Top Banner
Evolving to the Business Data Lake Big Data shouldn’t mean big problems The Pivotal Business Data Lake is a new approach to enterprise reporting and information management but it doesn’t mean an organization has to rip out its existing enterprise data warehouse (EDW). Instead the Pivotal Business Data Lake provides a way to extend the life of your current EDW and a platform to consolidate the existing data marts on to a single infrastructure. Furthermore, the Pivotal Business Data Lake also provides a way to incorporate new big data sources, including unstructured data such as weblogs, call records or RFID data. This means the information within it can be made available to your existing EDW customers. Refocusing the EDW on what it does best By design an EDW has multiple ‘layers’ of information. These have names such as ‘source layer’, ‘semantic layer’ and others but their purpose is always the same: to transform the loaded information into the format and structure that can be used for reporting. These layers can often mean seven or more transformations before the information is available for reporting. These layers are required because the way users wish to report on information is not the same as the way it’s loaded. The layers make updates easier to handle. The problem with this, however, is two-fold. How you can deliver value today and tomorrow by incorporating the Business Data Lake into existing enterprise data warehouse and data mart environments
4

Evolving to the business data lake

Jan 17, 2015

Download

Technology

Ignou Ignou

Every company is concerned about the rising costs of EDWs and to better enable them to be successful in a modern business as the demands of big data and mobile or transactional access increase. Pivotal’s Business Data Lake help you address that challenge by optimizing the data within your EDW and providing you with a way to add big data analytics without the cost of scaling the EDW to process big data volumes.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Evolving to the business data lake

Evolving to the Business Data Lake

Big Data shouldn’t mean big problemsThe Pivotal Business Data Lake is a new approach to enterprise reporting and information management but it doesn’t mean an organization has to rip out its existing enterprise data warehouse (EDW). Instead the Pivotal Business Data Lake provides a way to extend the life of your current EDW and a platform to consolidate the existing data marts on to a single infrastructure.

Furthermore, the Pivotal Business Data Lake also provides a way to incorporate new big data sources, including unstructured data such as weblogs, call records or RFID data. This means the information within it can be made available to your existing EDW customers.

Refocusing the EDW on what it does best By design an EDW has multiple ‘layers’ of information. These have names such as ‘source layer’, ‘semantic layer’ and others but their purpose is always the same: to transform the loaded information into the format and structure that can be used for reporting.

These layers can often mean seven or more transformations before the information is available for reporting. These layers are required because the way users wish to report on information is not the same as the way it’s loaded. The layers make updates easier to handle. The problem with this, however, is two-fold.

How you can deliver value today and tomorrow by incorporating the Business Data Lake into existing enterprise data warehouse and data mart environments

Page 2: Evolving to the business data lake

Firstly, no-one wants to pay premium prices for these intermediary layers – they just want the reporting layer. Secondly, these additional layers all take up storage space as data is copied from one layer to another in the EDW. This means a 10TB EDW may have only 1TB of actual information at the reporting layer.

Moving the intermediary layers to the Business Data LakeThe Pivotal Business Data Lake enables you to move these layers out of the EDW and into the Pivotal Business Data Lake using a combination of Hadoop Distributed File System (HDFS) to replace the source loading layer. This significantly reduces costs. Pivotal Data Dispatch (PDD) and HAWQ are then used to re-create and potentially simplify the other layers.

Reporting layer

Semantic layer

Translation layer

Intermediary layer

Source loading layer

Traditional EDW

Evolved EDW

PDD is used to automatically update the reporting layer within your existing EDW. This means no code changes are needed to your existing reporting solutions. It also results in a significantly increased timeframe before you need to allocate more space to your premium EDW. This approach can save over 80% of the cost of extending the landscape and add years to the life of an EDW infrastructure.

Semantic layer

Translation layer

Intermediary layer

Source loading layer

Enterprise Data Warehouse Business Data Lake

Reporting layer

Page 3: Evolving to the business data lake

Adding big data and predictive analytics into your EDW landscapeShifting to the Pivotal Business Data Lake is not simply about saving costs, however, it’s also about enabling the business to do more. Traditional EDW solutions cannot handle the volume of data required for big data analytics. But the results of that analytics are often a critical element the people using the EDW need to see. With the Pivotal Business Data Lake you already have a leading big data platform in Pivotal One that enables you to take on petabytes of data and perform analytics on that information. It also enables you to perform matching and searching and use HAWQ and PDD to make that information available within your existing EDW.

Enabling mobility – adding an in-memory layer to your EDWHigh demand interactive reporting applications such as mobile devices often make significant demands on traditional EDWs. Normally operating on a sub-set of data, these applications have a disproportionate impact on the processing requirements. With the Pivotal Business Data Lake the solution is to use PDD to copy the information from your EDW into SQLFire, a leading in-memory database which can support up to 2TB of data and provide lightning fast cached reporting for mobile and other high demand clients. Using the Business Data Lake in this way also means transactional systems can access the valuable analytics and outputs from your EDW without impacting the infrastructure and processing requirements of your existing EDW.

Add life to your EDW with the Pivotal Business Data LakeEvery company is concerned about the rising costs of EDWs and how to better enable them to be successful in a modern business as the demands of big data and mobile or transactional access increase. Pivotal’s Business Data Lake helps you address that challenge by optimizing the data within your EDW and providing you with a way to add big data analytics into it without the cost of scaling the EDW to process big data volumes. Pivotal’s Business Data Lake also adds performance to your EDW by acting as a lightning fast in-memory cache for key information. This enables mobile users and transactional systems to leverage the power of your EDW without the cost of scaling traditional EDWs to meet transactional demands.

the way we do itBIM

Page 4: Evolving to the business data lake

The information contained in this document is proprietary. ©2013 Capgemini.All rights reserved. Rightshore® is a trademark belonging to Capgemini.

About CapgeminiWith more than 130,000 people in 44 countries, Capgemini is one of the world’s foremost providers of consulting, technology and outsourcing services. The Group reported 2012 global revenues of EUR 10.3 billion.

Together with its clients, Capgemini creates and delivers business and technology solutions that fit their needs and drive the results they want. A deeply multicultural organization, Capgemini has developed its own way of working, the Collaborative Business ExperienceTM, and draws on Rightshore®, its worldwide delivery model.

Find out more at www.capgemini.com/bdl and www.gopivotal.com/businessdatalakeOr contact us at [email protected]

MC

OS

_GI_

MK

_201

3111

9