Copyright © 2006, SAS Institute Inc. All rights reserved. Data at its Best How to keep large data volumes in order and ensure high quality ? Milen Georgiev Mihnev – Senior Consultant,Kontrax
Dec 23, 2015
Copyright © 2006, SAS Institute Inc. All rights reserved.
Data at its BestHow to keep large data volumes in order and ensure high quality ?
Milen Georgiev Mihnev – Senior Consultant,Kontrax
Copyright © 2006, SAS Institute Inc. All rights reserved.
Only One GuaranteeOrganizations Have Lots of Data
ERP Systems
Web Logs etc
And Lots Of Systems Contributing
Call Centre Apps
Other Operational Apps
Legacy Systems
Operational Switches
Unstructured DataFile based information
Copyright © 2006, SAS Institute Inc. All rights reserved.
And more is on the way…New technologies just adding to the problem
ERP Systems
Web Logs etc
Call Centre Apps
Other Operational Apps
Legacy Systems
Unstructured DataFile based information
RFID
Process Monitoring
The data explosion is underway
Operational Switches
Copyright © 2006, SAS Institute Inc. All rights reserved.
Legacy Systems
ERPMarts
& Systems
RDBMS
There is no problem getting data….It comes from everywhere….
And it is all stored everywhere
Copyright © 2006, SAS Institute Inc. All rights reserved.
Legacy Systems
ERPMarts
& Systems
RDBMS
And to emphasize the point
There are probably multiple systems across departments
Legacy Systems
ERPMarts
& Systems
RDBMS
Legacy Systems
ERPMarts
& Systems
RDBMS
From multiple different vendors added piecemeal over time
Copyright © 2006, SAS Institute Inc. All rights reserved.
Running on many different types of hardware
PC based Microsoft Windows IBM Mainframe with z/OS
And operating systems … some examples
SPARC based Sun Solaris
ALPHA based openVMS
And this just scratches the surface!!!!
Copyright © 2006, SAS Institute Inc. All rights reserved.
Legacy Systems
ERPMarts
& Systems
RDBMS
Legacy Systems
ERPMarts
& Systems
RDBMS
Legacy Systems
ERPMarts
& Systems
RDBMS
Because of a silo’d approach information is in multiple placesand often duplicated and inconsistent….
Copyright © 2006, SAS Institute Inc. All rights reserved.
Data duplication, inconsistency and system proliferation
The growing number of mergers and acquisitions is also adding new systems, new complexity and new costs
Copyright © 2006, SAS Institute Inc. All rights reserved.
Originally..
Created a DWH to get a consolidated view of the many systems
Created a DWH to offload processing from already overloaded operational systems
Created a DWH to support BI and Analytics
Created a DWH to store historical data
Many successful projects… many failed ones.. Still have a major role to play in an organizations
Copyright © 2006, SAS Institute Inc. All rights reserved.
But… Internal pressure is rising
Pressure to consolidate operational RDBMS high to reduce costs by simplifying infrastructure and associated costs
Pressure to migrate legacy systems (such as core banking systems) is high or growing. Demand is there to modernize
Pressure to move to a single technology for building the warehouse or marts is high (2nd Generation)
Pressure to improve data quality at all points is growing
Copyright © 2006, SAS Institute Inc. All rights reserved.
Moving beyond ETL ….…. And into Data Integration
Copyright © 2006, SAS Institute Inc. All rights reserved.
The Business Initiatives/Programs in detail
Data cleansing at point of entry as well as integrated into a real-time process or batch process.
Copyright © 2006, SAS Institute Inc. All rights reserved.
The Business Initiatives/Programs in detail
Copyright © 2006, SAS Institute Inc. All rights reserved.
The Business Initiatives/Programs in detail
Data Synchronization / Replication
Batch and Multi-Transaction / Record Synchronization often via Change Data Capture mechanisms in low latency or batch mode
Copyright © 2006, SAS Institute Inc. All rights reserved.
The Business Initiatives/Programs in detail
Ongoing Migration with Synchronization
Ad Hoc / Project Based
Copyright © 2006, SAS Institute Inc. All rights reserved.
The Business Initiatives/Programs in detail
Master Data Management
Customer Data Integration
Product Information Management (aka Product Information Management)
x…Data Integration
Copyright © 2006, SAS Institute Inc. All rights reserved.
The Past… ….. Piecemeal Approach
Many technologies – one for each activity
Hand coding as the main mechanism
Complex
Time consuming to maintain
Copyright © 2006, SAS Institute Inc. All rights reserved.
The Future… ….. A Universal Data Integration Solution
One solution…
• That supports all the needs of an organization
• That spans the operational world and the business intelligence one
• Backed by people, process and methodology
• That is treated strategically
Copyright © 2006, SAS Institute Inc. All rights reserved.
Able to interact with all systems dependent on what it is you are doing
Copyright © 2006, SAS Institute Inc. All rights reserved.
Case Study – ETL/Data Quality
WHAT:
HOW:
RESULT:
WHY:
Warehouse project
Provide a single source of information to support the Portfolio Management Division
Use data integration technologies to access 75+ source systems on various platforms, transform and cleanse the data and load the resulting data into an Oracle database for reporting with Cognos
A single source of information for the Portfolio Management Division to report on
“We are pretty much using all of the various IT systems that the IT world has ever produced,” explains Eckart J. Schröer, head of information management. “The easy and transparent connection of the various data sources convinced us. No vendor other than SAS was able to provide us with the same capabilities. Our portfolio manager can take advantage of the successful integration of additional sources that are quickly accessible to them and made possible by our data management solution provided by SAS.”
Copyright © 2006, SAS Institute Inc. All rights reserved.
Case Study – Data Migration
WHAT:
HOW:
RESULT:
WHY:
Migrate seamlessly from one data warehousing solution to another (24 million customer records and 7TB)
AA sold from its parent company, Centrica. Needed to build a new data warehouse that would be populated with information that was housed on Centrica's system and had less than 1 year to do it.
Used data integration technologies to extract the relevant data and build the new data warehouse
New Data Warehouse in place within 6 months and significantly reduced cost of operation, ownership over old system