Orchestrate data with agility and responsiveness. Learn how to manage a common data integration project by SKENDER KOLLCAKU Milan, 07/2017 keywords: iPaaS, data integration, Talend, Salesforce, data-driven, use case, migration, cloud computing, SaaS, CRM, database, real-time, open-source, java, professional services, on-premise, mainframe, data quality, hybrid, repository, metadata, reusable job, data validation, bi-directional sync, design pattern, agile, business, ETL, project management, customer,
18
Embed
Orchestrate data with agility and responsiveness. Learn how to manage a common data integration project
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Orchestrate data with agility and responsiveness.
Learn how to manage a common data integration project
by SKENDER KOLLCAKU
Milan, 07/2017
keywords:
iPaaS, data integration, Talend, Salesforce, data-driven, use case, migration, cloud computing, SaaS, CRM, database, real-time, open-source, java, professional services, on-premise, mainframe, data quality, hybrid, repository, metadata, reusable job, data validation, bi-directional sync, design pattern, agile, business, ETL, project management, customer,
Once available the input flat files from the mainframe, the ETL (Extract, Transform and Load) operations to be executed could be the following:
Cleanse
Validate
Format
Unify
Standardize
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku
9
pull data from MF
cleanse, validate, format
unify or standardize
provision DB schema compatibility
upload into SaaSCRM
Data quality includes data validation
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku
10
DA
TA V
ALI
DA
TIO
N
NULL HANDLING
STRING HANDLING
DATE HANDLING
THIRD-PARTY VALIDATION LIBRARIES
Talend Data Preparation self-service free tool
Business process model definition before westart implementing the job
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku
11
Use of Talend DI canvas to model the business process. Flow of data will satisfy thefollowing business requirement: only matched/validated Customers address records willbe loaded into the SaaS CRM.
Use Talend to set up the data migration betweenOn-Premise input files and target SaaS CRM object(Account in Salesforce)
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku
12
prior to Addressvalidation
Simplified job which uses tMap “magical” component to validate Address
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku
13
Simplified job which uses tMap component to validate Customer address.The output are (1) loaded into Salesforce Account object as records and (2) rejectedCustomers with invalid addresses in an Excel spreadsheet for future analysis
(2) Project phase: bi-directionalsynchronization between mainframe and SaaS
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku
14
Talend built-in component tSalesforceGetUpdated_1 used for tracking changes (update,insert, upsert) in the Salesforce Account object and propagate them in real-time to a DB2mainframe’s table. This component can work in background given a past Start and Endtime range.Another mechanism is the CDC (Change Data Capture).
Bi-directional integration means real-time synchronization between the two databases
There are some key issues to consider:
How similar are the schemas of the databases to be kept in sync (this helps for
eventual JOIN operations)?
How often do the databases need to be synched (performance query…)?
How will we resolve situations in which the same data has been modified in both
of databases since the last sync session (conflict based on the “record owner” or
“last modified” solution to be described)?
How much effort and/or money are we willing to invest in developing our sync
system (“keep project budget on track”)?
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku
15
(3) Added values: technical perspective
External lookup with any other data sources (supply chain, e-commerce, BI (analysis of ROIs, deals/opportunities), DW, Marketing, social networks activity/engagement, distributed and cross-platform applications… )
Reusable jobs, thanks to repository metadata
Versioning of the Java generated code (Github, Maven…)
Statistical reports about job execution (performance)
Other applications can trigger the job (example: collecting data for reports and dashboards…)
Unified and scalable integration platform (Data Preparation, DI, Cloud integration, ESB, MDM, Big Data, Fabric…)
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku
16
(3) Added values: business perspective
Give real value to the data asset (“enable data-driven organizations”)
Support for decisions (“how to use the information obtained?”) and providethem in advance (apply automatically and review rules regularly)
Remove data management risk when modernizing systems
Consolidate applications
Smooth subscription model (start with free open-source tool and thenupgrade in a predictable fashion depending on business needs – pay only forthe number of developers…)
Optimize processes by keeping comprehensive, relevant and consistent data everywhere.
Deliveries in real-time and analytics prediction!
Big Data native suite of products
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku
17
Thank you!
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku