Top Banner
Cloud-based DWH Solution Using Amazon Redshift CeBIT | 12 March 2014 Ionut Hedesiu Senior Software Engineer
12
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Case Study: Cloud based DWH Solution using Amazon Redshift

Cloud-based DWH Solution

Using Amazon Redshift

CeBIT | 12 March 2014

Ionut Hedesiu

Senior Software Engineer

Page 2: Case Study: Cloud based DWH Solution using Amazon Redshift

What is Big Data?

BigWhat does it stand for?

Does it really matter?

Page 3: Case Study: Cloud based DWH Solution using Amazon Redshift

What if?

affordable and

intuitive framework

complete ETL flow ready in minutes

no 3rd

party licensing royalties

any amount of data

no single point of failure

Page 4: Case Study: Cloud based DWH Solution using Amazon Redshift

Approach

inexpensive, highly performant data

warehousing

strictly proven open source technologies

horizontally and vertically

scalable

Page 5: Case Study: Cloud based DWH Solution using Amazon Redshift

Solution

independent, metadata-driven

modules

collection of python modules

deployed and tested on enterprise/commodity hardware and Amazon

cloud solutions

Page 6: Case Study: Cloud based DWH Solution using Amazon Redshift

Implementation

• simple virtual Linux boxes

• instance auto-spawn

• SQL code on the fly

• AMQP standard messaging

• detailed logging, Splunk

• fully configurable

Page 7: Case Study: Cloud based DWH Solution using Amazon Redshift

Features

enterprise messaging

metadata-driven ETL flows

multiple work queues

detailed logging in multiple destinations

secure user access

alerts based on user-defined formulas

Page 8: Case Study: Cloud based DWH Solution using Amazon Redshift

Benefits

SCALABLE• vertical and horizontal• auto scalability and load balancing

CUSTOMISABLE• platform and database agnostic• quick module addition or removal

COST-EFFICIENT• minimal cost and development time• very low maintenance cost

Page 9: Case Study: Cloud based DWH Solution using Amazon Redshift

Benefits

POWERFUL• real-time data analytics• massive parallel processing• intensive data mining and cleansing

ROBUST• 99.5% availability• minimal or no maintenance• lightweight framework

FLEXIBLE• one central point of control• metadata driven

Page 10: Case Study: Cloud based DWH Solution using Amazon Redshift

Case Study – Global Media Organisation

• 500+ source systems• 3 database vendors• local batch processing

• no global data overview• no data integration

Page 11: Case Study: Cloud based DWH Solution using Amazon Redshift

Implementation Overview

• centralised data repository• real time processing• metadata driven• customised to client needs

• Python • Rabbit MQ• Amazon Redshift• Tableau

Page 12: Case Study: Cloud based DWH Solution using Amazon Redshift

Benefits & Results

• tenfold cost reduction• intuitive and easy to use • secure and simple to

administer

• real time analytics• improved decision-making • minimal to no maintenance• high scalability