Top Banner
A Hybrid Row-column OLTP Database Architecture for Operational Reporting Jan Schaffner, Anja Bog, Jens Krüger, Alexander Zeier
14

A Hybrid Row-column OLTP Database Architecture for Operational Reporting Jan Schaffner, Anja Bog, Jens Krüger, Alexander Zeier.

Dec 29, 2015

Download

Documents

Sarah Taylor
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Hybrid Row-column OLTP Database Architecture for Operational Reporting Jan Schaffner, Anja Bog, Jens Krüger, Alexander Zeier.

A Hybrid Row-column OLTP Database Architecture for Operational Reporting

Jan Schaffner, Anja Bog, Jens Krüger, Alexander Zeier

Page 2: A Hybrid Row-column OLTP Database Architecture for Operational Reporting Jan Schaffner, Anja Bog, Jens Krüger, Alexander Zeier.

August 24, 2008 | A Hybrid Row-column OLTP Database Architecture

Agenda

■ Operational Reporting

■ Related Work

■ Architecture of Hybrid System

■ Virtual Cube

■ Outlook and Discussion

2

Page 3: A Hybrid Row-column OLTP Database Architecture for Operational Reporting Jan Schaffner, Anja Bog, Jens Krüger, Alexander Zeier.

August 24, 2008 | A Hybrid Row-column OLTP Database Architecture

Operational Reporting

Dinstinction according to Inmon:

■ Informational Reporting

□ Supports long-term, strategic decisions

□ Summarized data

□ Long-term horizons

Typically done using a data warehouse (DW)

■ Operational Reporting

□ Supports day-to-day decisions

□ Data on a more detailed level

□ Takes up-to-the-minute data into account

Done using a DW or an OLTP system?

3

Page 4: A Hybrid Row-column OLTP Database Architecture for Operational Reporting Jan Schaffner, Anja Bog, Jens Krüger, Alexander Zeier.

August 24, 2008 | A Hybrid Row-column OLTP Database Architecture

Operational Reporting (contd.)

■ Using a DW for Operational Reporting

□ DW must be designed to the same level of granularity as the OLTP systems huge data volumes

□ Updates are required to frequently be replicated into the DW endless optimization

■ Using an OLTP Store for Operational Reporting

□ Operational reporting queries are relatively long-running in comparison to pure OLTP workloads

□ Resource contention:Locks of long-running queries block the short-running ones

□ Different data model:Not optimized for reporting (i.e. no star-schema)

4

Page 5: A Hybrid Row-column OLTP Database Architecture for Operational Reporting Jan Schaffner, Anja Bog, Jens Krüger, Alexander Zeier.

August 24, 2008 | A Hybrid Row-column OLTP Database Architecture

Common Data Warehouse Architecture

■ DW contains ETL processor which

□ ...extracts data from various OLTP sources into a staging area

□ ...applies transformations for cleansing and integration

□ ...stores data in a dimensional layout

■ OLAP engine runs queries against dimensional data store

5

Page 6: A Hybrid Row-column OLTP Database Architecture for Operational Reporting Jan Schaffner, Anja Bog, Jens Krüger, Alexander Zeier.

August 24, 2008 | A Hybrid Row-column OLTP Database Architecture

“Real-Time” DW Architectures

■ Microbatch

□ Configure ETL process to run in very short intervals

□ Up-to-date data but very resource intensive

■ Push Architectures

□ Handling of deltas on a business or database transaction level

□ Up-to-date data but still resource intensive

■ Operational Data Store (ODS)

□ Store copy of the OLTP data using an integrated schema

□ High data granularity but no up-to-date data

6

Page 7: A Hybrid Row-column OLTP Database Architecture for Operational Reporting Jan Schaffner, Anja Bog, Jens Krüger, Alexander Zeier.

August 24, 2008 | A Hybrid Row-column OLTP Database Architecture

“Real-Time” DW Architectures (contd.)

■ ELT

□ Data is extracted from the OLTP sources and loaded into the ODS

□ Transformations are done in the warehouse at query-runtime

□ High granularity (transactional data) but no up-to-date data

■ Virtual ODS

□ Virtual in the sense that queries are redirected against OLTP system

□ High granularity (transactional data) and up-to-date data

□ Performs ETL on-the-fly

□ Affects performance of OLTP system

7

Page 8: A Hybrid Row-column OLTP Database Architecture for Operational Reporting Jan Schaffner, Anja Bog, Jens Krüger, Alexander Zeier.

August 24, 2008 | A Hybrid Row-column OLTP Database Architecture

8

Column-Stores: New “Trend” for OLAP

□ Column-store databases:

◊ Vertical fragmentation

◊ Fast aggregations (sum, min, max, avg, …) more flexibility for ad-hoc reporting

◊ Each column can be compressed individually

□ Both disk-based …

◊ Vertica

◊ Greenplum

□ … and in-memory:

◊ SAP BIA

◊ MonetDB

◊ Exasol

c1

v11

v21

v31

c2

v12

v22

v32

c3

v13

v23

v33

sID

1

2

3

c1

v11

v21

v31

sID

1

2

3

c2

v12

v22

v32

sID

1

2

3

c3

v13

v23

v33

row-oriented column-oriented

Page 9: A Hybrid Row-column OLTP Database Architecture for Operational Reporting Jan Schaffner, Anja Bog, Jens Krüger, Alexander Zeier.

August 24, 2008 | A Hybrid Row-column OLTP Database Architecture

Encoding Schemes

9

Ordered

Unordered

Few distinct values Many distinct values

Delta representationDelta representation

Sequence of triples:• value• offset position• # occurrences

Sequence of triples:• value• offset position• # occurrences

Sequence of tuples:• value• bitmap for positional occurence

Sequence of tuples:• value• bitmap for positional occurence

??

Page 10: A Hybrid Row-column OLTP Database Architecture for Operational Reporting Jan Schaffner, Anja Bog, Jens Krüger, Alexander Zeier.

August 24, 2008 | A Hybrid Row-column OLTP Database Architecture

Architecture of Hybrid System

10

■ Essentially integration between row- and column store DBs

■ MaxDB is used as the row store

□ Database underlying SAP Business ByDesign

□ Supports ACID transactions

■ TREX is used as the column store

□ Main memory

□ Engine underlying SAP BIA

□ Has a copy of (some of) the OLTP data

□ Primary OLTP system and main-memory database (MMDB) aregoverned using a single resource manager

Page 11: A Hybrid Row-column OLTP Database Architecture for Operational Reporting Jan Schaffner, Anja Bog, Jens Krüger, Alexander Zeier.

August 24, 2008 | A Hybrid Row-column OLTP Database Architecture

Architecture of Hybrid System (contd.)

11

Page 12: A Hybrid Row-column OLTP Database Architecture for Operational Reporting Jan Schaffner, Anja Bog, Jens Krüger, Alexander Zeier.

August 24, 2008 | A Hybrid Row-column OLTP Database Architecture

Virtual Cube

■ Similar architecture as virtual ODS

□ Virtual Cube provides the same interfaceas a typical cube (slice, dice, drill-down, …)

□ Virtual Cube rewrites queries and issues them against the MMDB (TREX in our case)

□ TREX has a copy of the OLTP data

□ Primary OLTP system and MMDB aretied together as described above

Page 13: A Hybrid Row-column OLTP Database Architecture for Operational Reporting Jan Schaffner, Anja Bog, Jens Krüger, Alexander Zeier.

August 24, 2008 | A Hybrid Row-column OLTP Database Architecture

Outlook

13

■ Build a “real” hybrid database in-memory as part of ChunkyStore

■ Data can be stored as either:

□ Rows

□ Columns

□ Chunks (adjacent fragments of rows and columns)

■ DB decides which physical storage alternative is most suitable

■ Main-memory implementation will cater for fast updates as well as fast operational reporting capabilities

Page 14: A Hybrid Row-column OLTP Database Architecture for Operational Reporting Jan Schaffner, Anja Bog, Jens Krüger, Alexander Zeier.

August 24, 2008 | A Hybrid Row-column OLTP Database Architecture

Thank you

■ Questions?

14