Slide 1
OLTPOnline Transaction Processing System (OLTP)
The online operational Database System that performs online
transaction and query processing is called On Line transaction
Processing (OLTP) systems. Ex. Dayto day operations of
organizations, such as purchasing,inventory, manufacturing,
banking, payroll registration, andaccounting. OLTP System deals
with operational data. Operational data are those data involved in
the operation of a particular system. Example: In a banking System,
you withdraw amount from your account. Then Account Number,
Withdrawal amount, Available Amount, Balance Amount, Transaction
Number etc are operational data elements.
OLTPIn an OLTP system data are frequently updated and queried.
So quick response to a request is highly expected. Since the OLTP
systems involves large number of update queries, the database
tables are optimized for write operations. To prevent data
redundancy and to prevent update anomalies the database tables are
normalized. Normalization makes the write operation in the database
tables more efficient.Operational data are usually of local
relevance. It involves queries accessing individual
tuple(individual record).These type of queries are termed as point
queries.
Examples for OLTP Queries:
What is the Salary of Mr. John?Withdraw Money from Bank Account
: It performs update operation if money is withdrawn from
account.What is the address and email id of the person who is the
head of maths department?
What is OLAPBasic idea: converting data into information that
decision makers need
Concept to analyze data by multiple dimension in a structure
called data cubeOLAPOLAP designates a category of applications and
technologies that allows the collection, storage, manipulation and
reproduction of multidimensional data, with the goal of
analysis.HistoryIn 1993, E. F. Codd came up with the term online
analytical processing (OLAP) in his paper title Providing on-line
analytical processing using user analyststhe term OLAP seems
perfect to describe databases designed to facilitate decision
making (analysis) in an organization Purpose of OLAPTo derive
summarized information from large volume databaseTo generate
automated reports for human viewExamples for OLAP Queries
How is the profit changing over the years across different
regions ?Is it financially viable to continue the production unit
at location X?
OLAP, by Dr. Khalil9What and Why OLAP?OLAP enables users to gain
a deeper understanding and knowledge about various aspects of their
corporate data through fast, consistent, interactive access to a
variety of possible views of data.While OLAP systems can easily
answer who? and what? questions, its ability is to answer what if?
and why? type questions that distinguishes them from
general-purpose query tools.The types of analysis available from
OLAP range from basic navigation and browsing (referred to as
slicing and dicing) , to calculations, to more complex analysis
such as time series and complex modeling.
OLAP, by Dr. Khalil10OLAP ApplicationsFinance: Budgeting,
activity-based costing, financial performance analysis, and
financial modeling.
Sales: Sales analysis and sales forecasting.
Marketing: Market research analysis, sales forecasting,
promotions analysis, customer analysis, and market/customer
segmentation.
Manufacturing: Production planning and defect analysis.OLAP, by
Dr. Khalil11OLAP Benefits Increased productivity of business
end-users, IT developers, and consequently the entire
organization.Reduced backlog of applications development for IT
staff by making end-users self-sufficient enough to make their own
schema changes and build their own models. Retention of
organizational control over the integrity of corporate data as OLAP
applications are dependent on data warehouses and OLTP systems to
refresh their source data level. Improved potential revenue and
profitability by enabling the organization to respond more quickly
to market demands.OLTP System Online Transaction Processing
(Operational System)OLAP System Online Analytical Processing (Data
Warehouse)Source of dataOperational data; OLTPs are the original
source of the data.Consolidation data; OLAP data comes from the
various OLTP DatabasesPurpose of dataTo control and run fundamental
business tasksTo help with planning, problem solving, and decision
supportWhat the dataReveals a snapshot of ongoing business
processesMulti-dimensional views of various kinds of business
activitiesInserts and UpdatesShort and fast inserts and updates
initiated by end usersPeriodic long-running batch jobs refresh the
dataQueriesRelatively standardized and simple queries Returning
relatively few recordsOften complex queries involving
aggregationsProcessing SpeedTypically very fastDepends on the
amount of data involved; batch data refreshes and complex queries
may take many hours; query speed can be improved by creating
indexesSpace RequirementsCan be relatively small if historical data
is archivedLarger due to the existence of aggregation structures
and history data; requires more indexes than OLTPDatabase
DesignHighly normalized with many tablesTypically de-normalized
with fewer tables; use of star and/or snowflake schemasBackup and
RecoveryBackup religiously; operational data is critical to run the
business, data loss is likely to entail significant monetary loss
and legal liabilityInstead of regular backups, some environments
may consider simply reloading the OLTP data as a recovery
methodSchemaPronounce skee-ma, the structure of a database system,
described in a formal language supported by the database management
system (DBMS). In a relational database, the schema defines the
tables, the fields in each table, and the relationships between
fields and tables.Schemas are generally stored in a data
dictionary. Although a schema is defined in text database language,
the term is often used to refer to a graphical depiction of the
database structure.
Types of SchemasIn database:-Hierarchical modelNetwork
modelRelational model (RDBMS)In data warehouseStar schemaSnow-flake
schema
Star schemaThe star schema architecture is the simplest data
warehouse schema. It is called a star schema because the diagram
resembles a star, with points radiating from a center. The center
of the star consists of fact table and the points of the star are
the dimension tables. Usually the fact tables in a star schema are
in third normal form(3NF) whereas dimensional tables are
de-normalized. Despite the fact that the star schema is the
simplest architecture, it is most commonly used nowadays and is
recommended by Oracle.In a relational database, denormalization is
an approach to speeding up read performance (data retrieval) in
which the administrator selectively adds back specific instances of
redundant data after the data structure has been normalized. A
denormalized database should not be confused with a database that
has never been normalized.15Star Schema
Star SchemaFact Tables
A fact table typically has two types of columns: foreign keys to
dimension tables and measures those that contain raw numeric items
that represent relevant business facts. A fact table can contain
fact's data on detail or aggregated level, so it tends to be very
large.Star SchemaDimension TablesA dimension table is a structure
usually composed of one or more hierarchies that categorizes data.
If a dimension hasn't got a hierarchies and levels it is called
flat dimension or list. These tables are joined to the fact table
using foreign key references. Dimension tables are generally small
in size then fact table.
Typical fact tables store data about sales while dimension
tables data about geographic region(markets, cities) , customers,
products, time.Characteristics of star schema: Simple structure
-> easy to understand schema Great query effectives -> small
number of tables to join Relatively long time of loading data into
dimension tables -> de-normalized The most commonly used in the
data warehouse implementations -> widely supported by a large
number of business intelligence toolsSnowflake schema It is a
logical arrangement of tables in a multidimensional database such
that the entity relationship diagram resembles a snowflake shape.
The snowflake schema is represented by centralized fact tables
which are connected to multiple dimensions. "Snowflaking" is a
method of normalising the dimension tables in a star schema. When
it is completely normalised along all the dimension tables, the
resultant structure resembles a snowflake with the fact table in
the middle. The principle behind snowflaking is normalisation of
the dimension tables. Snow-flake schema
Snow-flake Schema Star SchemaEase of maintenance / changeNo
redundancy and hence more easy to maintain and changeHas redundant
data and hence less easy to maintain/changeEase of UseMore complex
queries and hence less easy to understandLess complex queries and
easy to understandQuery PerformanceMore foreign keys-and hence more
query execution timeLess no. of foreign keys and hence lesser query
execution timeType of DatawarehouseGood to use for datawarehouse
core to simplify complex relationships (many:many)Good for
datamarts with simple relationships (1:1 or 1:many)JoinsHigher
number of JoinsFewer JoinsDimension tableIt may have more than one
dimension table for each dimensionContains only single dimension
table for each dimensionWhen to useWhen dimension table is
relatively big in size, snowflaking is better as it reduces
space.When dimension table contains less number of rows, we can go
for Star schema.Normalization/ De-NormalizationDimension Tables are
in Normalized form but Fact Table is still in De-Normalized
formBoth Dimension and Fact Tables are in De-Normalized formData
modelBottom up approachTop down approachCubeA cube is a
multidimensional structure that contains information for analytical
purposes; the main constituents of a cube are dimensions and
measures. Dimensions define the structure of the cube that you use
to slice and dice over, and measures provide aggregated numerical
values of interest to the end user. As a logical structure, a cube
allows a client application to retrieve values, of measures, as if
they were contained in cells in the cube; cells are defined for
every possible summarized value. A cell, in the cube, is defined by
the intersection of dimension members and contains the aggregated
values of the measures at that specific intersection.Benefit of
Using CubesA cube provides a single place where all related data,
for analysis, is stored.
3-D Cubedimensions = 3Multi-dimensional cube:Fact table
view:
day 2
day 1ExampleStoreProductTimeM T W Th F S
SJuiceMilkCokeCreamSoapBreadNYSFLA10345632125656 units of bread
sold in LA on MDimensions:Time, Product, StoreAttributes:Product
(upc, price, )Store Hierarchies:Product Brand Day Week QuarterStore
Region Countryroll-up to weekroll-up to brandroll-up to regionOLAP,
by Dr. Khalil26Representation of Multi-Dimensional DataOLAP
database servers use multi-dimensional structures to store data and
relationships between data. Multi-dimensional structures are
best-visualized as cubes of data, and cubes within cubes of data.
Each side of a cube is a dimension.
OLAP, by Dr. Khalil27Representation of Multi-Dimensional
DataMulti-dimensional databases are a compact and
easy-to-understand way of visualizing and manipulating data
elements that have many inter-relationships. The cube can be
expanded to include another dimension, for example, the number of
sales staff in each city.The response time of a multi-dimensional
query depends on how many cells have to be added on-the-fly. As the
number of dimensions increases, the number of cubes cells increases
exponentially.
OLAP, by Dr. Khalil28Representation of Multi-Dimensional
DataMulti-dimensional OLAP supports common analytical operations,
such as:Consolidation: involves the aggregation of data such as
roll-ups or complex expressions involving interrelated data. For
example, branch offices can be rolled up to cities and rolled up to
countries.Drill-Down: is the reverse of consolidation and involves
displaying the detailed data that comprises the consolidated
data.Slicing and dicing: refers to the ability to look at the data
from different viewpoints. Slicing and dicing is often performed
along a time axis in order to analyze trends and find patterns.
Olap cube basicsMeasuresDimensionsHierarchiesLevels
29OLAP InplementationMultidimensional OLAP (MOLAP)Relational
OLAP (ROLAP)Hybrid OLAP (HOLAP)OLAP, by Dr.
Khalil31Multi-dimensional OLAP (MOLAP)MOLAP tools use specialized
data structures and multi-dimensional database management systems
(MDDBMS) to organize, navigate, and analyze data.To enhance query
performance the data is typically aggregated and stored according
to predicted usage.MOLAP data structures use array technology and
efficient storage techniques that minimize the disk space
requirements through sparse data management.The development issues
associated with MOLAP:Only a limited amount of data can be
efficiently stored and analyzed.Navigation and analysis of data are
limited because the data is designed according to previously
determined requirements.MOLAP products require a different set of
skills and tools to build and maintain the database.
OLAP, by Dr. Khalil32Relational OLAP (ROLAP)ROLAP is the
fastest-growing type of OLAP tools.ROLAP supports RDBMS products
through the use of a metadata layer, thus avoiding the requirement
to create a static multi-dimensional data structure.This
facilitates the creation of multiple multi-dimensional views of the
two-dimensional relation.To improve performance, some ROLAP
products have enhanced SQL engines to support the complexity of
multi-dimensional analysis, while others recommend, or require, the
use of highly denormalized database designs such as the star
schema.The development issues associated with ROLAP
technology:Performance problems associated with the processing of
complex queries that require multiple passes through the relational
data.Development of middleware to facilitate the development of
multi-dimensional applications.Development of an option to create
persistent multi-dimensional structures, together with facilities o
assist in the administration of these structures.
HOLAPa hybrid of ROLAP and MOLAPcan be thought of as a virtual
database whereby the higher levels of the database are implemented
as MOLAP and the lower levels of the database as ROLAP OLAP, by Dr.
Khalil34Hybrid OLAP (HOLAP)HOLAP tools provide limited analysis
capability, either directly against RDBMS products, or by using an
intermediate MOLAP server.HOLAP tools deliver selected data
directly from DBMS or via MOLAP server to the desktop (or local
server) in the form of data cube, where it is stored, analyzed, and
maintained locally is the fastest-growing type of OLAP tools.The
issues associated with HOLAP tools:The architecture results in
significant data redundancy and may cause problems for networks
that support many users.Ability of each user to build a custom data
cube may cause a lack of data consistency among users.Only a
limited amount of data can be efficiently maintained.
MOLAP (Multidimensional Online Analytical Processing)ROLAP
(Relational Online Analytical Processing)HOLAP (Hybrid Online
Analytical Processing)The MOLAP storage mode causes the
aggregations of the partition and a copy of its source data to be
stored in a multidimensional structure in Analysis Services when
the partition is processed.The ROLAP storage mode causes the
aggregations of the partition to be stored in indexed views in the
relational database that was specified in the partitions data
source.The HOLAP storage mode combines attributes of both MOLAP and
ROLAP. Like MOLAP, HOLAP causes the aggregations of the partition
to be stored in a multidimensional structure in an SQL
ServerAnalysis Services instance.This MOLAP structure is highly
optimized to maximize query performance. The storage location can
be on the computer where the partition is defined or on another
computer running Analysis Services. Because a copy of the source
data resides in the multidimensional structure, queries can be
resolved without accessing the partitions source data.Unlike the
MOLAP storage mode, ROLAP does not cause a copy of the source data
to be stored in the Analysis Services data folders. Instead, when
results cannot be derived from the query cache, the indexed views
in the data source are accessed to answer queries.HOLAP does not
cause a copy of the source data to be stored. For queries that
access only summary data in the aggregations of a partition, HOLAP
is the equivalent of MOLAP.MOLAP (Multidimensional Online
Analytical Processing)ROLAP (Relational Online Analytical
Processing)HOLAP (Hybrid Online Analytical Processing)Query
response times can be decreased substantially by using
aggregations. The data in the partitions MOLAP structure is only as
current as the most recent processing of the partition.Query
response is generally slower with ROLAP storage than with the MOLAP
or HOLAP storage modes. Processing time is also typically slower
with ROLAP. However, ROLAP enables users to view data in real time
and can save storage space when you are working with large datasets
that are infrequently queried, such as purely historical
data.Queries that access source datafor example, if you want to
drill down to an atomic cube cell for which there is no aggregation
datamust retrieve data from the relational database and will not be
as fast as they would be if the source data were stored in the
MOLAP structure. With HOLAP storage mode, users will typically
experience substantial differences in query times depending upon
whether the query can be resolved from cache or aggregations versus
from the source data
itself.Sheet1customeridnameaddresscityproductidnamepricestorecodecitysaleoderIddatecustIdprodIdstoreIdqtyamtsalestorecodecitytypemgrcitycityIdpopregionregionregIdnamesTypetIdsizelocationsaleprodIdstoreIddateamts1s2s353joe10
mainsfop1bolt10c1nyco1001/7/9753p1c1112orderIds5sfot1joesfo1Mnorthnorthcold
regiont1smalldowntownp1s1112p1125081fred12
mainsfop2nut5c2sfoo1022/7/9753p2c1211dates7sfot2fredla5Msouthsouthwarm
regiont2largesuburbsp2s1111p2118111sally80
willowlac3la1053/8/97111p1c3550custIds9lat1nancyp1s3150prodIdp2s218storeIdp1s1244productidnamepriceqtyp1s224p1bolt10amtp2nut5customeridstorecodecitynamec1nycaddressc2sfocityc3laproductidsalecustIdprodIdstoreIdqtyamtname53p1c1112price53p2c1211111p1c3550storecodecity
&APage &P
Sheet2
&APage &P
Sheet3
&APage &P
Sheet4
&APage &P
Sheet5
&APage &P
Sheet6
&APage &P
Sheet7
&APage &P
Sheet8
&APage &P
Sheet9
&APage &P
Sheet10
&APage &P
Sheet11
&APage &P
Sheet12
&APage &P
Sheet13
&APage &P
Sheet14
&APage &P
Sheet15
&APage &P
Sheet16
&APage &P
Sheet1customeridnameaddresscityproductidnamepricestorecodecitysaleoderIddatecustIdprodIdstoreIdqtyamtsalestorecodecitytypemgrcitycityIdpopregionregionregIdnamesTypetIdsizelocationsaleprodIdstoreIddateamts1s2s353joe10
mainsfop1bolt10c1nyco1001/7/9753p1c1112orderIds5sfot1joesfo1Mnorthnorthcold
regiont1smalldowntownp1c1112p144481fred12
mainsfop2nut5c2sfoo1022/7/9753p2c1211dates7sfot2fredla5Msouthsouthwarm
regiont2largesuburbsp2c1111p2111sally80
willowlac3la1053/8/97111p1c3550custIds9lat1nancyp1c3150prodIdp2c218storeIdp1c1244productidnamepriceqtyp1c224p1bolt10amtp2nut5customeridstorecodecitynamec1nycaddressc2sfocityc3laproductidsalecustIdprodIdstoreIdqtyamtname53p1c1112price53p2c1211111p1c3550storecodecity
&APage &P
Sheet2
&APage &P
Sheet3
&APage &P
Sheet4
&APage &P
Sheet5
&APage &P
Sheet6
&APage &P
Sheet7
&APage &P
Sheet8
&APage &P
Sheet9
&APage &P
Sheet10
&APage &P
Sheet11
&APage &P
Sheet12
&APage &P
Sheet13
&APage &P
Sheet14
&APage &P
Sheet15
&APage &P
Sheet16
&APage &P
Sheet1customeridnameaddresscityproductidnamepricestorecodecitysaleoderIddatecustIdprodIdstoreIdqtyamtsalestorecodecitytypemgrcitycityIdpopregionregionregIdnamesTypetIdsizelocationsaleprodIdstoreIddateamts1s2s353joe10
mainsfop1bolt10c1nyco1001/7/9753p1c1112orderIds5sfot1joesfo1Mnorthnorthcold
regiont1smalldowntownp1c1112p1125081fred12
mainsfop2nut5c2sfoo1022/7/9753p2c1211dates7sfot2fredla5Msouthsouthwarm
regiont2largesuburbsp2c1111p2118111sally80
willowlac3la1053/8/97111p1c3550custIds9lat1nancyp1c3150prodIdp2c218storeIdp1c1244productidnamepriceqtyp1c224p1bolt10amtp2nut5customeridstorecodecitynamec1nycaddressc2sfocityc3laproductidsalecustIdprodIdstoreIdqtyamtname53p1c1112price53p2c1211111p1c3550storecodecity
&APage &P
Sheet2
&APage &P
Sheet3
&APage &P
Sheet4
&APage &P
Sheet5
&APage &P
Sheet6
&APage &P
Sheet7
&APage &P
Sheet8
&APage &P
Sheet9
&APage &P
Sheet10
&APage &P
Sheet11
&APage &P
Sheet12
&APage &P
Sheet13
&APage &P
Sheet14
&APage &P
Sheet15
&APage &P
Sheet16
&APage &P