Top Banner

of 70

BI Lecture Number 9

Apr 14, 2018

Download

Documents

Vaibhav Gupta
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 7/27/2019 BI Lecture Number 9

    1/70

    Dr. N.P. Singh,Professor (IT)15.10.13

  • 7/27/2019 BI Lecture Number 9

    2/70

    Subject oriented Integrated

    Near current data delivery Current data Detailed

  • 7/27/2019 BI Lecture Number 9

    3/70

    An ODS is an environment where data fromdifferent operational databases is integrated.

    The purpose is to provide the end usercommunity with an integrated view ofenterprise data.

    It enables the user to address operationalchallenges that span over more than onebusiness function.

  • 7/27/2019 BI Lecture Number 9

    4/70

    It is the right place to have a central version ofreference data that can be shared among differentapplication systems.

    One way could be that the applications access thedata in the ODS directly. Another way is to replicate data changes from the

    ODS into the databases of the legacy systems.

    The ODS can help to integrate new and existingsystems. The ODS may shorten the time required to populate a

    DW, because a part of the integrated data alreadyresides in the ODS.

  • 7/27/2019 BI Lecture Number 9

    5/70

    The ODS provides improved accessibility to criticaloperational data.

    With an ODS, organizations have a complete view of their

    financial metrics and customer transactions. This is useful for better understanding of the customer and

    to make well-informed business decisions. The ODS can provide the ability to request product and

    service usage data on a real or near real-time basis. Operational reports can be generated with an improved

    performance in comparison to the legacy systems.

  • 7/27/2019 BI Lecture Number 9

    6/70

    Frequency is how often the ODS is updated, quitepossibly from completely different legacy systems,

    using distinct population processes, and also takesinto account the volume of updates that areoccurring.

    Velocity is the speed with which an update must take

    place

    from the point in time a legacy systemchange occurs, to the point in time that it must bereflected in the ODS.

  • 7/27/2019 BI Lecture Number 9

    7/70

  • 7/27/2019 BI Lecture Number 9

    8/70

  • 7/27/2019 BI Lecture Number 9

    9/70

    How to position the ODS within the BIarchitecture

  • 7/27/2019 BI Lecture Number 9

    10/70

  • 7/27/2019 BI Lecture Number 9

    11/70

  • 7/27/2019 BI Lecture Number 9

    12/70

  • 7/27/2019 BI Lecture Number 9

    13/70

  • 7/27/2019 BI Lecture Number 9

    14/70

  • 7/27/2019 BI Lecture Number 9

    15/70

    ODS in DSS Environment -Corporate Information Factory

  • 7/27/2019 BI Lecture Number 9

    16/70

  • 7/27/2019 BI Lecture Number 9

    17/70

    class I where transactions were moved to the ODS in an immediate manner

    from applications - in a range of 1 to 2 seconds from the moment thetransaction was executed in the operational environment until the

    transaction arrived at the ODS. In this case, the end user could hardlytell the difference between an activity that had occurred in theoperational environment and the same activity as it was transmittedin the ODS environment.

    class II where activities that occurred in the operational environment were

    stored and forwarded to the ODS every four hours or so. In this case,there was a noticeable lag between the original execution of thetransaction and the reflection of that transaction in the ODSenvironment. However this class of ODS was much easier to build andto operate than a class I ODS.

  • 7/27/2019 BI Lecture Number 9

    18/70

    class III in this case the time lag between execution in the operational

    environment and reflection in the ODS is not four hours or so, but isovernight. In a class III ODS there is a noticeable time lag between the

    execution of the transaction in the operational environment and thereflection of the transaction in the ODS environment. This type ofODS is relatively very easy to build.

    class IV a class IV ODS is one that is fed from the data warehouse from

    analysis created by the DSS analyst in the data warehouseenvironment and condensed down to a point where the results of theanalytical processing fit comfortably in the ODS. The input to the ODScan be either regular or irregular. This class of ODS is very easy to buildas long as the data warehouse has already been constructed.

  • 7/27/2019 BI Lecture Number 9

    19/70

    Insurance Retail

    Banking Telecommunications

  • 7/27/2019 BI Lecture Number 9

    20/70

    How can I provide an up-to-date view of insuranceproducts owned by each customer for ourCustomer Relationship Management (CRM)system?

    How can I consolidate all the information requiredto solve customer problems?

    How can I decrease the turn-around time forquotes?

    How can I reduce the time it takes to produce

    claim reports?

  • 7/27/2019 BI Lecture Number 9

    21/70

    How can we give suppliers the ability to co-manage our inventory?

    What inventory items should I be adjusting

    throughout the day? How can my customers track their own orders

    through the Web?

    What are my customers ordering across allsubsidiaries? What is the buying potential of my customer at

    the point of sale?

  • 7/27/2019 BI Lecture Number 9

    22/70

    What is the complete credit picture of mycustomer, so I can grant an immediate increase?

    How can we provide customer service have a

    consolidated view of all products and transactions? How can we detect credit card fraud while the

    transaction is in progress? What is the current consolidated profitability

    status of a customer?

  • 7/27/2019 BI Lecture Number 9

    23/70

    Can we identify what our Web customers arelooking for in real-time?

    Which calling cards are being used forfraudulent calls? What are the current results of my campaign? How can we quickly monitor calling patterns

    after the merger?

  • 7/27/2019 BI Lecture Number 9

    24/70

    How do I provide an up-to-date view of crossfunctional information for a particular

    business process when the data is spreadacross several disparate sources?

  • 7/27/2019 BI Lecture Number 9

    25/70

    To maximize customer satisfaction andprofitability, the ultimate data store would

    contain all of the organizations operationaldata. This, of course, is not economical or

    technically feasible at one place with onetechnology solution.

  • 7/27/2019 BI Lecture Number 9

    26/70

    It seems to have the characteristics of an On-Line Transactional Processing (OLTP) system

    while at the same time accommodating someof the attributes of a data warehouse (forexample, integrating and transforming datafrom multiple sources).

  • 7/27/2019 BI Lecture Number 9

    27/70

  • 7/27/2019 BI Lecture Number 9

    28/70

  • 7/27/2019 BI Lecture Number 9

    29/70

  • 7/27/2019 BI Lecture Number 9

    30/70

    Transferring the data Data characteristics

    The ODS environment ODS administration and maintenance

  • 7/27/2019 BI Lecture Number 9

    31/70

  • 7/27/2019 BI Lecture Number 9

    32/70

  • 7/27/2019 BI Lecture Number 9

    33/70

  • 7/27/2019 BI Lecture Number 9

    34/70

  • 7/27/2019 BI Lecture Number 9

    35/70

  • 7/27/2019 BI Lecture Number 9

    36/70

    Analyzing the business requirements Defining the ODS type needed

    Data modeling Defining and describing the different ODS

    layers

  • 7/27/2019 BI Lecture Number 9

    37/70

    The business scenarios were created from thefollowing three business questions:

    Banking/finance: What is my customersentire product portfolio? Retail: How can my customers track their

    own orders through the Web? Telecommunications: Which calling cards are

    being used for fraudulent calls?

  • 7/27/2019 BI Lecture Number 9

    38/70

  • 7/27/2019 BI Lecture Number 9

    39/70

  • 7/27/2019 BI Lecture Number 9

    40/70

    Fig describes the data flow for the order maintenancebusiness scenario.

    Data is integrated and transformed from multipleheterogeneous data sources and used to populate the order

    maintenance ODS. An order maintenance application will be used by both

    customers and the customer service department to accessand update the ODS.

    Changes made to the ODS through the order maintenanceapplication will flow back to the source systems using atrigger and apply mechanism. Regularly scheduled reports

    will be created for the inventory management department.

  • 7/27/2019 BI Lecture Number 9

    41/70

  • 7/27/2019 BI Lecture Number 9

    42/70

    Figure represents the data flow for the consolidated callinformation scenario.

    Data is integrated and transformed from multiplehomogeneous data sources and used to populate the calling

    transaction ODS. This data flow into the ODS is real-time. A custom-built fraud application will be used to verify calls

    and trigger customer service when a suspect call is identified. The existing customer service and billing applications will be

    migrated to the ODS, eliminating their data stores.

    A follow-on phase will eliminate the customer data store.

  • 7/27/2019 BI Lecture Number 9

    43/70

  • 7/27/2019 BI Lecture Number 9

    44/70

    An ODS type A includes real-time (or near-real-time) legacydata access andlocalized updates (data modifications are notfed back to the legacy systems). The localized updates wouldtypically include new data not currently captured in the

    operational systems. An ODS type B includes the characteristics of an ODS type A

    along with a triggerand apply mechanism to feed data backto the operational systems. Typically these feedback

    requirements would be very specific to minimize conflicts. An ODS type C is either fully integrated with the legacy

    applications or uses real-time update and access.

  • 7/27/2019 BI Lecture Number 9

    45/70

  • 7/27/2019 BI Lecture Number 9

    46/70

    The ODS can be directly updated by front-end applications (such asCampaign Management, Customer Service, Call Center) or by the userdirectly through an application interface (such as a new Web application).

    The ODS can be a source of data for the warehouse. Batch processes will

    be used to populate the data warehouse. The ODS complements or extends the operational systems. It is not

    intended to replace them. Although most sources will be used to populate both the ODS and the

    data warehouse, two data acquisition streams will probably exist due to

    the temporal differences in the data required. For example, the data warehouse may require a monthly inventory snapshot whereas

    the ODS may require an up to the minute inventory status.

  • 7/27/2019 BI Lecture Number 9

    47/70

    Data flows from the operational systems to the ODSthrough the data acquisition layer.

    Updates to the ODS can be real-time, store andforward, and/or batch.

    In a real-time environment changes are applied to theODS immediately, for example, using the sameoperational application.

    A store and forward scheme may use tools such as

    replication or messaging to populate the ODS. Changes which are only required daily, for example,could use a normal batch process.

    Operational systems are not updated from an ODStype A.

  • 7/27/2019 BI Lecture Number 9

    48/70

  • 7/27/2019 BI Lecture Number 9

    49/70

    The ODS type B includes the characteristicsof an ODS type A plus the additional feature

    of an asynchronous triggering mechanism. This triggering mechanism is used to send

    ODS changes back to the operationalsystems.

  • 7/27/2019 BI Lecture Number 9

    50/70

  • 7/27/2019 BI Lecture Number 9

    51/70

  • 7/27/2019 BI Lecture Number 9

    52/70

    Data flows back and forth between the datasources and the ODS through the data

    acquisition layer on a real-time basis. The ODS becomes the single source for much

    of the corporations key operational data.

  • 7/27/2019 BI Lecture Number 9

    53/70

  • 7/27/2019 BI Lecture Number 9

    54/70

    ODS DW

    ORGN. SUBJECT SUBJECT

    USERS LARGE NUMBER FEW

    SIZE SMALL VERY LARGE

    GROWTH 20 - 30 % Pa 50 - 180 % Pa

    STRUCT. NORMALIZED DeNORMALIZED

    UPDATE SEVERAL NONE

    VOLATILE YES NO

    METADATA YES YES

    DESIGN PROCESS DRIVEN DATA DRIVEN

  • 7/27/2019 BI Lecture Number 9

    55/70

    Issue Operational Warehouse

    How Built One application at a time in

    the legacy environment or one

    subject area at a time in the

    ODS

    One or more subject

    areas at a time

    Requireme

    nts

    Known Vague

    Data

    Access

    Smaller number of rows

    retrieved in a single call

    Large set of data is

    scanned to retrieve

    results

    Critical to Daily Business operation Management

    Decisions that may

    affect profitability

  • 7/27/2019 BI Lecture Number 9

    56/70

    Issue Operational Warehouse

    Tuning Highly tuned for frequent

    access to small amounts of

    data

    Tuned for infrequent access to

    larger quantities of data

    Data

    volume

    Volume needed for daily

    operation

    Larger volumes needed to support

    statistical analysis, forecasting, adhoc reporting, and querying

    Data

    Retention

    Data retrieved to meet

    daily requirements

    Data retained longer to support

    historical reporting, comparison ,

    analysis etc.

    Data

    currency

    Must be up to the minute Usually does not require as high

    availability as the production

    environment unless world wide

    access is necessary

  • 7/27/2019 BI Lecture Number 9

    57/70

    Data Warehouse OLTP

    Designed for analysis of business

    measures by categories and

    attributes

    Designed for real-time business

    operations

    Optimized for bulk loads and large,complex, unpredictable queries

    that access many rows per table

    Optimized for a common set oftransactions, usually adding or

    retrieving a single row at a time

    per table

    Loaded with consistent, valid data;

    requires no real time validation

    Optimized for validation of

    incoming data during transactions;

    uses validation data tables

    Supports few concurrent users

    relative to OLTPSupports thousands of concurrent

    users

  • 7/27/2019 BI Lecture Number 9

    58/70

    At one hand, ODS is decidedly operational.It provides high response time and highavailability and is certainly qualified to actas the basis of Mission Critical Systems.

    On the other hand, ODS has some veryclear DSS features.

    The ODS is integrated, subject orientedand supports some important kinds ofdecision support system.

  • 7/27/2019 BI Lecture Number 9

    59/70

    ODS sits between the legacy applications & the DW. It is fed by integration & transformation programs. These program may be the same that feed to DW or different ODS

    feeds data in to data warehouse. Some operational data traverse directly to DW through I/T layers.

    Some data passes from the operational foundations in to I/T layers,then to ODS and on to DW. ODS is enablement of integrated, collective online processing It support online updates. Integrated many applications.

    It provide view of the enterprise. It provide decision support processing

    Complex Structure Underlying technology Design Monitoring & maintaining

  • 7/27/2019 BI Lecture Number 9

    60/70

    Two types of users Farmers (same task repetitively, look for small amount of data, always

    get what they are looking for, work in structured world-

    Structured data

    Structured processing

    Structured procedures and so forth)

    Explorers (antithesis of farmer, operate in random manner, does notknow what he/she is looking for, operate in heuristic mode, very largeset of data. Look for

    Associations,

    Patterns Relationship

    Not yet discovered)

    Nothing

    Huge gold mines

    Unstructured manner

  • 7/27/2019 BI Lecture Number 9

    61/70

    Satisfy the need of both Classical Design:

    DSS environment with a data model, which reflects theinformational needs of the corporation.

    From the data model are generated normalized tables. Tables are known as logical model Tables are combined in to a form of physical design that

    can be termed as lightly normalized design. Tables are combined on the basis of containing common

    keys and general common usage. There is a fly in the ointment of this approach

    Performance where many tables must be joined Performance where many occurrences of the data User may find it unnatural to join many tables.

  • 7/27/2019 BI Lecture Number 9

    62/70

    Second Approach:

    Volume & usage

    Volume & usage of the data are factored in to design,a mutant form of normalization is achieved.

    The normalization turn in to heavy normalization

    A structure star join is created.

  • 7/27/2019 BI Lecture Number 9

    63/70

    Star Join:

    Two parts

    Fact Tables (represent the structure that holds the

    majority of the occurrence of the data, it combine dataand cross reference keys from a variety of other tables)&

    Dimension tables ( contain data which is not terriblyvoluminous, related to fact tables by foreign key)

    Fact tables are efficient to access because data has beenpre-joined in to table at the moment of loading

  • 7/27/2019 BI Lecture Number 9

    64/70

    Star Join:

    Usage of the data must be known in advance.

    With out knowing pattern of access & usage of the

    data it is difficult to design the fact tables.

    One department may look differently for the

    same of data in comparison to other.

    Star join for finance may be different from join forproduction.

  • 7/27/2019 BI Lecture Number 9

    65/70

    Normalized

    Structure

    Star Structure

    Inefficient to access Efficient to access

    Holds modest amountsof data

    Holds large amount ofdata

    Applicable to a wide

    audience

    Applicable to a restricted

    audience

    Handles updates Does not handle

    updates

  • 7/27/2019 BI Lecture Number 9

    66/70

    ODS environment serves both operational & DSSenvironment, the ODS is built with both a waterfalloperational & a spiral DSS methodology

    Water fall methodology

    Requirements gathering & assimilation

    Analysis & systemization

    Design

    Programming

    Testing

    implementation

  • 7/27/2019 BI Lecture Number 9

    67/70

    Legacy systems

    ETL Tools

    Operational Data store

    Access Tools

  • 7/27/2019 BI Lecture Number 9

    68/70

    Legacy systems: ERP, CRM, Web or any legacysystem, where in operations data is recorded.

    ETL Tools: These tools are used to extract,

    transform and load data from legacy systems tooperational data stores.

    Operational data store

    BI Tools: for analyzing the data & generating

    reports

  • 7/27/2019 BI Lecture Number 9

    69/70

    Legacy systems : data is extracted from e-mails,direct mails, telemarketing, kiosk, stores, call

    centers, web using ETL tools and stored in

    operational data stores. Operational data store.

    Data warehouse.

    BI tools or Analytical tools.

  • 7/27/2019 BI Lecture Number 9

    70/70

    Gartner introduced the concept of zero latencystrategy which means any strategy that exploits theimmediate exchange of information across

    technical and organizational boundaries to achievebusiness benefit. Organizations that can make decisions based on up-

    to-the-second information and apply thosedecisions to operational systems and businessprocesses are known as ZLE.

    Pull & Push Process