(2½ hours) Total Marks: 75 N. B.: (1) All questions are compulsory. (2) Makesuitable assumptions wherever necessary and state the assumptions made. (3) Answers to the same question must be written together. (4) Numbers to the right indicate marks. (5) Draw neat labeled diagrams wherever necessary. (6) Use of Non-programmable calculators is allowed. 1. Attempt any two of the following: 10 a. What are operational databases? Explain following characteristics of data in a data warehouse. i) Subject-oriented ii) Integrated iii) Time-variant iv) Non-volatile The Operational databases are often used for on-line transaction processing (OLTP). It deals with day-to-day operations such as banking, purchasing, manufacturing, registration, accounting, etc. These systems typically get data into the database. Each transaction processes information about a single entity. The purpose of these queries is to support business operations. Features of Data warehouse • Subject-oriented: A data warehouse is organized around major subjects, such as customer, vendor, product, and Sales. It focuses on the modeling and analysis of data rather than day-to- day business operations. • Integrated: A data warehouse is constructed by integrating data from multiple heterogeneous data sources. • Time variant: A data warehouse is a repository of historical data. It gives the view of the data for a designated time frame. • Non-volatile: A data warehouse is always a physically separate store of data transformed from the application data found in the operational environment. Due to this separation, a data warehouse does not require transaction processing, recovery, and concurrency control mechanisms. b. Explain virtual data warehouse in detail. This option provides end users with direct access to multiple operational databases through middleware tools. That is it provides on-the-fly data for decision support purposes. The end users can generate “summarized data” reports for their data analysis. The advantages of this approach are: – Easy to build – Elimination of the time and expense of developing a traditional data warehouse – Flexibility – No data redundancy – Provides end-users with the most current corporate information The drawbacks of this approach include: – Repetitive transformation and integration operations – Impacts to source systems – Loss of historical perspective
14
Embed
All compulsory state the assumptions same question written ...muresults.net/itacademic/TYIT6/Nov17/DWSS.pdf · (2½ hours) Total Marks: 75 N. B.: (1) All questions are compulsory.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
(2½ hours)
Total Marks: 75
N. B.: (1) All questions are compulsory.
(2) Makesuitable assumptions wherever necessary and state the assumptions made.
(3) Answers to the same question must be written together.
Where datetime-exp is the expression which is to be converted to text and datetime-fmt
is the format of resultant text.
Example:-
To_char(SALE_DATE, 'Month DD, YYYY') would return sale date as 'April 07, 2009'
b. Write a detailed note on data object validation in OWB.
• The process of validation is all about making sure the objects and mappings we've
defined in the Warehouse Builder have no obvious errors in design.
• Oracle Warehouse Builder runs a series of validation tests to ensure that data object
definitions are complete and that scripts can be generated and deployed.
• When these tests are complete, the results are displayed.
• Oracle Warehouse Builder enables you to open object editors and correct any invalid
objects before continuing.
• Validating objects and mapping can be done with the help of Design Center.
• Validation of repository objects can be done with the help of Data Object Editor.
• Validation of mapping can be done through Mapping Editor.
• The validation will result in one of the following three possibilities:
– The validation completes successfully with no warnings and/or errors.
– The validation completes successfully, but with some non-fatal warnings.
– The validation fails due to one or more errors.
c. What is constant operator? Explain steps to create constants in mapping.
The Constant operator enables you to generate constant values. The outputs of constant
operator are constants.
• Drag a Constant Operator onto the mapping canvas.
• The Constant operator only allows output, so there is no input group defined or
allowed.
• The output to from this operator are constants
• There can be multiple constants defined using single constant operator.
• Right click on the constant operator and select Open Details... from the pop up
to open the CONSTANT editor dialog box.
• This dialog box helps in defining constant values as output.
• The Output Attributes tab of the CONSTANT editor allows you to add/remove
constants.
• Let us add two Output Attributes X and Y.
• Click on X and enter the value 10 for the Expression property( through Properties
window)
• Similarly Click on Y and enter the value 20 for Expression property
• The next step is to connect our constants X and Y to target attributes where
constant values are needed.
d. Explain various functions of control center manager.
6. Attempt any two of the following: 10
a. Explain clipboard and recycle bin features of oracle warehouse builder.
Clipboard
The Clipboard is a concept the OWB has borrowed from operating systems. The
clipboard facilitates cut, copy and paste objects. The Clipboard is a temporary storage
area for objects that you have copied or moved from one project and plan to use
somewhere else.
To view the content of the Clipboard:
– select Clipboard from the Tools menu
– or press the F8 key
Recycle Bin
• The Recycle Bin in OWB is similar to recycle bin in operating systems.
• OWB keeps deleted objects in the recycle bin.
• The deleted objects can be restored from the Recycle Bin.
• To undo a deletion select an object from the Recycle Bin and click Restore. .
• If “Put in Recycle Bin” check box is checked while deleting, then the object will be
send to the Recycle Bin.
• The Warehouse Builder Recycle Bin window can be opened by clicking on Tools
menu and selecting Recycle Bin option from the pop-up menu
• This window has a content area which shows all deleted objects.
• The content shown with Object Parent as well as Time Deleted information.
• The Object Parent means the project from which it was deleted from and the Time
Deleted is when we deleted the object.
• Below the content area it has two buttons:
– One for restoring a deleted object
– Another for emptying the content of recycle bin
b. What is Metadata Loader? What are its benefits?
The workspace objects can be exported to a file. We can export anything from an entire
project down to a single data object or mapping. Following are the benefits of Metadata
Loader exports and imports
– Backup
– To transport metadata definitions to another repository for loading
If we choose an entire project or a collection such as a node or module, it will export all
objects contained within it. If we choose any subset, it will also export the context of the
objects so that it will remember where to put them on import.
Say for example if you export a table, the metadata also contains the definition for:
– the module in which it resides
– the project the module is in
We can also choose to export any dependencies on the object being exported if they exist.
To export an object select the object which is to exported and click on Design | Export
|Warehouse Builder Metadata from the main menu. This will launch the Metadata Export
dialog box. This dialog box lists all objects selected from explorer for export. We can
specify the file name of the export file. Metadata Export dialog box has option to choose
to export all dependencies on the object being exported.
c. Explain data density and data sparsity with example.
d. Explain ROLAP and its merits.
It uses a standard relational database to store the physical data. For this data is stored in
a special structure known as a star schema or snowflake schema. A star schema consists
of a central fact table containing measures and a set of dimension tables with the
hierarchy defined by child-parent columns. In star schema model a fact table is at the
center of the star and the dimension tables as points of the star. The dimension tables are
then relationally joined with the fact table to allow multidimensional queries. The data
is retrieved from the relational database into the client tool by SQL queries.
• Advantage:
– Because it utilizes a relational database, ROLAP can support massive
amount of data
– Same technology as existing source systems (source systems are RDBMS
based)
4. Attempt any three of the following: 15
a. Differentiate between OLTP database and Data warehouse database.
OLTP database
Data warehouse database
Application Oriented Subject Oriented Detailed data Summarized and refined Designed for real-time business transactions and
concurrent processes Designed for analysis of business
Isolated Data Integrated Data Repetitive access Ad-hoc access Performance Sensitive Performance relaxed Few Records accessed at a time Large volumes accessed at a time Optimized for common and known set of
transactions, usually intensive nature; addition,
updation and deletion of rows at a time per table.
Optimized for bulk loads and complex,
unpredictable queries that access many
rows per table Database Size 100 MB to 100 GB Database Size 100 GB to few terabytes Very minimal historical data Current as well as historical data
b.
What is Oracle Warehouse Builder? Explain the significance of projects and
modules in OWB?
Oracle Warehouse Builder is an ETL tool produced by Oracle that offers a graphical
environment to build, manage and maintain data integration processes in business
intelligence systems.
The OWB design objects are organized under a project, which provide a means for
structuring the objects for security and reusability. Each project contains nodes for each
type of design object that you can create or import. These projects are stored in a
workspace. So prior to extracting data one has to create a project. A default project called
MY_PROJECT is automatically created when you create a workspace. Alternatively,
you can rename MY_PROJECT or define more projects. A project will contain one or
more modules. Therefore before you import source metadata into Warehouse Builder,
create a module that will contain these metadata definitions. The type of module you
create depends on the source from which you are importing metadata. The Oracle
Warehouse Builder supports Oracle, Non-Oracle or Flat File modules to import metadata
definitions from Oracle database, Non-Oracle database and flat files respectively.
c. What is time dimension? Explain various steps to create it through time dimension
wizard.
The Time/Date dimension provides the time series information to describe warehouse
data. Most of the data warehouses include a time dimension. Also the information it
contains is very similar from warehouse to warehouse. It has levels such as days, weeks,
months, etc. The Time dimension enables the warehouse users to retrieve data by time
period.
Creation of Time dimension
• Launch Design Center
• Expand the Databases node under your project node and then right-click on the
Dimensions node, a pop menu appears. Select New | Using Time Wizard... from the
pop up menu to launch the Time Dimension Wizard.
• The first screen is a welcome screen which shows various steps involved in creation
of a time dimension.
• Step1: First screen is to provide name and description for the time dimension
• Step2: This step set the storage type for this dimension either as ROLAP or as
MOLAP.
• Step3: Define the range of data stored in the time dimension. It asks us what year we
want to start with, and then how many total years to include starting with that year.
• Step4: This step is to choose hierarchy and the levels in the hierarchy. Following are
the hierarchies and the levels in the hierarchy:
1. Normal Hierarchy
• Calendar Year
• Calendar Quarter
• Calendar month
• Day
2. Week Hierarchy
• Calendar Week
• Day
• Step5: This screen shows summary of Time Dimension before creation of the
Sequence and Map
• Step6:Progress Status
d. Write a detailed note on staging and its benefits.
Staging is the process of copying the source data temporarily into tables in target
database. The purpose is to perform any cleaning and transformations before loading the
source data into the final target tables. The data staging area is a temporary area for
storage and processing. This is the place where all the extracted data is put together and
prepared for loading into the data warehouse. A staging area is like a large table with
data pulled from various sources to be loaded into a data warehouse in the required
format. In the absence of a staging area, data must be loaded directly from OLTP system
to the OLAP system. A staging area can be created
– within the database ( using Database tables )
– outside the database (using Flat files)
Advantages of staging
• Source database connection can be freed immediately after copying the data to the
staging area. The formatting and restructuring of the data happens later with data in
the staging area.
• If the ETL process needs to be restarted, there is no need to go back and disturb the
source system to retrieve the data.
e. Describe on SUBSTR transformation function.
SUBSTR function is used get a substring of a given string value. After dropping
the Transformation Operator on the mapping, it will pop up a dialog box where we
can select SUBSTR function. This operator needs three parameters, STRING,
POSITION and SUBSTR_LENGTH. The STRING parameter represents the string field
from which substring is to be extracted. The other two parameters must be constant
integer values. The POSITION is a number indicating the start position of the substring
within the source string. The SUBSTR_LENGTH specifies the length of the substring to
extract.
Map the input field to the STRING input attribute of the SUBSTR operator. Using
Constant operator, you can supply integer constant values to POSITION and
SUBSTR_LENGTH. Following figure shows how a SUBSTR Transformation Operator
look like when it is placed on the mapping.
f. What are snapshots? Explain full snapshot and signature snapshot.
Snapshot
• A snapshot is a point in time version of an object.
• The snapshot of an object captures all the metadata information about that object at the time when
the snapshot is taken.
• It enables you to compare the current object with a previously taken snapshot.
• Since objects can be restored from snapshots, it can be used as a backup mechanism.
Full Snapshots :
• Full snapshots provide complete metadata of an object that you can use to restore it later.
• So it is suitable for making backups of objects.
• Full snapshots take longer time to create and require more storage space than signature snapshots.
Signature Snapshots:
• It captures only the signature of an object.
• A signature contains enough information about the selected metadata component to detect changes
when compared with another snapshot or the current object definition.
Signature snapshots are small and can be created quickly.