SSAS _ Dwbi Tutorials

Category Archives: SSASRSS - Posts

RSS - Comments

Search … Go

Go

Go

CATEGORIES

Databases (19)

Oracle (10)

Teradata (8)

DWBI (71)

Analytics (9)

SSAS (9)

ETL (57)

DataStage (44)

Informatica (1)

QualityStage (3)

SSIS (7)

Reporting (2)

SSRS (2)

DWH (20)

Scheduling (2)

Autosys (2)

Uncategorized (3)

UNIX (31)

Shell Scripting (2)

Videos (1)

RECENT POSTS

STORAGE MODES IN SSAS (MOLAP, ROLAP AND HOLAP)APRIL 27, 2013

Introduction

There are three standard storage modes (MOLAP, ROLAP and HOLAP) in OLAP applications which affect the

performance of OLAP queries and cube processing, storage requirements and also determine storage

locations.

SSAS (2005 and 2008) supports not only these three standard storage modes but supports proactive

caching, a new feature with SSAS 2005 which enables you to combine the best of both worlds (ROLAP and

MOLAP storage) for both frequency of data refresh and OLAP query performance.

In the part 1 of this article, I will start my discussion with an overview of each mode and then in part 2 I will

cover the new proactive-caching feature of SSAS.

Basic storage modes

The cube data can be divided into three different types – meta-data, detail data and aggregate data. No

matter what storage is used, the meta-data will always be stored on the OLAP server but storage of the detail

data and aggregate data will depend on the storage mode you specify.

MOLAP (Multidimensional OLAP)

This is the default and most frequently used storage mode. In this mode when you process the cube, the

source data is pulled from the relational store, the required aggregation is then performed and finally the

data is stored in the Analysis Services server in a compressed and optimized multidimensional format.

After processing, once the data from the underlying relational database is retrieved there exists no

connection to the relational data stores. So if there is any subsequent changes in the relational data after

processing that will not reflect in the cube unless the cube is reprocessed and hence it is called offline data-

set mode.

Since both the detail and aggregate data are stored locally on the OLAP server, the MOLAP storage mode is

very efficient and provides the fastest query performance.

Pros

o Stores the detail and aggregate data in the OLAP server in a compressed multidimensional format; as a

result the cube browsing is fastest in this mode.

o Provides maximum query performance, because all the required data (a copy of the detail data and

calculated aggregate data) are stored in the OLAP server itself and there is no need to refer to the underlying

relational database.

dwbi tutorialsData Warehousing and Business Intelligence Tutorials HOME

http://dwbitutorials.wordpress.com/feed/

http://dwbitutorials.wordpress.com/comments/feed/

http://dwbitutorials.wordpress.com/category/databases/

http://dwbitutorials.wordpress.com/category/databases/oracle/

http://dwbitutorials.wordpress.com/category/databases/teradata/

http://dwbitutorials.wordpress.com/category/dwbi/

http://dwbitutorials.wordpress.com/category/dwbi/analytics/

http://dwbitutorials.wordpress.com/category/dwbi/analytics/ssas/

http://dwbitutorials.wordpress.com/category/dwbi/etl/

http://dwbitutorials.wordpress.com/category/dwbi/etl/datastage/

http://dwbitutorials.wordpress.com/category/dwbi/etl/informatica/

http://dwbitutorials.wordpress.com/category/dwbi/etl/qualitystage/

http://dwbitutorials.wordpress.com/category/dwbi/etl/ssis/

http://dwbitutorials.wordpress.com/category/dwbi/reporting/

http://dwbitutorials.wordpress.com/category/dwbi/reporting/ssrs/

http://dwbitutorials.wordpress.com/category/dwh/

http://dwbitutorials.wordpress.com/category/scheduling/

http://dwbitutorials.wordpress.com/category/scheduling/autosys/

http://dwbitutorials.wordpress.com/category/uncategorized/

http://dwbitutorials.wordpress.com/category/unix/

http://dwbitutorials.wordpress.com/category/unix/shell-scripting/

http://dwbitutorials.wordpress.com/category/videos/

http://dwbitutorials.wordpress.com/2013/04/27/storage-modes-in-ssas-molap-rolap-and-holap/


http://dwbitutorials.wordpress.com/

http://dwbitutorials.wordpress.com/

Storage Modes in SSAS

(MOLAP, ROLAP and HOLAP)

IBM InfoSphere DataStage

Performance Tuning: Overview

of Best Practices

Looping in Transformer Stage

– DataStage 8.5 : Example 2

Looping in Transformer Stage

– DataStage 8.5 : Example 1

Parallel Debugging in

DataStage 8.5

TOP POSTS & PAGES

SQL QUERIES INTERVIEW

QUESTIONS – 6

SQL Server Analysis Services

(SSAS) Step By Step : Part 1

Guidelines To Develop A

Generic JobUsing Schema Files

In Infosphere DataStage 8.1

Teradata SQL Interview

Questions – 3

DataStage Parallel Routines - 1

Handling Datasets in

DataStage

Looping in Transformer Stage -

DataStage 8.5 : Example 1



IBM InfoSphere DataStage

Performance Tuning: Overview

of Best Practices



BLOG STATS

16,346 hits

o All the calculations are pre-generated when the cube is processed and stored locally on the OLAP server

hence even the complex calculations, as a part the query result, will be pefromed quickly.

o MOLAP does not need to have a permanent connection to the underlying relational database (only at the

time of processing) as it stores the detail and aggregate data in the OLAP server so the data can be viewed

even when there is connection to the relational database.

o MOLAP uses compression to store the data on the OLAP server and so has less storage requirements than

relational databases for same amount of data. (Note however, that beginning with SQL Server 2008 you can

use data compression at relational database level as well).

Cons

o With MOLAP mode, you need frequent processing to pull refreshed data after last processing resulting in

drain on system resources.

o Latency; just after the processing if there is any changes in the relational database it will not be reflected

on the OLAP server unless re-processing is performed.

o MOLAP stores a copy of the relational data at OLAP server and so requires additional investment for

storage.

o If the data volume is high, the cube processing can take longer, though you can use incremental

processing to overcome this.

ROLAP (Relational OLAP)

In comparison with MOLAP, ROLAP does not pull data from the underlying relational database source to the

OLAP server but rather both cube detail data and aggregation stay at relational database source. In order to

store the calculated aggregation the database server creates additional database objects (indexed views). In

other words, the ROLAP mode does not copy the detail data to the OLAP server, and when a query result

cannot be obtained from the query cache the created indexed views are accessed to provide the results.

Pros

o Ability to view the data in near real-time.

o Since ROLAP does not make another copy of data as in case of MOLAP, it has less storage requirements.

This is very advantageous for large datasets which are queried infrequently such as historical data.

o In ROLAP mode, the detail data is stored on the underlying relational database, so there is no limitation

on data size that ROLAP can support or limited by the data size of relational database. In nutshell, it can

even handle huge volumes of data.

Cons

o Compared to MOLAP or HOLAP the query response is generally slower because everything is stored on

relational database and not locally on the OLAP server.

o A permanent connection to the underlying database must be maintained to view the cube data.

Note:

If you use ROLAP storage mode and your relational database is SQL Server, the Analysis Services server may

create indexed views for aggregation. However this requires a few prerequisite to be available – for example,

the data source must be a table, not a view. The table name must use two part naming convention or it must

be qualified with owner/schema name etc. For a complete list of these prerequisites you can refer to the link

provided in reference section.

HOLAP (Hybrid OLAP)


http://dwbitutorials.wordpress.com/2013/04/17/ibm-infosphere-datastage-performance-tuning-overview-of-best-practices/

http://dwbitutorials.wordpress.com/2013/03/31/looping-in-transformer-stage-datastage-8-5-example-2/


http://dwbitutorials.wordpress.com/2013/03/31/parallel-debugging-in-datastage-8-5/

http://dwbitutorials.wordpress.com/2013/02/23/sql-queries-interview-questions-6/

http://dwbitutorials.wordpress.com/2013/03/23/sql-server-analysis-services-ssas-step-by-step-part-1/

http://dwbitutorials.wordpress.com/2013/02/26/guidelines-to-develop-a-generic-jobusing-schema-files-in-infosphere-datastage-8-1/

http://dwbitutorials.wordpress.com/2013/03/13/teradata-sql-interview-questions-3/

http://dwbitutorials.wordpress.com/2013/03/12/datastage-parallel-routines-1/

http://dwbitutorials.wordpress.com/2013/02/26/handling-datasets-in-datastage/



http://dwbitutorials.wordpress.com/2013/04/17/ibm-infosphere-datastage-performance-tuning-overview-of-best-practices/


http://www.sql-server-performance.com/articles/dba/Data_Compression_in_SQL_Server_2008_p1.aspx

Share this:

Like this:

Be the first to like this.

Like

This mode is a hybrid of MOLAP and ROLAP and attempts to provide the greater data capacity of ROLAP and

the fast processing and high query performance of MOLAP.

In HOLAP storage mode, the cube detail data remains in the underlying relational data store and the

aggregations are stored on the OLAP server. If you query only summary data in aggregation, the HOLAP

storage mode will work similar to MOLAP. For the detail data queries, HOLAP will drill through the detail data

in underlying relational data store and hence performance would not be as good as MOLAP. Therefore, your

query would be as fast as MOLAP in if your query result can be provided from query cache or aggregation but

performance would degrade if it needs the detail data from relational data store.

Pros

o HOLAP balances the disk space requirement, as it only stores the aggregate data on the OLAP server and

the detail data remains in the relational database. So no duplicate copy of the detail data is maintained.

o Since HOLAP does not store detail data on the OLAP server, the cube and partitions would be smaller in

size than MOLAP cubes and partitions.

o Performance is better than ROLAP as in HOLAP the summary data are stored on the OLAP server and

queries can be satisfied from this summary data.

o HOLAP would be optimal in the scenario where query response is required and query results are based

on aggregations on large volumes of data.

Cons

o Query performance (response time) degrades if it has to drill through the detail data from relational data

store, in this case HOLAP performs very much like ROLAP.

Configuring the storage mode

Setting the storage mode is a relatively straightforward process – select a particular OLAP object in BIDS

(Business Intelligence Development Studio), right click on it and then select properties. The property

calledStorageMode will allow you to set the storage mode to MOLAP, ROLAP or HOLAP.

Conclusion

In the part 1 of this article, I discussed the basic OLAP storage modes as well as their pros and cons and then

finally I showed how you can configure the storage mode of OLAP objects in SSAS.

In part 2 of this article I will cover proactive-caching feature of SSAS, which allows the administrator to

better control the frequency of cube data refresh, so that cube can refer to near real time data and at same

time it also provides the query performance of MOLAP storage mode.

SSAS LEAVE A COMMENT

Twitter Facebook Google +1 Email Print

SQL SERVER ANALYSIS SERVICES GLOSSARYMARCH 23, 2013

http://dwbitutorials.wordpress.com/2013/04/27/storage-modes-in-ssas-molap-rolap-and-holap/?share=twitter&nb=1

http://dwbitutorials.wordpress.com/2013/04/27/storage-modes-in-ssas-molap-rolap-and-holap/?share=facebook&nb=1

http://dwbitutorials.wordpress.com/2013/04/27/storage-modes-in-ssas-molap-rolap-and-holap/?share=google-plus-1&nb=1

http://dwbitutorials.wordpress.com/2013/04/27/storage-modes-in-ssas-molap-rolap-and-holap/?share=email&nb=1


http://widgets.wp.com/likes/#


http://dwbitutorials.wordpress.com/2013/04/27/storage-modes-in-ssas-molap-rolap-and-holap/#respond

http://dwbitutorials.wordpress.com/2013/03/23/sql-server-analysis-services-glossary/


Share this:

Like this:


Like

Following is a list of common terms when working with SQL Server Analysis Services.

Cube - Cube is a multi dimensional data structure composed of dimensions and measure groups. The

intersection of dimension and measure groups contained in a cube returns the dataset.

Calculated Measure - Each field in a measure group is known as a base measure. Measures created using

MDX expressions with/without base measures are known as calculated measures.

Data Source View - It’s an insulation layer that inherits the basic schema from the data source with the

flexibility to manipulate the schema in this layer without modifying the actual schema in the data source.

Dimension - Dimension is an OLAP structure that is basically used to contain attributes related to an entity

to categorize data on the row / column axis. A dimension almost never contains measurable numeric data,

and if at all it contains, it is used as an attribute. Typical example of dimensions are Geography,

Organization, Employee, Time etc.

Fact - Fact known as a Measure Group in a cube, is an OLAP structure that is basically used to contain

measureable numeric data, for one or more entities. In cube parlance these entities are known as

Dimensions. A dimension need not be necessarily associated directly with a fact, but a fact is always

associated directly with at least one dimension. Typical example of facts are Sales, Performance, Tax etc.

Hierarchy - Hierarchy is collection of nested attributes associated in a parent-child fashion with a defined

cardinality. Dimension is formed of attributes, and hierarchy contained in a dimension is formed of one or

more attributes from the same dimension.

KPI - Key Performance Indicators are logical structures defined using MDX expressions. Each KPI has a goal,

status, value, trend, and indicator associated with it. Value is derived based on the definition of KPI, all the

rest of these values vary based on this derived value. KPIs are the primary elements that makes up a

scorecard in a dashboard.

MDX - Multi Dimensional Expressions is considered as the query language of multi dimensional data

structures. This can be considered as the SQL of OLAP databases, with the major difference that MDX is

mostly used for reading data only.

Named Set - Named Set is a pre-defined MDX query defined in the script of the cube. It can be thought of

synonymous to Views in a SQL Server database. Named sets can be dynamic or static and this nature defines

the time when this query gets evaluated.

OLAP - Online Analytical Processing is a term used to represent analytical data sources and analysis

systems. The fundamental perception and expectation associated with the term OLAP is that it would contain

multi dimensional data and the environment hosting the same.

Snowflake Schema - Snowflake schema is an OLAP schema, where one or more normalized dimension

tables are associated with a fact table. For example, Product Sub Category -> Product Category -> Product

can be three normalized dimension tables and Product table can be associated with a fact table like Sales.

This is a very common example of a snowflake schema.

Star Schema - Star schema is an OLAP schema, where all dimension tables are directly associated with

fact tables, and no normalized dimension tables are considered in the schema. For example, Time, Product,

Geography dimension tables would be directly associated with a fact table like Sales. This is a very common

example of star schema.

ANALYTICS SSAS ADVANCED SSAS TUTORIAL DATA SOURCE VIEWS DIMENSION FACT OLAP CUBE DESIGN ONLINE ANALYTICAL PROCESSING


http://dwbitutorials.wordpress.com/2013/03/23/sql-server-analysis-services-glossary/?share=twitter&nb=1

http://dwbitutorials.wordpress.com/2013/03/23/sql-server-analysis-services-glossary/?share=facebook&nb=1

http://dwbitutorials.wordpress.com/2013/03/23/sql-server-analysis-services-glossary/?share=google-plus-1&nb=1

http://dwbitutorials.wordpress.com/2013/03/23/sql-server-analysis-services-glossary/?share=email&nb=1





http://dwbitutorials.wordpress.com/tag/advanced-ssas-tutorial/

http://dwbitutorials.wordpress.com/tag/data-source-views/

http://dwbitutorials.wordpress.com/tag/dimension/

http://dwbitutorials.wordpress.com/tag/fact/

http://dwbitutorials.wordpress.com/tag/olap-cube-design/

http://dwbitutorials.wordpress.com/tag/online-analytical-processing/

SQL SERVER ANALYSIS SERVICES SQL SERVER ANALYSIS SERVICES TUTORIALS SSAS BASICS SSAS DEFINITIONS SSAS GLOSSARY SSAS KEYWORDS

SSAS STEP BY STEP GUIDE SSAS TUTORIALS LEAVE A COMMENT

SQL SERVER ANALYSIS SERVICES (SSAS) STEP BY STEP : PART 4MARCH 23, 2013

Calculated Measures and Named Sets

Fields from fact tables get converted into measures in measuregroups in a cube. When measuregroups are

created in a cube, one measuregroup is created per fact table. Often in production systems, developing

calculated measures is a regular requirement. Multi-Dimensional Expressions (MDX) is the query language

for a cube and is synonymous to what T-SQL is to SQL Server. Often queries that are frequently used are

required to be in some ready format in a cube, so that the users do not need to develop them over and over

again. One of the solutions for this is named sets, which can be perceived as a query already defined in the

cube, similar to views in SQL Server. We will develop a calculated measure and a few named sets in this

section.

Developing a Calculated Measure

Measures created directly from the fields of a fact table are called base measures. But often we require

measures based on custom requirements, so we apply some logic and/or formula to these base measures and

create calculated measures. We will add two measures from two measure groups and create a calculated

measure.

Open the cube designer, and click on the Calculations tab. Click on “New Calculated Measure” from the

toolbar, and key in the values as shown in the below screenshot.

We have named this new calculated measure “TotalSales”. The “Parent hierarchy” specifies which parent

hierarchy the measure will be part and in this case it will be “Measures”. It’s a built-in hierarchy and all

http://dwbitutorials.wordpress.com/tag/sql-server-analysis-services/

http://dwbitutorials.wordpress.com/tag/sql-server-analysis-services-tutorials/

http://dwbitutorials.wordpress.com/tag/ssas-basics/

http://dwbitutorials.wordpress.com/tag/ssas-definitions/

http://dwbitutorials.wordpress.com/tag/ssas-glossary/

http://dwbitutorials.wordpress.com/tag/ssas-keywords/

http://dwbitutorials.wordpress.com/tag/ssas-step-by-step-guide/

http://dwbitutorials.wordpress.com/tag/ssas-tutorials/

http://dwbitutorials.wordpress.com/2013/03/23/sql-server-analysis-services-glossary/#respond



measures normally fall under this.

In the Expression, we can specify any MDX expression. Here we are adding Internet Sales Amount from

FactInternetSales and Reseller Sales Amount from FactResellerSales measure groups. You do not need to type

the values you can just drag and drop values from the panes on the left-hand side of the window.

In the additional properties you can set additional options for this measure. Save your solution, in the next

section we will create named sets and then deploy these at the same time.

Developing Named Sets

Named sets return a dataset based on defined logic. They are primarily useful to create datasets that are

often requested from the cube. Named sets are of two types: Static and Dynamic. The difference between

these two is that static named sets are calculated when they are requested the first time in a session and

dynamic named sets are calculated each time a query references it. In this section we will look at how to

create dynamic named sets. Note that dynamic named sets were not introduced until SQL Server 2008.

Open the cube designer, and click on the Calculations tab. Click on “New Named Set” from the toolbar and key in the values as shown

in the below screenshots.

Here we are creating two named sets, Internet Sales Top 25 and Reseller Sales Top 25. In these named sets,

we are returning the Top 25 products based on Internet Sales and Reseller Sales. In this formula, TopCount,

the MDX function returns top 25 records from the dataset.

In the Type selection, we can select whether we want the named set to be static or dynamic. We have

selected Dynamic as we want to create a dynamic named set.

In the Display folder selection, we can specify where the named sets will appear. By default named

sets appear in the last dimension that is used in the formula. Here we have used an attribute hierarchy from

Product dimension, so the named sets should appear in the same dimension under “Named Sets” directory.

Save and deploy the solution, and then re-connect to the cube in the “Browser” pane. You should be able to

see the calculated measure and named sets as shown in the below screenshot.

Browsing a Cube Using Excel

Once the cube is deployed and ready to host queries from the data store, client applications can start

querying the cube. One of the most user friendly client tools for business users to query a cube is Microsoft

Excel. It has a built-in interface and components to support GUI based connection, querying and formatting

of data sourced from a cube. Business users can use the familiar interface of Excel and create ad-hoc pivot

table reports by querying the cube without any detailed knowledge about querying a multi-dimensional data

source. We will connect to the cube we just created using Excel and develop a very simple report using the

cube data.

Using Excel and Creating a Pivot Table Report

We will first create a connection to the cube we have developed in the previous exercises. After connecting

the cube we will use the calculated measures and a named set to create a very basic pivot table report. For

the purpose of demonstration, Excel 2010 is used and is installed on the development machine, but you can

also use Excel 2007 to connect to the cube.

Open Microsoft Excel and select the “Data” tab from the menu ribbon. Click on “From Other Sources” and

select “From Analysis Services” option as shown in the below screenshot.

In the next step specify the SSAS server name and logon credentials. If you have everything on the local

machine, you can also use “localhost” as the server name.

If you were able to successfully connect to the specified SSAS instance with the logon credentials specified,

in the next step you should be able to select the SSAS “Sales” database and find the Sales Cube. Select the

Sales Cube and proceed to the next step.

In the next step, specify the name of the connection file to save. This file will be saved as an .ODC file and

you can reuse this connection file when you want to use the same connection in other workbooks.

After saving the file, you will be prompted with the option to select the kind of report you want to create. We

will go with the default option and select “PivotTable Report”.

After selecting “PivotTable Report”, a designer will open with options to select dimension, attributes and

measures to populate your pivot table. Select the values as shown in the below screenshot. Our intention is to

display the hierarchy we created in the Sales Territory dimension on the columns axis, Internet Sales Top 25

named set on the rows axis, and the Total Sales calculated measure in the values area.

After making the above selections, your report should look like the below screenshot. Using the features

available from the “Options” tab, you can format this report and give it a more professional look. You can try

drilling down the hierarchy, but you will see that you need to develop the hierarchies. Users who frequently

want to see sales of products to top customers, can pick up any named-set that we defined earlier. Instead of

having users define formulas for adding internet sales and reseller sales, users can just select Total Sales.

Share this:

Like this:


Like


SQL SERVER ANALYSIS SERVICES SQL SERVER ANALYSIS SERVICES TUTORIALS SSAS BASICS SSAS STEP BY STEP GUIDE SSAS TUTORIALS

LEAVE A COMMENT



Processing and Deploying a Cube

Once the cube design and development is complete, the next step is to deploy the cube. When the cube is

deployed, a database for the solution is created in the SSAS instance, if not already present. Each of the

dimensions and measure group definitions are read, and data is calculated and stored as per the design and

configuration of these objects. Once the cube is successfully deployed, client applications can connect to the

cube and browse the cube data. We will deploy the cube we have developed and test connecting to the cube.

http://dwbitutorials.wordpress.com/2013/03/23/sql-server-analysis-services-ssas-step-by-step-part-4/?share=twitter&nb=1

http://dwbitutorials.wordpress.com/2013/03/23/sql-server-analysis-services-ssas-step-by-step-part-4/?share=facebook&nb=1

http://dwbitutorials.wordpress.com/2013/03/23/sql-server-analysis-services-ssas-step-by-step-part-4/?share=google-plus-1&nb=1

http://dwbitutorials.wordpress.com/2013/03/23/sql-server-analysis-services-ssas-step-by-step-part-4/?share=email&nb=1
















http://dwbitutorials.wordpress.com/2013/03/23/sql-server-analysis-services-ssas-step-by-step-part-4/#respond



We might also face errors during deployment, and we will attempt debugging and resolving these errors.

Debugging Deployment Errors

In a development environment, ideally you would come across errors during deployment and processing of

the cube. Debugging errors is an essential part of the cube development life cycle. We will configure the

deployment properties and we should face some errors during the deployment. We will then analyze and

resolve these errors.

Right-click the solution and select Properties, this would bring up a pop-up window. Select the deployment

tab and it will bring up the deployment properties. Mention the SSAS server name and the database name

that was created for your solution in the SSAS instance. Since SSAS in installed on my local / development

machine, I have chosen server as “localhost” and name of the database as “Sales”. We will keep the rest of

the options as default for now.

Right-click the solution and select “Deploy”, this will start deploying the solution. If you have not specified

an appropriate account in the impersonation information, your deployment might fail as the account might

not have sufficient privileges.

If you have followed all the previous steps as explained, you should face errors as shown below. From the

error message you can make out that cube processing failed due to the Date dimension.

Right-click the Cube Dim Date dimension and select “Process”, and you would find the following error.

If you recall we have defined a hierarchy in the Date dimension, Year -> Semester -> Quarter -> Month,

and the attribute relation expected is one to many. If you browse the data, you will find that the same set of

semester values exist in each year, so how do you make them unique for each Quarter? When the Quarter

is processed, it will find duplicate Semester as the key columns for the Semester is Semester itself by default

which is not unique. So we need to make each attribute unique by changing its key columns.

Edit the Date dimension in the dimension editor, select the Semester attribute and edit the Key Columns

property. This should bring up a pop-up window as shown below. To make the Semester attribute unique, we

need to make the key column a composite key Year + Semester to make it unique. So select key columns as

shown below.

When you select multiple columns in the key column, the name column property becomes blank and it’s a

mandatory property. So select this property and set it again to Semester as we want to display semesters

when this is browsed.

This should solve the error we were facing on the date dimension. Duplicate keys are one of the most

common errors during dimension processing and we just learned how to resolve this issue.

Processing Dimensions and Cube

SSAS provides various cube processing methods and options to configure error logging as well as impact on

processing when errors are encountered. We will briefly look at these options, understand what processing of

the cube means, deploy our cube and try to access data from the cube.

Right-click on the dimension or cube and select “Process”, and this should bring up a similar screen with

processing options as shown in the below screenshot. Various processing options are visible in the dropdown.

Unprocess would remove all the aggregation created by the processing of the object. Process Full would also

do the same operation, but also create all the aggregations again. More reference about these options can be

found in MSDN BOL.

In the “Change Settings” and “Impact Analysis” options you will find more error configuration and other

options related to processing.

Deploy the cube and the cube should be deployed successfully. Go to the Browser pane after successful

deployment, and try to connect to the cube and browse data by dragging and dropping dimension attributes

and measures on the browsing area. Below is an example.

Share this:

Like this:


Like



LEAVE A COMMENT



Designing a Cube

Using BIDS, after the DSV is developed, the next step is to create dimensions. Dimensions are of two types:

Database Dimensions and Cube Dimensions. Database dimensions can be perceived as a master template,

and Cube dimensions can be perceived as instances / children of this master template.

We will start our development with the creation of database dimensions. If you consider a dimension as a

table, all the fields in this table can be perceived as attributes. Hierarchy in a dimension is a group of

attributes logically related to each other with a defined cardinality. Finally we will create a cube using the

dimensions we just developed, and fact tables to create dimensions (cube dimensions) and measure groups























(from fact tables).

Creating a Dimension

Dimensions are of two types: database dimension and cube dimension. The dimensions that are defined at

the solution level can be termed as a database dimension and the ones defined inside the cube are termed as

a cube dimension. Dimension Wizard is the primary means of creating a dimension. We will create a

dimension using the three dimension tables which we have included in our schema.

Right-click the Dimensions folder and select “New Dimension”, this will invoke the Dimension Wizard. The

first screen should look like the below screenshot. You have the options of using an existing table, creating a

table in the data source and using a template. We already have the dimension table in our schema and we

will use this, so select “Use an existing table” and click “Next”.

Select the DSV we created earlier in the DSV selection. We intend to create a dimension from the

DimSalesTerritory table, so select the same table. Every dimension table needs to have a key attribute, and

in this table SaleTerritoryKey is the primary key column which is guaranteed to identify each record

uniquely. It would not make sense to browse this attribute using the Key, instead SalesTerritoryRegion field

has unique values. We can also use this field as the key as well as name column. But for the purpose of our

exercise, we will use the SaleTerritoryKey field as the key column and SalesTerritoryRegion as the name

column. Though it looks inappropriate to use the key field, but when you are starting to develop an

understanding of dimensions, this will help to set a rule in your mind that the key field is always required,

mostly a surrogate key and you can set a name column to any field to facilitate a convenient browsing

mechanism.

In the next screen, you need to make a selection of the attributes that will be present in the dimension. If

you uncheck the “Enable Browsing” button, they won’t be visible to client applications when they browse the

dimension. Attributes can be of different types and you can specify the type in the Attribute Type field. The

Dimension Wizard removes the Name column you set from the key column as that is available due to the key

column. So you won’t find that field in this list of available attributes.

Now the next step is to give a name to the dimension, name it “Cube Dim Sales Territory” or anything

appropriate. After this step you have completed creating your first dimension.

In a similar manner create Product and Date dimension using the Dimension Wizard.

Creating a Hierarchy

A Hierarchy is a set of logically related attributes with a fixed cardinality. While browsing the data, a

hierarchy exposes the top level attribute which can be broken down into lower level attributes. For example,

Year -> Semester – Quarter – Month is a hierarchy. While analyzing the data, it might be required to drill

down from a higher level to a detail level, and exposing data as a hierarchy is one of the best solutions for

this.

Creating a hierarchy is as easy as dragging and dropping attributes in the hierarchy pane of the dimension

editor. We want to create a hierarchy in the Sales Territory dimension. Open Sales Territory dimension in

the dimension editor, drag and drop attributes in the hierarchy pane, click on each of them and rename

them to something appropriate. After completing this, your hierarchy should look similar to the below

screenshot.

You will find a warning icon on the hierarchy pane, which says that attribute relationships are missing

between these attributes. Country has a one-to-many relationship with Region, and Group has a one-to-

many relationship with Country. But these relationships need to be defined explicitly in the dimension. Click

on Attribute Relationships tab, right-click the region attribute and select “New Attribute Relationship”. Set

the values as shown in the below screenshot to correct the relationships between these attributes.

After you have applied the above changes, your attribute relationship tab should look like the below

screenshot.

If you have observer carefully, relationship types are of two types: Rigid and Flexible. This has an effect on

the processing of the cube. Rigid means that you do not expect the relationship to change and

Flexible means that relationship values can change. In our dataset, Group is a logical way to categorize

countries and it can change, while regions within country have limited or no change. So the relationship

type between country and group should be flexible and relationship type between region (sales territory key)

and country should be rigid. Double click on the arrow joining Key attribute and Country, and change the

relationship type as shown below.

Check out the Hierarchy pane, and you should find that the warning icon is no longer visible. You can

change the name of the hierarchy to something appropriate. In the interest of beginners who might get

confused with the distinction between attributes and hierarchy, we will keep the name as “Hierarchy”.

Edit the Date dimension, and create a Year – Semester – Quarter – Month hierarchy in the date dimension.

Creating a Cube using the Cube Wizard

A Cube acts as an OLAP database to the subscribers who need to query data from an OLAP data store. A

Cube is the main object of a SSAS solution where the majority of fine tuning, calculations, aggregation

design, storage design, defining relationship and a lot of other configurations are developed. We will create a

cube using our dimension and fact tables.

Right-click the Cube folder and select “New Cube”, and it will invoke the Cube Wizard. In the first screen

you need to select one of the methods of creating a Cube. We already have our dimensions ready, and

schema is already designed to contain dimension and fact tables. So we will select the option of “Use existing

tables”.

In the next screen, we need to select the tables which will be used to create measure groups. We already

have a DSV which has fact tables in the schema. So we will use this as shown in the below screenshot.

In the next screen, we need to select the measures that we want to create from the fact tables we just

selected in the previous screen. For now, select all the fields as shown below and move to the next screen.

In this screen you need to select any existing dimensions. We have created three dimensions and we will

include all of these dimensions as shown below.

In the next screen, we can select if we want to create any additional new dimensions from the tables

available in the DSV. We do not want to create any more dimensions, so unselect any selected tables as

shown below and move to the next screen.

Finally you need to name your cube, which is the last step of the wizard before your cube is created. Name it

something appropriate like “Sales Cube” as shown below.

Now your cube should have been created and if your cube editor is open you should find different tabs to

configure and design various features and aspects of the cube. If you look carefully in the below screenshot,

you will find FactInternetSales and FactResellerSales measure groups. Also you will find Sales Territory and

Product dimension, but Date dimension is missing. Both fact tables have multiple fields referencing the

DateKey from the Date dimension. BIDS intelligently creates three dimensions from the Date dimension and

names them to the name of the field which is referenced from the Date dimension. So you will find three

compounds of Date dimension – Ship Date, Due Date and Order Date dimensions. These are known as role-

playing dimensions.

Share this:

Like this:


Like



LEAVE A COMMENT



Overview

SQL Server Analysis Services (SSAS) is the technology from the Microsoft Business Intelligence stack, to

develop Online Analytical Processing (OLAP) solutions. In simple terms, you can use SSAS to create cubes

using data from data marts / data warehouse for deeper and faster data analysis.

Cubes are multi-dimensional data sources which have dimensions and facts (also known as measures) as its

basic constituents. From a relational perspective dimensions can be thought of as master tables and facts

can be thought of as measureable details. These details are generally stored in a pre-aggregated proprietary

format and users can analyze huge amounts of data and slice this data by dimensions very easily. Multi-

dimensional expression (MDX) is the query language used to query a cube, similar to the way T-SQL is used to

query a table in SQL Server.

Simple examples of dimensions can be product / geography / time / customer, and similar simple examples

of facts can be orders / sales. A typical analysis could be to analyze sales in Asia-pacific geography

during the past 5 years. You can think of this data as a pivot table where geography is the column-axis and

years is the row axis, and sales can be seen as the values. Geography can also have its own hierarchy like

Country->City->State. Time can also have its own hierarchy like Year->Semester->Quarter. Sales could

then be analyzed using any of these hierarchies for effective data analysis.

A typical higher level cube development process using SSAS involves the following steps:

1) Reading data from a dimensional model

2) Configuring a schema in BIDS (Business Intelligence Development Studio)

3) Creating dimensions, measures and cubes from this schema

4) Fine tuning the cube as per the requirements

5) Deploying the cube

In this tutorial we will step through a number of topics that you need to understand in order to successfully

create a basic cube. Our high level outline is as follows:

Design and develop a star-schema

Create dimensions, hierarchies, and cubes

Process and deploy a cube

Develop calculated measures and named sets using MDX

Browse the cube data using Excel as the client tool

When you start learning SSAS, you should have a reasonable relational database background. But when























you start working in a multi-dimensional environment, you need to stop thinking from a two-dimensional

(relational database) perspective, which will develop over time.

In this tutorial, we will also try to develop an understanding of OLAP development from the eyes of an OLTP

practitioner.

Creating a Sample SSAS Project and Cube

Data in Online Transaction Processing (OLTP) systems is suited to support convenient data storage for user-

facing applications. The data model in such systems is highly normalized. For data warehousing

environments, data is required to be in a schema that supports a dimensional model. Data is therefore

transformed from the OLTP storage systems to a data warehouse using ETL, so that data can be aligned in a

suitable format to create data marts from the data warehouse.

Two major theories driving the design of a data warehouse and data marts are from Ralph Kimball and Bill

Inmon which are mostly practiced in real time environments. Generally data is gathered from OLTP systems

and brought to the data warehouse. From the data warehouse, context / requirement specific data marts are

created, which can be perceived as a subset of the data warehouse. Cube source data from these data marts,

and client applications connect to the cube. The schema for a cube falls into two categories: Star and

Snowflake. In simple terms, Star Schema can be considered a more denormalized form of schema compared

to Snowflake.

Designing and developing a data warehouse is out scope for this tutorial. For the purpose of development, we

will install and use the AdventureWorks DW database. We will then create a SSAS project and create a data

source which will connect to this database. Finally we will create a star schema using a Data Source View.

Installing AdventureWorks Sample Database

AdventureWorks is the sample database available from Microsoft for different purposes as well as different

SQL Server versions. We need to use the AdventureWorks DW 2008 R2 database for our cube design and

development. This database contains dimension and fact tables with prepopulated data. We can use this

database as a launchpad to start our SSAS project. Developing a data mart is out of the scope of this tutorial,

so we will use this sample database.

To install the AdventureWorks database, navigate to the codeplex

(http://msftdbprodsamples.codeplex.com/) site and download the MSI for the version of SQL Server you are

using. This tutorial expects that the reader is using SQL Server 2008 R2, and all the exercises will be using

this version of SQL Server.

After downloading, start the installer and you should get a screen similar to the one below.

http://msftdbprodsamples.codeplex.com/

AdventureWorks Data Warehouse 2008R2 is the database we need for our exercises. Point the installer

to the SQL Server instance that you are using, and install the database. After the database in installed, open

SQL Server Management Studio to verify the databases that were installed. You should find something

similar to the below screenshot.

Expand the database higlighted above and check out the different Dim and Fact tables in this database. The

tables having the prefix Dim are suited to be used as Dimension tables, and tables having prefix Fact are

suited to be used as Fact tables.

Creating a SSAS Project

To start development, we need to create a new SSAS project using Business Intelligence Development Studio.

After creating the new project, we need to create a data source that points to the AdventureWorks DW 2008

R2 database.

Open Business Intelligence Development Studio (BIDS). Create a new SSAS Project, by selecting New Project

from the File menu. Name this project “MyOLAPProject”. As soon as the new project opens up, you should

find a list of folders in the explorer tab. Right-click on the data sources folder and select New DataSource. A

Data Source wizard will open with a Welcome screen, select Next and you should find a screen to define your

connection. We need to define a new connection, so select “New” and a screen should appear as shown

below. Point the connection to theAdventureWorksDW2008R2 database and click OK.

After this, you need to specify the impersonation information for the data source. This information is used to

specify how the solution will connect to the SSAS instance using the credentials specified. Every time

you deploy or process the solution, this connection information will be used. So keep in mind that the

account you use should have sufficient privileges. If you are not sure which account to use, it is suggested

that you use an account with administrator privileges on your development machine. Please keep in mind

that this is not recommended and should not be done in production environments. This is just suggested to

quickly get you started with cube design and development.

After specifying this information, click “Next”. This should take you to the final screen where you need to

name the data source. Name it something appropriate and click OK, which should create your data source.

Creating a Star Schema Using a Data Source View

A data warehouse or data mart from where we would source our data could contain ten to hundreds of

tables. Also one would not have the liberty to change the schema of these tables to suit the requirements of

the cube design. The Data Source View is an insulation layer between the actual data source and the

solution. We can create and modify the schema we need in this layer and this is used as the data source for

the different objects we create in the solution. A Star Schema is a schema structure where different

dimension tables are directly connected to the fact table. If you imagine a fact table in the center and

different dimensions attached to it, you would find the figure similar to a star and hence the name star

schema. It’s the simplest form of the schema and hence we will use this in our exercise.

Right-click on the Data Source View and select New Data Source View and a wizard should pop-up with a

Welcome screen. Select “Next”, and the next screen should prompt you to select a relational data source.

Select the data source we just created and click “Next”, the next screen should prompt you to select tables

that we intend to use in our solution. Select the tables as shown in the below screenshot. The below fact and

dimension tables are chosen as they are interlinked with each other and also suits the requirements of the

exercises to follow.

Select “Next”, name the DSV to something appropriate and this should finally create your Data Source View.

After arranging the tables in the DSV, your schema should look similar to the below screenshot.

Share this:

Like this:


Like

In the above figure, you can see that both the fact tables are related to all three dimensions in the same

manner. This is a typical case of a star schema. You can also browse the data, create calculated fields, assign

primary keys and carry out other similar function in this designer to modify the schema without modifying

the actual schema in the database.



LEAVE A COMMENT


SSAS INTERVIEW QUESTIONS AND ANSWERSMARCH 12, 2013

What is the difference between SSAS 2005 and SSAS2008?

1. In 2005 its not possible to create an empty cube but in 2008 we can create an empty cube.

2. A new feature in Analysis Services 2008 is the Attribute Relationships tab in the Dimension

Designer . to implement attribute relationship is complex in ssas 2005

3. we can create ONLY 2000 partitions per Measure Group in ssas 2005 and the same limit of

partitions is removed in ssas 2008.

You can answer more but if you end this with these then the interviewer feel that you are REAL

EXPERIENCED.





















http://dwbitutorials.wordpress.com/2013/03/12/ssas-interview-questions-and-answers/


What is datawarehouse in short DWH?

The datawarehouse is an informational environment that

Provides an integrated and total view of the enterprise

Makes the enterprise’s current and historical information easily available for decision making

Makes decision-support transactions possible without hindering operational systems

Renders the organization’s information consistent

Presents a flexible and interactive source of strategic information

OR a warehouse is a

Subject oriented

Integrated

Time variant

Non volatile for doing decision support

OR

Collection of data in support of management’s decision making process”. He defined the terms in the

sentence as follows.

OR

Subject oriented:

It define the specific business domain ex: banking, retail, insurance, etc…..

Integrated:

It should be in a position to integrated data from various source systems

Ex: sql,oracle,db2 etc……

Time variant:

It should be in a position to maintain the data the various time periods.

Non volatile:

Once data is inserted it can’t be changed

What is data mart?

A data mart is a subset of an organizational data store, usually oriented to a specific purpose or major data

subject that may be distributed to support business needs. Data marts are analytical data stores designed to

focus on specific business functions for a specific community within an organization.

Data marts are often derived from subsets of data in a data warehouse, though in the bottom-up data

warehouse design methodology the data warehouse is created from the union of organizational data marts.

They are 3 types of data mart they are

1. Dependent

2. Independent

3. Logical data mart

What are the difference between data mart and data warehouse?

Datawarehouse is complete data where as Data mart is Subset of the same.

http://en.wikipedia.org/wiki/Data_warehouse

Ex:

All the organisation data may related to finance department, HR, banking dept are stored in data warehouse

where as in data mart only finance data or HR department data will be stored. So data warehouse is a

collection of different data marts.

Have you ever worked on performance tuning, if yes what are the steps involved in it?

We need to identify the bottlenecks to tune the performance, to overcome the bottleneck we need to

following the following.

1. Avoid named queries

2. Unnecessary relationships between tables

3. Proper attribute relationships to be given

4. Proper aggregation design

5. Proper partitioning of data

6. Proper dimension usage design

7. Avoid unnecessary many to many relationships

8. Avoid unnecessary measures

9. Set AttributeHierarchyEnabled = FALSE to Attributes that is not required

10. Won’t take even single measure which is not necessary.

What are the difficulties faced in cube development?

This question is either to test whether you are really experienced or when he doesnot have any questions to

ask ..

You can tell any area where you feel difficult to work. But always the best answers will be the following.

1. Giving attribute relationships

2. Calculations

3. Giving dimension usage (many to many relationship)

4. Analyzing the requirements

Explain the flow of creating a cube?

Steps to create a cube in ssas

1. Create a data source.

2. Create a datasource view.

3. Create Dimensions

4. Create a cube.

5. Deploy and Process the cube.

What is a datasource or DS?

The data source is the Physical Connection information that analysis service uses to connect to the database

that host the data. The data source contains the connection string which specifies the server and the

database hosting the data as well as any necessary authentication credentials.

What is datasourceview or DSV?

A data source view is a persistent set of tables from a data source that supply the data for a particular cube.

BIDS also includes a wizard for creating data source views, which you can invoke by right-clicking on the

Data Source Views folder in Solution Explorer.

1. Datasource view is the logical view of the data in the data source.

2. Data source view is the only thing a cube can see.

What is named calculation?

A named calculation is a SQL expression represented as a calculated column. This expression appears and

behaves as a column in the table. A named calculation lets you extend the relational schema of existing

tables or views in a data source view without modifying the tables or views in the underlying data source.

Named calculation is used to create a new column in the DSV using hard coded values or by using existing

columns or even with both.

What is named query?

Named query in DSV is similar to View in Database. This is used to create Virtual table in DSV which will not

impact the underlying database. Named query is mainly used to merge the two or more table in the

datasource view or to filter columns of a table.

Why we need named queries?

A named query is used to join multiple tables, to remove unnecessary columns from a table of a database.

You can achieve the same in database using Views but this Named Queries will be the best bet whe you don’t

have access to create Views in database.

How will you add a new column to an existing table in data source view?

By using named calculations we can add a new column to an existing table in the data source view. Named

Calculation is explained above.

What is dimension table?

A dimension table contains hierarchical data by which you’d like to summarize. A dimension table contains

specific business information, a dimension table that contains the specific name of each member of the

dimension. The name of the dimension member is called an “attribute”

The key attribute in the dimension must contain a unique value for each member of the dimension. This key

attribute is called “primary key column”

The primary key column of each dimension table corresponding to the one of the key column in any related

fact table.

What is fact table?

A fact table contains the basic information that you wish to summarize. The table that stores the detailed

value for measure is called fact table. In simple and best we can define as “The table which contains

METRICS” that are used to analyse the business.

It consists of 2 sections

1) Foregine key to the dimesion

2) measures/facts(a numerical value that used to monitor business activity)

What is Factless fact table?

This is very important interview question. The “Factless Fact Table” is a table which is similar to Fact Table

except for having any measure; I mean that this table just has the links to the dimensions. These tables

enable you to track events; indeed they are for recording events.

Factless fact tables are used for tracking a process or collecting stats. They are called so because, the fact

table does not have aggregatable numeric values or information. They are mere key values with reference to

the dimensions from which the stats can be collected

What is attribute relationships, why we need it?

Attribute relationships are the way of telling the analysis service engine that how the attributes are related

with each other. It will help to relate two or more attributes to each other.Processing time will be decreased

if proper relationships are given. This increases the Cube Processing performance and MDX query

performance too.

In Microsoft SQL Server Analysis Services, attributes within a dimension are always related either directly or

indirectly to the key attribute. When you define a dimension based on a star schema, which is where all

dimension attributes are derived from the same relational table, an attribute relationship is automatically

defined between the key attribute and each non-key attribute of the dimension. When you define a

dimension based on a snowflake schema, which is where dimension attributes are derived from multiple

related tables, an attribute relationship is automatically defined as follows:

Between the key attribute and each non-key attribute bound to columns in the main dimension

table.

Between the key attribute and the attribute bound to the foreign key in the secondary table that

links the underlying dimension tables.

Between the attribute bound to foreign key in the secondary table and each non-key attribute bound

to columns from the secondary table.

How many types of attribute relationships are there?

They are 2 types of attribute relationships they are

1. Rigid

2. Flexible

Rigid: In Rigid relationships where the relationship between the attributes is fixed, attributes will not

change levels or their respective attribute relationships.

Example: The time dimension. We know that month “January 2009″ will ONLY belong to Year “2009″ and it

wont be moved to any other year.

Flexible : In Flexible relationship between the attributes is changed.

Example: An employee and department. An employee can be in accounts department today but it is possible

that the employee will be in Marketing department tomorrow.

How many types of dimensions are there and what are they?

They are 3 types of dimensions:

1. confirm dimension

2. junk dimension

3. degenerate attribute

What are confirmed dimensions, junk dimension and degenerated dimensions?

Confirm dimension: It is the dimension which is sharable across the multiple facts or data model. This is also

called as Role Playing Dimensions.

junk dimension: A number of very small dimensions might be lumped (a small irregularly shaped) together to

form a single dimension, a junk dimension – the attributes are not closely related. Grouping of Random flags

and text Attributes in a dimension and moving them to a separate sub dimension is known as junk

dimension.

Degenerated dimension: In this degenerate dimension contains their values in fact table and

the dimension id not available in dimension table. Degenerated Dimension is a dimension key without

corresponding dimension.

Example: In the PointOfSale Transaction Fact table, we have:

Date Key (FK), Product Key (FK), Store Key (FK), Promotion Key (FP), and POS Transaction Number

Date Dimension corresponds to Date Key, Production Dimension corresponds to Production Key. In a

traditional parent-child database, POS Transactional Number would be the key to the transaction header

record that contains all the info valid for the transaction as a whole, such as the transaction date and

store identifier. But in this dimensional model, we have already extracted this info into other dimension.

Therefore, POS Transaction Number looks like a dimension key in the fact table but does not have the

corresponding dimension table.

What are the types of database schema?

They are 3 types of database schema they are

1. Star

2. Snowflake

3. Starflake

What is star, snowflake and star flake schema?

Star schema: In star schema fact table will be directly linked with all dimension tables. The star schema’s

dimensions are denormalized with each dimension being represented by a single table. In a star schema a

central fact table connects a number of individual dimension tables.

Snowflake: The snowflake schema is an extension of the star schema, where each point of the star

explodes into more points. In a star schema, each dimension is represented by a single dimensional table,

whereas in a snowflake schema, that dimensional table is normalized into multiple lookup tables, each

representing a level in the dimensional hierarchy. In snow flake schema fact table will be linked directly as

well as there will be some intermediate dimension tables between fact and dimension tables.

Star flake: A hybrid structure that contains a mixture of star(denormalized) and snowflake(normalized)

schema’s.

How will you hide an attribute?

We can hide the attribute by selecting “AttributeHierarchyVisible = False” in properties of the attribute.

How will you make an attribute not process?

By selecting “ AttributeHierarchyEnabled = False”, we can make an attribute not in process.

What is use of IsAggregatable property?

In Analysis Service we generally see all dimension has All member. This is because of IsAggregatable

property of the attribute. You can set its value to false, so that it will not show All member. Its default

member for that attribute. If you hide this member than you will have to set other attribute value to default

member else it will pick some value as default and this will create confusion in browsing data if someone is

not known to change in default member.

http://www.1keydata.com/datawarehousing/www.1keydata.com/datawarehousing/star-schema.html

What are key, name and value columns of an attribute?

Key column of any attribute: Contains the column or columns that represent the key for the attribute,

which is the column in the underlying relational table in the data source view to which the attribute is

bound. The value of this column for each member is displayed to users unless a value is specified for the

NameColumn property.

Name column of an attribute: Identifies the column that provides the name of the attribute that is

displayed to users, instead of the value in the key column for the attribute. This column is used when the key

column value for an attribute member is cryptic or not otherwise useful to the user, or when the key column

is based on a composite key. The NameColumn property is not used in parent-child hierarchies; instead, the

NameColumn property for child members is used as the member names in a parent-child hierarchy.

Value columns of an attribute: Identifies the column that provides the value of the attribute. If the

NameColumn element of the attribute is specified, the same DataItem values are used as default values for

the ValueColumn element. If the NameColumn element of the attribute is not specified and the KeyColumns

collection of the attribute contains a single KeyColumn element representing a key column with a string data

type, the same DataItem values are used as default values for the ValueColumn element.

What is hierarchy, what are its types and difference between them?

A hierarchy is a very important part of any OLAP engine and allows users to drill down from summary levels

hierarchies represent the way user expect to explore data at more detailed level

hierarchies is made up of multipule levels creating the structure based on end user requirements.

->years->quarter->month->week ,are all the levels of calender hierarchy

They are 2 types of hierarchies they are

1. Natural hierarchy

2. Unnatural hierarchy

Natural hierarchy: This means that the attributes are intuitively related to one another. There is a clear

relationship from the top of the hierarchy to the bottom.

Example: An example of this would be date: year, quarter and month follow from each other, and in part,

define each other.

Unnatural hierarchy: This means that the attributes are not clearly related.

Example: An example of this might be geography; we may have country -> state -> city, but it is not clear

where Province might sit.

What is Attribute hierarchy?

An attribute hierarchy is created for every attribute in a dimension, and each hierarchy is available for

dimensioning fact data. This hierarchy consists of an “All” level and a detail level containing all members of

the hierarchy.

you can organize attributes into user-defined hierarchies to provide navigation paths in a cube. Under

certain circumstances, you may want to disable or hide some attributes and their hierarchies.

What is use of AttributeHierarchyDisplayFolder property ?

AttributeHierarchyDisplayFolder: Identifies the folder in which to display the associated attribute

hierarchy to end users. For example if I set the property value as “Test” to all the Attributes of a dimension

then a folder with the name “Test” will be created and all the Attributes will be placed into the same.

What is use of AttributeHierarchyEnabled?

AttributeHierarchyEnabled: Determines whether an attribute hierarchy is generated by Analysis

Services for the attribute. If the attribute hierarchy is not enabled, the attribute cannot be used in a user-

defined hierarchy and the attribute hierarchy cannot be referenced in Multidimensional Expressions (MDX)

statements.

What is use of AttributeHierarchyOptimizedState?

AttributeHierarchyOptimizedState: Determines the level of optimization applied to the attribute

hierarchy. By default, an attribute hierarchy is FullyOptimized, which means that Analysis Services builds

indexes for the attribute hierarchy to improve query performance. The other option, NotOptimized, means

that no indexes are built for the attribute hierarchy. Using NotOptimized is useful if the attribute hierarchy

is used for purposes other than querying, because no additional indexes are built for the attribute. Other

uses for an attribute hierarchy can be helping to order another attribute.

What is use of AttributeHierarchyOrdered ?

AttributeHierarchyOrdered: Determines whether the associated attribute hierarchy is ordered. The

default value is True. However, if an attribute hierarchy will not be used for querying, you can save

processing time by changing the value of this property to False.

What is the use of AttributeHierarchyVisible ?

AttributeHierarchyVisible : Determines whether the attribute hierarchy is visible to client applications.

The default value is True. However, if an attribute hierarchy will not be used for querying, you can save

processing time by changing the value of this property to False.

What are types of storage modes?

There are three standard storage modes in OLAP applications

1. MOLAP

2. ROLAP

3. HOLAP

Compare the Three Storage Modes ?

Summary and comparison

Basic

Storage

Mode

Storage Location

for Detail Data

Storage Location

for Summary/

Aggregations

Storage space

requirement

Query

Response

Time

Processing

Time

Latency

MOLAP Multidimensional

Format

Multidimensional

Format

MediumBecause

detail data is

stored in

compressed

format.

Fast Fast High

HOLAP Relational

Database

Multidimensional

Format

Small Medium Fast Medium

ROLAP Relational

Database

Relational

Database

Large Slow Slow Low

What is MOLAP and its advantage?

MOLAP (Multi dimensional Online Analytical Processing) : MOLAP is the most used storage type. Its designed

to offer maximum query performance to the users. the data and aggregations are stored in a

multidimensional format, compressed and optimized for performance. This is both good and bad. When a

cube with MOLAP storage is processed, the data is pulled from the relational database, the aggregations are

performed, and the data is stored in the AS database. The data inside the cube will refresh only when the

cube is processed, so latency is high.

Advantages:

1. Since the data is stored on the OLAP server in optimized format, queries (even complex

calculations) are faster than ROLAP.

2. The data is compressed so it takes up less space.

3. And because the data is stored on the OLAP server, you don’t need to keep the connection to the


4. Cube browsing is fastest using MOLAP.

What is ROLAP and its advantage?

ROLAP (Relational Online Analytical Processing) : ROLAP does not have the high latency disadvantage of

MOLAP. With ROLAP, the data and aggregations are stored in relational format. This means that there will

be zero latency between the relational source database and the cube.

Disadvantage of this mode is the performance, this type gives the poorest query performance because no

objects benefit from multi dimensional storage.

Advantages:

1. Since the data is kept in the relational database instead of on the OLAP server, you can view the

data in almost real time.

2. Also, since the data is kept in the relational database, it allows for much larger amounts of data,

which can mean better scalability.

3. Low latency.

What is HOLAP and its advantage?

Hybrid Online Analytical Processing (HOLAP): HOLAP is a combination of MOLAP and ROLAP. HOLAP stores

the detail data in the relational database but stores the aggregations in multidimensional format. Because of

this, the aggregations will need to be processed when changes are occur. With HOLAP you kind of have

medium query performance: not as slow as ROLAP, but not as fast as MOLAP. If, however, you were only

querying aggregated data or using a cached query, query performance would be similar to MOLAP. But when

you need to get that detail data, performance is closer to ROLAP.

Advantages:

1. HOLAP is best used when large amounts of aggregations are queried often with little detail data,

offering high performance and lower storage requirements.

2. Cubes are smaller than MOLAP since the detail data is kept in the relational database.

3. Processing time is less than MOLAP since only aggregations are stored in multidimensional format.

4. Low latency since processing takes place when changes occur and detail data is kept in the


What are Translations and its use?

Translation: The translation feature in analysis service allows you to display caption and attributes names

that correspond to a specific language. It helps in providing GLOBALIZATION to the Cube.

What is Database dimension?

All the dimensions that are created using NEW DIMENSION Wizard are database dimensions. In other

words, the dimensions which are at Database level are called Database Dimensions.

What is Cube dimension?

A cube dimension is an instance of a database dimension within a cube is called as cube dimension. A

database dimension can be used in multiple cubes, and multiple cube dimensions can be based on a single

database dimension

Difference between Database dimension and Cube dimension?

1. The Database dimension has only Name and ID properties, whereas a Cube dimension has several

more properties.

2. Database dimension is created one where as Cube dimension is referenced from database dimension.

3. Database dimension exists only once.where as Cube dimensions can be created more than one using

ROLE PLAYING Dimensions concept.

How will you add a dimension to cube?

To add a dimension to a cube follow these steps.

1. In Solution Explorer, right-click the cube, and then click View Designer.

1. In the Design tab for the cube, click the Dimension Usage tab.

2. Either click the Add Cube Dimension button, or right-click anywhere on the work surface and then

click Add Cube Dimension.

3. In the Add Cube Dimension dialog box, use one of the following steps:

4. To add an existing dimension, select the dimension, and then click OK.

5. To create a new dimension to add to the cube, click New dimension, and then follow the steps in the

Dimension Wizard.

What is SCD (slowly changing dimension)?

Slowly changing dimensions (SCD) determine how the historical changes in the dimension tables are

handled. Implementing the SCD mechanism enables users to know to which category an item belonged to in

any given date.

What are types of SCD?

It is a concept of STORING Historical Changes and when ever an IT guy finds a new way to store then a new

Type will come into picture. Basically there are 3 types of SCD they are given below

1. SCD type1

2. SCD type2

3. SCD type3

What is Type1, Type2, Type3 of SCD?

Type 1: In Type 1 Slowly Changing Dimension, the new information simply overwrites the original

information. In other words, no history is kept.

In our example, recall we originally have the following table:

Customer Key Name State

1001 Christina Illinois

After Christina moved from Illinois to California, the new information replaces the new record, and we have

the following table:


1001 Christina California

Advantages: This is the easiest way to handle the Slowly Changing Dimension problem, since there is no

need to keep track of the old information.

Disadvantages: All history is lost. By applying this methodology, it is not possible to trace back in history.

Usage: About 50% of the time.

When to use Type 1: Type 1 slowly changing dimension should be used when it is not necessary for the data

warehouse to keep track of historical changes.

Type 2: In Type 2 Slowly Changing Dimension, a new record is added to the table to represent the new

information. Therefore, both the original and the new record will be present. The new record gets its own

primary key.




After Christina moved from Illinois to California, we add the new information as a new row into the table:



1005 Christina California

Advantages: This allows us to accurately keep all historical information.

Disadvantages:

1. This will cause the size of the table to grow fast. In cases where the number of rows for the table is

very high to start with, storage and performance can become a concern.

2. This necessarily complicates the ETL process.

Usage: About 50% of the time.

Type3 : In Type 3 Slowly Changing Dimension, there will be two columns to indicate the particular attribute

of interest, one indicating the original value, and one indicating the current value. There will also be a

column that indicates when the current value becomes active.




To accommodate Type 3 Slowly Changing Dimension, we will now have the following columns:

Customer Key,Name,OriginalState,CurrentState,Effective Date

After Christina moved from Illinois to California, the original information gets updated, and we have the

following table (assuming the effective date of change is January 15, 2003):

Customer Key Name OriginalState CurrentState Effective Date

1001 Christina Illinois California 15-JAN-2003

Advantages:

1. This does not increase the size of the table, since new information is updated.

2. This allows us to keep some part of history.

Disadvantages: Type 3 will not be able to keep all history where an attribute is changed more than once. For

example, if Christina later moves to Texas on December 15, 2003, the California information will be lost.

Usage: Type 3 is rarely used in actual practice.

What is role playing dimension with two examples?

Role play dimensions: We already discussed about this. This is nothing but CONFIRMED Dimensions. A

dimension can play different role in a fact table you can recognize a roleplay dimension when there are

multiple columns in a fact table that each have foreign keys to the same dimension table.

Ex1: There are three dimension keys in the factinternalsales,factresellersales tables which all refer to the

dimtime table,the same time dimension is used to track sales by that contain either of these fact table,the

corresponding role-playing dimension are automatically added to the cube.

Ex2 : In retail banking, for checking account cube we could have transaction date dimension and effective

date dimension. Both dimensions have date, month, quarter and year attributes. The formats of attributes

are the same on both dimensions, for example the date attribute is in ‘dd-mm-yyyy’ format. Both dimensions

have members from 1993 to 2010.

What is measure group, measure?

Measure groups : These measure groups can contain different dimensions and be at different granularity

but so long as you model your cube correctly, your users will be able to use measures from each of these

measure groups in their queries easily and without worrying about the underlying complexity.

Creating multiple measure groups : To create a new measure group in the Cube Editor, go to the Cube

Structure tab and right-click on the cube name in the Measures pane and select ‘New Measure Group’. You’ll

then need to select the fact table to create the measure group from and then the new measure group will be

created; any columns that aren’t used as foreign key columns in the DSV will automatically be created as

measures, and you’ll also get an extra measure of aggregation type Count. It’s a good idea to delete any

measures you are not going to use at this stage.

Measures : Measures are the numeric values that our users want to aggregate, slice, dice and otherwise

analyze, and as a result, it’s important to make sure they behave the way we want them to. One of the

fundamental reasons for using Analysis Services is that, unlike a relational database it allows us to build into

our cube design business rules about measures: how they should be formatted, how they should aggregate up,

how they interact with specific dimensions and so on.

What is attribute?

An attribute is a specification that defines a property of an object, element, or file. It may also refer to or set

the specific value for a given instance of such.

What is surrogate key?

A surrogate key is the SQL generated key which acts like an alternate primary key for the table in database,

Data warehouses commonly use a surrogate key to uniquely identify an entity. A surrogate is not generated

by the user but by the system. A primary difference between a primary key and surrogate key in few

databases is that primarykey uniquely identifies a record while a Surrogatekey uniquely identifies an entity.

Ex: An employee may be recruited before the year 2000 while another employee with the same name may be

recruited after the year 2000. Here, the primary key will uniquely identify the record while the surrogate

key will be generated by the system (say a serial number) since the SK is NOT derived from the data.

How many types of relations are there between dimension and measure group?

They are six relation between the dimension and measure group, they are

1. No Relationship

2. Regular

3. Refernce

4. Many to Many

5. Data Mining

6. Fact

What is regular type, no relation type, fact type, referenced type, many-to-many type

with example?

No relationship: The dimension and measure group are not related.

Regular: The dimension table is joined directly to the fact table.

Referenced: The dimension table is joined to an intermediate table, which in turn,is joined to the fact

table.

Many to many:The dimension table is to an intermediate fact table,the intermediate fact table is joined ,

in turn, to an intermediate dimension table to which the fact table is joined.

Data mining:The target dimension is based on a mining model built from the source dimension. The source

dimension must also be included in the cube.

Fact table: The dimension table is the fact table.

What are calculated members and what is its use?

Calculations are item in the cube that are eveluated at runtime

Calculated members: You can create customized measures or dimension members, called calculated

members, by combining cube data, arithmetic operators, numbers, and/or functions.

Example: You can create a calculated member called Marks that converts dollars to marks by multiplying an

existing dollar measure by a conversion rate. Marks can then be displayed to end users in a separate row or

column. Calculated member definitions are stored, but their values exist only in memory. In the preceding

example, values in marks are displayed to end users but are not stored as cube data.

What are KPIs and what is its use?

In Analysis Services, a KPI is a collection of calculations that are associated with a measure group in a cube

that are used to evaluate business success. We use KPI to see the business at the particular point, this is

represents with some graphical items such as traffic signals,ganze etc

What are actions, how many types of actions are there, explain with example?

Actions are powerful way of extending the value of SSAS cubes for the end user. They can click on a cube or

portion of a cube to start an application with the selected item as a parameter, or to retrieve information

about the selected item.

One of the objects supported by a SQL Server Analysis Services cube is the action. An action is an event that

a user can initiate when accessing cube data. The event can take a number of forms. For example, a user

might be able to view a Reporting Services report, open a Web page, or drill through to detailed information

related to the cube data

Analysis Services supports three types of actions..

Report action: Report action Returns a Reporting Services report that is associated with the cube data on

which the action is based.

Drill through: Drillthrough Returns a result set that provides detailed information related to the cube data

on which the action is based.

Standard: Standard has five action subtypes that are based on the specified cube data.

Dataset: Returns a mutlidimensional dataset.

Proprietary: Returns a string that can be interpreted by a client application.

Rowset: Returns a tabular rowset.

Statement: Returns a command string that can be run by a client application.

URL: Returns a URL that can be opened by a client application, usually a browser.

What is partition, how will you implement it?

You can use the Partition Wizard to define partitions for a measure group in a cube. By default, a single

partition is defined for each measure group in a cube. Access and processing performance, however, can

degrade for large partitions. By creating multiple partitions, each containing a portion of the data for a

measure group, you can improve the access and processing performance for that measure group.

What is the minimum and maximum number of partitions required for a measure

group?

In 2005 a MAX of 2000 partitions can be created per measure group and that limit is lifted in later versions.

In any version the MINIMUM is ONE Partition per measure group.

What are Aggregations and its use?

Aggregations provide performance improvements by allowing Microsoft SQL Server Analysis Services (SSAS)

to retrieve pre-calculated totals directly from cube storage instead of having to recalculate data from an

underlying data source for each query. To design these aggregations, you can use the Aggregation Design

Wizard. This wizard guides you through the following steps:

1. Selecting standard or custom settings for the storage and caching options of a partition, measure

group, or cube.

2. Providing estimated or actual counts for objects referenced by the partition, measure group, or cube.

3. Specifying aggregation options and limits to optimize the storage and query performance delivered

by designed aggregations.

4. Saving and optionally processing the partition, measure group, or cube to generate the defined

aggregations.

5. After you use the Aggregation Design Wizard, you can use the Usage-Based Optimization Wizard to

design aggregations based on the usage patterns of the business users and client applications that

query the cube.

What is perspective, have you ever created perspective?

Perspectives are a way to reduce the complexity of cubes by hidden elements like measure groups, measures,

dimensions, hierarchies etc. It’s nothing but slicing of a cube, for ex we are having retail and hospital data

and end user is subscribed to see only hospital data, then we can create perspective according to it.

What is deploy, process and build?

Bulid: Verifies the project files and create several local files.

Deploy: Deploy the structure of the cube(Skeleton) to the server.

Process: Read the data from the source and build the dimesions and cube structures

Elaborating the same is given below.

Build: Its is a used to process the data of the cube database. Build is a version of a program. As a rule, a

build is a pre-release version and as such is identified by a build number, rather than by a release number.

Reiterative (repeated) builds are an important part of the development process. Throughout development,

application components are collected and repeatedly compiled for testing purposes, to ensure a reliable final

product. Build tools, such as make or Ant, enable developers to automate some programming tasks. As a verb,

to build can mean either to write code or to put individual coded components of a program together.

Deployment: During development of an Analysis Services project in Business Intelligence Development

Studio, you frequently deploy the project to a development server in order to create the Analysis Services

database defined by the project. This is required to test the project.

for example, to browse cells in the cube, browse dimension members, or verify key performance indicators

(KPIs) formulas.

What is the maximum size of a dimension?

The maximum size of the dimension is 4 gb.

What are the types of processing and explain each?

They are 6 types of processing in ssas ,they are

Process Full

Process Data

Process Index

Process Incremental

Process Structure

UnProcess

Process Full: Processes an Analysis Services object and all the objects that it contains. When Process Full

is executed against an object that has already been processed, Analysis Services drops all data in the object,

and then processes the object. This kind of processing is required when a structural change has been made

to an object, for example, when an attribute hierarchy is added, deleted, or renamed. This processing option

is supported for cubes, databases, dimensions, measure groups, mining models, mining structures, and

partitions.

Process Data: Processes data only without building aggregations or indexes. If there is data is in the

partitions, it will be dropped before re-populating the partition with source data. This processing option is

supported for dimensions, cubes, measure groups, and partitions.

http://searchsoftwarequality.techtarget.com/sDefinition/0,,sid92_gci212834,00.html

http://whatis.techtarget.com/definition/0,,sid9_gci211824,00.html


http://searchenterpriselinux.techtarget.com/sDefinition/0,,sid39_gci873167,00.html


Process Index: Creates or rebuilds indexes and aggregations for all processed partitions. This option causes

an error on unprocessed objects. This processing option is supported for cubes, dimensions, measure groups,

and partitions.

Process Increment: Adds newly available fact data and process only to the relevant partitions. This

processing option is supported for measure groups, and partitions.

Process Structure: If the cube is unprocessed, Analysis Services will process, if it is necessary, all the

cube’s dimensions. After that, Analysis Services will create only cube definitions. If this option is applied to a

mining structure, it populates the mining structure with source data. The difference between this option and

the Process Full option is that this option does not iterate the processing down to the mining models

themselves. This processing option is supported for cubes and mining structures.

Unprocess : Drops the data in the object specified and any lower-level constituent objects. After the data is

dropped, it is not reloaded. This processing option is supported for cubes, databases, dimensions, measure

groups, mining models, mining structures, and partitions.

Process Default: Detects the process state of an object, and performs processing necessary to deliver

unprocessed or partially processed objects to a fully processed state. This processing option is supported for

cubes, databases, dimensions, measure groups, mining models, mining structures, and partitions.

What is a cube?

The basic unit of storage and analysis in Analysis Services is the cube. A cube is a collection of data that’s

been aggregated to allow queries to return data quickly.

For example, a cube of order data might be aggregated by time period and by title, making the cube fast

when you ask questions concerning orders by week or orders by title.

What is AMO?

The full form of AMO is Analysis Managament Objects. This is used to create or alter cubes from .NET code.

After creating the cube, if we added a new column to the OLTP table then how you add

this new attribute to the cube?

Just open the datasourceview and on right click we find the option REFRESH. Click the REFRESH then it will

add new attributes to the table which can be added to Cube.

REAL TIME INTERVIEW QUESTIONS -

What is the size of the Cube in your last Project?

Answer to this question varies from project to project and mainly depends on how BIG is your database and

how COMPLEX the database design is. Generally for the database with a TRANSACTION TABLE of 50 crore

records, the cube size will be around 100GB. So, better go with 100GB as answer to this question.

What is size of the database in your last Project?

You can expect this question immediately after you answer 100GB to the last question. The database size will

be 600 to 800GB for which the cube will come to 100 GB. So go with 800GB for this question.

What is size of the fact(Transaction) table in your last Project?

This will be the next question if you answer 800GB as your dataabase size. Here he is not expecting SIZE in

GBs but the interviewer will be expecting NUMBER OF ROWS in the Transaction table. Go with 57Crore

records for this question.

How frequently you process the cube?

You have to be very careful here. Frequency of processing cube depends on HOW FREQUENTLY YOU ARE

GETTING NEW DATA. Once the new data comes then SSIS team loads it and send a mail to SSAS team after

load is completed successfully. Once SSAS team receives the mail then these guys will look for best time to

PROCESS.

Typically we get data either Weekly or Monthly. So you can say that the processing of the cube will be done

either Weekly or monthly.

How frequently you get DATA from clients?

This answer should be based on your last answer. IF you answered WEEKLY to last question then the Answer

to this question also should be WEEKLY. IF MONTHLY for last question then this answer also should be

MONTHLY.

What type of Processing Options you used to process the cube in your Project?

This is the toughest question to answer. This depends on DATA you have and CLIENTS requirements. Let me

explain here.

1. If the database is SMALL, let’s say it has only 1 crore records then people do FULL PROCESS as it

wont take much time.

2. If the database is MEDIUM, let’s say it has only 15 crore records then people prefer to do

INCREMENTAL PROCESS unless CLIENTS ask us to do FULL PROCESS as it takes little bit of time.

3. If the database is HUGE, let’s say it has more than 35 to 40 crore records then people prefer to do

INCREMENTAL PROCESS unless CLIENTS ask us to do FULL PROCESS as it takes lot of time. In this

case we TRY to convince clients for INCREMENTAL and if they don’t agree then we don’t have any

other option.

4. Incremental process will come into picture ONLY when there is no updates to the OLD data i.e no

changes to already existing data else NO OTHER OPTION than FULL PROCESS.

How you provide security to cube?

By defining roles we provide security to cubes. Using roles we can restrict users from accessing

restricted data. Procedure as follows -

1. Define Role

2. Set Permission

3. Add appropriate Users to the role

How you move the cube from one server to another?

There are many ways to do the same. Let me explain four here and cleverly you can say “I worked on 4 SSAS

projects till date and implemented different types in all the four.”

1. Backup and restore – This is the simplest way. Take the Backup from development server and copy

the backup to FTP folder of clients. After doing this drop a mail to Client’s Admin and he will take

care of RESTORE part.

2. Directly PROCESS the cube in PRODUCTION environment. For this you need access to Production

which will not be given by clients unless the clients are *********. One of the client I worked for given

FULL access to me ..

3. Under Srart –> All Programs –> Sql Server –> Analysis Services you can see deployment wizard.

This is one way of moving the cube. This method has some steps to follow. First deploy your cube and

FOUR files will be created in BIN folder of PROJECT folder. Copy those FOUR files and paste in

Production server in any directory. Then OPEN this DEPLOYMENT Wizard in production and when it

ask for Database file then point to the location where you copied the files. After that

NEXT,NEXT,NEXT … OK .. Cube will be deployed and processed.

4. This way is most beautiful one. Synchronization, In this we will first deploy and process the cube in

STAGING ENVIRONMENT and then we will go to production server. Connect to Analysis services in

SSMS and select Synchronize by right clicking on Databases folder in SSMS of analysis services.

Then select source as STAGING SERVER and then click on OK. The changes in the cube present in

the Staging server will be copied to the production server.

What is the toughest challenge you face in your Project?

There are couple of this where we face difficulty.

1. While working on RELATIONSHIPS between Measure Groups and Dimensions.

2. Working on Complex calculations

3. Performance tuning

How you created Partitions of the cube in your Last Project?

Partitions can be created on different data. Few people do it on PRODUCT NAME wise and many prefer to do

it on DATE data wise. you go with DATE wise.

In dates, we can create MONTH wise,WEEK wise,QUARTER wise and some times YEAR wise. This all depends

on how much data you are coming per WEEK or MONTH or QUARTER … If you are getting 50 lakhs records

per month then tell you do MONTH wise.

How many dimensions in your last cube?

47 to 50.

How many measure groups in your last cube?

Total 10 and in that 4 are Fact tables and remaining 6 are Fact less fact tables.

What is the Schema of your last cube?

Snowflake

Why not STAR Schema ?

My data base design doesn’t support STAR Schema.

What are the different relationships that you are used in your cube?

1. Regular

2. Referenced

3. Many to Many

4. Fact

5. No Relationship

Have you created the KPI’s , If then Explain?

Don’t add much to this as the questions in this will be tricky. Just tell that you worked on couple of KPI and

you have basic knowledge on this. (Don’t worry, this is not MANDATORY)

How you define Aggregations in your Project?

We defined the aggregations for MOST FREQUENTLY USED data in SSRS reports.

Size of SSAS team in your last Project?

Just 2 guys as we guys are really in demand and lot of scarcity:)

How many Resources worked on same Cube in your Project?

Only 2 and one in morning shift and another in Evening shift.

Share this:

Like this:


Like

How much time it take to Process the Cube?

This is Very very important question. This again depends on the SIZE of database,Complexity of the database

and your server settings. For database with 50 cr transaction records, it generally takes 3.5 hrs.

How many Calculation you done in Your Project?

I answer more than 5000 and if you tell the same then you are caught unless you are super good in MDX.

Best answer for you is “Worked on 50 calculations”.

ANALYTICS SSAS DATA SOURCE VIEWS MSBI INTERVIEW QUESTIONS AND ANSWERS SSAS INTERVIEW QUESTIONS

SSAS INTERVIEW QUESTIONS AND ANSWERS LEAVE A COMMENT


MDX 101: INTRODUCTION TO MDXJANUARY 14, 2013

About MDX

Multi Dimensional eXpression language (MDX) is an extremely powerful tool that can allow you to fully realize the

potential of your multidimensional database/cube. In this article, I’ll be covering the basics of MDX and will hopefully

provide you with a solid understanding of MDX query syntax, why MDX is so powerful, and how you can use MDX to

add serious value to your B.I. solution. According to MSDN, the purpose of MDX is to make accessing data from

multiple dimensions easier. And thats exactly what MDX does, which makes it ideal to view data broken down by

multiple categories and aggregated. You may be asking yourself, “Why not just write a SQL Stored Procedure and

query the data warehouse? Anything I can get from the cube I can get from the data warehouse!” While this is true,

imagine you have been given the requirements for a report, which should include Actual Sales, Sales Goals, and Prior

Period Sales broken down by Sales Region, Stores, and Month grouped by Year. Writing a stored procedure to pull the

required data from the data warehouse would be large and messy and more than likely the procedure would execute

very slowly simply because a data warehouse is not optimal for reporting on aggregated data. Because a cube stores

your data in an aggregated format, MDX can allow you to view that aggregated data quickly and efficiently across

multiple dimensions (such as Region, Stores, and Month, just to name a few) in only a few lines of code. In short, MDX

can allow you build otherwise extremely complicated queries and reports with just a fraction of the code. Not only will

the queries and reports be easier to build, they will execute much faster than if you had queried the same data with a

SQL statement. With that said, lets go over some of the terms we’ll need to understand before we can begin writing our

first MDX query.

http://dwbitutorials.wordpress.com/2013/01/14/mdx-101-introduction-to-mdx/

http://dwbitutorials.wordpress.com/2013/03/12/ssas-interview-questions-and-answers/?share=twitter&nb=1

http://dwbitutorials.wordpress.com/2013/03/12/ssas-interview-questions-and-answers/?share=facebook&nb=1

http://dwbitutorials.wordpress.com/2013/03/12/ssas-interview-questions-and-answers/?share=google-plus-1&nb=1

http://dwbitutorials.wordpress.com/2013/03/12/ssas-interview-questions-and-answers/?share=email&nb=1






http://dwbitutorials.wordpress.com/tag/msbi-interview-questions-and-answers/

http://dwbitutorials.wordpress.com/tag/ssas-interview-questions/

http://dwbitutorials.wordpress.com/tag/ssas-interview-questions-and-answers/

http://dwbitutorials.wordpress.com/2013/03/12/ssas-interview-questions-and-answers/#respond



http://msdn.microsoft.com/en-us/library/aa216772(v=SQL.80).aspx

Basic Terms:

Cube: A cube is a collection of measures, or facts, and dimensions which are based on tables and views, usually from a

data warehouse. Within the cube are every possible aggregation of measures for every combination of dimensions. For

example, Sales may be a measure in a cube. At any time you may want to view Sales broken down by Product. The

aggregated Sales for each product are stored within the cube. This is why MDX queries against the cube are so much

faster than querying the data warehouse. In the image below, you can see a graphical representation of a cube with

multiple axes featuring Time, Product, and Store dimensions. Also, be aware that an Analysis Services cube is not a

cube geometrically speaking. The term cube is an accepted industry standard when referring to a multidimensional

database.

Dimension: A dimension organizes data in relation to a certain interest. An example of a dimension in the Adventure

Works cube is the Product dimension. A dimension can be based on a table, a view, or a select statement bringing

together data from multiple tables and/or views. Attribute: An attribute can be thought of as a qualitative way to

describe a dimension. For example, some of the attributes of the Product dimension in the Adventure Works cube are

Category, Model Name, Product, and Style, just to name a few. Measures (Facts): A measure, or fact, contains

quantitative data that can be aggregated. Some of the measures in the Reseller Sales measure group in the Adventure

Works cube are Reseller Tax Amount, Reseller Freight Amount, and Reseller Order Quantity. These measures can be

aggregated across dimensions in our cube. For instance, it may be valuable to the end user to see Reseller Order

Quantities broken down by product, region, time period, or all of the above. Measures can be viewed this way.

Hierarchy: A hierarchy is a hierarchical structure of dimension attributes. Within a hierarchy are levels and within a level

are members, which you’ll learn about in a moment. In the image below, we have an illustration of a hierarchy. You can

clearly see how the levels make up a hierarchy and the members that are part of those levels.

Level: As mentioned above, in a hierarchy are levels. In the Adventure Works cube, the Fiscal Year hierarchy is made up

of four levels: The top level is the Year, the second level is Quarter, the third level is Semester, the fourth level is Month,

and the final level is the Date level. When viewing aggregated measures across a hierarchy, we would be able to see

that measure aggregated up to each member in that level. Member: A member is a value in a level. An example of a

member of the Month level of the Fiscal Year hierarchy in the Adventure Works cube would be March. Axis: An axis can

be thought of as a line on which a dimension rests. This line and the dimension intersect our cube and measures. An

MDX query can have up to 128 axes. Only 5 are aliased.

Rows

Columns

Pages

Chapters

Sections

Most MDX queries will only contain 2 axes since most reporting is done in a two dimensional tabular format. There

aren’t many vendors of multidimensional database software that support views of more than two dimensions at a time.

Tuple: A tuple is an ordered collection of dimension attributes that are not from the same dimension that uniquely

identify a cell in a cube. In the example below, I have a tuple that features the Fiscal Year attribute of the Date

dimension along with the Sales Reason attribute of the Sales Reason dimension. Notice that the tuple is contained in

parentheses.

1 --A tuple

2 ([Date].[Fiscal Year].&[2006], [Sales Reason].[Sales Reason].&[9])

Set: A set is a collection of tuples. In the example of a set (seen below), I have two tuples within the set. The

dimensionality of tuples within a set must be the same or your query will throw an error. In other words, the dimensions

used in the tuples of a set must be the same and in the same order. A set must be wrapped in curly brackets and each

tuple in the set should have parentheses around it.

1 --A collection of tuples

2 {

3 ([Date].[Fiscal Year].&[2006], [Sales Reason].[Sales Reason].&[9]),

4 ([Date].[Fiscal Year].&[2006], [Sales Reason].[Sales Reason].&[5])

5 }

MDX Query Syntax Here we have a basic MDX query. Let’s take a look at each piece of this query so we

can understand the necessary

parts to a basic MDX statement. When you first look at this query, you may think that it appears awefully

similar to T-SQL, but that would be where the similarities end. The MDX query statements function very

differently compared to T-SQL and it is important to remember that when you are learning to write MDX.

The first thing your MDX query will need is a SELECT statement. The Select statement in an MDX query is

very different from a SELECT statement in a SQL query. In T-SQL, the SELECT statement can only define the

columns layout of your query results. In an MDX query, the SELECT statement defines the lay out of multiple

dimensions along up to 128 different axes. In our example query seen above, you can see that I have

specified the Date dimension and Fiscal Year attribute be displayed along the Row axis and the Internet

Order Quanity measure be displayed along the Column axis. The second thing your MDX query will need is a

FROM statement. The FROM statement specifies the context of your query, or which cube you wish to query.

In T-SQL, you can use the FROM statement to join multiple tables from multiple databases, but in MDX there

is no joining of cubes together so don’t even think about it. The last piece of this MDX query is the WHERE

statement. The WHERE statement limits the results of our MDX query by specifying what is known as a slicer

dimension. You can see in in this example that I am restricting the results of the MDX query to only the 1

member (which is actually the Bikes category) of the Category attribute of the Product dimension. Basic

Queries To Get Us Started So now that we’ve learned a little bit about MDX, have covered some of the

necessary terminology, and understand the MDX query syntax, lets start writing some MDX. Important

Note: All of the examples in this article are for use with the Adventure Works 2008 Analysis Services

database, which can be downloaded from here. Once the database is downloaded and installed, open the

Analysis Services project (located by default in c:Program FilesMicrosoft SQL Server100ToolsSamples) in

BIDS. Make sure you change the data source connection to the instance of SQL Server where you installed

the sample database. Then deploy the project. Open SQL Server Management Studio. When the Connect to

Server dialogue window pops up, make sure to select Analysis Services in the drop down list next to Server

type. Specify the correct server where the Adventure Works AS database is at and click Connect. Navigate to

the Analysis Services database called AdventureWorksCube. Right-click, select New Query, and select MDX.

Now that the MDX query editor window is open, we can start writing some MDX. The first query is very simple and its

purpose is to show us the default measure of the Adventure Works cube.

1 SELECT

2 FROM [ADVENTURE WORKS]

3 ;

Results:

If you execute the query seen above, you’ll see a single number returned. While it would appear that we have

not defined a measure or dimension in our query, the number being returned by the query is actually the

default measure specified in the Adventure Works cube, which is the Reseller Sales Amount. Since no

dimension has been specified, this query returns the sum total of the Reseller Sales Amount across all

products, all time, all regions, etc. So unless you explicitly specify in your query which measure you want to

see, the measure that is going to be returned is the default measure Reseller Sales Amount. One way we could

explicity request a different measure is to use the Where statement and specify a slicer.

1 SELECT



3 WHERE [Measures].[Reseller Total Product Cost]

4 ;

Results:

Because we have used the Where statement to limit the scope of the query to the Reseller Total Product Cost,

this query will return the sum total of the Reseller Total Product Cost, instead of the Reseller Sales Amount,

across all dimensions. Now lets specify some dimensions by which to slice and dice our cube with.

1 SELECT [Date].[Calendar].[Calendar Year] ON COLUMNS,

2 [Product].[Product Categories].[Category] ON ROWS


4 ;

Results:

Now that we have specified a dimension on both the Columns axis and the Row axis, we are starting to

realize the power of the cube. Compare the above MDX query to a SQL query that would return similar

results and you will see one of the reasons why MDX (and Analysis Services) is so powerful. Can you guess

what measure is being displayed in our query results? If you said the Reseller Sales Amount, that would be

right. Once again, because we have not explicitly specified which measure to bring back, the default

measure Reseller Sales Amount will be returned. For our last query, lets bring it all together and specify a

dimension and attribute on each axis and a slicer dimension:

1 SELECT [Date].[Calendar].[Calendar Year] ON COLUMNS,

2 [Product].[Product Categories].[Category] ON ROWS


4 WHERE [Measures].[Reseller Total Product Cost]

5 ;

Share this:

Like this:


Like

Results:

The above query is exactly the same

as the previous

query except now we have explcitly limited the scope of the query to the Reseller Total Product Cost.

ANALYTICS DWBI SSAS LEAVE A COMMENT


SQL SERVER ANALYSIS SERVICES (SSAS)JANUARY 14, 2013

OVERVIEW

SQL Server Analysis Services (SSAS) is the technology from the Microsoft Business Intelligence stack, to

develop Online Analytical Processing (OLAP) solutions. In simple terms, you can use SSAS to create cubes

using data from data marts / data warehouse for deeper and faster data analysis.

Cubes are multi-dimensional data sources which have dimensions and facts (also known as measures) as its

basic constituents. From a relational perspective dimensions can be thought of as master tables and facts

can be thought of as measureable details. These details are generally stored in a pre-aggregated proprietary

format and users can analyze huge amounts of data and slice this data by dimensions very easily. Multi-

dimensional expression (MDX) is the query language used to query a cube, similar to the way T-SQL is used to

query a table in SQL Server.

Simple examples of dimensions can be product / geography / time / customer, and similar simple examples

of facts can be orders / sales. A typical analysis could be to analyze sales in Asia-pacific geography

during the past 5 years. You can think of this data as a pivot table where geography is the column-axis and

years is the row axis, and sales can be seen as the values. Geography can also have its own hierarchy like

Country->City->State. Time can also have its own hierarchy like Year->Semester->Quarter. Sales could

then be analyzed using any of these hierarchies for effective data analysis.

A typical higher level cube development process using SSAS involves the following steps:

1) Reading data from a dimensional model

2) Configuring a schema in BIDS (Business Intelligence Development Studio)

3) Creating dimensions, measures and cubes from this schema

4) Fine tuning the cube as per the requirements

5) Deploying the cube

In this tutorial we will step through a number of topics that you need to understand in order to successfully

create a basic cube. Our high level outline is as follows:

http://dwbitutorials.wordpress.com/2013/01/14/mdx-101-introduction-to-mdx/?share=twitter&nb=1

http://dwbitutorials.wordpress.com/2013/01/14/mdx-101-introduction-to-mdx/?share=facebook&nb=1

http://dwbitutorials.wordpress.com/2013/01/14/mdx-101-introduction-to-mdx/?share=google-plus-1&nb=1

http://dwbitutorials.wordpress.com/2013/01/14/mdx-101-introduction-to-mdx/?share=email&nb=1






http://dwbitutorials.wordpress.com/2013/01/14/mdx-101-introduction-to-mdx/#respond

http://dwbitutorials.wordpress.com/2013/01/14/sql-server-analysis-services-ssas/


Design and develop a star-schema

Create dimensions, hierarchies, and cubes

Process and deploy a cube

Develop calculated measures and named sets using MDX

Browse the cube data using Excel as the client tool

When you start learning SSAS, you should have a reasonable relational database background. But when

you start working in a multi-dimensional environment, you need to stop thinking from a two-dimensional

(relational database) perspective, which will develop over time.

In this tutorial, we will also try to develop an understanding of OLAP development from the eyes of an OLTP

practitioner.

Creating a Sample SSAS Project and Cube

Data in Online Transaction Processing (OLTP) systems is suited to support convenient data storage for user-

facing applications. The data model in such systems is highly normalized. For data warehousing

environments, data is required to be in a schema that supports a dimensional model. Data is therefore

transformed from the OLTP storage systems to a data warehouse using ETL, so that data can be aligned in a

suitable format to create data marts from the data warehouse.

Two major theories driving the design of a data warehouse and data marts are from Ralph Kimball and Bill

Inmon which are mostly practiced in real time environments. Generally data is gathered from OLTP systems

and brought to the data warehouse. From the data warehouse, context / requirement specific data marts are

created, which can be perceived as a subset of the data warehouse. Cube source data from these data marts,

and client applications connect to the cube. The schema for a cube falls into two categories: Star and

Snowflake. In simple terms, Star Schema can be considered a more denormalized form of schema compared

to Snowflake.

Designing and developing a data warehouse is out scope for this tutorial. For the purpose of development, we

will install and use the AdventureWorks DW database. We will then create a SSAS project and create a data

source which will connect to this database. Finally we will create a star schema using a Data Source View.

Installing AdventureWorks Sample Database

OVERVIEW

AdventureWorks is the sample database available from Microsoft for different purposes as well as different

SQL Server versions. We need to use the AdventureWorks DW 2008 R2 database for our cube design and

development. This database contains dimension and fact tables with prepopulated data. We can use this

database as a launchpad to start our SSAS project. Developing a data mart is out of the scope of this tutorial,

so we will use this sample database.

EXPLANATION

To install the AdventureWorks database, navigate to the codeplex

(http://msftdbprodsamples.codeplex.com/) site and download the MSI for the version of SQL Server you are

using. This tutorial expects that the reader is using SQL Server 2008 R2, and all the exercises will be using

this version of SQL Server.

After downloading, start the installer and you should get a screen similar to the one below.


AdventureWorks Data Warehouse 2008R2 is the database we need for our exercises. Point the installer

to the SQL Server instance that you are using, and install the database. After the database in installed, open

SQL Server Management Studio to verify the databases that were installed. You should find something

similar to the below screenshot.

Expand the database higlighted above and check out the different Dim and Fact tables in this database. The

tables having the prefix Dim are suited to be used as Dimension tables, and tables having prefix Fact are

suited to be used as Fact tables.

Creating a SSAS Project

OVERVIEW

To start development, we need to create a new SSAS project using Business Intelligence Development Studio.

After creating the new project, we need to create a data source that points to the AdventureWorks DW 2008

R2 database.

EXPLANATION

Open Business Intelligence Development Studio (BIDS). Create a new SSAS Project, by selecting New Project

from the File menu. Name this project “MyOLAPProject”. As soon as the new project opens up, you should

find a list of folders in the explorer tab. Right-click on the data sources folder and select New DataSource. A

Data Source wizard will open with a Welcome screen, select Next and you should find a screen to define your

connection. We need to define a new connection, so select “New” and a screen should appear as shown

below. Point the connection to theAdventureWorksDW2008R2 database and click OK.

After this, you need to specify the impersonation information for the data source. This information is used to

specify how the solution will connect to the SSAS instance using the credentials specified. Every time

you deploy or process the solution, this connection information will be used. So keep in mind that the

account you use should have sufficient privileges. If you are not sure which account to use, it is suggested

that you use an account with administrator privileges on your development machine. Please keep in mind

that this is not recommended and should not be done in production environments. This is just suggested to

quickly get you started with cube design and development.

After specifying this information, click “Next”. This should take you to the final screen where you need to

name the data source. Name it something appropriate and click OK, which should create your data source.

Creating a Star Schema Using a Data Source View

OVERVIEW

A data warehouse or data mart from where we would source our data could contain ten to hundreds of

tables. Also one would not have the liberty to change the schema of these tables to suit the requirements of

the cube design. The Data Source View is an insulation layer between the actual data source and the

solution. We can create and modify the schema we need in this layer and this is used as the data source for

the different objects we create in the solution. A Star Schema is a schema structure where different

dimension tables are directly connected to the fact table. If you imagine a fact table in the center and

different dimensions attached to it, you would find the figure similar to a star and hence the name star

schema. It’s the simplest form of the schema and hence we will use this in our exercise.

EXPLANATION

Right-click on the Data Source View and select New Data Source View and a wizard should pop-up with a

Welcome screen. Select “Next”, and the next screen should prompt you to select a relational data source.

Select the data source we just created and click “Next”, the next screen should prompt you to select tables

that we intend to use in our solution. Select the tables as shown in the below screenshot. The below fact and

dimension tables are chosen as they are interlinked with each other and also suits the requirements of the

exercises to follow.

Select “Next”, name the DSV to something appropriate and this should finally create your Data Source View.

After arranging the tables in the DSV, your schema should look similar to the below screenshot.

In the above figure, you can see that both the fact tables are related to all three dimensions in the same

manner. This is a typical case of a star schema. You can also browse the data, create calculated fields, assign

primary keys and carry out other similar function in this designer to modify the schema without modifying

the actual schema in the database.

Designing a Cube

USING BIDS, AFTER THE DSV IS DEVELOPED, THE NEXT STEP IS TO CREATE DIMENSIONS. DIMENSIONS ARE OF TWO TYPES: DATABASE

DIMENSIONS AND CUBE DIMENSIONS. DATABASE DIMENSIONS CAN BE PERCEIVED AS A MASTER TEMPLATE, AND CUBE DIMENSIONS CAN BE

PERCEIVED AS INSTANCES / CHILDREN OF THIS MASTER TEMPLATE.

We will start our development with the creation of database dimensions. If you consider a dimension as a

table, all the fields in this table can be perceived as attributes. Hierarchy in a dimension is a group of

attributes logically related to each other with a defined cardinality. Finally we will create a cube using the

dimensions we just developed, and fact tables to create dimensions (cube dimensions) and measure groups

(from fact tables).

Creating a Dimension

Dimensions are of two types: database dimension and cube dimension. The dimensions that are defined at

the solution level can be termed as a database dimension and the ones defined inside the cube are termed as

a cube dimension. Dimension Wizard is the primary means of creating a dimension. We will create a

dimension using the three dimension tables which we have included in our schema.

EXPLANATION

Right-click the Dimensions folder and select “New Dimension”, this will invoke the Dimension Wizard. The

first screen should look like the below screenshot. You have the options of using an existing table, creating a

table in the data source and using a template. We already have the dimension table in our schema and we

will use this, so select “Use an existing table” and click “Next”.

Select the DSV we created earlier in the DSV selection. We intend to create a dimension from the

DimSalesTerritory table, so select the same table. Every dimension table needs to have a key attribute, and

in this table SaleTerritoryKey is the primary key column which is guaranteed to identify each record

uniquely. It would not make sense to browse this attribute using the Key, instead SalesTerritoryRegion field

has unique values. We can also use this field as the key as well as name column. But for the purpose of our

exercise, we will use the SaleTerritoryKey field as the key column and SalesTerritoryRegion as the name

column. Though it looks inappropriate to use the key field, but when you are starting to develop an

understanding of dimensions, this will help to set a rule in your mind that the key field is always required,

mostly a surrogate key and you can set a name column to any field to facilitate a convenient browsing

mechanism.

In the next screen, you need to make a selection of the attributes that will be present in the dimension. If

you uncheck the “Enable Browsing” button, they won’t be visible to client applications when they browse the

dimension. Attributes can be of different types and you can specify the type in the Attribute Type field. The

Dimension Wizard removes the Name column you set from the key column as that is available due to the key

column. So you won’t find that field in this list of available attributes.

Now the next step is to give a name to the dimension, name it “Cube Dim Sales Territory” or anything

appropriate. After this step you have completed creating your first dimension.

In a similar manner create Product and Date dimension using the Dimension Wizard.

Creating a Hierarchy

A Hierarchy is a set of logically related attributes with a fixed cardinality. While browsing the data, a

hierarchy exposes the top level attribute which can be broken down into lower level attributes. For example,

Year -> Semester – Quarter – Month is a hierarchy. While analyzing the data, it might be required to drill

down from a higher level to a detail level, and exposing data as a hierarchy is one of the best solutions for

this.

EXPLANATION

Creating a hierarchy is as easy as dragging and dropping attributes in the hierarchy pane of the dimension

editor. We want to create a hierarchy in the Sales Territory dimension. Open Sales Territory dimension in

the dimension editor, drag and drop attributes in the hierarchy pane, click on each of them and rename

them to something appropriate. After completing this, your hierarchy should look similar to the below

screenshot.

You will find a warning icon on the hierarchy pane, which says that attribute relationships are missing

between these attributes. Country has a one-to-many relationship with Region, and Group has a one-to-

many relationship with Country. But these relationships need to be defined explicitly in the dimension. Click

on Attribute Relationships tab, right-click the region attribute and select “New Attribute Relationship”. Set

the values as shown in the below screenshot to correct the relationships between these attributes.

After you have applied the above changes, your attribute relationship tab should look like the below

screenshot.

If you have observer carefully, relationship types are of two types: Rigid and Flexible. This has an effect on

the processing of the cube. Rigid means that you do not expect the relationship to change and

Flexible means that relationship values can change. In our dataset, Group is a logical way to categorize

countries and it can change, while regions within country have limited or no change. So the relationship

type between country and group should be flexible and relationship type between region (sales territory key)

and country should be rigid. Double click on the arrow joining Key attribute and Country, and change the

relationship type as shown below.

Check out the Hierarchy pane, and you should find that the warning icon is no longer visible. You can

change the name of the hierarchy to something appropriate. In the interest of beginners who might get

confused with the distinction between attributes and hierarchy, we will keep the name as “Hierarchy”.

Edit the Date dimension, and create a Year – Semester – Quarter – Month hierarchy in the date dimension.

Creating a Cube using the Cube Wizard

A Cube acts as an OLAP database to the subscribers who need to query data from an OLAP data store. A

Cube is the main object of a SSAS solution where the majority of fine tuning, calculations, aggregation

design, storage design, defining relationship and a lot of other configurations are developed. We will create a

cube using our dimension and fact tables.

EXPLANATION

Right-click the Cube folder and select “New Cube”, and it will invoke the Cube Wizard. In the first screen

you need to select one of the methods of creating a Cube. We already have our dimensions ready, and

schema is already designed to contain dimension and fact tables. So we will select the option of “Use existing

tables”.

In the next screen, we need to select the tables which will be used to create measure groups. We already

have a DSV which has fact tables in the schema. So we will use this as shown in the below screenshot.

In the next screen, we need to select the measures that we want to create from the fact tables we just

selected in the previous screen. For now, select all the fields as shown below and move to the next screen.

In this screen you need to select any existing dimensions. We have created three dimensions and we will

include all of these dimensions as shown below.

In the next screen, we can select if we want to create any additional new dimensions from the tables

available in the DSV. We do not want to create any more dimensions, so unselect any selected tables as

shown below and move to the next screen.

Finally you need to name your cube, which is the last step of the wizard before your cube is created. Name it

something appropriate like “Sales Cube” as shown below.

Now your cube should have been created and if your cube editor is open you should find different tabs to

configure and design various features and aspects of the cube. If you look carefully in the below screenshot,

you will find FactInternetSales and FactResellerSales measure groups. Also you will find Sales Territory and

Product dimension, but Date dimension is missing. Both fact tables have multiple fields referencing the

DateKey from the Date dimension. BIDS intelligently creates three dimensions from the Date dimension and

names them to the name of the field which is referenced from the Date dimension. So you will find three

compounds of Date dimension – Ship Date, Due Date and Order Date dimensions. These are known as role-

playing dimensions.

Processing and Deploying a Cube

OVERVIEW

Once the cube design and development is complete, the next step is to deploy the cube. When the cube is

deployed, a database for the solution is created in the SSAS instance, if not already present. Each of the

dimensions and measure group definitions are read, and data is calculated and stored as per the design and

configuration of these objects. Once the cube is successfully deployed, client applications can connect to the

cube and browse the cube data. We will deploy the cube we have developed and test connecting to the cube.

We might also face errors during deployment, and we will attempt debugging and resolving these errors.

Debugging Deployment Errors

In a development environment, ideally you would come across errors during deployment and processing of

the cube. Debugging errors is an essential part of the cube development life cycle. We will configure the

deployment properties and we should face some errors during the deployment. We will then analyze and

resolve these errors.

EXPLANATION

Right-click the solution and select Properties, this would bring up a pop-up window. Select the deployment

tab and it will bring up the deployment properties. Mention the SSAS server name and the database name

that was created for your solution in the SSAS instance. Since SSAS in installed on my local / development

machine, I have chosen server as “localhost” and name of the database as “Sales”. We will keep the rest of

the options as default for now.

Right-click the solution and select “Deploy”, this will start deploying the solution. If you have not specified

an appropriate account in the impersonation information, your deployment might fail as the account might

not have sufficient privileges.

If you have followed all the previous steps as explained, you should face errors as shown below. From the

error message you can make out that cube processing failed due to the Date dimension.

Right-click the Cube Dim Date dimension and select “Process”, and you would find the following error.

If you recall we have defined a hierarchy in the Date dimension, Year -> Semester -> Quarter -> Month,

and the attribute relation expected is one to many. If you browse the data, you will find that the same set of

semester values exist in each year, so how do you make them unique for each Quarter? When the Quarter

is processed, it will find duplicate Semester as the key columns for the Semester is Semester itself by default

which is not unique. So we need to make each attribute unique by changing its key columns.

Edit the Date dimension in the dimension editor, select the Semester attribute and edit the Key Columns

property. This should bring up a pop-up window as shown below. To make the Semester attribute unique, we

need to make the key column a composite key Year + Semester to make it unique. So select key columns as

shown below.

When you select multiple columns in the key column, the name column property becomes blank and it’s a

mandatory property. So select this property and set it again to Semester as we want to display semesters

when this is browsed.

This should solve the error we were facing on the date dimension. Duplicate keys are one of the most

common errors during dimension processing and we just learned how to resolve this issue.

Processing Dimensions and Cube

SSAS provides various cube processing methods and options to configure error logging as well as impact on

processing when errors are encountered. We will briefly look at these options, understand what processing of

the cube means, deploy our cube and try to access data from the cube.

EXPLANATION

Right-click on the dimension or cube and select “Process”, and this should bring up a similar screen with

processing options as shown in the below screenshot. Various processing options are visible in the dropdown.

Unprocess would remove all the aggregation created by the processing of the object. Process Full would also

do the same operation, but also create all the aggregations again. More reference about these options can be

found in MSDN BOL.

In the “Change Settings” and “Impact Analysis” options you will find more error configuration and other

options related to processing.

Deploy the cube and the cube should be deployed successfully. Go to the Browser pane after successful

deployment, and try to connect to the cube and browse data by dragging and dropping dimension attributes

and measures on the browsing area. Below is an example.

Calculated Measures and Named Sets

Fields from fact tables get converted into measures in measuregroups in a cube. When measuregroups are

created in a cube, one measuregroup is created per fact table. Often in production systems, developing

calculated measures is a regular requirement. Multi-Dimensional Expressions (MDX) is the query language

for a cube and is synonymous to what T-SQL is to SQL Server. Often queries that are frequently used are

required to be in some ready format in a cube, so that the users do not need to develop them over and over

again. One of the solutions for this is named sets, which can be perceived as a query already defined in the

cube, similar to views in SQL Server. We will develop a calculated measure and a few named sets in this

section.

Developing a Calculated Measure

Measures created directly from the fields of a fact table are called base measures. But often we require

measures based on custom requirements, so we apply some logic and/or formula to these base measures and

create calculated measures. We will add two measures from two measure groups and create a calculated

measure.

EXPLANATION

Open the cube designer, and click on the Calculations tab. Click on “New Calculated Measure” from the

toolbar, and key in the values as shown in the below screenshot.

We have named this new calculated measure “TotalSales”. The “Parent hierarchy” specifies which parent

hierarchy the measure will be part and in this case it will be “Measures”. It’s a built-in hierarchy and all

measures normally fall under this.

In the Expression, we can specify any MDX expression. Here we are adding Internet Sales Amount from

FactInternetSales and Reseller Sales Amount from FactResellerSales measure groups. You do not need to type

the values you can just drag and drop values from the panes on the left-hand side of the window.

In the additional properties you can set additional options for this measure. Save your solution, in the next

section we will create named sets and then deploy these at the same time.

Developing Named Sets

Named sets return a dataset based on defined logic. They are primarily useful to create datasets that are

often requested from the cube. Named sets are of two types: Static and Dynamic. The difference between

these two is that static named sets are calculated when they are requested the first time in a session and

dynamic named sets are calculated each time a query references it. In this section we will look at how to

create dynamic named sets. Note that dynamic named sets were not introduced until SQL Server 2008.

EXPLANATION

Open the cube designer, and click on the Calculations tab. Click on “New Named Set” from the toolbar and

key in the values as shown in the below screenshots.

Here we are creating two named sets, Internet Sales Top 25 and Reseller Sales Top 25. In these named sets,

we are returning the Top 25 products based on Internet Sales and Reseller Sales. In this formula, TopCount,

the MDX function returns top 25 records from the dataset.

In the Type selection, we can select whether we want the named set to be static or dynamic. We have

selected Dynamic as we want to create a dynamic named set.

In the Display folder selection, we can specify where the named sets will appear. By default named

sets appear in the last dimension that is used in the formula. Here we have used an attribute hierarchy from

Product dimension, so the named sets should appear in the same dimension under “Named Sets” directory.

Save and deploy the solution, and then re-connect to the cube in the “Browser” pane. You should be able to

see the calculated measure and named sets as shown in the below screenshot.

Browsing a Cube Using Excel

Once the cube is deployed and ready to host queries from the data store, client applications can start

querying the cube. One of the most user friendly client tools for business users to query a cube is Microsoft

Excel. It has a built-in interface and components to support GUI based connection, querying and formatting

of data sourced from a cube. Business users can use the familiar interface of Excel and create ad-hoc pivot

table reports by querying the cube without any detailed knowledge about querying a multi-dimensional data

source. We will connect to the cube we just created using Excel and develop a very simple report using the

cube data.

Using Excel and Creating a Pivot Table Report

OVERVIEW

We will first create a connection to the cube we have developed in the previous exercises. After connecting

the cube we will use the calculated measures and a named set to create a very basic pivot table report. For

the purpose of demonstration, Excel 2010 is used and is installed on the development machine, but you can

also use Excel 2007 to connect to the cube.

EXPLANATION

Open Microsoft Excel and select the “Data” tab from the menu ribbon. Click on “From Other Sources” and

select “From Analysis Services” option as shown in the below screenshot.

In the next step specify the SSAS server name and logon credentials. If you have everything on the local

machine, you can also use “localhost” as the server name.

If you were able to successfully connect to the specified SSAS instance with the logon credentials specified,

in the next step you should be able to select the SSAS “Sales” database and find the Sales Cube. Select the

Sales Cube and proceed to the next step.

In the next step, specify the name of the connection file to save. This file will be saved as an .ODC file and

you can reuse this connection file when you want to use the same connection in other workbooks.

After saving the file, you will be prompted with the option to select the kind of report you want to create. We

will go with the default option and select “PivotTable Report”.

After selecting “PivotTable Report”, a designer will open with options to select dimension, attributes and

measures to populate your pivot table. Select the values as shown in the below screenshot. Our intention is to

display the hierarchy we created in the Sales Territory dimension on the columns axis, Internet Sales Top 25

named set on the rows axis, and the Total Sales calculated measure in the values area.

After making the above selections, your report should look like the below screenshot. Using the features

available from the “Options” tab, you can format this report and give it a more professional look. You can try

drilling down the hierarchy, but you will see that you need to develop the hierarchies. Users who frequently

want to see sales of products to top customers, can pick up any named-set that we defined earlier. Instead of

having users define formulas for adding internet sales and reseller sales, users can just select Total Sales.

SQL Server Analysis Services Glossary

Following is a list of common terms when working with SQL Server Analysis Services.

Cube - Cube is a multi dimensional data structure composed of dimensions and measure groups. The

intersection of dimension and measure groups contained in a cube returns the dataset.

Calculated Measure - Each field in a measure group is known as a base measure. Measures created using

MDX expressions with/without base measures are known as calculated measures.

Data Source View - It’s an insulation layer that inherits the basic schema from the data source with the

flexibility to manipulate the schema in this layer without modifying the actual schema in the data source.

Dimension - Dimension is an OLAP structure that is basically used to contain attributes related to an entity

to categorize data on the row / column axis. A dimension almost never contains measurable numeric data,

and if at all it contains, it is used as an attribute. Typical example of dimensions are Geography,

Organization, Employee, Time etc.

Fact - Fact known as a Measure Group in a cube, is an OLAP structure that is basically used to contain

measureable numeric data, for one or more entities. In cube parlance these entities are known as

Dimensions. A dimension need not be necessarily associated directly with a fact, but a fact is always

associated directly with at least one dimension. Typical example of facts are Sales, Performance, Tax etc.

Hierarchy - Hierarchy is collection of nested attributes associated in a parent-child fashion with a defined

cardinality. Dimension is formed of attributes, and hierarchy contained in a dimension is formed of one or

more attributes from the same dimension.

BLOG AT WORDPRESS.COM. | THEME: SOMETHING FISHY BY CAROLINE MOORE.

Share this:

Like this:


Like

KPI - Key Performance Indicators are logical structures defined using MDX expressions. Each KPI has a goal,

status, value, trend, and indicator associated with it. Value is derived based on the definition of KPI, all the

rest of these values vary based on this derived value. KPIs are the primary elements that makes up a

scorecard in a dashboard.

MDX - Multi Dimensional Expressions is considered as the query language of multi dimensional data

structures. This can be considered as the SQL of OLAP databases, with the major difference that MDX is

mostly used for reading data only.

Named Set - Named Set is a pre-defined MDX query defined in the script of the cube. It can be thought of

synonymous to Views in a SQL Server database. Named sets can be dynamic or static and this nature defines

the time when this query gets evaluated.

OLAP - Online Analytical Processing is a term used to represent analytical data sources and analysis

systems. The fundamental perception and expectation associated with the term OLAP is that it would contain

multi dimensional data and the environment hosting the same.

Snowflake Schema - Snowflake schema is an OLAP schema, where one or more normalized dimension

tables are associated with a fact table. For example, Product Sub Category -> Product Category -> Product

can be three normalized dimension tables and Product table can be associated with a fact table like Sales.

This is a very common example of a snowflake schema.

Star Schema - Star schema is an OLAP schema, where all dimension tables are directly associated with

fact tables, and no normalized dimension tables are considered in the schema. For example, Time, Product,

Geography dimension tables would be directly associated with a fact table like Sales. This is a very common

example of star schema.

ANALYTICS DWBI SSAS LEAVE A COMMENT


http://wordpress.com/?ref=footer

http://theme.wordpress.com/themes/something-fishy/

http://carolinemoore.net/

http://dwbitutorials.wordpress.com/2013/01/14/sql-server-analysis-services-ssas/?share=twitter&nb=1

http://dwbitutorials.wordpress.com/2013/01/14/sql-server-analysis-services-ssas/?share=facebook&nb=1

http://dwbitutorials.wordpress.com/2013/01/14/sql-server-analysis-services-ssas/?share=google-plus-1&nb=1

http://dwbitutorials.wordpress.com/2013/01/14/sql-server-analysis-services-ssas/?share=email&nb=1






http://dwbitutorials.wordpress.com/2013/01/14/sql-server-analysis-services-ssas/#respond

SSAS _ Dwbi Tutorials

Documents

aggregate data

required data

thesource data

offline data

relational data stores

olap server

frequency of data refresh

used storage mode