Joins in Oracle BI EE 06/08/2010 BY DAAN BAKBOORD LEAVE A COMMENT There a lot of questions on the Oracle Forums , recently also, about join in Oracle BI EE. It seems like there are a lot of misconceptions about the use of joins in Oracle BI EE. There are two types of joins; Foreign Key Join Complex Join These joins could be applied to two different in layers; Physical Layer Logical Layer (Business Model) –> When would you use which join? You can use both joins in both layers. There are some basic rules when it comes to using joins in Oracle BI EE. In the Physical Layer, you should use Foreign Key joins, while in the Logical layer it’s common to use Complex Joins. –> Why should you use these joins ? Physical Layer – Foreign Key Join The Oracle BI Server uses the Foreign Key Joins to construct the where clause when selecting form multiple tables. There are two types of keys in the Physical Layer: Primary Key – Unique identifier of a single record in a table Foreign Key – Reference to the Primary Key of another table So basically the Foreign Key Join is a ‘Primary Key / Foreign Key’- relationship between two tables. There are situations where you could use Complex Joins in the Physical Layer. Jeff McQuigg has written a good blogpost about this subject. Basically it comes down to the following three situations: Range Joins Data Type Conversion Joins Cartesian Joins Logical Layer – Complex Join You use Complex Joins in the Logical Layer only to tell the Oracle BI Server that there is a link between the two tables. The Oracle BI Server should go to the Physical layer to see the actual link between the two tables. A logical tabel consists of one or more Logical Table Sources. Multiple Logical Table Sources lead to multiple join paths. Using a Foreign Key join in a situation where multiple join paths exists, restricts the Oracle BI Server only to use the specified join. Therefore it is common practice to use a Complex Join in the Logical Layer.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Joins in Oracle BI EE
06/08/2010 BY DAAN BAKBOORD LEAVE A COMMENT
There a lot of questions on the Oracle Forums, recently also, about join in Oracle BI EE. It seems like there are a lot
of misconceptions about the use of joins in Oracle BI EE.
There are two types of joins;
Foreign Key Join
Complex Join
These joins could be applied to two different in layers;
Physical Layer
Logical Layer (Business Model)
–> When would you use which join?
You can use both joins in both layers. There are some basic rules when it comes to using joins in Oracle BI EE.
In the Physical Layer, you should use Foreign Key joins, while in the Logical layer it’s common to use Complex Joins.
–> Why should you use these joins ?
Physical Layer – Foreign Key Join
The Oracle BI Server uses the Foreign Key Joins to construct the where clause when selecting form multiple tables.
There are two types of keys in the Physical Layer:
Primary Key – Unique identifier of a single record in a table
Foreign Key – Reference to the Primary Key of another table
So basically the Foreign Key Join is a ‘Primary Key / Foreign Key’- relationship between two tables.
There are situations where you could use Complex Joins in the Physical Layer. Jeff McQuigg has written a good
blogpost about this subject.
Basically it comes down to the following three situations:
Range Joins
Data Type Conversion Joins
Cartesian Joins
Logical Layer – Complex Join
You use Complex Joins in the Logical Layer only to tell the Oracle BI Server that there is a link between the two
tables. The Oracle BI Server should go to the Physical layer to see the actual link between the two tables.
A logical tabel consists of one or more Logical Table Sources. Multiple Logical Table Sources lead to multiple join
paths. Using a Foreign Key join in a situation where multiple join paths exists, restricts the Oracle BI Server only to
use the specified join. Therefore it is common practice to use a Complex Join in the Logical Layer.
I hear you think; Why is the Foreign Key Join supported in the first place? It seems to only be there for backwards
compatibility
In the Logical Layer you define Primary Keys as well. The purpose of a Primary Key in the Logical Layer is….
…to identify the Unique identifier of a single record in a Logical Table
… to identify the lowes level of detail of a Logical Table
FILED UNDER MODELLING TAGGED WITH JOIN, LOGICAL LAYER, ORACLE BI EE, PHYSICAL LAYER
There are three ways to fight this, and if you follow them, you will never ever encounter
them. In fact, I haven’t seen a circular join in a few years.
1. Import physical tables without FK joins turned on. Make the joins yourself, so there is nothing hidden that could cause problem. You should “own” your joins, not the other way around.
2. Aliases are the technique one uses to eliminate circular joins. An easy example of how to use aliases is when you have a lookup table, like W_LOV_D – each version of it requires an alias, which will be joined to differently.
3. With all tables & aliases, using a naming standard will guarantee that only one physical layer object will exist in one Business Model object. If you prefix every object in the physical layer with its Logical usage first (essentially making everything an alias), you will not be able to use the Customer (W_COUNTRY_D) table in the Employee logical table, as the standard would require you to use a table alias called Employee (W_COUNTRY_D). The 7.9 OOB repository has finally adopted a similar approach.
Nullable FlagThe most devious Gotcha however is the nullable flag on each column. Since many
DBAs get lazy and do not set NOT NULL constraints in the database; this mis-
information is consumed by the BI Server and gives it a tummy ache. Incorrect not-null
settings are important to the BI server as they guide how it not only generates SQL but
does internal joins as well.
By leaving the default of Nullable set to true, just think what that is saying: it is saying
that one can expect null values to appear in the column. Whenever a join on this
column is needed, such as for a subtotal or a logical query that derives into multiple
physical SQLs, the join syntax can be affected. When subtotaling on such a field,
certain kinds of metrics that are calculated in the SQL (such as a count distinct) can
cause problems.
When subtotaling on a field in a report, this will translate into a multi-part query of the
basic format:
Select fields
from Select (Detail Rows) DR,
Select (SubTotal Rows) SR
Where DR.SubTotal Field = SR.SubTotal Field
Note that for straight additive measures, the server will do the calculation and not the
SQL.
If the field you are SubTotaling on is set to nullable, then you have the following join
condition:
nvl(DR.c1 , 88.0) = nvl(SR.c1 , 88.0)
and nvl(DR.c1 , 99.0) = nvl(SR.c1 , 99.0)
No that is not a typo. Although the join condition it generates is a little bizarre by being
redundant, it is clear that the server is taking into consideration that the field may be
null. For varchar fileds, it uses ‘q’ and ‘z’ instead of 88 and 99. (This is dangerous if
you actually have those values in the field!)
When set to Not Null, your join condition will be simply:
D1.C1 = SR.C1
There are two errors you can make, each the opposite of the other:
1. Setting the column to null when it is really not null. This will give you the correct answer, however it will be slower than the straight join due to the non-use of indexes.
2. Setting the column to not null when it is really nullable. This will result in the wrong answer, as the equi-join will remove null rows from the recordset completely.
Thus, be very accurate about your nullable settings in the physical layer. Ideally this
should be handled in the database data model before you import a table to ensure that
your physical layer does not become out of synch with the actual database.
Primary KeysWhen beginning with Analytics / OBI, a lot of people make the mistake of creating too
many primary keys when their PK simply has multiple columns.
Normally in Data Warehousing, a dimension table has a single field to be its primary
key. However sometimes we do not have that luxury, and must use multiple fields to
represent our PK. The best example of this is a Fact table: its PK is a combination of all
the FKS to the dimension. For example, a small fact table with FKs to Product, Month
and Region has a composite Primary Key of PRODUCT_KEY, MONTH_KEY, REGION_KEY.
It is one key and not three: in the Analytics Physical layer, configure one Primary key
but select multiple physical columns to define the key.
As a basic rule, think of the Highlander: “There can be only one!” There are advanced
scenarios where their may be more than one, but 99% of the time, make sure every
physical layer table has a single PK.
Complex JoinsOccasionally I will see complex joins where a complex join is not needed. The vast
majority of the time your joins should be straight FK joins, even if the PK of the parent
table is a composite PK (see above).
Complex joins are used in the following scenarios:
Range Joins . Useful for YTD metrics or rolling metrics. In 10.1.3.2, the need to do this has been eliminated, but your legacy config may still need them for this purpose. An example might be: W_DAY_D.DAY_KEY <= FACTS.DAY_KEY AND W_DAY_D.DAY_KEY < FACTS.DAY_KEY + 7. This query will bring back the last seven days of facts for the current day selected. (Yes this looks strange, but try it out on a piece of paper to see how it works)
Data Type Conversion Joins . This is common in non-star schema models, or poorly designed ones. The most common example is the conversion of a date field to a YYYYMMDD format to join to a date table that has a field of the same YYYYMMDD format. Incidentally, this is one of the reasons why you would have more than one PK of the date table – DAY_KEY and DAY_NUMBER would both be PKs if you had to do this.
Cartesian Joins . In strange circumstances, a 1=1 join can be used. Be extremely careful, as this is a Cartesian product, which means the same thing as no join, which in turn means all records from table 1 will be pared with all records from table 2. I have had to use this in my current project extensively, where we have to force a dimension to work with a fact table for the sake of the UI. When doing so, you must guarantee that only one of the dimensional values is selected in the report or prompt; otherwise your result will be substantially incorrect.
Scenarios when you should not use a complex join
1:1 Joins . This is common when using _DX and _FX tables in the Oracle BI applications (the RMW). A complex join frequently is over included in SQL, meaning that sometimes it is added and slows down performance even when it is not needed. Model this with the _DX or _FX as the parent of the base _D or _F table using a regular FK join
Filtering . This is where there is a filter in the join to exclude some rows from one of the tables. Think about a LOV table. One could make a join to a LOV and filter out the TYPE field to just ‘Case Status Types’. Although this will work, it is poor design and can be missed in other joins. It is best to put the filters in the LTS Content tab for the Dimension table it is mapped to.