8/18/2019 CapGemini Datastage Exercise
1/122
1
Training course Datastage (part 1)Training course Datastage (part 1)
V. BEYETV. BEYET
03/07/2006 03/07/2006
8/18/2019 CapGemini Datastage Exercise
2/122
2
Presentation ...Presentation ...
Who am I ? Who am I ?
Who are you ? Who are you ?
8/18/2019 CapGemini Datastage Exercise
3/122
3
Summar
General presentation (DataStage : what is it ?)
DataStage : how to use it ?
The other components (part 2)
8/18/2019 CapGemini Datastage Exercise
4/122
4
!enera" presentation
Datastage : What is it ?
An ETL tool: Extract-Transform-Load
A graphic environment
A tool integrated in a suite of BI tools
Developed by Ascential IB!"
8/18/2019 CapGemini Datastage Exercise
5/122
5
Datastage : why to use it ?
big si#e of data volume"
multi-source and multi-target :
files$ Databases oracle$ s%lserver$ access$ &"'
Data transformation : (elect$ )ormat$ *ombine$ Aggregate (ort'
!enera" presentation
8/18/2019 CapGemini Datastage Exercise
6/122
6
Datastage : how it wors ?
Development is done :
on a client-server mode$
+ith a graphical Design of flo+s$
+ith simple and basic elements$ +ith a simple language basic"'
Treatments are : *ompiled and run by an engine$
,ritten on a niverse database$
!enera" presentation
8/18/2019 CapGemini Datastage Exercise
7/122
7
T#e $i%%erent too"s
Server Server
Designer Designer Manager Manager
Administrator Administrator Director Director
!enera" presentation
8/18/2019 CapGemini Datastage Exercise
8/122
8
Server Server
The ser!er contains programs an" "ata.
The programs
*alled .obs : first as source code and then asexecutable programs$ +ritten in niverse Database
But +e can/t understand source code
Data : !ay be +ritten in niverse Database but better in
server directories'
!enera" presentation
8/18/2019 CapGemini Datastage Exercise
9/122
9
Server Server
What is a Pro#ect $or Datastage ?
A server is organi#ed in different environments called
01ro2ects3
A 1ro2ect is a separated environment for 2obs$ table
definitions and routines
A 1ro2ect can be created at any time
The number of pro2ects is unlimited
The number of 2obs is unlimited for each pro2ect But the number of simultaneous client connection islimited
!enera" presentation
8/18/2019 CapGemini Datastage Exercise
10/122
10
Servur Servur
%ni!erse Data&ase:
The niverse Database is a relational Database +ith files
Tables are called 45ash )ile4
A 5ash file is an indexed file6 It/s the central element to use all
the possibilities of the Datastage engine'
A Hash file with incorrectly defined keys may create disastrous problems.
!enera" presentation
8/18/2019 CapGemini Datastage Exercise
11/122
11
General presentation (Datastage : what is it ?)
DataStage : how to use it ?
The other components (part 2)
Summar
8/18/2019 CapGemini Datastage Exercise
12/122
12
T#e $esigner
The "esigner is to "esign #o&s & loo at the icon
The #o&s are compose" with ' Stages :
active stages : action
passive stages : data storageLin7s : bet+een the stages
Designer Designer
8/18/2019 CapGemini Datastage Exercise
13/122
13
T#e $esigner
1assive stages : a place for Data storage the
data flo+ is from the stage or to the stage"
Text )ile : se%uential file
5ash )ile : It can be treated only by
datastage and not by ,ord1ad$ &" but
simultaneous access is possible on 5ash file'
8 (tage : The file is in the niverse *ore
Data(tage engine"'
9DB* (tage$ 9LEDB$ 9A9*I :
epresentation of a database6 it allo+s to
access directly to a database +ith an 9DB*
lin7'
Designer Designer
8/18/2019 CapGemini Datastage Exercise
14/122
14
Active stagesAn active stage is a representation of a transformation on the dataflo+ :
Designer Designer
T#e $esigner
(ort : of a file
Aggregator : calculations
Transformer : selection$ transformation$ transport of properties
&
8/18/2019 CapGemini Datastage Exercise
15/122
15
lin7s
Designer Designer
T#e $esigner
Bet+een active and passive stages
Bet+een passive stages
Bet+een active stages
8/18/2019 CapGemini Datastage Exercise
16/122
16
T#e $esigner
* #o& in the "esigner
Designer Designer
1assive (tageActive (tage
8/18/2019 CapGemini Datastage Exercise
17/122
17
T#e $esigner Designer Designer
Data(tage Designer :
Each 2ob has :
- one or more source of data
- one or more transformation- one or more destination for the data
The toolbar contains the stage icons to designthe 2obs'
The 2obs have to be compiled to createexecutable programs'
8/18/2019 CapGemini Datastage Exercise
18/122
18
T#e $esigner Designer Designer
The repository
The toolbar+ith stageicons
palette"
To compile the 2ob
To run the 2ob
8/18/2019 CapGemini Datastage Exercise
19/122
19
T#e $esigner Designer Designer
Let/s study no+ the different (tages :
(e%uential )iles text files"Transformer
5ash )iles(ortAggregatoroutines8 (tages
8/18/2019 CapGemini Datastage Exercise
20/122
20
(e%uential file (tage :
*an be read$ *an be +ritten$
*an be read and +ritten in the same 2ob$ *an be +ritten cash or not$ *an be D9( file or nix file & *an be read by t+o 2obs at the same time
*an/t be +ritten by t+o 2obs at the same time
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
21/122
21
T#e $esigner
(e%uential )ile :
Designer Designer
(tage name
)ile Type
(tage description
8/18/2019 CapGemini Datastage Exercise
22/122
22
T#e $esigner Designer Designer
(e%uential )ile :
9utput lin7
(tage name to be +ritten"
8/18/2019 CapGemini Datastage Exercise
23/122
23
T#e $esigner Designer Designer
(e%uential )ile :
Data )ormat 9utput file"
Al+ays those values
8/18/2019 CapGemini Datastage Exercise
24/122
24
T#e $esigner Designer Designer
(e%uential )ile : To test the connection andvie+ the data in the file
Different columns of thefile 9utput" : type$ length
(i#e to displayfor 8ie+ Data"
8/18/2019 CapGemini Datastage Exercise
25/122
25
;roup your tabledefinitions byapplication
*reate or modify the tabledefinitions for files$databases$ transformers$ &"
T#e $esigner Designer Designer
To describe easily a file :use or create a 0tabledefinition3
(e%uential )ile :
8/18/2019 CapGemini Datastage Exercise
26/122
26
Then it can be used in different 2obs clic7 on Load to find the rightdefinition"'
T#e $esigner Designer Designer
(e%uential )ile :
8/18/2019 CapGemini Datastage Exercise
27/122
27
8ie+ Data
T#e $esigner Designer Designer
(e%uential )ile :
8/18/2019 CapGemini Datastage Exercise
28/122
28
Transformer (tage :
!ulti-source and multi-target$
,ait for the availability of the source of data$ !a7es loo7up bet+een < flo+s reference"$ Transform or propagate the data of each flo+$ Allo+s to select$ filter$ create refusals file'
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
29/122
29
Transformer (tage :
*an do treatments by :
native basic function or created in the manager$ Data(tage function or Data(tage macro$ routines before=after type" 9r only propagate columns'
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
30/122
30
Transformer (tage :
T#e $esigner Designer Designer
Input data9utput data
ight clic7 :propagate allthe columns
8/18/2019 CapGemini Datastage Exercise
31/122
31
T#e $esigner Designer Designer
Input data
9utput data
Transformer (tage :
8/18/2019 CapGemini Datastage Exercise
32/122
32
Exercise n°1 :Objective : Read a sequential file and create a new one sa!e t"e file#
$"e catalo%ue&in file "as to 'e read and t"e catalo%ue(sa!e&t)* file "as to 'e written
+ource ,ile : catalo%ue&in in -in director.#
$ar%et ,ile : catalo%ue(sa!e&t)* in -t)* director.#
+te*s :
1/ reate a ta'le definition structure of atalo%ue ta'le #
2/ esi%n t"e o' wit" 2 +equential ,iles and 1 $ransfor)er
3/ reate t"e lins data flow#
4/ +a!e and o)*ile t"e o'
5/ Run t"e o'
6/oo at t"e *erfor)ances statistics ri%"t clic#
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
33/122
33
Loo7 at the performances of your 2ob :
ight clic7 on the grid and then select
0(ho+ performance statistics3
T#e $esigner Designer Designer
Transformer (tage :
8/18/2019 CapGemini Datastage Exercise
34/122
34
*reate the parameters of the 2ob :
)enu Edit / o' ro*erties ta' ara)eters&
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
35/122
35
Exercise n°2 :
Objective : se en!iron)ent !aria'les
/ create a o' *ara)eter : director.
/ *lace it on all t"e *at"s fro) t"e o' of t"e first
exercise exa)*le : director.-t)*#
/ co)*ile
/ )odif. .our in*ut file add .our 'est fil)#
/ run wit" different *at" ot"er %rou*s#&
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
36/122
36
5ash )ile (tage :
T#e $esigner Designer Designer
>ecessary for a loo7up
9ne 5ash file is entirely +ritten before it can beread )romTrans lin7 must be finished before )rom)ilmType5)can start"Allo+s to group multiple records +ith the same7ey suppress duplicate 7eys"*an be read in different 2obs simultaneously
*an be +ritten by different lin7s simultaneouslyin the same 2ob or in different 2obs"
8/18/2019 CapGemini Datastage Exercise
37/122
37
5ash )ile :
T#e $esigner Designer Designer
(tage name
Account nameData(tage pro2ect"
)ile path
8/18/2019 CapGemini Datastage Exercise
38/122
38
T#e $esigner Designer Designer
5ash )ile : )ile name
)or files to +rite
(elect this chec7 box tospecify that all recordsshould be cached$ ratherthan +ritten to the hashedfile immediately' This isnot recommended +here your 2ob +rites and readsto the same hashed file inthe same stream ofexecution
8/18/2019 CapGemini Datastage Exercise
39/122
39
A 7ey must be defined it can be a single or multiple 7ey"
T#e $esigner Designer Designer
5ash )ile :
8/18/2019 CapGemini Datastage Exercise
40/122
40
(tage Transformer : Loo7up? The main flo+ can be from every type
? The secondary flo+ must has a 5ash )ile to design a loo7up so veryoften$ you +ill have to design a temporary 5ash )ile"
? The loo7 up is done +ith the 7ey of the secondary flo+
? The number of records in the main flo+ can/t be higher after theloo7up than before the loo7 up
? The loo7up is sho+n +ith a dotted line
? ,hen a loo7up is 0exclusive3 the number of records after the loo7upis smaller then the number of records before the loo7up
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
41/122
41
T#e $esigner Designer Designer
Transformer (tage : Loo7up
1rincipal )lo+hori#ontal"
eference )lo+vertical flo+"
8/18/2019 CapGemini Datastage Exercise
42/122
42
Exercise n°3 : Objective : )ae a loou* 'etween atalo% file and ,il) $.*e
to *ut t"e t.*e fil) in t"e out*ut file&
+ource ,ile : catalo%ue&in in -in director.#
$ar%et ,ile : catalo%ue&out in -out director.#
+te*s :
1/ reate a ta'le definition structure of ,il)$.*e ta'le #
2/ odif. .our o' to create a ;as" ,ile fro) t"e ,il)$.*e&in file
3/ reate t"e lin to s"ow t"e loou* data flow#4/ +a!e and o)*ile t"e o'
5/ Run t"e o'
6/oo at t"e *erfor)ances statistics ri%"t clic#
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
43/122
43
Exercise n°4 : Objective : *ut t"e director na)e and t"e fil) na)e to%et"er
se*arated '. a & ?f t"e fil) t.*e is not found *ut in t"e out*ut file& @"at "a**ens w"en t"e director na)e is
e)*t. A ,ind a solution&
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
44/122
44
Exercise n°5 : Objective : ?f t"e fil) t.*e is not found use constraint# *ut t"e
fil) in a refusals file ,irst a +equential file and t"en a ;as"
,ile#
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
45/122
45
(tage Loo7up +ith selection exclusive loo7up"
Don't %orget & "ooup can e $esigne$ *it# +,-+ stage or V stage ut it is more
etter *it# as# i"es.
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
46/122
46
T#e $esigner Designer Designer
Exercise n°6 : Objective : +elect onl. t"e fil)s for w"ic" t"e t.*e is nown
t"at )eans t"at t"e loou* is BC#
8/18/2019 CapGemini Datastage Exercise
47/122
47
Exercise n°7 : Objective : +elect all t"e clients w"o are fe)ale to *ut t"e) in
an out*ut file$"e +EDE colu)n contains ale# or , fe)ale#
nd t"en create an annotation for t"is o' all t"e o's )ust "a!e annotations#
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
48/122
48
T#e $irector Director Director
The Director is the 2ob controller$ it allo+s to :
un 2obs
Immediately or later$ +ith more options than in the Designer
*ontrol 2ob status
(tatus : *ompiled$ unning$ Aborted$ 8alidated$ )ailed validation '''
.ob monitoring
To control the number of lines treated by each active stage of a 2ob'
8/18/2019 CapGemini Datastage Exercise
49/122
49
un 2obs +ith Director
T#e $irector Director Director
(elect the 2ob andclic7 here
And then enterthe parameters
8/18/2019 CapGemini Datastage Exercise
50/122
50
To run a 2ob later :
Director Director T#e $irector
clic7 here
And then choosethe date and time
8/18/2019 CapGemini Datastage Exercise
51/122
51
To modify running parameters for a 2ob : Limits Tab
Director Director T#e $irector
,arnings limit : the 2obstops after x +arnings
o+s limit : the 2ob stops after xro+s on each flo+"
8/18/2019 CapGemini Datastage Exercise
52/122
52
8erify the status of 2obs +ith Director
The status :? 4>ot compiled4
? 4*ompiled4
? 4)ailed validation4
? 48alidated o74
? 4Aborted4? 4)inished4
? 4unning4
T#e $irector Director Director
8/18/2019 CapGemini Datastage Exercise
53/122
53
Director Director
Example : list of 2obs
T#e $irector
To run 2obs To stop 2obs To run 2obs laterTo vie+ the log To reset 2ob status
8/18/2019 CapGemini Datastage Exercise
54/122
54
Example of a !onitor :
Director Director
)or each step : the number of treated lines input and output"the beginning timethe execution duration Elapsed time"the statusthe performance ro+s=sec"
T#e $irector
Lin7 type :1ri : principal flo+
ef : reference flo+ loo7up"9ut : output flo+
The monitor allo+s to follo+ the
different stages of a 2ob' (eethe importance of a good namefor the stages and the lin7s @
8/18/2019 CapGemini Datastage Exercise
55/122
55
Example of a log :
Director Director T#e $irector
;reen : 9 >o problemCello+ : +arninged : bloc7ing problem
Don't %orget & "ear t#e "og %rom time to time (o4"ear "og).
To loo7 at error messages$choose the 2ob and clic7 on the0log3 button
8/18/2019 CapGemini Datastage Exercise
56/122
56
All the elements :
? 2obs
?outines
?table definitions
are classified in *ategories but the
name must be uni%ue +ithin a pro2ect
T#e manager
The manager is the tool to export=import elements from a
Data(tage pro2ect to an other Data(tage pro2ect'
Manager Manager
To import or export elements clic7 on
the appropriate button
)ile9pen 1ro2ect to change pro2ect
Drag and Drop on an element to changecategory
8/18/2019 CapGemini Datastage Exercise
57/122
57
E19T
Manager Manager
T#e manager
To append to anexisting file
To change the selectionoptions :
- By category
- By individual components
?.obs
?outines al+ays chec70(ource *ode3 box"
?Table definitions
choose +hat do you +ant to export create a 'dsx"
8/18/2019 CapGemini Datastage Exercise
58/122
58
I!19T
Manager Manager
T#e manager
This +ill create=modify elements inthe Data(tage 1ro2ect
!a7e your choice
choose +hat do you +ant to import
8/18/2019 CapGemini Datastage Exercise
59/122
59
,ith the manager$ you can compile many 2obs at the same time multiple compile
2obs"
Tools un multiple 2ob compile
you select the type of 2obs you +ant to compile and select 0(ho+ manual
selection page3 and clic7 on 0>ext3 button
select the 2obs and clic7 on 0>ext3 button
clic7 on the 0(tart compile3 button
Manager Manager
T#e manager
8/18/2019 CapGemini Datastage Exercise
60/122
60
(ort (tage :
T#e $esigner Designer Designer
*riteria of sorting are filled in
In (tage Tab=1roperties Tab
!odify those parameters if thefile to sort has a lot of lines
8/18/2019 CapGemini Datastage Exercise
61/122
61
Exercise n°8 : Objective : @"en .ou "a!e selected all t"e @o)en sort t"e file
'. al*"a'etical order&
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
62/122
62
Aggregator (tage :
- Allo+s data to be aggregated on a smaller number ofrecords$
- Intermediate treatments executed in memory$- Allo+s to execute a before=after routine before or afterthe stage treatment +hen all the lines have been treated"$
- 1erformances are better if data is sorted Input tab"$
- The aggregator does not sort the records'
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
63/122
63
Aggregator (tage : Input Tab
T#e $esigner Designer Designer
,hen input datais sorted
8/18/2019 CapGemini Datastage Exercise
64/122
64
Aggregator (tage : 9utput tab
T#e $esigner Designer Designer
;roup by
Differentfunctions
8/18/2019 CapGemini Datastage Exercise
65/122
65
Exercise n°9 :
Objective : create a o' w"ic" reads location&innd calculates t"e "it/*arade fro) t"e )ost "ired cassettes
order '. nu)'er of "ire descendin%#& ut also t"e na)e of t"e
fil) and not onl. t"e nu)'er of t"e cassette loou* wit"
catalo%ue&in#&
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
66/122
66
Exercise n°10 :
Objective : create a o' w"ic" reads location&innd calculates t"e a!era%e nu)'er of "ire for eac" cassette&2 different )et"ods can 'e used#
T#e $esigner Designer Designer
DD i
8/18/2019 CapGemini Datastage Exercise
67/122
67
Exercise n°9 o' to desi%n#
T#e $esigner Designer Designer
D iD i
8/18/2019 CapGemini Datastage Exercise
68/122
68
Exercise n°10 o' to desi%n#
T#e $esigner Designer Designer
D iD i
8/18/2019 CapGemini Datastage Exercise
69/122
69
5ash )ile (tage : ,e have seen that the 5ash )ile is necessary for a loo7up
,e have seen also that 5ash )ile allo+s to suppressduplicate 7ey
Let/s see no+ ho+ it is useful to group different flo+s
T#e $esigner Designer Designer
D iD i
8/18/2019 CapGemini Datastage Exercise
70/122
70
Exercise n°11 :
Objective : @it" t"e o' fro) exercise 10 use t"e 2 )et"ods in
t"e sa)e o'# create a ;as" ,ile to *ut t"e different results in t"e
sa)e ;as" ,ile&olu)n 1 : or
olu)n 2 : t"e result of eac" )et"od
?n t"e ;as" file .ou )ust "a!e 2 lines&
T#e $esigner Designer Designer
DesignerDesigner
8/18/2019 CapGemini Datastage Exercise
71/122
71
Exercise n°11 o' to desi%n#
T#e $esigner Designer Designer
DesignerDesigner
8/18/2019 CapGemini Datastage Exercise
72/122
72
(tage 8ariables : (imple treatments can be made easily +ith stage variable'
- It is a data +hich remain 0active3 during all the duration of the stage' (o youcan find a max if data is sorted"$ calculate a sum or count something'
- In the transformer$ clic7 on the right button and then select 0(ho+ (tagevariables3' Example :
T#e $esigner Designer Designer
T# $ i DesignerDesigner
8/18/2019 CapGemini Datastage Exercise
73/122
73
T#e $esigner Designer Designer
Another example :
T# $ i
DesignerDesigner
8/18/2019 CapGemini Datastage Exercise
74/122
74
Exercise n°12 :
Objective : $r. to calculate t"e a!era%e wit" sta%e !aria'les&
T#e $esigner Designer Designer
Exercise n°13 :
Objective : reate a o' t"at create a file wit" all t"e client e.#and in a second colu)n t"e list of t"e fil)s se*arated '. a dot#&
T# $ iDesignerDesigner
8/18/2019 CapGemini Datastage Exercise
75/122
75
T#e $esigner Designer Designer
Exercise n°13 o' to desi%n#
T# $ iDesignerDesigner
8/18/2019 CapGemini Datastage Exercise
76/122
76
T#e $esigner Designer Designer
Exercise n°13 o' to desi%n#T#e or$er o% t#e $i%%erent 5aria"es is important. T#e instructions are eecute$ in t#e
or$er o% t#e stage 5aria"es (to c#ange t#e or$er 84 rig#t c"ic4stage properties49in
or$ering Ta)
T#e 5aria"es must e initia"i:e$ (84 rig#t c"ic4stage properties45aria"es).
T#ere must e a #as# %i"e a%ter t#e stage.
T# $ iDesignerDesigner
8/18/2019 CapGemini Datastage Exercise
77/122
77
Data(tage 8ariables : Different variables are defined by Datastage :-F>LL
- FI>9,>!$ F9T9,>!
- FDATE
- FTE$ F)AL(E
- F1AT5
T#e $esigner Designer Designer
Lin7 8ariables : The more useful is : >9T)9>D
T# $ iDesigner Designer
8/18/2019 CapGemini Datastage Exercise
78/122
78
outines :
- (ource code +ritten +ith Basic language"
- It is external from the 2obs and can be used many times at many
levels- It can be a Transform function or a Before=After )unction :
a transform function is called at each line
a before subroutine is called before the first lineexample : empty a file"
an after subroutine is called +hen all the lines have beentreated
T#e $esigner Designer esigne
T# $ i Designer Designer
8/18/2019 CapGemini Datastage Exercise
79/122
79
outines G=H"
T#e $esigner gg
Type of routine>ame of the routine
Al+ays fill in this(hort description
T# $ i Designer Designer
8/18/2019 CapGemini Datastage Exercise
80/122
80
outines
8/18/2019 CapGemini Datastage Exercise
81/122
81
T#e $esigner gg
outines H=H"
*ode : useArgument names
(ave *ompileTest oftheroutine
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
82/122
82
T#e $esigner g
outines : access to a se%uential file
*lose(e% )ic
9pen(e% )ic to xxx thenendelseend
,rite(e% )ic to xxx thenendelseend
ead(e% )ic to xxx thenendelse
end
)ile 5eader
,eof(e% xxx To empty the file
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
83/122
83
T#e $esigner
outines :
If & Then
End
Else
End
;oTo
)or i & To
>ext i
Loop ,hileepeat
Loop ntilepeat
*all D(LogInfo4Information4$ Houtine>ame4"*all D(Log,arn4,arning4$ 4outine>ame4"*all D(Log)atal4Abort4$ 4outine>ame4"
AJ5ello /
BJ,orld/*A:B
*J5ello ,orld/
field&$K$K$H$G" search string file after the third comma
Trim&$ / /$/T/" suppress the trailing spaces
pcase&"
Iconv4MN=
8/18/2019 CapGemini Datastage Exercise
84/122
84
T#e $esigner
outines : Test
By double-clic7 on esult column
T#e $esigner
Designer Designer
8/18/2019 CapGemini Datastage Exercise
85/122
85
Exercise n°14 :
+te* 1 :
Objective : write a routine w"ic" calculates t"e nu)'er of da.
'etween two dates&
?f 'e%in date is null t"en return 0
?f end date is null t"en initialiIe it wit" date of toda.
+a!e co)*ile and test t"e routine&
T#e $esigner
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
86/122
86
T#e $esigner
T#e $esigner
Designer Designer
8/18/2019 CapGemini Datastage Exercise
87/122
87
Exercise n°14 :
+te* 2
Objective : Read location&in %enerate a file wit" t"e "ire
duration returned cassettes onl.#
Jon returned cassettes after 10 da.s end date null# will 'e
written in a refusals file wit" t"e na)e and address of client to
send t"en a )ail#
T#e $esigner
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
88/122
88
Exercise n°14 o' to 'e desi%ned#
T#e $esigner
T#e $esigner
Designer Designer
8/18/2019 CapGemini Datastage Exercise
89/122
89
Exercise n°15 :
Objective : @it" a routine se +E # calculate t"e a)ount
for t"e cassette "ire da.s nu)'er K "ire *rice K coefficient#&
$"e coefficient is calculated wit" t"at rule :L5 da.s M da.s K "ire *rice
=M5 and L10 da.s M da.s K "ire *rice K 1&20
=M10 and L30 da.s M da.s K "ire *rice K 1&50
=M 30 da.s M da.s K "ire *rice K 3
T#e $esigner
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
90/122
90
8 (tage : +or7s +ith internal hash file in the Data(tage 1ro2ect"
ma7es a *artesian product
uses (L re%uests select & from & +here & order by &"
T#e $esigner
T#e $esigner
Designer Designer
8/18/2019 CapGemini Datastage Exercise
91/122
91
Exercise n°16 : execute t"e artesian *roduct on lients file
and assettes file
Objective : ro*ose to t"e clients cassettes "e "as ne!er "ired
N+te* 1 : create t"e o' *ara)eter
N+te* 2 : create a o' to write clients "as" file et cassettes "as" file
in t"e + *roect wit" account *ara)eter N+te* 3 : ?n a new o' use t"ose "as" files to )ae t"e artesian
*roduct
Noo at .our o' *erfor)ances OO
T#e $esigner
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
92/122
92
Exercise 16 : +te* 1 and +te* 2
T#e $esigner
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
93/122
93
+te* 3 :
T#e $esigner
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
94/122
94
T#e $esigner
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
95/122
95
T#e $esigner
The number of records
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
96/122
96
Jor)aliIation :
#e $esigne
12 PQPPPE
12
12 Q
12
12
12 E
The normali#ation :
n/nor)aliIation :
!ulti-valuated file >ormali#ed file
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
97/122
97
>ormali#ation :
g
!ulti-valuated file must have :
G- a 7ey
8/18/2019 CapGemini Datastage Exercise
98/122
98
Exercise n°17 : nor)aliIationun/nor)aliIationN+te* 1 : create a o' w"ic" reads location&in file and writes a "as"
file ?d(li as t"e e. and t"e list of all ?d(as se*arated '.
SF# : use +ort sta%e and +ta%e Faria'les O
M= View Data on the Input ink of the Hash !ile
N+te* 2 : )odif. t"e a o' to add nor)aliIation of t"is file
M= View Data on the "utput ink of the Hash !ile
N+te* 3 : o)*are t"e sequential file wit" location&in file
g
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
99/122
99
g
Exercise J°17 : o' to desi%n and Fiew ata
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
100/122
100
g
The 9A9*I (tages :
The version of oracle used is Pi so use 9A9*IP stage Cou can :
Either use a %uery generated by Data(tage
9r use a user-defined %uery9r a combination of the both precedent possibilities
The access parameters have to be defined by 2ob parameters The stage can access only one table or more Different actions can be programmed : read$ insert$ update
Cou can also use (toc7ed 1rocedures
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
101/122
101
g
The 9A9*I (tages :The access parameters have to be defined by 2ob parameters
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
102/122
102
g
The 9A9*I (tages : 9utput lin7
%uery generated byData(tage or user-defined %uery
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
103/122
103
(election of the tables"
(election ofthecolumns
0;roup by3clause
(ort parameters%uery generated
by Data(tage
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
104/122
104
;enerate (ELE*T clause from column list6 enter other clauses
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
105/122
105
Enter custom (L statement : +hen you +ant to add something specific
To format a date for
example
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
106/122
106
The 9A9*I (tages : 9utput lin7
*hoose the table
*hoose the action
Important parameters
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
107/122
107
The 9A9*I (tages : 9utput lin7
>umber of lines
bet+een < commit
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
108/122
108
The 9A9*I (tages : verify error code G=H"
If the 2ob must abort+hen there is a(L error
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
109/122
109
The 9A9*I (tages : verify error code
8/18/2019 CapGemini Datastage Exercise
110/122
110
The 9A9*I (tages : verify error code H=H"
Treat lines G by G
To receive (L error code
To select the errors
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
111/122
111
The 9A Bul7 (tages :
- to insert in a table li7e (LL9AD"
- 8ery fast deactivate the index before the load and reactivate itafter the load"
- But no +arning if the index is in nusable state after the load
+hen duplicate 7eys for example"- >ot a lot of Date and Time format DD'!!'CCCC$ CCCC-!!-DD$ DD-
!9>-CCCC$ !!=DD=CCCC - hh
8/18/2019 CapGemini Datastage Exercise
112/122
112
The 9A Bul7 (tages
D(>
Date and Time format
pass+ord
Table name +ith
oracle'table>ame"
>umber of lines bet+een
< *ommit
user
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
113/122
113
5o+ to create a table definition from a table in the database U
9n the repository$
right clic7 on Table Definitions
and then choose 0Import3
and then 1lug-in !eta Data
Definitions
T#e $esigner
Designer Designer
8/18/2019 CapGemini Datastage Exercise
114/122
114
Then choose the table s" and clic7 on 0Import3
The table definitions +ill be created in the category 09DB*3
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
115/122
115
Exercise n°18 : Read a ata'ase
Objective : reate a o' w"ic" reads t"e ta'le
RE,($E in Q?B+ data'ase
+te* 1 : create t"e ta'le definition fro) t"e data'ase
+te* 2 : create t"e o' t"at reads t"e ta'le
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
116/122
116
Exercise n°19 : @rite in a ata'ase
Objective : reate a o' w"ic" writes in t"e ta'le$+$(?J(GF in Q?B+ data'ase onl. t"e 2 first
colu)ns : e.s#
ocation&in $+$(?J(GF :?d(li MMMMMMMM == ;R1
?d(as MMMMMMMM == ;R2
?n ;R1 *ut a letter different for eac" %rou*# 'efore t"e client nu)'er ?d(li#&
Step # : se BRB? sta%e
Step $ : +a)e exercise wit" BRQC sta%e
T#e $esigner Designer Designer
8/18/2019 CapGemini Datastage Exercise
117/122
117
Exercise n°20 : *date a ata'ase
Objective : reate a o' to u*date t"e colu)nsQEG?J($E and EJ($E in t"e ta'le
$+$(?J(GF in Q?B+ data'ase fro) location&in file
QEG?J($E and EJ($E are defined as ti)esta)* O
Administrator Administrator T#e a$ministrator
8/18/2019 CapGemini Datastage Exercise
118/122
118
The *"ministrator :
*reate a Data(tage pro2ect
nloc7 a 2ob
(ometimes$ due to server problems$ the designer or manager" falls do+n and
some elements may be loc7ed 2obs$ table definitions$ routines$ &"In that case$ in the Administrator +ith administrator security rights" :
Administrator Administrator T#e a$ministrator
8/18/2019 CapGemini Datastage Exercise
119/122
119
nloc7 a 2ob G=H"
choose your pro2ect
And clic7 on
*ommand button
To create a pro2ect
Administrator Administrator T#e a$ministrator
8/18/2019 CapGemini Datastage Exercise
120/122
120
nloc7 a 2ob
8/18/2019 CapGemini Datastage Exercise
121/122
121
unloc7 your 2ob +ith device number
nloc7 a 2ob H=H"
or +ith user number>L9* (E ser>umber EADL9*"9r everything>L9* ALL"
Administrator Administrator T#e a$ministrator
8/18/2019 CapGemini Datastage Exercise
122/122
1ro2ect name
*reate a pro2ectLocation for the 1ro2ect 2obs$
routines$ 8 hash files$ table
definitions$ &" on the server' !ust be
different from the location for the
directories of data @