C3: Protected DataStage Administrator and Director Basic
C3: Protected
DataStage Administrator and
Director
Basic
2©Copyright 2005, Cognizant Academy, All Rights Reserved
About the Author
Created By: Mandhagini P.S (127057)
Credential Information:
An expert in DataStage having 3 years of IT experience
Version and Date:
DS/PPT/1106/1.0
3©Copyright 2005, Cognizant Academy, All Rights Reserved
Questions
A Welcome Break
Coding Standards
Demo Key Contacts
Reference
Test Your Understanding
Hands-on Exercise
Icons Used
4©Copyright 2005, Cognizant Academy, All Rights Reserved
DataStage Administrator and Director: Overview
Introduction:
DataStage is a Widely used Data Warehousing (DW) tool used to develop
Complex ETL jobs. It has a unique feature of Real Time Integration and also
provides a very user friendly Interface. DataStage has many features to make
easier back end query.
DataStage administrator allows you to prepare the setup for DataStage Projects
and General Administration of DataStage
DataStage director allows you to monitor, schedule, and run the jobs and helps in
viewing the Job Log after running the job
5©Copyright 2005, Cognizant Academy, All Rights Reserved
DataStage Administrator and Director: Objectives
Objective:After completing this chapter, you will be able to:
Identify what is DataStage tool
Define DataStage Administrator
Work with DataStage Administrator
Explain DataStage Director
Work with DataStage Director
6©Copyright 2005, Cognizant Academy, All Rights Reserved
• Logging into a DataStage server using the Administrator requires the host name
of the server, the fully qualified name if necessary or the server’s IP address, and
an operating system username and password.
• For UNIX servers, users logging in as root or as a root-equivalent account, or as
dsadm will have full administrative rights.
• For Windows servers, users logging in who are members of the Local
Administrators (standalone server) or Domain Administrators (domain controller
or servers in an Active Directory Forest) groups will have full administrative
rights.
DataStage Administrator: Logging In
7©Copyright 2005, Cognizant Academy, All Rights Reserved
Enter your operating system username and password
Enter the hostname or IP address of the server where DataStage is installed
DataStage Administrator: Logging In (Contd.)
The Administrator Login Dialog Box
8©Copyright 2005, Cognizant Academy, All Rights Reserved
• This page lists the DataStage projects, and shows the pathname of the selected project in the Project pathname field. The Projects page has the following buttons:
– Add: Adds new DataStage projects. This button is enabled only if you have administrator status.
– Delete: Deletes projects. This button is enabled only if you have administrator status.
– Properties: Views or sets the properties of the selected project.
– NLS: Lets you change project maps and locales (if the NLS option was installed during the server installation).
– Command: Issues DataStage Engine commands directly from the selected project.
Viewing the Project List
9©Copyright 2005, Cognizant Academy, All Rights Reserved
Tip: The default directory path in which to create projects is located under the root directory of the DataStage server installation. For example, if the server was installed to /appl/Ascential/DataStage the projects would be installed to /appl/Ascential/DataStage/Projects/{project name}.
Adding Projects
• Provided that you have the proper permissions, you can add as many projects to the DataStage server as necessary.
• In normal projects any DataStage developer can create, delete, or modify any object within the project once it has been created.
10©Copyright 2005, Cognizant Academy, All Rights Reserved
Deleting Projects
Make sure you have a current backup of your project, just in case!
Highlight the project to be deleted
11©Copyright 2005, Cognizant Academy, All Rights Reserved
General Project Options
• Enable job administration in Director - enabling this feature allows the user the
ability to Cleanup Resources and Clear Status File from within the Job menu of
DataStage Director.
• Enable Runtime Column Propagation for Parallel Jobs - if you enable this feature,
stages in parallel jobs can handle undefined columns that they encounter when
the job is run, and propagate these columns through to the rest of the stages in
the job.
• Auto-purge of job log - this setting will automatically purge job log entries for jobs
based on the auto-purge action setting. For example, if you specify to auto purge
up to the previous 3 job runs, entries for the previous 3 job runs are kept as new
job runs are completed.
12©Copyright 2005, Cognizant Academy, All Rights Reserved
General Project Options (Contd.)
Auto purge settings for job logs—not a global or retroactive setting
Create Environmental Variables
13©Copyright 2005, Cognizant Academy, All Rights Reserved
Setting Project-wise Environment Variables
• You can set project-wide defaults for general environment variables or ones
specific to parallel jobs from this page.
• You can also specify new variables. All of these are then available to be used in
jobs.
• In each of the categories except User Defined, only the default value can be
modified. In the User Defined category, users can create new environment
variables and assign default values.
14©Copyright 2005, Cognizant Academy, All Rights Reserved
Setting Project-wise Environment Variables (Contd.)
15©Copyright 2005, Cognizant Academy, All Rights Reserved
Enable Server-Side Job Tracing
Trace files that have been created
Enable or disable tracing in the project
View or delete the currently highlighted file
You can trace the activities on the server to help diagnose project problems.
16©Copyright 2005, Cognizant Academy, All Rights Reserved
Validating User Account for Job Scheduling
Select a user account with proper access to the DataStage project
Verification that the currently selected user account can schedule jobs
• This tab applies to Windows NT/2000 servers only.• DataStage uses the Windows NT Schedule service to schedule jobs.
17©Copyright 2005, Cognizant Academy, All Rights Reserved
Performance Tuning Options
Some performance tuning options are:• Row buffering• Hashed file stage caching
18©Copyright 2005, Cognizant Academy, All Rights Reserved
Server Commands
Select a project and click ‘Command’
Enter a valid DataStage command
When you execute the command, a new window will show the response from the engine
19©Copyright 2005, Cognizant Academy, All Rights Reserved
Assigning Roles (Operator/Developer) to User Accounts
There are four roles for a DataStage user account:
• DataStage Developer: Has full access to all areas of a DataStage project.
• DataStage Production Manager: Has full access to all areas of a DataStage
project, and can also create and manipulate protected projects.
• DataStage Operator: Has permission to run and manage DataStage jobs.
• <None>: Does not have permission to log on to DataStage.
20©Copyright 2005, Cognizant Academy, All Rights Reserved
Select the user role, which is to be assigned to particular user accounts.
Assigning Roles (Operator/Developer) to User Accounts (Contd.)
21©Copyright 2005, Cognizant Academy, All Rights Reserved
Settings for Parallel Jobs
• Enable Runtime Column Propagation for Parallel Jobs
When this feature is enabled, stages in parallel jobs can handle undefined
columns that they encounter when the job is run, and propagate these columns
through to the rest of the job.
• Enable Remote Execution of Parallel Jobs
Select this to specify that parallel jobs in this project are to be deployed on USS
machine (Unix systems Services). When this option is selected, the Remote tab
is enabled and you can specify details about the jobs that are deployed
22©Copyright 2005, Cognizant Academy, All Rights Reserved
Enable these options.
Settings for Parallel Jobs (Contd.)
23©Copyright 2005, Cognizant Academy, All Rights Reserved
Settings for Parallel Jobs (Contd.)
24©Copyright 2005, Cognizant Academy, All Rights Reserved
DataStage Director: Logging In
• Logging into a DataStage server using the Director requires.
• The host name of the server, the fully qualified name if necessary, or the server’s
IP address and the operating system username and password.
25©Copyright 2005, Cognizant Academy, All Rights Reserved
Enter your operating system username and password
Enter the hostname or IP address of the server where DataStage is installed
Select the project to attach to
The Director Login Dialog Box
DataStage Director: Logging In (Contd.)
26©Copyright 2005, Cognizant Academy, All Rights Reserved
• The Job Status view shows the status of all the jobs in the currently selected job category, or, if the job category pane is hidden, in the current project. The view has the following columns:– Job name: The name of the job.– Status: The status of the job.– Started on date: The time and date a job was started. These fields are only
filled in for a job with a status of Running.– Last ran on date: The time and date the job was finished, stopped, or
aborted. These columns are blank for jobs that have never been run.– Description: A description of the job, if available.
• To view more details about a job’s status, select the job and do one of the following:– Choose View —> Detail.– Right-click to display the shortcut menu and choose Detail.– Double-click the job.
Viewing the Job Run Status
27©Copyright 2005, Cognizant Academy, All Rights Reserved
Detailed information about a job’s status
Viewing the Job Run Status (Contd.)
28©Copyright 2005, Cognizant Academy, All Rights Reserved
Validating a Job
• You can check that a job or job invocation will run successfully by validating it.
• Jobs should be validated before running them for the first time, or after making
any significant changes to job parameters. When a server job is validated, the
following checks are made without actually extracting, converting, or writing data.
• Connections are made to the data sources or data warehouse.
• SQL SELECT statements are prepared.
• Files are opened. Intermediate files in Hashed File, UniVerse, or ODBC stages
that use the local data source are created, if they do not already exist.
29©Copyright 2005, Cognizant Academy, All Rights Reserved
Click Validate when Job Run Options and parameters have been set
Validating a Job (Contd.)
30©Copyright 2005, Cognizant Academy, All Rights Reserved
Click Run when Job Run Options, parameters and tracing options have been set
Running a Job
31©Copyright 2005, Cognizant Academy, All Rights Reserved
Monitoring a Job
Expand tree to see all links attached to an active stage
Optionally show CPU utilization for each
active stage
32©Copyright 2005, Cognizant Academy, All Rights Reserved
Stopping a Job
Click Stop button to stop a running job
33©Copyright 2005, Cognizant Academy, All Rights Reserved
• If a job has stopped or aborted, then it is difficult to determine whether all the
required data was written to the target data tables. When a job has a status of
Stopped or Aborted, you must reset it before running the job again. By resetting
a job, you set it back to a runnable state and, optionally, return your target files to
the state they were in before the job was run.
• To reset a job or job invocation:
1. Select the job or invocation you want to reset in the Job Status view.
2. Choose Job —> Reset or click the Reset button on the toolbar. A message
box appears.
3. Click Yes to reset the tables. All the files in the job are reinstated to the state
they were in before the job was run. The job’s status is updated to “Has been
reset”.
Resetting a Job
34©Copyright 2005, Cognizant Academy, All Rights Reserved
Resetting a Job (Contd.)
Click Reset button to return a job to a runnable state
35©Copyright 2005, Cognizant Academy, All Rights Reserved
Interpreting the Job Execution Details in Log View
Current run—blackPrevious run—blue
Additional information is available for this entry (…)
36©Copyright 2005, Cognizant Academy, All Rights Reserved
Log Event Detail Window
Detail information can be copied to the system clipboard and pasted into a text editor—useful for sending errors to support!
Additional lines of information regarding this particular event
37©Copyright 2005, Cognizant Academy, All Rights Reserved
Filtering Log Events
Where to start showing log entries
Where to stop showing log entries
How many log entries to show
What type of log entries to show
38©Copyright 2005, Cognizant Academy, All Rights Reserved
Clearing Log Entries
Immediately delete log entries or automatically purge entries
Which entries to remove immediately
Which entries to remove automatically
39©Copyright 2005, Cognizant Academy, All Rights Reserved
Options in Auto- Purge:
• Up to previous (job runs): Purges old log entries, leaving the specified number
of recent job run entries in the file.
• Older than (days): Purges all log entries older than the specified number of
days. Specify the number of job run entries or days by clicking the arrow buttons
or entering the value directly.
Clearing Log Entries (Contd.)
40©Copyright 2005, Cognizant Academy, All Rights Reserved
Schedule View
41©Copyright 2005, Cognizant Academy, All Rights Reserved
Scheduling a Job Execution
You can schedule a job to run in a number of ways:
• Once today at a specified time
• Once tomorrow at a specified time
• On a specific day and at a particular time
• Daily at a particular time
• On the next occurrence of a particular date and time
42©Copyright 2005, Cognizant Academy, All Rights Reserved
Select a job and click Schedule button
Scheduling a Job Execution (Contd.)
43©Copyright 2005, Cognizant Academy, All Rights Reserved
Rescheduling a Job Execution
Select a previously scheduled job and click Reschedule button
44©Copyright 2005, Cognizant Academy, All Rights Reserved
Un-scheduling a Job Execution
Right click on a previously scheduled job and click Unschedule
45©Copyright 2005, Cognizant Academy, All Rights Reserved
Cleaning Up Resources
• If the Enable Job Administration in Director option has been set in the DataStage
Administrator, then certain functions are available to help you clean up the
resources of a job that has hung or aborted or return a job to a state in which you
can rerun it after the cause of the problem has been fixed.
• You should use them with care, and only after you have tried to reset the job and
you are sure it has hung or aborted.
• The Cleanup Resources command lets you:
– View and end job processes
– View and release the associated locks
46©Copyright 2005, Cognizant Academy, All Rights Reserved
Cleaning Up Resources (Contd.)
Operating system’s process ID number
Logout (kill) selected O/S process
Engine locks associated with processes
47©Copyright 2005, Cognizant Academy, All Rights Reserved
Clearing the Status File
Select a hung job and select Clear Status File from Job menu
48©Copyright 2005, Cognizant Academy, All Rights Reserved
Clearing the Status File (Contd.)
Before you clear a status file you should:
• Try to reset the job.
• Ensure that all the job’s processes have ended.
49©Copyright 2005, Cognizant Academy, All Rights Reserved
• Allow time for questions from participants
50©Copyright 2005, Cognizant Academy, All Rights Reserved
• What is the use of having User Defined Environment Variables?
• Can a DataStage operator manipulate a protected Project?
• What is the default cache size of a Hash size?
• When will “Clear Status File” be enabled in Director?
• What does (…) in the JOB LOG mean?
• Where do you see the CPU Utilization of each stage in a job?
Test Your Understanding
51©Copyright 2005, Cognizant Academy, All Rights Reserved
DataStage Administrator and Director: Summary
• DataStage is an ETL tool widely used in Data Warehousing. It has 4 components: Administrator, Director, Designer and Manager.
• Administrator can be used to:
– Create or delete projects
– Assign roles to user accounts
– Set project specific environment variables
– Enable tracing and Performance tuning
• Director can be used to:
– View job statistics
– Validate/Run/Monitor/Stop/Reset and Schedule jobs
– View logs/ filter log events and clear log entries
– Clean up job resources
52©Copyright 2005, Cognizant Academy, All Rights Reserved
DataStage Administrator and Director: Source
• DataStage 7.5.1 manual
Disclaimer: Parts of the content of this course is based on the materials available from the Web sites and books listed above. The materials that can be accessed from linked sites are not maintained by Cognizant Academy and we are not responsible for the contents thereof. All trademarks, service marks, and trade names in this course are the marks of the respective owner(s).
You have successfully completed
DataStage Administrator and Director.