Top Banner
1 DDW Using Wrapper Scripts
53

GDC - Wrapper

Apr 11, 2015

Download

Documents

api-3754320
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: GDC - Wrapper

1

DDW Using Wrapper Scripts

Page 2: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

2

Process Highlights

• Run any executable file including Ab Initio deployed scripts• Restart without touching/deleting flags manually• Run processes from different servers without impacting inter-

process communication• Check/Set Object status on Oracle• Update ASLAM on Teradata and Oracle• Collect Statistics and prepare tie-out• Archive log files• Archive data files• Communicate completion/failure/time-out thru email to

different mailing lists and pager

Page 3: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

3

Process Architecture

• Main Process - submitted thru crontab

• Sub-processes - submitted by Main Process

• Executable files - submitted by Sub-process

Main Process

Sub-process Sub-process Sub-process

Run Ab Initio Graph Collect StatisticsSend Mail

Page 4: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

4

Main Process

• Submitted thru crontab

• Sets environment for the entire process

• Validates existence/executability of sub-process files

• Submits one or more Sub-Processes

• Waits for Sub-Process completion

• Updates ASLAM on Teradata

• Archives Log Files

• Archives Data Files

Page 5: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

5

Sub-Process

• Submitted by the Main process

• All sub-processes submitted simultaneously and not sequentially

• Has capability to wait for a variety of dependencies, including other Sub-Processes

• Can perform various functions depending on RUN_TYPE definition

Page 6: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

6

Sub Process - Functionality

Can perform any of the functions as determined by RUN_TYPE

P : Process (submit any executable file such as Ab Initio deployed script)

F : Set Flag on local and remote directory location

O : Set Object Status on Oracle

OA : Set ASLAM on Oracle

Page 7: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

7

Sub Process - Dependencies

Can wait for one or more or all of the dependencies

D : Data file

F : Flag set by another process

O : Object Status on Oracle

S : Another Sub-process

Page 8: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

8

File System - Overview

• Common Files

Sourced by every process

Ease of code maintenance

Extend new features to all processes

Developers can not alter code – maintains integrity

• Local Files

Process-specific files

Defined by Developer

Source common files

Page 9: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

9

File System – Common Files

• Located in /usr/local/abinitio/common (DDW_COMMON_DIR ) on every server

• Files include

ddw_main_process.ksh chk_object_status.sql

ddw_sub_process.ksh set_object_status.sql

archive.ksh fill_job_detail_nolsn.sql

get_ora_cnt.ksh fill_job_detail.sql

get_td_cnt.ksh hosts.env

update_aslam.ksh

Page 10: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

10

File System – Local Files

• Main process file

• Sub-Process file(s)

• Ab Initio deployed scripts

• List files

• Mail files

• Pager file

Page 11: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

11

File System – hosts.env

• Required for defining host name of each process

• Located in /usr/local/abinitio/common

• Used for checking/setting flags in inter-process communication

• Useful for fail-over protection and inter-process communication

• e.g.

export US_FIN_ITEM_ROLLUP_HOST=harp

Page 12: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

12

File System - Directories

• Home directory (also known as Sandbox) is project specific

e.g. /usr/dell/us_fin/fin/orders/us/load

• Following sub-directories required by the Wrapperbin main process file

db dbc files

dml dml files

env list files

flags setting/checking flags

logs log files

mail mail-related files

paging pager related files

run sub-process and ab initio deployed files

temp temporary files created by Wrapper and Ab Initio scripts

Page 13: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

13

File System – Main Process file

• Copy /usr/local/abinitio/template/template_main.ksh to bin/ directory ($AI_BIN)

• Rename as desired

• Make sure it’s executable

• Modify just one line of the file (directory path)

. $HOME/<directory path>/ab_project_setup.ksh $HOME/<directory path>

• Sources /usr/local/abinitio/common/ddw_main_process.ksh

Page 14: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

14

File System – Sub-Process file - 1

• Copy /usr/local/abinitio/template/template_sub.run to run/ directory ($AI_RUN)

• Rename as desired

• Make sure it’s executable

• Define RUN_TYPE and related parameters

• Sources /usr/local/abinitio/common/ddw_sub_process.ksh

Page 15: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

15

File System – Sub-Process file - 2

RUN_TYPE=P

• Runs a process by submitting an executable file such as Ab Initio deployed graph

• Parameter required is

RUN_JOB: executable shell script that needs to be processed

Page 16: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

16

File System – Sub-Process file - 3

RUN_TYPE=O

• Sets Object Status on Oracle

• Sources /usr/local/abinitio/common/set_object_status.sql

• Parameters required areOS_REGION: region codeOS_SUBJECT_AREA: subject area codeOS_OBJECT_NAME: object nameOS_ACTION: START or FINISHOS_LOAD_SEQ_NUM: $LOAD_SEQ_NUM or 0OS_COMMENTS: Any non-null valueOS_SID_NAME: Oracle SID where process is runningOS_SCHEMA: Oracle Schema where process is running

• Two run files required – one for START and another for FINISH

Page 17: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

17

File System – Sub-Process file - 4

RUN_TYPE=F

• Sets flags on local and target directories

• Parameters required are

FLAG_NAME: flag name

REMOTE_HOST: host server name of downstream process as in hosts.env

REMOTE_USER: useid for logging to REMOTE_HOST

REMOTE_DIR: directory on REMOTE_HOST for setting the flag

Page 18: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

18

File System – Sub-Process file - 5

RUN_TYPE=OA

• Updates ASLAM on Oracle

• Parameters required are

OA_JOB_CODE: Job code for ASLAM on Oracle

OA_ACTION: 'START‘ or ‘FINISH’

OA_LOAD_SEQ_NUM: $LOAD_SEQ_NUM

OA_FINISH_PROCESS_NAME: run file name where OA_ACTION is ‘FINISH’

• OA_FINISH_PROCESS_NAME is required only when OA_ACTION is START

• Two run files required – one for START and another for FINISH

Page 19: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

19

File System – List Files

• Used by Main Process or Sub-processes

• Located in env/ directory ($AI_ENV)

• Can be any of the following types

Job List

Dependency List

Stats List

Archive List

Page 20: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

20

File System – Job List File

• Lists the sub-processes to be submitted by Main Process

• SUB_JOB_LIST parameter defines the file name

• ON/OFF flag determines which process to run

• Order in the job list file does not indicate order of their run

• Main Process reads this file several times - hence a copy of this file is stored in $AI_ENV/.ENV and accessed to preserve integrity from changes to the file till process completion

• e.g. entriesON extract_svc_tags.run

Page 21: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

21

File System – Dependency List File - 1

• Lists the dependencies for which a sub-processes may wait

• SUB_DEPEND_LIST parameter defines the file name

• Four types of dependencies

D: Data File

F: Flag set by another process

O: Object Status

S: Another Sub-process

• Dependency checking is always in the alphabetical order i.e. D F O S

Page 22: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

22

File System – Dependency List File - 2

• Main Process reads this file several times - hence a copy of this file is stored in $AI_ENV/.ENV and accessed to

preserve integrity from changes to the file till process completion

Page 23: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

23

File System – Dependency List File - 3

Dependency Type = D

• Waits for Data files

• Format for entry

<subprocess> D <depedended file> <DEFAULT/directory location>

e.g.

build_svc_tags.run D sthsflat.sql DEFAULT

(where DEFAULT = $COLL_HOME/load_seq_num)

Page 24: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

24

File System – Dependency List File - 4

Dependency Type = F

• Waits for a flag set by another process

• Format for entry

<subprocess> F <depedended flag> <directory location> <remote server> <remote userid>

e.g.

build_svc_tags.run F customer_can_${LOAD_SEQ_NUM}.1_moved.flg $CUST_MOVED_DIR $CAN_FIN_CUST_LOAD_HOST can_svc

• Remote server parameter is as defined in /usr/local/abinitio/common/hosts.env

Page 25: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

25

File System – Dependency List File - 5

Dependency Type = O

• Waits for Object Status on Oracle

• Format for entry

<subprocess> O <object_name> <subject name> <region>

e.g.

copy_girp_oh_od_all_us.run O PROD_ORDER_DETAIL FINANCE AMER

Page 26: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

26

File System – Dependency List File - 6

Dependency Type = S

• Waits for another Sub-Process submitted by the same Main Process

• Format for entry

<sub-process> S <depedended file>

e.g.

build_svc_tags.run S extract_svc_tags.run

Page 27: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

27

Dependencies – Important Points

• A Sub-Process can have any kind and number of dependencies• If a Sub-Process has more than one kind of dependency, waiting is

in the alphabetical order of the kind (D F O S)• A Sub-process can wait for any number of other Sub-Processes • Any number of Sub-Processes can wait for a Sub-Process• Setting OFF of dependent Sub-Process ignores the dependency – if

Sub-Process B waits for Sub-Process A and Sub-Process A is set to OFF, the dependency is ignored

• Ignoring the Sub-Process dependency does no cascade – if Sub-Process C waits for Sub-Process B and Sub-Process B in turn waits for Sub-Process A, setting Sub-Process B to OFF does not make Sub-Process C wait for Sub-Process A.

Page 28: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

28

File System – Stats List File - 1

• Required for collecting the record count for the tie-out report

• STAT_LIST_FILE parameter defines the file name

• collect_stats.ksh is used to collect record count

• File contents are not read till collect_stats Sub-Process submits collect_stats.ksh

• Can collect record count of any of the following sources

Data Files used/generated by Ab Initio

Table on Teradata (with/without a where condition)

Table on Oracle (with/without a where condition)

Page 29: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

29

File System – Stats List File - 2

• Source Type determines how to get the record count

• Source Types AF : Anomaly File downloaded from source AT : Anomaly Table on Teradata along with database name, with optional where condition BT : Base Table on Teradata along with database name, with optional where condition DF : Records discarded with D flag - file created by Ab Initio graph DR : Delete Resent - file created by Ab Initio graph EF : Extract File IF : Incremental file IT : Incremental Table on Teradata along with database name, with optional where condition MB : Datamart base table on Oracle, with optional where condition MI : Oracle Incremental Datamart table, with optional where condition MT : Teradata Incremental Datamart table/view, with optional where condition OE : Extract from Oracle Tables, with optional where condition

• Custom source types can be added to the list after modifying collect_stats.ksh accordingly

Page 30: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

30

File System – Stats List File - 3

Data Files

• Source Types of AF, DF, DR, EF, IF

• Format for definition

<Table Name> <Source Type> <File Name With Location> <DML File With Location>

e.g. SVC_TAG AF $SVC_TAG_ANOM_IN_DAT $SVC_TAG_ANOM_DML

• Both multi-file and single-file systems are handled

• DML file required in definition for multi-file system, not for single-file systems

Page 31: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

31

File System – Stats List File - 4

Teradata Tables

• Source Types of AT, BT, IT, MT

• Format for definition

<Table Name> <Source Type> <database.tablename> [<"Where Condition">]

e.g. SVC_TAG BT $SVC_TAG "where svc_business_unit_id = 707 and load_seq_num=${LOAD_SEQ_NUM}.1"

• When no where condition is defined, whole table count is returned

• Sources /usr/local/abinitio/common/get_td_cnt.ksh

• Requires $TD_LOGON parameter defined that points to a dbc file to login to Teradata

Page 32: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

32

File System – Stats List File - 5

Oracle Tables

• Source Types of OE, MI, ME

• Format for definition

<Table Name> <Source Type> <Source Name> <Oracle Schema> <Oracle Sid> [<"Where Condition">]

e.g. ORDER_DETAIL OE RAW_RAW_STAT_ORDER_DETAIL_AMER am_fl_extract proc

• When no where condition is defined, whole table count is returned

• Sources /usr/local/abinitio/common/get_ora_cnt.ksh

• Gets oracle password using getpasswd function using oracle schema and oracle sid defined in this file

Page 33: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

33

File System – Archive List File

• Used for archiving data files

• ARCHIVE_FILES_LIST parameter defines the file name

• Format for definition

ON <file name with directory path>

e.g.

ON $AI_OUT_DATA/r_svc_tag.dat

Page 34: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

34

File System – Mail files - 1

• Files used for sending mails

• Files located in mail/ ($AI_MAIL)

• Three types of files required

List file

Subject File

Body text File

Page 35: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

35

File System – Mail files - 2

• List file sends mail to the listed email addresses

e.g. Format for entry

mail -s “`cat $1` `date`” [email protected] < “$2”

• Add additional email addresses delimited by a comma

• Subject file contains text that forms the subject part of a mail

e.g. Format for entryERROR - <region> <process name> <task> Aborted

• Body text file contains text that forms the body of a mail

Page 36: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

36

File System – Pager files

• File used for sending pager messages

• File located in paging/ ($AI_PAGING)

• File contains the logic and page-id for sending the code

e.g. Format for entry

echo $1 $2 | /usr/bin/Mail -s "ravi" [email protected]

• Enter multiple entries for multiple ids

Page 37: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

37

ASLAM on Teradata

• Each process has a Process_id and makes an entry to ASLAM tables

• Each Process works on one or more Objects and ASLAM tables maintain the relation between a Process and the Objects

• A Process can end with any of the four statusesS : Successful

E : Errored

T : Timed-out

U : Unknown

• A Group of Processes can be defined to group related Processes

Page 38: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

38

Log Files - 1

• Four levels of log files are created in $AI_LOG

Generated by Main Process

Generated by Sub-Process

Generated by the executable submitted by a Sub-Process

Log files defined inside an Ab Intio graph

• Main Process log file defined as parameter LOG_FILE and each time Main Process is submitted, a separate log file is created

e.g.

Definition: <identifier>_`date +%b%d_%Y:%H:%M:%S`.log

Actual: build_corp_lookup_May24_2002:16:03:01.log

Page 39: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

39

Log Files - 2

• Sub-Process log file is created by suffixing date(MMMDD_YYYY:HH:MI:SS) to the run file and each time Sub-Process is submitted, a new log file is created.

e.g.

generate_dml_May22_2002:21:03:02.log

• The log file generated by the executable submitted by the Sub-Process (such as Ab Initio deployed script) takes it’s name from the script name appended with YYYYMMMDD and the extension is ‘out’ instead of ‘log’. Each time the process is submitted, output is appended to this file (i.e. only one file per day)

e.g.

collect_stats_lookup_download_2002May24.out

Page 40: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

40

Log Files - 3

• Log files defined inside the Ab Initio graphs have constant names and are always replaced when the graph is re-run.

• Log files older than certain number of days (LOG_FILE_KEEP_DAYS ) are archived and compressed by the Main Process and copied to $AI_LOGS/archive directory.

• Archived log files older than certain number of days (LOG_ARCH_KEEP_DAYS) are removed by the Main process.

Page 41: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

41

Automating a Process - 1

1. Create the directory structure

2. Setup Project environment

3. Create Main Process File

(copy /usr/local/abinitio/template/template_main.ksh to $AI_BIN and modify)

4. Create Sub-process files

(copy /usr/local/abinitio/template/template_sub.run to $AI_RUN and modify)

5. Define Job list file

(refer to /usr/local/abinitio/template/template_job.lst for sample)

Page 42: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

42

Automating a Process - 2

5. Define Dependency list file

(refer to /usr/local/abinitio/template/template_dependency.lst for sample)

6. Define Stats list file

(refer to /usr/local/abinitio/template/template_stats.lst for sample)

7. Define Archive list file

(refer to /usr/local/abinitio/template/template_archive.lst for sample)

8. Define Mail files

(refer to mail*.lst and mail*.txt in /usr/local/abinitio/template for sample)

9. Copy /usr/loca/abinitio/template/mail_done_template.ksh to $AI_RUN and rename to use for submitting by a sub-process.

Page 43: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

43

Automating a Process - 3

10. Define Pager file

(Copy /usr/local/abinitio/template/page_oncall to $AI_PAGING and modify as required)

11. Copy /usr/loca/abinitio/template/collect_stats.ksh to $AI_RUN – customize if custom source types are defined or tie-out calculation needs to be modified

12. Define wrapper related Parameters in the project setup and make sure they are exported

13. Setup ASLAM metadata

Page 44: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

44

Parameters that change runtime behavior - 1

LSN_REQUIRED (Y/N) Whether to check for the file that provides the Load Sequence Num exists or not

IGNORE_RUNNING_FLAGS (Y/N) Whether to resubmit the running sub-processes again when the main-process is restarted

IGNORE_ASLAM (Y/N) Whether to update ASLAM tables or not

PAGE_SUCCESSFUL_RUN (Y/N) Whether to send a pager message when main process completes successfully

PAGE_SUBPROCESS_FAIL (Y/N) Whether the sub-process to page when it fails

PAGE_SUB_DEPENDENCY (Y/N) Whether the sub-process to page waiting for another sub-process

Page 45: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

45

Parameters that change runtime behavior - 2

ARCHIVE_LOG_FILES (Y/N) Whether to archive log files or not.

LOG_FILE_KEEP_DAYS (3) Log files older than how may days should be archived

LOG_ARCH_KEEP_DAYS (14) Archived log files older than how many days old should be deleted

ARCHIVE_FILES (Y/N) Whether to archive data files or not

TIEOUT_FAIL_EXIT (Y/N) Whether the process should terminate if the tie-out fails

PRINT_BASE_TIEOUT (Y/N) Whether want to print base tie-out in the report

PRINT_MART_TIEOUT (Y/N) Whether want to print mart tie-out in the report

Page 46: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

46

Parameters that change runtime behavior - 3

MAIN_SLEEP_TIME Time in seconds the main-process waits for sub-process completion between each cycle

MAIN_PAGE_CNT No. of cycles after which main-process sends pager message

MAIN_EXIT_CNT No. of cycles after which main-process time-outs

SUB_SLEEP_TIME Time in seconds sub-process sleeps to check for completion of a dependency between

each cycle

SUB_PAGE_CNT No. of cycles after which sub-process sends a pager

SUB_EXIT_CNT No. of cycles after which sub-process times-out

Page 47: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

47

Flags and Process Control - 1

• Main Process sets $RUNNING_FLAG to prevent another concurrent session• Main Process writes Load Sequence Number and time of completion to $DONE_FLAG, to prevent another run for the same day• Each Sub-Process sets a flag depending on the status

Running: <sub_process_name>_running.flgDone: <sub_process_name>_done.flgFailed: <sub_process_name>_error.flg

• Sub-Process on failure or time-out sets $ABORT_FLAG• Main Process terminates when it finds $ABORT_FLAG

Page 48: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

48

Flags and Process Control - 2

• To resubmit processes whose running flags exist (occurs when the process is killed / server has failed), remove flags manually or set IGNORE_RUNNING_FLAGS to Y and restart

• Flags set by upstream processes are deleted at the end of successful completion of process

• Flags set for downstream processes are deleted when Main Process is started afresh (when $ABORT_FLAG does not exist)

Page 49: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

49

Flags and Restartability - 1

• Restarting a process is required under one of these situations• Main-Process failed or timed-out• One or more Sub-Processes failed or timed-out• Main-Process/Sub-Processes killed manually or due to server failure

• Restarted Main Process ignores Sub-Processes whose done flags or running flags exist.

• Restarting Main Process cleans up abort flags and error flags – NEVER EVER DELETE ANY FLAGS WHEN RESTARTING

Page 50: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

50

Flags and Restartability - 2

• Deleting done flags before restart will resubmit the Sub-Processes those have already completed

• Deleting running flags before restart can lead to concurrent sessions of the Sub-Processes whose outcome may be unpredictable• Deleting ABORT_FLAG will remove any flags set for downstream

processes (since absence of ABORT_FLAG is taken as a fresh process)

Page 51: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

51

Important Considerations

• Main Process time-out does not kill any sub-process it has submitted – they are still running – so just restart the Main Process• Sub-Process time-out indicates it has timed-out even before finished it’s job• Sub-Process never times-out waiting for the process it has submitted (such as an Ab Initio deployed script)• Failure of one Sub-Process in no way influences the outcome of

another Sub-Process (except it may time-out if it has a dependency)• Deleting running flags before restart can lead to concurrent sessions of the Sub-Process whose outcome may be unpredictable

Page 52: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

52

The Wrapper Advantage

• Easy to maintain code because it’s centralized• Easy to extend new features to every process with little changes in individual process• Easy to set-up a process which improves productivity• Easy to support because of uniformity in code/processing across

regions/subject/processes• Easy to move processes across servers without impacting inter-

process communication

Page 53: GDC - Wrapper

DDW Apollo - Process Automation Using Wrapper Scripts

53

The Wrapper Advantage

Easy to Set-up. Easy to Support.

Easy as Dell.