05/06/22 by Manish Shekhar - Infos ys Index • What are psets? • Creation of psets. • What is analysis_level parameter? • Achieving data lineage for generic graphs using psets. • Physical datasets and logical EME datasets. • Parameters to handle parallel running jobs calling the same graph. • Capturing job statistics details in the EME when using generic graphs.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
04/12/23 by Manish Shekhar - Infosys
Index
• What are psets?
• Creation of psets.
• What is analysis_level parameter?
• Achieving data lineage for generic graphs using psets.
• Physical datasets and logical EME datasets.
• Parameters to handle parallel running jobs calling the same graph.
• Capturing job statistics details in the EME when using generic graphs.
04/12/23 by Manish Shekhar - Infosys
What are psets and how are they created
• Creating a set of input parameter and value pairs (psets). You do the above, using the Input Values Editor in the Edit menu, which allows you
to specify a set of values for the graph's formal parameters, then save it as a separate .pset (parameter set) file in any of the directories under the private sandbox.
Steps:
a. Select Edit Input Values... from the GDE menu.
This appears same as the graph parameter editor, with two columns in it, the parameter name and value.
b. For each formal parameter enter the required value in the value field.
c. Then select File Save As and save the same value set as <graph name>.pset under the private sandbox’s pset directory.
Note: The editor defaults to the project's mp directory as the location of the new .pset file you need to navigate to pset directory in the sandbox.
04/12/23 by Manish Shekhar - Infosys
04/12/23 by Manish Shekhar - Infosys
04/12/23 by Manish Shekhar - Infosys
Along with the existing formal parameters of the generic graph, define a formal parameter called analysis_level and set it’s value to none.
04/12/23 by Manish Shekhar - Infosys
• Check in the generic graph from common sandbox to the EME.
04/12/23 by Manish Shekhar - Infosys
• Dependency analysis will not be performed on the generic graph due to analysis_level parameter’s value.
04/12/23 by Manish Shekhar - Infosys
• Each separate input values set you create in this step represents a separate instance of the graph. To enable the Job Tracking of the generic graph, for different such value sets, simply check these .pset files with different input value sets into the EME data store.
• This graph instance represented by the .pset file is analyzed and saved in the EME data store as a graph object. For .pset file to be analyzed set analysis_level parameter’s in each parameter set to expand. This was mandatory in Abinitio V-13.
NOTE: Abinitio V-14 automatically expands the psets when they are checked in.
04/12/23 by Manish Shekhar - Infosys
Achieving data lineage for generic graphs using psets.
• Distinct values of logical EME datasets are passed from different psets to the same generic graph. This is done to achieve data lineage. When psets are checked in they are expanded and dependency analysis takes place. Different instances of the generic graph will show up in EME with unique values of logical datasets.
04/12/23 by Manish Shekhar - Infosys
EME view of distinct instances of generic graph:
As above different data lineage are achieved in two instances of the same graph in EME.
04/12/23 by Manish Shekhar - Infosys
• Physical dataset names overwrites the logical EME dataset names passed from psets. Physical dataset names are set and then passed while executing the graph from within the wrapper via pset.
For e.g. exporting physical datasets
Calling graph passing parameters
04/12/23 by Manish Shekhar - Infosys
Handling concurrent running multiple instances of a graph
AB_JOB_PREFIX – To avoid problems with multiple instances of a graph being run concurrently in the same
directory, you can make the AB_JOB value unique by exporting the AB_JOB_PREFIX configuration variable. For e.g.
AB_JOB_PREFIX should be assigned any dynamic value. In the e.g. above it is
assigned to process id (PID=$$). Alternatively date timestamp in YYYYMMDDHHMISS format can also be assigned to it.
Setting this parameter makes sure that AB_JOB will now resolve to ${AB_JOB_PREFIX}${AB_JOB} and thus recovery files also will get created with different names.
04/12/23 by Manish Shekhar - Infosys
Capturing job statistics details in the EME when using generic graphs
AB_AIR_JOB_GRAPH –
Specifies the graph/application being run so that it may be linked to the job object.
- When a generic graph is called the job statistics are stored in the EME under the name of the generic graph. This causes confusion and discrepancies when tracking stats in EME because a generic graph may be used in multiple projects. The objective is to store job statistics under the pset name so that they can be correlated with the logical use of the generic graph.
- This parameter needs to be set in the calling script/program to have a generic graph reposit tracking to the .graph (pset version) of the graph.
- If the graph is generic then you should set AB_AIR_JOB_GRAPH because you want the job to be associated with pset instance of the graph which does the specific task according to values passed through pset.
04/12/23 by Manish Shekhar - Infosys
In Coop Sys 2.14 and above
Benefits• Job statistics will be reposited with the logical use of the graph• The statistics will be accurately reported by the appropriate job group or project • Performance improvement in graph execution time.