Top Banner
DB2 1 What are joins, view, synonyms and alias? Joins: Joining of data from two different tables by columns. Following are the types of joins Inner Join: An inner join is a method of combining two tables that discards rows of either table that do not match ant row of the other table. The matching is based on the join condition. Outer Join: An outer join is a method of combining two or more tables so that the result includes unmatched rows of one or the other table, or of both. The matching is based on the join condition. DB2® supports three types of outer joins: Full outer join: Includes unmatched rows from both tables. If any column of the result table does not have a value, that column has the null value in the result table. Left outer join: Includes rows from the table that is specified before LEFT OUTER JOIN that have no matching values in the table that is specified after LEFT OUTER JOIN. Right outer join: Includes rows from the table that is specified after RIGHT OUTER JOIN that have no matching values in the table that is specified before RIGHT OUTER JOIN. Views: Views provide an alternative way of looking at the data in one or more tables. A view is named specification of result table. For retrieval, all views can be used like base tables. Maximum 15 base tables can be used in a view. Data security: Views allows to set-up different security levels for the same base table. Sensitive columns can be secured from the unauthorized Ids. It can be used to present additional information like derived columns. It can be used to hide complex quires. Developer can create a view that results from a complex join on multiple tables. But the user can simply query in this view as if it is separate base table, without knowing the complexity behind the building. It can be used for domain support. Domain identified a valid range of values that a column can contain. This is achieved in VIEW using WITH CHECK OPTION. 1
72
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Mainframes Interview Questions

DB2

1 What are joins, view, synonyms and alias?

Joins: Joining of data from two different tables by columns. Following are the types of joinsInner Join: An inner join is a method of combining two tables that discards rows of either table that do not match ant row of the other table. The matching is based on the join condition.Outer Join: An outer join is a method of combining two or more tables so that the result includes unmatched rows of one or the other table, or of both. The matching is based on the join condition. DB2® supports three types of outer joins: Full outer join: Includes unmatched rows from both tables. If any column of the result table does not have a value, that column has the null value in the result table. Left outer join: Includes rows from the table that is specified before LEFT OUTER JOIN that have no matching values in the table that is specified after LEFT OUTER JOIN. Right outer join: Includes rows from the table that is specified after RIGHT OUTER JOIN that have no matching values in the table that is specified before RIGHT OUTER JOIN.

Views: Views provide an alternative way of looking at the data in one or more tables. A view is named specification of result table. For retrieval, all views can be used like base tables. Maximum 15 base tables can be used in a view.

Data security: Views allows to set-up different security levels for the same base table. Sensitive columns can be secured from the unauthorized Ids.

It can be used to present additional information like derived columns. It can be used to hide complex quires. Developer can create a view that results from

a complex join on multiple tables. But the user can simply query in this view as if it is separate base table, without knowing the complexity behind the building.

It can be used for domain support. Domain identified a valid range of values that a column can contain. This is achieved in VIEW using WITH CHECK OPTION.

Read Only Views: Views on which INSERT, UPDATE and DELETE operation cannot be carried out are called non-updateable views or read only views.Updateable Views:

It should be derived from one base table and should not contain derived columns. Views should not contain GROUP BY, HAVING and DISTINCT clauses at the outermost

level. Views should not be defined over read –only views Views should include all the NOT NULL columns of the base table.

Restriction: You cannot create an index for a view. In addition, you cannot create any form of a key or a constraint (referential or otherwise) on a view. Such indexes, keys, or constraints need to be built on the tables that the view references.

Synonyms: A synonym is used to reference a table or view by another name. The other name can then be written in the application code pointing to test tables in the development stage and to production entities when the code is migrated. The synonym is linked to the AUTHID that created it.

1

Page 2: Mainframes Interview Questions

Also a Synonym is specific to the DB2 subsystem. Synonym can only access the table or view in the subsystem in which it is defined. A synonym is dropped when the table is dropped.

Alias: An alias is an alternative to a synonym, designed for a distributed environment to avoid having to use the location qualifier of a table or view. The alias is not dropped when the table is dropped.

An alias is a logical pointer to an alternate table name. The purpose of an alias is to resolve loops in the paths of joins. In some cases more than one alias may be necessary for a given table.

Synonyms can be used for the user who created it. But alias can be used for any users. Synonym is dropped when base table got dropped but alias will not get dropped. Synonym is recorded in the sys.synonym table and alias is recorded in sys.tables.

2 What is null indicator, how it is handled in programming.

In DB2, the columns defined as NULL needs to be handled carefully else it will throw null exception error, in order to overcome this error data type can be handled by using null indicator.

NULL is stored using a special one-byte null indicator that is "attached" to every null able column.

If the column is set to NULL, then the indicator field is used to record this.

Using NULL will never save space in a DB2 database design - in fact, it will always add an extra byte for every column that can be NULL. The byte is used whether or not the column is actually set to NULL. The indicator variable is transparent to an end user

In order to handle these null variables we need to have NULL-INDICATORS declares in the Program as S9(4) comp variable (A indicator variable is shared by both the database manager and the host application. Therefore, this variable must be declared in the application as a host variable, which corresponds to the SQL data type SMALLINT)

What values Null indicators will hold:

‘-1’ : Field is having NULL value‘ 0’ : Field is Not NULL value‘-2’ : Field value is truncated

How/Why to handling Null Values:

When processing INSERT or UPDATE… statements, the database manager checks the null-indicator variable, if one exists. If the indicator variable is negative, the database manager sets the target column value to null, if nulls are allowed else it throws SQL error code -305, we need null indicators to handle this situation.

2

Page 3: Mainframes Interview Questions

If the null-indicator variable is zero or positive, the database manager uses the value of the associated host variable.

There are two reasons for getting -305 and Resolution:

1) If the table column is defined as NOT NULL (with no default) and if we try to insert a null value we get this error. Resolution: This should be resolved by making sure that the inserted value is not null. Null indicator cannot be used here since the column is defined as NOT NULL. (Validate the data, if it’s not numeric or less than spaces then move spaces into it and then insert or update into table) 2) A table column is defined as NULL, The host variable has a not null value and the Null indicator is not set in the host program, so the null indicator is defaulted to a negative value. Resolution: This should be resolved by using a null indicator in the host program and moving the relevant value to the null indicator. Here in order to move null value into respective column move -1 to null indicator.

05 WS-SSN-NULL-IND PIC S9(04) BINARY.

EXEC SQL SELECT PREIN_SOC_SEC_NBR INTO :PREIN-SOC-SEC-NBR:WS-SSN-NULL-IND FROM PL_TOR_TEE_INST WHERE (PREIN_SOC_SEC_NBR = :PREIN-SOC-SEC-NBR OR PREIN_SOC_SEC_NBR LIKE :WS-SSN-LIKE OR PREIN_SOC_SEC_NBR IS NULL) END-EXEC

IF WS-SSN-NULL-IND = -1 MOVE SPACES TO PNMIN-SOC-SEC-NBR END-IF

IF PNMIN-SOC-SEC-NBR = SPACES MOVE -1 TO WS-SSN-NULL-IND ELSE MOVE ZERO TO WS-SSN-NULL-IND.

3 What is difference between inner join and outer join?

Inner Join: An inner join is a method of combining two tables that discards rows of either table that do not match ant row of the other table. The matching is based on the join condition.

3

Page 4: Mainframes Interview Questions

Outer Join: An outer join is a method of combining two or more tables so that the result includes unmatched rows of one or the other table, or of both. The matching is based on the join condition. DB2® supports three types of outer joins: Full outer join: Includes unmatched rows from both tables. If any column of the result table does not have a value, that column has the null value in the result table. Left outer join: Includes rows from the table that is specified before LEFT OUTER JOIN that have no matching values in the table that is specified after LEFT OUTER JOIN. Right outer join: Includes rows from the table that is specified after RIGHT OUTER JOIN that have no matching values in the table that is specified before RIGHT OUTER JOIN.

4 Are views updatable and delete give example with query.

Views can be updatable or not updatable. If a view is updatable, that means you can use its name in DML statements to actually update, insert, and delete the underlying table's rows. A view can be updatable only if all these rules hold:

The select statement does not contain any table joins; that is, the view is based on one and only one table or view. (In the latter case, the underlying view must also be updatable.)

All underlying table's mandatory (NOT NULL) columns are present in the view definition.

The underlying query does not contain set operations like UNION, EXCEPT, or INTERSECT; the DISTINCT keyword is also not allowed.

No aggregate functions or expressions can be specified in the select statement clause.

The underlying query cannot have a GROUP BY clause. We can drop a view by DROP VIEW <view name>

5 What is referential integrity? And what is SQL error code for referential integrity.

Referential integrity refers to the consistency that must be maintained between primary and foreign keys, i.e. every foreign key value must have a corresponding primary key value. The relation of the primary key of a base table with foreign key of the reference table is known as referential integrity.

-530 Referential integrity preventing the INSERT/UPDATE-532 Referential integrity (DELETE RESTRICT rule) preventing the DELETE.-536 Referential integrity (DELETE RESTRICT rule) preventing the DELETE.

6 What is primary key and unique key difference and also unique index?

4

Page 5: Mainframes Interview Questions

Primary key and unique are Entity integrity constraints Primary key allows each row in a table to be uniquely identified and ensures that no

duplicate rows exist and no null values are entered. Unique key constraint is used to prevent the duplication of key values within the rows of

a table and allow null values. Primary Key uniquely identifies the record and the UNIQUE index on the primary key

ensures that no duplicates values will be stored in the table. Primary key can be defined only one for a table whereas number of unique indexes can

be defined for a table. Primary key cannot have null values but unique can have null values and to remind one

null value can never be equal to another null value.

7 What is Cursor, and various options, errors related to cursor.

Most programming languages can only process one record at a time and so when you use a DB2 query where you expect multiple rows to be returned the results get stored in an intermediate buffer. The cursor is the pointer to the record in that buffer that you are currently up to in processing of the results returned from the query.

>>-DECLARE--cursor-name--CURSOR--+-----------+------------------> '-WITH HOLD-'

>--+----------------------------+--FOR--+-select-statement-+--->< | .-TO CALLER-. | '-statement-name---' '-WITH RETURN--+-----------+-' '-TO CLIENT-'

8 What is difference between group by and order by?

GROUP BY: The GROUP BY clause is used for grouping rows by values of one or more columns. You can then apply column function for each group. Except for the columns named in the GROUP BY clause, the SELECT statement must specify any other selected columns as an operand of one the column functions. If a column you specified in the GROUP BY clause contains null values, DB2 considers those null values to be equal. Thus, all nulls form a single group. You can also group the rows by the values of more than one column. ORDER BY: The ORDER BY clause is used to sort and order the rows of data in a result dataset by the values contained in the column specified. Default is ascending order (ASC). If the keyword DESC follows the column name, then descending order is used. Integer can also be used in place of column name in the ORDER BY clause.

9 What is deadlock error?

Deadlock: When two or more transactions are in simultaneous wait stage, each waiting for one of other to release a lock before it can proceed, DEADLOCK occurs.

5

Page 6: Mainframes Interview Questions

-911 – The current unit of work has been rolled back due to deadlock or time out. The current unit of work was the victim in a deadlock, or experienced a timeout, and had to be rolled back. The reason code indicated whether a deadlock or timeout occurred. A long running application, or an application that is likely to encounter a deadlock, should (if possible) issue frequent COMMIT commands.-913 – Unsuccessful execution caused by deadlock or timeout. The application was the victim in a deadlock or experienced a timeout. The application should either commit or roll back to the previous COMMIT.

***** IMR CHANGES FOR DEADLOCK RESOLUTION BEGIN HERE ********* MOVE 'N' TO WS-SQLCODE. PERFORM VARYING WS-LOOP-CTR FROM +1 BY +1 UNTIL WS-LOOP-CTR > WS-LOOP-CTR-MAX OR WS-SQLCODE = 'Y' ***** IMR CHANGES END HERE *********************************** EXEC SQL SELECT PLNAM_PLAT_ID, PLNAM_STATUS INTO :PLNAM-PLAT-ID, :PLNAM-STATUS FROM PL_PLAT_NAMES WHERE PLNAM_PLAT_ID = :PLNAM-PLAT-ID AND PLNAM_NAME_TYPE = 'P' END-EXEC

***** IMR CHANGES FOR DEADLOCK RESOLUTION BEGIN HERE ********* IF SQLCODE NOT = -913 MOVE 'Y' TO WS-SQLCODE END-IF END-PERFORM. ***** IMR CHANGES END HERE *********************************** IF SQLCODE = ZERO NEXT SENTENCE ELSE IF SQLCODE = WS-ROW-NOT-FOUND GO TO 3000-CONTINUE ELSE

6

Page 7: Mainframes Interview Questions

MOVE SQLCODE TO WS-RETURN-CODE MOVE 'PLAT NAMES FAIL; SQLCODE = ' TO WS-ABEND-MSG1 MOVE WS-RETURN-CODE TO WS-ABEND-MSG2

GO TO 9999-ABEND.

10 What happens if two resources are trying to access same table, will we get -911 or -904.

-911 if both are trying to update the table, one resource will get this error.

11 What is DCLGEN, host variables, SQLCA?

DCLGEN: DCLGEN stands for DeCLaration GENerator.It is an IBM provided function which generates INCLUDE members for DB2 tables for use in COBOL and PL/1 programs. These INCLUDE members contain SQL table declarations and working storage structures.

Host Variables: Host variables are data items defined within a COBOL program. They are used to pass values to and receive values from a database. Host variables can be defined in the File Section, Working-Storage Section, Local-Storage Section or Linkage Section of your COBOL program and have any level number between 1 and 48. Level 49 is reserved for VARCHAR data items.When a host variable name is used within an embedded SQL statement, the data item name must begin with a colon (:) to enable the Compiler to distinguish between host variables and tables or columns with the same name.Host variables are used in one of two ways:Input host variables These are used to specify data that will be transferred from the COBOL program to the database.Output host variables These are used to hold data that is returned to the COBOL program from the database.

SQLCA: (SQL communications area) An SQLCA is a collection of variables that is updated at the end of the execution of every SQL statement. A program that contains executable SQL statements and is precompiled with option LANGLEVEL SAA1 (the default) or MIA must provide exactly one SQLCA, though more than one SQLCA is possible by having one SQLCA per thread in a multi-threaded application. When a program is precompiled with option LANGLEVEL SQL92E, an SQLCODE or SQLSTATE variable may be declared in the SQL declare section or an SQLCODE variable can be declared somewhere in the program. An SQLCA should not be provided when using LANGLEVEL SQL92E. The SQL INCLUDE statement can be used to provide the declaration of the SQLCA in all languages but REXX. The SQLCA is automatically provided in REXX. To display the SQLCA after each command executed through the command line processor, issue the command db2 -a. The SQLCA is then provided as part of the output for subsequent commands. The SQLCA is also dumped in the db2diag.log file.

7

Page 8: Mainframes Interview Questions

12 What is pre-compilation process?

DB2 program is first feed to DB2 Pre compiler that extracts the DB2 statements into DBRM and replaces the source program DB2 statements with the COBOL CALL statements. This modified source is passed to COBOL compiler and then link editor to generate load module. During pre-compilation, the time stamp token is placed on modified source and DBRM. On the other side, DBRM undergoes bind process and DB2 Optimizer chooses the best path for the extracted SQL statement and stores it in PLAN.

13 What is bind, what is package?

Bind· A type of compiler for SQL statement.· It reads the SQL statements from the DBRM and produces a mechanism to access data asdirected by the SQL statements being bound.· Checks syntax, check for correctness of table & column definitions against the cataloginformation & performs authorization validation.

Package· It is a single bound DBRM with optimized access paths.· It also contains a location identifier a collection identifier and a package identifier.· A package can have multiple versions, each with it’s own version identifier.

Advantages of Package· Reduced bind time.· Versioning.· Provides remote data access.· Can specify bind options at the programmer level.

14 Difference between plan and package, and DBRM and Package.

Plan· An application plan contains one or both of the following elements

o A list of package names.o The bound form of SQL statements taken from one or more DBRM.

· Every DB2 application requires an application plan.· Plans are created using the DB2 sub commands BIND PLAN.

Package· It is a single bound DBRM with optimized access paths.· It also contains a location identifier a collection identifier and a package identifier.· A package can have multiple versions, each with it’s own version identifier.

Advantages of Package

8

Page 9: Mainframes Interview Questions

· Reduced bind time.· Versioning.· Provides remote data access.· Can specify bind options at the programmer level.

DBRM: Database Request Module, has the SQL statements extracted from the host language program by the pre-compiler.

15 How to resolve dead-lock error.

***** IMR CHANGES FOR DEADLOCK RESOLUTION BEGIN HERE ********* MOVE 'N' TO WS-SQLCODE. PERFORM VARYING WS-LOOP-CTR FROM +1 BY +1 UNTIL WS-LOOP-CTR > WS-LOOP-CTR-MAX OR WS-SQLCODE = 'Y' ***** IMR CHANGES END HERE *********************************** EXEC SQL SELECT PPLNT_ST, PPLNT_CTY, PPLNT_PLNT_FROM_DT, PPLNT_CRT_THRU_DT, PPLNT_THRU_FULL_DT, PPLNT_THRU_INST_DT, INTO :PPLNT-ST, :PPLNT-CTY, :PPLNT-PLNT-FROM-DT, :PPLNT-CRT-THRU-DT, :PPLNT-THRU-FULL-DT, :PPLNT-THRU-INST-DT, FROM @PTR00.PL_PLANT_PARAM WHERE PPLNT_ST = :WS-PBPARM-STATE AND PPLNT_CTY = :WS-PBPARM-COUNTY END-EXEC ***** IMR CHANGES FOR DEADLOCK RESOLUTION BEGIN HERE ********* IF SQLCODE NOT = -913 MOVE 'Y' TO WS-SQLCODE END-IF END-PERFORM. ***** IMR CHANGES END HERE ***********************************

16 What are index, table spaces, and database? Define different types of table spaces.

Index: An Index is an ordered set of pointers to rows of a base table. Each index is based on the values of data in one or more columns. An index is an object that is separate from the

9

Page 10: Mainframes Interview Questions

data in the table. When you define an index using the CREATE INDEX statement, DB2 builds this structure and maintains it automatically.Index can be used by DB2 to improve performance and ensure uniqueness. In most cases, access to data is faster with an index. A table with a unique index cannot have rows with identical keys.Database: It is not a physical object. It is a logical grouping consists of Table spaces, Index spaces, Views, Tables etc. Whenever you create an object you have to explicitly say which database it belongs to or Db2 implicitly assigns the object to the right database.In creation of table, we should say in which database and in which table space we are holding the table. In creating an index, we mention only the table name. Db2 decides the database from the base table name.It can also be defined as a unit of star and stop. There can be maximum of 65,279 databases in DB2 subsystem.Table space: Table spaces are the physical space where the tables are stored. There are three types of table spaces.

Simple Table space: More than one table is stored in the table space and single page can contain rows from all the tables.Segmented Table space: Segment is logically contiguous set of n pages where n is defined by SEGSIZE parameter of TABLESPACE definition. Tables are stored in one or more segments. Mass delete and sequential access are faster in segmented type table space. Reorganization will restore the table in its clustered order. Lock table command locks only the table and not the table space.Partitioned Table space: Only one table can be stored, 1-64 partitions are possible in table space. NUMPART of TABLESPACE definition decides the partitions. It is partitioned with value ranges from single or combination of columns and these columns cannot be updated. Individual partitions can be independently recovered and reorganized. Different partitions can be stored in different groups

17 What is copy pending status?

If you LOAD or REORG the table space with the option LOG NO, then the table space get into COPY PENDING status. The meaning is, an image copy is needed on the table space.

18 What is check pending status?

If you LOAD the table space with ENFORCE NO option, then the table space get into CHECK PENDING status. The meaning is table space is loaded without enforcing constraints. CHECK utility needs to be run on the table space.

19 What is Image copy?

After a Bulk Load of the Table or Table Space by means of a DB2 Load Utility, we need to go for an Image Copy so as to restore the system in case of some unexpected situations.

10

Page 11: Mainframes Interview Questions

Generally, DB2 system itself will expect such an image copy and will prompt the message in the output saying 'Image Copy Required', 'Table Space Remains in Copy Pending State'.

It is up to the wish of the programmer either to go for an Image Copy or to cancel the Copy Pend state by means of the Repair set.

20 What is difference between Image copy and Unload file?

Image copy produces a binary copy of the data, including all the internal db2 data (like object id, position of data within table space, etc.). If the table space was not reorganized and all the data was in bad order within the table space, recovering to the image copy will put the table space in the same bad reorganization state. When you drop a table and recreate it without any changes to the table definition, the table will get a new object id by db2. The old image copy still contains the object id of the table before dropping the table. Without translating old object id to new object id you cannot use old image copies. Unload contains the data in a more readable way. The data is structured in a way a load-utility may use it. In case you want to move the data to different environment unload is probably your choice. Unload is slower though and has other drawbacks. Yes you can use both unload and image copy because they target different situations.

21 What is Restart Logic? How will you handle it?

Usually there is only one COMMIT or ROLLBACK just before the termination of the transaction. But it is not preferred always.If the program is updating one million records in the table space and it abnormally ends after processing ninety thousand records, then the issued ROLLBACK brought the database back to the point to transaction initiation. As we have not committed the database we have to update the database once again. The repeated cost occurred is due to Bad design of applicationIf the program is expected to do huge updates, then commit frequency has to be chosen properly. Let us say, after careful analysis, we have designed our application COMMIT frequency as thousand records. If the program abnormally ends while processing 1500th record, then the restart should not start from first record but from 1001th record. This is done using restart logic.Create one temporary table called RESTARTs with a dummy record and insert one record into the table for every commit with the key and occurrence of commit. This insertion should happen, just BEFORE the issue of COMMIT.First paragraph of the procedure should read the last record of the table and skipped the record that are already processed and committed (1000 in the previous case). After the processing of all the records, delete the entries in the table and issue one final COMMIT.

22 Query to get salary with occurrence of two from table.

11

Page 12: Mainframes Interview Questions

SELECT SALARY FROM EMPTABLE GROUP BY SALARY HAVING COUNT (*) = 2

23 What is difference between having and order by?

HAVING: The HAVING clause is used to specify a search condition that each retrieved group must satisfy. The HAVING clause for groups, and contain the same kind of search conditions you specify in a WHERE clause. The search condition in the HAVING clause tests properties of each group rather than properties of individual rows in the group.ORDER BY: The ORDER BY clause is used to sort and order the rows of data in a result dataset by the values contained in the column specified. Default is ascending order (ASC). If the keyword DESC follows the column name, then descending order is used. Integer can also be used in place of column name in the ORDER BY clause.

24 What are and how many types of ISOLATION LEVELS.

CURSOR STABILITY (CS): As the cursor moves from the record in one page to the record in next page, the lock over the first page is released (provided the record is not updated). It avoids concurrent updates or deletion of row that is currently processing. It provides WRITE integrity.REPEATABLE READ (RR): All the locks acquired are retained until commit point. Prefer this option when your application has to retrieve the same rows several times and cannot tolerate data each time. It provides READ and WRITE integrity.READ STABILITY (RS): It is same as the repeatable read.UNCOMMITTED READ (UR): It is also known as DIRTY READ. It can be applied only for retrieval SQL. There are no locks during READ and so it may read the data that is not committed. Highly dangerous and use it when concurrency is your only requirement. It finds its great use in statistical calculation of the large table and data-warehousing environment.

25 How many ways are there to delete a record from table?

Two ways, DELETE and UPDATE.

26 What is Index Cardinality?

The number of distinct values for a column is called index cardinality. DB2's RUNSTATS utility analyzes column value redundancy to determine whether to use a table space or index scan to search for data.

27 What is difference between join and Union?

UNION: It is used to get information about more entities. In other words, it returns more rows.

12

Page 13: Mainframes Interview Questions

JOIN: It is used to get detail information about entities. In other words, it returns more columns.

28 What is RUNSTATS, REORG and EXPLAIN.

RUNSTATS: It collects statistical information for DB2 tables, table spaces, partitions, indexes and columns. It places the information into DB2 catalog tables. The DB2 optimizer uses the data in these tables to determine optimal access path for the SQL queries. CATMAINT, MODIFY and STOSPACE are the other catalog manipulation utilities.REORG: It is used to re-clustering the data in table space and resetting the free space to the amount specified in the CREATE DDL and deletes and redefining VSAM datasets for STOGROUP defined objects.EXPLAIN: EXPLAIN (YES) loads the access path selected by the optimizer in PLAN_TABLE. EXPLAIN (NO) is default and it won’t load any such details in PLAN_TABLE.

29 What is start and stop command in db2?

a. -DISPLAY UTILITY(*) b. -TERM UTILITY(*) c. -DIS DB(DTATNV03) SPACE(PNV03*) LIMIT(*) d. e. -STO DB(DTATNV03) SP(PNV03NI*) f. g. -STA DB(DTATNV03) SP(PNV03NI2) ACCESS(FORCE)

30 Maximum Salary First Max SalarySELECT MAX (SALARY) FROM EMPTABLE

Second Max SalarySELECT MAX (SALARY) FROM EMPTABLE A WHERE SALARY < (SELECT MAX (SALARY) FROM EMPTABLE B)

Nth Max SalarySELECT SALARY FROM EMPTABLE A WHERE N= (SELECT COUNT (*) FROM EMPTABLE WHERE B.SALARY >= A.SALARY)

31 Select emp id whose salary in any month not less than 2000.

EMPID MONTH SALARY100 JAN 1000100 FEB 2000100 MAR 3000

13

Page 14: Mainframes Interview Questions

200 JAN 2000200 FEB 3000200 MAR 4000300 JAN 3000300 FEB 4000300 MAR 5000

SELECT EMPID FROM EMPTABLE WHERE EMPID NOT IN (SELECT EMPID FROM EMPTABLE WHERE EMPID <2000)

JCL

1 What is COND parameter?

The COND specifies condition for execution of subsequent job step depends on the return code from previous steps.

COND can be coded in both JOB and EXEC statements. But COND written in EXEC stmt overrides on COND written in JOB statement.

Maximum 8 conditions can be coded in the COND parameter. In case of multiple conditions if ANY of the condition is found TRUE then the JOB stops proceeding further.

It bypasses the step if the condition is true.

SYNTAX : COND=(CODE,OPERATOR,STEPNAME)code can be 0 to 4095Operator can be GT, LT, GE, LE, EQStep name is optional, if omitted then the rerun code of all the steps are checked.

In EXEC you may find like

COND=ONLY it allows step execution if any prior step is ABENDED COND=EVEN it allows step execution even if the prior step is ABENDED

Example.//STEP2 EXEC PGM=PROG12, COND = (4, GT, STEP1)

Here system bypasses the STEP2 if 4 is greater than the return code from STEP1

2 What is RD?

Specify that the system is to allow the operator the option of performing automatic step or checkpoint restart if a job step abends with a restartable abend code.

RD[.procstepname]= {R } (Restart, Checkpoint Allowed){RNC} (Restart, No Checkpoint){NR } (No Automatic Restart, Check point Allowed){NC } (No Automatic Restart, No check point)

14

Page 15: Mainframes Interview Questions

3 What is REFERBACK?

The backward reference or refer back permits you to obtain information from a previous JCL statement in the job stream. “*” is the refer-back operator. It improves consistency and makes the coding easier.

DCB, DSN, VOL=SHR, OUTPUT, PGM can be referred-back.

Refer back can be done using the following ways.

a) Another DD of the same step will be referred.*.DDNAME

a) DD of another step can be referred*.STEPNAME.DDNAME (DDNAME of the STEPNAME)

b) DD of another proc step can be referred*.STEP-INVOKING-PROC.PROC-STEP-NAME.DDNAME

STAR in the SYSOUT par ammeter refers back to MSGCLASS of JOB card.Example://STEP1 EXEC PGM=TRANS//TRANFILE DD DSN=AR.TRANS.FILE, DISP=(NEW,KEEP,),// UNIT=SYSDA,VOL=SER=MPS800,// SPACE=(CYL,(2,1)),// DCB=(DSORG=PS,RECFM=FB,LRECL=80)//TRANERR DD DSN=AR.TRANS.ERR, DISP=(NEW,KEEP,),// UNIT=SYSDA,VOL=SER=MPS800,// SPACE=(CYL,(2,1)),// DCB=*.TRANFILE//STEP2 EXEC PGM=TRANSEP//TRANIN DD DSN=*STEP1.TRANFILE, DISP=SHR//TRANOUT DD DSN=AR.TRANS.A.FILE, DISP=(NEW,KEEP,),// UNIT=SYSDA,VOL=REF=*STEP1.TRANFILE,// SPACE=(CYL,(2,1)),// DCB=*.TRANFILE

4 What is difference between DISP=OLD, DISP=SHR, DISP=MOD

NEW creates a new data set OLD OLD results in the OS searching for an existing data set of the name specified. If the file is written to then its old data will be lost replaced by the new data SHR identical to OLD, Except when OLD give exclusive control of the dataset to the user whereas SHR allows multiple jobs to read the same dataset. MOD MOD results in the OS searching for an existing data set of the name specified. If file is not there it creates. If the file is written to then it appends the data to the existing file.

15

Page 16: Mainframes Interview Questions

5 There are 100 steps in JCL, I want to execute only from 51 to 100 how can I do it, (Restart should not be used) with syntax.

6 What is GDG, GDG base, GDG Model? How will you create GDG?

GDG (Generation Data Group) is group of datasets that are related to each other chronologically or functionally. Each of these dataset is called a generation. The generation number distinguishes each generation from other.

If the GDG Base is MM01.PAYROLL.MASTER, then their generations are identified using the generic name “MM01.PAYROLL.MASTER.GnnnnVxx”. nnnn is generation number (0001-9999) and xx is version number (00-99).

GDG Base is created using IDCAMS. The parameters given while creating the GDG are.

Parameter PurposeNAME Base of the GDG is given here.LIMIT The maximum number of GDG version that can exist at any point of

time. It is a number and should be less than 256EMPTY/NOEMPTY When the LIMIT is exceeded.

EMPTY keeps ONLY the most recent generations. GDG will be un-catalogued.NOEMPTY keeps the LIMIT number of newest generation. Un-catalogues on the oldest generation.

SCRATCH/NOSCRATCH SCRATCH un-catalogue and deletes the version that are not kept. (Physical deletion)NOSCRATCH just un-cataloguing and it is not physically deleted from the volume.

ONWERFOR DAYS (n)/ TO (DATE)

Owner of the GDGExpiry date. Can be coded either in the unit of days or till particular date.

Model dataset is defined after or along with the creation of base. Once model DCB is defined, then during the allocation of new versions, we no need to code DCB parameter. Model DCB parameter can be overridden by coding new parameter while creating the GDG version. It is worth to note that two GDG version can exist in two different formats.

//GDG EXEC PGM=IDCAMS//SYSPRINT DD SYSOUT=*//MODEL DD DSN=MM01.PAYROLL.MASTER,DISP=(NEW,KEEP),// UNIT=SYSDA,VOL=SER=MPS800,SPACE=(CYL,(10,20)),// DCB=(DSORG=PS,RECFM=FB,LRECL=400)//SYSIN DD * DEFINE GDG (NAME(MM01.PAYROLL.MASTER) -

LIMIT(5) -

16

Page 17: Mainframes Interview Questions

NOEMPTY -SCRATCH)

7 What is Procedure?

Set of Job control statements that are frequently used are saved separately as procedure and invoked in the JOB. The use of procedures helps in minimizing duplication of code and probability of error. This is because a procedure should consist of pre-tested statements. Each time a procedure is invoked; there is no need to reset its functionality, Since it is pre-tested.

Statements not allowed in a procedure The JOB statement and JES2/JES3 control statements The JOBCAT and JOBLIB statement. An in-stream procedure (an internal PROC/PEND pair) SYSIN DD *, DATA statement.

8 What is difference between In-stream procedure and Catalogue procedure?

If a procedure is placed in the same job stream, then it is called In-stream procedure. They start with PROC statement and send with PEND statement. They should be coded before first EXEC statement.

If a procedure exists in a PDS, it is called as catalogued procedure. PEND statement is not needed for catalogued procedures.

One procedure can invoke another procedure and this is called nesting. Nesting can be done till 15 levels. But there is an indirect of limit of 255 steps. So we have to make sure than expansion of any proc in the JOB is not exceeding this limit on number of steps.

9 Where will default values for Catalogue proc is defined.

Before first step of the procedure.

10 How to pass the data in in-stream procedure.

We have to declare SYSIN DD DUMMY in proc and we have to override the SYSIN in JCL.

11 What is DUMMY, and NULLFILE, Specific usage of these?

Thought there was a slight difference between NULLFILE and DUMMY. When concatenating multiple files in the same DD statement, if you used DD DUMMY in the list all files beyond that point were dropped as the DUMMY signaled the EOF. Whereas using NULLFILE would allow the remaining files/records to be read/processed.

12 What is REGION?

Syntax: REGION={ xK |yM } ( x can be 1-2096128 & y can be 1-2047)

17

Page 18: Mainframes Interview Questions

It is used to specify the amount of central/virtual storage the job requires. It can be requested in the units of kilobytes (xK) or megabytes (yM). If requested in terms of kilobytes, then x should be multiple of 4 or the system will round it tp nearest 4k allocates for your job.

REGION can be coded in EXEC statement also. REGION parameter coded on JOB card overrides the parameter coded on EXEC card.

REGION=0M allocate all the available memory in the address space to the JOB. Region related ABENDS: When the requested region is not available, the JOB will ABEND

with S822. When the requested region is not enough for the program to run, you will get ABEND S80A or S804.

13 What happens if Time=0 is given, will JCL run.

YES

14 What are Symbolic parameter and Overriding parameter.

Symbolic parameter: A symbolic is a PROC placeholder. The value for the symbolic is supplied when the PROC is invoked. (&symbol=value). If the value is not provided during invoke, then the default value coded in the PROC definition would be used for substitution.

Overriding parameter: An override changes the dataset names or parameters that are already coded in the procedure. When you override a cataloged procedure, the override applies just to that execution of the job. The cataloged procedure itself isn’t changed. Multiple overrides are allowed but they should follow the order. That is first you should override the parameters of step1, then step2 and then step3. Any overrides in the wrong order are IGNORED. If the STEPNAME is not coded override, then the system applies the override to first step alone.

15 What is Temporary data set, I have steps1 to steps10, step5 has created temporary datasets and it is being used in step 8, job abended at step6 what happens to temporary data set.

All the temporary datasets created in step5 will get deleted.

16 What is IEFBR14, what is the specific use? What happens is disposition is given as MOD, DEL, DEL.

It is a dummy program. This is two line assembler program which clears register 15 and branch to register 14. So whatever operation you do with IEFBR14, you will get the return code of 0. In this sense, we cannot call this as utility program. But it is default as any other utility program

If you rerun the job without deleting the datasets created by the job in the prior run, then your job will end with JCLERROR with message ‘duplicate dataset name found.’ (NOTCAT2). To avoid this, all the datasets are deleted in the first step using IEFBR14.

18

Page 19: Mainframes Interview Questions

Disposition (MOD, DELETE, DELETE), if the dataset is not found, it will be allocated and deleted.

IEFBR14 can be used for allocating datasets and for deleting the datasets in TAPE, which you cannot do in TSO.

17 What is SB37, SD37, SE37, I know it’s a space ABEND be specific in each.

SB37: End of volume. We have requested 1600 tracks (including primary and secondary). When the program tries to write more than 1600 tracks, the operation ended with SB37. It should be noted that though I have requested 1600 tracks, I may get this ABEND even after just 400 tracks. Because 1600 tracks is the best case allocation and 400 tracks is the worst case allocation.Solution:

1) Increase the size of primary and secondary reasonable. If you increase the size blindly, the also you will get SB37 if the size of secondary or primary is not available within five extends.

2) If the job again and again comes down, then simple solution is make the dataset multi volume by coding VOL=(,,,3). Now the dataset will get spaces of 48 extents and it would be more than enough for successful run.

SD37: Secondary space is not given. We have coded only the primary space and that is already filled and then if the program tries to write more, you will get SD37 space ABEND.Solution:

Include the secondary space in the SPACE parameter of the dataset that has given problem.

SE37: End of Volume. This is same as SB37. You will get this ABEND usually for a partitioned dataset.Solution:

If the partitioned dataset is already existing one, then compress the dataset using ‘Z’ in ISPF/IEBGENER and then submit the job once again. If it again ABENDED, then create a new dataset with more primary, secondary and directory space and copy all the members of current PDS to the new PDS, delete the old, rename the new to old and resubmit the JOB.

18 If Limit is reached and empty and No-scratch is defined, what happens to new version, how will it be created?

When the LIMIT is exceeded, EMPTY keeps only the most recent generation. NOSCRATCH just does un-cataloguing and it is not physically deleted from the volume.

19 How to copy VB file to FB file.

SORT FIELDS=(7,8,CH,A)OUTFIL FNAMES=FB1,VTOF, OUTREC=(5,76)20 What is Difference between RECFM=FB,F,VB,V,U. Which is efficient?

19

Page 20: Mainframes Interview Questions

RECFM= {F }{FB }{FBS}{FS } [A]{V } [M]{VB }{VBS}{VS }{U }

Record format is:F: fixed lengthB: blockedS: spannedV: variable lengthU: undefined lengthControl characters are:A: ISO/ANSI codeM: machine code

FB is more efficient.

21 How many max steps can be used in JCL.

255

22 How many DD statements can be given?

3273 DD statements can be coded in a single EXEC step.

23 How can you create a PS without using 3.2 options?

//STEP1 EXEC PGM=IEFBR14//DD1 DD DSN=TI822SA.TEST.FILE,// DISP=(NEW,CATLG,DELETE),// UNIT=SYSDA,VOL=SER=MPS800,// SPACE=(CYL,(5,5)),// DCB=(DSORG=PS,RECFM=FB,LRECL=80)

24 How will you find hex values in a file?

By HEX ON

25 What is REPRO command?

20

Page 21: Mainframes Interview Questions

It is the general purpose command that can operate on both VSAM and non-VSAM datasets. It performs three basic functions.

It creates a backup of a VSAM dataset on a physical sequential dataset and later it can be used for restore/rebuild the VSAM dataset.

It merges data from two VSAM dataset.REPRO INFILE(DDNAME) | INDATASET(DATASET-NAME) -OUTFILE(DDNAME) | OUTDATASET(DATASET-NAME) -

INFILE/INDATASET and OUTFILE/OUTDATASET point to input output datasets.REPRO can be used for selective copying.

Where to start REPRO Where to stop REPRO Valid ForFROMKEY(REC-KEY) TOKEY(REC-KEY) KSDS, ISAMFROMADDRESS(RBA) TOADDRESS(RBA) KSDS, ESDSFROMNUMBER(RRN) TONUMBER(RRN) RRDSSKIP(number) COUNT(number) KSDS, ESDS, RRDSSKIP refers to the number of input records to skip before beginning to copy and COUNT specifies the number of output records to copy.

26 What is default in DISP?

(NEW, DELETE, DELETE)

27 What is Keyword and positional parameters specify them.

All the parameters of JOB, EXEC and DD statements can be broadly classified into two types. They are POSITIONAL and KEYWORD parameters.

Parameter that has its meaning defined by its position is positional parameter. By passing of any positional paramater has to be informed by system by ‘,’. EX: accounting information and programmer name of the job card

Keyword parameter follows positional parameter and they can be coded in any order. Ex: All the parameters suffixed by ‘=’ are keyword parameter. PGM= and PROC= are exceptions for this rule. They are positional parameters.

28 What are the ways to pass a parameter from JCL to COBOL?

PARM and SYSIN

29 What is the Max Length of Parm that can be passed through JCL?

21

Page 22: Mainframes Interview Questions

100 bytes

30 How can you pass /* through SYSIN.

//SYSIN DD DATA, DLM=##//EMPFILE DD *2052 MUTHU1099 DEV/*##

31 How can you pass a value from COBOL to JCL?

Through RETURN-CODE

32 What happens if we try to write a file with DISP=OLD and DISP=MOD and DISP=SHR option.

DISP=OLD: File will be over written by the new records.DISP=MOD: Records are appended at last.DISP=SHR: In this mode when we try to write a record the DISP is automatically changed to OLD.

33 What are STEPLIB and JOBLIB? Which override which value?

STEPLIB: It follows EXEC statement. Load modules will checked first in this library and then in the system libraries. If it is not found in both places, then the JOB would ABEND with S806 code.JOBLIB: It follows JOB statement. Load modules of any steps (EXEC) that don’t have respective STEPLIB will be looked into this PDS. If not found, it will be checked against system libraries. If it is not found there also, then the JOB would ABEND with S806.STEP overrides the JOB.

34 What is JCLLIB?

It follows JOB statement. Catalogued procedures in the JOB are searched in this PDS. If they are not found, they will be checked in system procedure libraries. If they are not there, then there will be JCLERROR with ‘Proc not found’ message.Syntax: //PROCLIB JCLLIB ORDER(PDS1,PDS2)INCLUDE members are also kept in procedure libraries. (JCLLIB).

35 What is BLOCK, TRACK, CYLINDER parameter? What is the value for each?

22

Page 23: Mainframes Interview Questions

1 Block = 32KB of formatted space or 42KB of unformatted space1 track = 6 Blocks1 cylinder = 15 Tracks

36 What is the default Disposition parameter for abnormal termination of job?

DELETE

37 How will you alter a GDG.

With ALTER command.

//STEP1 EXEC PGM=IDCAMS //SYSIN DD * ALTER YOUR.GDG.NAME LIMIT(20) /*

38 What is difference between Force and Purge.

FORCE: Force allows you to delete data spaces, generation data groups, and user catalogs without first ensuring that these objects are empty.PURGE: Purge deletes the dataset before retention period.

39 Can we change a Limit of GDG after its creation?

Yes we can change the parameter values in the GDG

//STEP1 EXEC PGM=IDCAMS //SYSIN DD * ALTER YOUR.GDG.NAME LIMIT(20) /*

40 How will you override a particular dataset name of a procedure .How will you restart a proc step.

Overriding dataset.STEP-NAME-IN-PROC.DDNAME-OF-STEP

Restart a proc step.PROC-STEP-NAME.STEP-NAME-IN-PROC

41 Can we delete a record in ESDS.

NO

42 What is Difference between KSDS, RRDS, and ESDS?

23

Page 24: Mainframes Interview Questions

Characteristics ESDS KSDS RRDSEntry Sequence Based on entry

sequenceBased on collating sequence by key field

Based on relative record number order

Access Only sequential access is possible

Sequential and random access is possible. Random access is thru primary/alternate key

Can be accessed directly using relative record number, which serves as address

Alternate INDEX May have one or more alternate indexes. But cannot be used in BATCH COBOL. Can be used in CICS COBOL

May have one or more alternate indexes

Not applicable

Location of record A record RBA cannot be changed

A record RBA can be changed.

A relative record number can be changed

Free Space Exist at the end of the dataset to add records

Distributed free space for inserting records in between or change the length of existing record

Free slots are available for adding records at their location.

Deletion Cannot be deleted. REWRITE of same length is possible.

DELETE is possible DELETE is possible

Record Size FIXED or Variable FIXED or Variable FixedSPANNED records Possible Possible Not PossibleSpecialty Occupied less space Easy RANDOM

access and most popular method

Fastest access method

Peference Application that require sequential access only. EX: PAYROLL PROCESSING

Application that require each record to have a key field and require both direct and sequential access. Ex: BANKING APPLICATION

Application that require only direct access. There should be a field in the record that can be easily mapped to RRN

43 What is ALTERNATE INDEX?

It is the one way of accessing the KSDS and ESDS dataset without using the primary key.

44 What is BLDINDEX?

24

Page 25: Mainframes Interview Questions

Alternate index should have all the alternate keys with their corresponding primary key pointers. After the AIX is defined, this information should be loaded from base cluster. Then only we can access the records using AIX. BLDINDEX do this LOAD operation.

INFILE and OUTFILE points to Base Cluster and Alternate index Cluster.

45 How will you calculate REC length for AIX for KSDS and ESDS?

Unique Case: 5 + ( alt-key-length + primary-key )Non unique Case: 5 + ( alt-key-length + n * primary-key ) where n = number of duplicate records for the alternate key

46 IF/THEN/ELSE in JCL

//JOBE JOB ...//PROC1 PROC//PSTEPONE EXEC PGM=...// PEND//PROC2 PROC//PSTEPTWO EXEC PGM=...// PEND//EXP1 EXEC PROC=PROC1//EXP2 EXEC PROC=PROC2//IFTEST4 IF (EXP1.PSTEPONE.RC > 4) THEN//STEP1ERR EXEC PGM=PROG1// ELSE//IFTEST5 IF (EXP2.PSTEPTWO.ABENDCC=U0012) THEN//STEP2ERR EXEC PGM=PROG2// ELSE//NOERR EXEC PGM=PROG3//ENDTEST5 ENDIF//ENDTEST4 ENDIF//NEXTSTEP EXEC ...

COBOL

1 What is Static call and Dynamic call, how will you identify whether call is dynamic or static by seeing program.

Sl # STATIC Call DYNAMIC Call1 Identified by Call literal.

Ex: CALL ‘PGM1’.Identified by Call variable and the variable should be populated at run time.01 WS-PGM PIC X(08).Move ‘PGM1’ to WS-PGMCALL WS-PGM

2 Default Compiler option is NODYNAM and so all the literal calls are considered as static calls.

If you want convert the literal calls into DYNAMIC, the program should be compiled with DYNAM option.

25

Page 26: Mainframes Interview Questions

By default, call variables and any un-resolved calls are considered as dynamic.

3. If the subprogram undergoes change, sub program and main program need to be recompiled.

If the subprogram undergoes change, recompilation of subprogram is enough.

4 Sub modules are link edited with main module.

Sub modules are picked up during run time from the load library.

5 Size of load module will be large Size of load module will be less.6 Fast Slow compared to Static call.7 Less flexible. More flexible.8 Sub-program will not be in initial

stage the next time it is called unless you explicitly use INITIAL or you do a CANCEL after each call.

Program will be in initial state every time it is called.

2 What are compiler options for static and dynamic call, where will you specify them?

We pass the values to the compiler through PARM. Ex: CPARM='NODYNAM,OFFSET,TEST(NONE,SYM),NOZWB', Static = NODYNAMDynamic = DYNAM

3 What is call by reference and call by content.

Sl # Pass By Reference Pass By Content1 CALL ‘sub1’ USING BY REFERENCE

WS-VAR1 CALL ‘sub1’ USING BY CONTENT WS-VAR1(BY CONTENT keyword is needed)

2 It is default in COBOL. BY REFERENCE is not needed.

BY CONTENT key word is mandatory to pass an element by value.

3 Address of WS-VAR1 is passed Value of WS-VAR1 is passed4 The sub-program modifications on

the passed elements are visible in the main program.

The sub-program modifications on the passed elements are local to that sub-program and not visible in the main program.

4 What is stop run and go back, what happens if go back is given instead of Stop run.

The following statements affect the state of a file differently:

An EXIT PROGRAM statement does not change the status of any of the files in a run unit unless:

o The ILE COBOL program issuing the EXIT PROGRAM has the INITIAL attribute. If it has the INITIAL attribute, then all internal files defined in that program are closed.

26

Page 27: Mainframes Interview Questions

o An EXIT PROGRAM statement with the AND CONTINUE RUN UNIT phrase is issued in the main program of a *NEW activation group. In this case, control returns from the main program to the caller, which, in turn, causes the *NEW activation group to end, closing all of the files scoped to the activation group.

A STOP RUN statement returns control to the caller of the program at the nearest control boundary. If this is a hard control boundary, the activation group (run unit) will end, and all files scoped to the activation group will be closed.

A GOBACK statement issued from a main program (which is always at a hard control boundary) behaves the same as the STOP RUN statement. A GOBACK statement issued from a subprogram behaves the same as the EXIT PROGRAM statement. It does not change the status of any of the files in a run unit unless the ILE COBOL program issuing the GOBACK has the INITIAL attribute. If it has the INITIAL attribute, then all internal files defined in that program are closed.

A CANCEL statement resets the storage that contains information about the internal file. If the program has internal files that are open when the CANCEL statement is processed, those internal files are closed when that program is canceled. The program can no longer use the files unless it reopens them. If the canceled program is called again, the program considers the file closed. If the program opens the file, a new linkage to the file is established.

For a standalone program, it is the same. But STOP RUN from a called program will not return control to the calling program.

5 What is Linkage section, what is the important parameter in it?

This section allows a COBOL program to receive values from JCL. Also if you are calling a sub-program and you require some values to be passed between them then this is the section that should be used.Thus in a short this section allows a value to be passed into a COBOL program from JCL as well as a linked program. Thus the name Linkage Section.

Handling parameters in a COBOL procedure

Each parameter to be accepted or passed by a procedure must 1 be declared in the LINKAGE SECTION. For example, this code fragment 1 comes from a procedure that accepts two IN parameters (one 1 CHAR(15) and one INT), and passes an OUT parameter (an INT): LINKAGE SECTION. 01 IN-SPERSON PIC X(15). 01 IN-SQTY PIC S9(9) USAGE COMP-5.

01 OUT-SALESSUM PIC S9(9) USAGE COMP-5.

Ensure that the COBOL data types you declare map correctly to SQL data types. For a detailed list of data type mappings between SQL and COBOL, see "Supported SQL Data Types in COBOL".

Each parameter must then be listed in the PROCEDURE DIVISION. The 1 following example shows a PROCEDURE DIVISION that corresponds to the 1 parameter definitions from the previous LINKAGE SECTION example.

27

Page 28: Mainframes Interview Questions

PROCEDURE DIVISION USING IN-SPERSON IN-SQTY

OUT-SALESSUM.

It is used to access the data that are external to the program. JCL can send maximum 100 characters to a program thru PARM. Linkage section MUST be coded with a half word binary field, prior to actual field. If length field is not coded, the first two bytes of the field coded in the linkage section will be filled with length and so there are chances of 2 bytes data truncation in the actual field.

01 LK-DATA. 05 LK-LENGTH PIC S9(04) COMP. 05 LK-VARIABLE PIC X(08).

6 Is Using clause mandatory while calling sub program.

NO

7 What is REDEFINES.

The REDEFINES clause allows you to use different data description entries to describe the same computer storage area. Redefining declaration should immediately follow the redefined item and should be done at the same level. Multiple redefinitions are possible. Size of redefined and redefining need not be the same.

Example:01 WS-DATE PIC 9(06).01 WS-REDEF-DATE REDEFINES WS-DATE.

05 WS-YEAR PIC 9(02).05 WS-MON PIC 9(02).05 WS-DAY PIC 9(02).

8 01 WS-A PIC X(30) value ‘ABCDEFGHIJKLMNOPQRSTUVWXYZ ‘

01 WS-B REDEFINES WS-A. 05 WS-B1 PIC X(10).

05 WS-B2 PIC 9(10). 05 WS-B1 PIC X(10). --- What will be the value in WS-B2.

KLMNOPQRST

9 What is COMP SYNC

Causes the item to be aligned on natural boundaries. Can be SYNCHRONIZED LEFT or RIGHT.

For binary data items, the address resolution is faster if they are located at word boundaries

in the memory. For example, on main frame the memory word size is 4 bytes. This means

28

Page 29: Mainframes Interview Questions

that each word will start from an address divisible by 4. If my first variable is x(3) and next

one is S9(4) COMP, then if you do not specify the SYNC clause, S9(4) COMP will start from

byte 3 ( assuming that it starts from 0). If you specify SYNC, then the binary data item will

start from address 4. You might see some wastage of memory, but the access to this

computational field is faster.

10 What happens if I give 01 PIC X(10).

It is equivalent to FILLER

11 What happen if I initialize FILLER and OCCUR.

While initializing FILLER, OCCURS DEPENDING ON items are not affected.

12 What is Subscript and Index, what is the difference?

Sl # Subscript Index1 Working Storage item Internal Item – No need to declare it.2 It means occurrence It means displacement3 Occurrence, in turn translated to

displacement to access elements and so slower than INDEX access.

Faster and efficient.

4 It can be used in any arithmetic operations or for display.

It cannot be used for arithmetic operation or for display purpose.

5 Subscripts can be modified by any arithmetic statement.

INDEX can only be modified with SET, SEARCH and PERFORM statements.

13 Can we change Subscript value at run time? If yes give syntax.

NO

14 Can we change index value at run time, if yes give syntax.

NO

15 What is search and search all, syntax for Search and Search All.

Sl # Sequential SEARCH Binary SEARCH

29

Page 30: Mainframes Interview Questions

1 SEARCH SEARCH ALL2 Table should have INDEX Table should have INDEX3 Table need not be in SORTED order. Table should be in sorted order of the

searching argument. There should be ASCENDING/DESCENDING Clause.

4 Multiple WHEN conditions can be coded.

Only one WHEN condition can be coded.

5. Any logical comparison is possible. Only = is possible. Only AND is possible in compound conditions.

6 Index should be set to 1 before using SEARCH

Index need not be set to 1 before SEARCH ALL.

7 Prefer when the table size is small Prefer when the table size is significantly large.

Sequential SEARCHDuring SERIAL SEARCH, the first entry of the table is searched. If the condition is met, the table look-up is completed. If the condition is not met, then index or subscript is incremented by one and the next entry is searched and the process continues until a match is found or the table has been completely searched.

SET indexname-1 TO 1.SEARCH identifier-1 AT END display ‘match not found:’ WHEN condition-1 imperative statement-1 /NEXT SENTENCE WHEN condition-2 imperative statement-2 /NEXT SENTENCEEND-SEARCH

Identifier-1 should be OCCURS item and not 01 item.Condition-1, Condition-2 compares an input field or search argument with a table argument.Though AT END Clause is optional, it is highly recommended to code that. Because if it is not coded and element looking for is not found, then the control simply comes to the next statement after SEARCH where an invalid table item can be referred and that may lead to incorrect results / abnormal ends.

SET statement Syntax:SET index-name-1 TO/UP BY/DOWN BY integer-1.

01 TABLE-ONE. 05 TABLE-ENTRY1 OCCURS 10 TIMES INDEXED BY TE1-INDEX. 10 TABLE-ENTRY2 OCCURS 10 TIMES INDEXED BY TE2-INDEX. 15 TABLE-ENTRY3 OCCURS 5 TIMES ASCENDING KEY IS KEY1 INDEXED BY TE3-INDEX. 20 KEY1 PIC X(5). 20 KEY2 PIC X(10).. . .PROCEDURE DIVISION.

30

Page 31: Mainframes Interview Questions

. . . SET TE1-INDEX TO 1 SET TE2-INDEX TO 4 SET TE3-INDEX TO 1 MOVE “A1234” TO KEY1 (TE1-INDEX, TE2-INDEX, TE3-INDEX + 2) MOVE “AAAAAAAA00” TO KEY2 (TE1-INDEX, TE2-INDEX, TE3-INDEX + 2) . . . SEARCH TABLE-ENTRY3 AT END MOVE 4 TO RETURN-CODE WHEN TABLE-ENTRY3(TE1-INDEX, TE2-INDEX, TE3-INDEX) = “A1234AAAAAAAA00” MOVE 0 TO RETURN-CODE END-SEARCH

Values after execution:TE1-INDEX = 1TE2-INDEX = 4TE3-INDEX points to the TABLE-ENTRY3 itemthat equals “A1234AAAAAAAA00”RETURN-CODE = 0

Binary SEARCHWhen the size of the table is large and it is arranged in some sequence –either ascending or descending on search field, then BINARY SEARCH would be the efficient method.

SEARCH ALL identifier-1 AT END imperative-statement-1 WHEN dataname-1 = identifier-2/literal-1/arithmetic expression-1 AND dataname-2 = identifier-3/literal-2/arithmetic expression-2END-SEARCH.

Identifier-2 and identifier-3 are subscripted items and dataname-1 and dataname-2 are working storage items that are not subscripted.Compare the item to be searched with the item at the center. If it matches fine, else repeat the process with the left or right half depending on where the item lies.

01 TABLE-A. 05 TABLE-ENTRY OCCURS 90 TIMES ASCENDING KEY-1, KEY-2 DESCENDING KEY-3 INDEXED BY INDX-1. 10 PART-1 PIC 99. 10 KEY-1 PIC 9(5). 10 PART-2 PIC 9(6). 10 KEY-2 PIC 9(4). 10 PART-3 PIC 9(18). 10 KEY-3 PIC 9(5).

31

Page 32: Mainframes Interview Questions

You can search this table using the following instructions: SEARCH ALL TABLE-ENTRY AT END PERFORM NOENTRY WHEN KEY-1 (INDX-1) = VALUE-1 AND KEY-2 (INDX-1) = VALUE-2 AND KEY-3 (INDX-1) = VALUE-3 MOVE PART-1 (INDX-1) TO OUTPUT-AREAEND-SEARCH

16 What is internal sort, explain with syntax.

COBOL sort is known as internal sort.

SORT SORTFILE ON ASCENDING/DESCENDING KEY sd-key-1, sd-key-2USING file1 file2/INPUT PROCEDURE IS setion-1GIVING file3 /OUTPUT PROCEDURE IS section-2END-SORT

17 Can I display comp-3 and comp values.

YES

18 What is SOC7 Error, SOC4 Error?

SOC4 abend may be due to the following reasons.1.Missing SELECT statement ( During Compile time)2.Bad Subscript/Index3.Read/Write attempt to unopened file4.Move data to/from unopened file5.Missing parameters in called subprogram

SOC7 abend may be due to the following reasond1.Numeric Operation on Non-numeric data2.Coding past the maximum allowed subscript.3.Un-initialize working storage.

19 What is SSRANGE AND NOSSRANGE?

SSRANGE is a compiler option used to handle an array overflow. Default Compiler Option is NOSSRANGE and hence when required to handle the overflow condition, SSRANGE Compiler Option needs to be specified in the COBOL Program.

32

Page 33: Mainframes Interview Questions

20 What is Evaluate Statement, Can we Specify the when statements in any order.

No we cannot specify the when statement in any order.

With COBOL85, we use the EVALUATE verb to implement the case structure of other languages. Multiple IF statements can be efficiently and effectively replaced with EVALUATE statement. After the execution of one of the when clauses, the control is automatically come to the next statement after the END-EVALUATE. Any complex condition can be given in the WHEN clause. Break statement is not needed, as it is so in other languages.

General SyntaxEVALUATE subject-1 (ALSO subject2..) WHEN object-1 (ALSO object2..) WHEN object-3 (ALSO object4..) WHEN OTHER imperative statementEND-EVALUATE

1.Number of Subjects in EVALUATE clause should be equal to number of objects in every WHEN clause.2.Subject can be variable, expression or the keyword TRUE/ FLASE and respectively objects can be values, TRUE/FALSE or any condition. 3.If none of the WHEN condition is satisfied, then WHEN OTHER path will be executed.Sample

EVALUATE SQLCODE ALSO TRUEWHEN 100 ALSO A=B imperative statementWHEN -305 ALSO (A/C=4) imperative statementWHEN OTHER imperative statementEND-EVALUATE

21 What is Continue and next sentence?

CONTINUE is no operation statement. The control is just passed to next STATEMENT (One or more valid words and clauses). NEXT SENTENCE passes the control to the next SENTENCE (One or more statements terminated by a period)

22 Can we display value in 88 level.

NO, This level is known as CONDITION name.It is identified with special level ‘88’. A condition name specifies the value that a field can contain and used as abbreviation in condition checking. 01 SEX PIC X.

88 MALE VALUE ‘1’88 FEMALE VALUE ‘2’ ‘3’.

33

Page 34: Mainframes Interview Questions

IF SEX=1 can also be coded as IF MALE in Procedure division. ‘SET FEMALE TO TRUE ‘ moves value 2 to SEX. If multiple values are coded on VALUE clause, the first value will be moved when it is set to true.

23 How can you declare a file in COBOL Pgm, without assigning any DDNAME to it in JCL.

SELECT [OPTIONAL] FILENAME ASSIGN to DDNAME =>ALL Files

SELECT Statement- OPTIONAL ClauseThis can be coded only for input files. If OPTIONAL is not coded, then the input file is

expected to present in JCL. If not, an execution error will occur.If OPTIONAL is coded, then if the file is not mapped in JCL, it is considered as empty file and the first read results end of file. The file can also be dynamically allocated instead of static allocation in JCL.

24 What happen if I don’t assign any file in JCL, and the file is actually defined in COBOL pgm, what happen when I run JCL, if it runs why will it be running, if any error occurs how will you overcome that.

A cobol pgm looks for the file in jcl statement only at the time of open statement for that file. So after only declaring a file in cobol but not using anywhere in the program will not require any jcl statement for that file. So after declaring a file in cobol and not declaring any jcl statement for that file will not cause any problem until there is no open statement in cobol pgm.

25 What is Recording format of a LOAD LIBRARY.

U - undefined-length records

26 What is the record length for report file?

133 bytes

SQLCA

Name Data Type Field Values

sqlcaid CHAR(8) An "eye catcher" for storage dumps containing 'SQLCA'. The sixth byte is 'L' if line number information is returned from parsing an SQL procedure body.

34

Page 35: Mainframes Interview Questions

Name Data Type Field Values

sqlcabc INTEGER Contains the length of the SQLCA, 136.

sqlcode INTEGER Contains the SQL return code.

Code Means

0

Successful execution (although one or more SQLWARN indicators may be set).

positive

Successful execution, but with a warning condition.

negative

Error condition.

sqlerrml SMALLINT

Length indicator for sqlerrmc, in the range 0 through 70. 0 means that the value of sqlerrmc is not relevant.

sqlerrmc VARCHAR (70)

Contains one or more tokens, separated by X'FF', which are substituted for variables in the descriptions of error conditions.

This field is also used when a successful connection is completed.

When a NOT ATOMIC compound SQL statement is issued, it may contain information on up to seven errors.

The last token might be followed by X'FF'. The sqlerrml value will include any trailing X'FF'.

sqlerrp CHAR(8) Begins with a three-letter identifier indicating the product, followed by five digits indicating the version, release, and modification level of the product. For example, SQL08010 means DB2 Universal Database Version 8 Release 1 Modification level 0.

If SQLCODE indicates an error condition, this field identifies the module that returned the error.

35

Page 36: Mainframes Interview Questions

Name Data Type Field Values

This field is also used when a successful connection is completed.

sqlerrd ARRAY Six INTEGER variables that provide diagnostic information. These values are generally empty if there are no errors, except for sqlerrd(6) from a partitioned database.

sqlerrd(1) INTEGER If connection is invoked and successful, contains the maximum expected difference in length of mixed character data (CHAR data types) when converted to the database code page from the application code page. A value of 0 or 1 indicates no expansion; a value greater than 1 indicates a possible expansion in length; a negative value indicates a possible contraction. On successful return from an SQL procedure, contains the return status value from the SQL procedure.

sqlerrd(2) INTEGER If connection is invoked and successful, contains the maximum expected difference in length of mixed character data (CHAR data types) when converted to the application code page from the database code page. A value of 0 or 1 indicates no expansion; a value greater than 1 indicates a possible expansion in length; a negative value indicates a possible contraction. If the SQLCA results from a NOT ATOMIC compound SQL statement that encountered one or more errors, the value is set to the number of statements that failed.

sqlerrd(3) INTEGER If PREPARE is invoked and successful, contains an estimate of the number of rows that will be returned. After INSERT, UPDATE, DELETE, or MERGE, contains the actual number of rows that qualified for the operation. If compound SQL is invoked, contains an accumulation of all sub-statement rows. If CONNECT is invoked, contains 1 if the database can be updated, or if the database is read only.

If the OPEN statement is invoked, and the cursor contains SQL data change statements, this field contains the sum of the number of rows that

36

Page 37: Mainframes Interview Questions

Name Data Type Field Values

qualified for the embedded insert, update, delete, or merge operations.

If CREATE PROCEDURE for an SQL procedure is invoked, and an error is encountered when parsing the SQL procedure body, contains the line number where the error was encountered. The sixth byte of sqlcaid must be 'L' for this to be a valid line number.

sqlerrd(4) INTEGER If PREPARE is invoked and successful , contains a relative cost estimate of the resources required to process the statement. If compound SQL is invoked, contains a count of the number of successful sub-statements. If CONNECT is invoked, contains 0 for a one-phase commit from a down-level client; 1 for a one-phase commit; 2 for a one-phase, read-only commit; and 3 for a two-phase commit.

sqlerrd(5) INTEGER Contains the total number of rows deleted, inserted, or updated as a result of both:

The enforcement of constraints after a successful delete operation

The processing of triggered SQL statements from activated triggers

If compound SQL is invoked, contains an accumulation of the number of such rows for all sub-statements. In some cases, when an error is encountered, this field contains a negative value that is an internal error pointer. If CONNECT is invoked, contains an authentication type value of 0 for server authentication; 1 for client authentication; 2 for authentication using DB2 Connect; 4 for SERVER_ENCRYPT authentication; 5 for authentication using DB2 Connect with encryption; 7 for KERBEROS authentication; 8 for KRB_SERVER_ENCRYPT authentication; 9 for GSSPLUGIN authentication; 10 for GSS_SERVER_ENCRYPT authentication; and 255 for unspecified authentication.

37

Page 38: Mainframes Interview Questions

Name Data Type Field Values

sqlerrd(6) INTEGER For a partitioned database, contains the partition number of the partition that encountered the error or warning. If no errors or warnings were encountered, this field contains the partition number of the coordinator node. The number in this field is the same as that specified for the partition in the db2nodes.cfg file.

sqlwarn Array A set of warning indicators, each containing a blank or W. If compound SQL is invoked, contains an accumulation of the warning indicators set for all sub-statements.

sqlwarn0 CHAR(1) Blank if all other indicators are blank; contains W if at least one other indicator is not blank.

sqlwarn1 CHAR(1) Contains W if the value of a string column was truncated when assigned to a host variable. Contains N if the null terminator was truncated. Contains A if the CONNECT or ATTACH is successful, and the authorization name for the connection is longer than 8 bytes. Contains P if the PREPARE statement relative cost estimate stored in sqlerrd(4) exceeded the value that could be stored in an INTEGER or was less than 1, and either the CURRENT EXPLAIN MODE or the CURRENT EXPLAIN SNAPSHOT special register is set to a value other than NO.

sqlwarn2 CHAR(1) Contains W if null values were eliminated from the argument of a function. a

sqlwarn3 CHAR(1) Contains W if the number of columns is not equal to the number of host variables. Contains Z if the number of result set locators specified on the ASSOCIATE LOCATORS statement is less than the number of result sets returned by a procedure.

sqlwarn4 CHAR(1) Contains W if a prepared UPDATE or DELETE statement does not include a WHERE clause.

sqlwarn5 CHAR(1) Reserved for future use.

sqlwarn6 CHAR(1) Contains W if the result of a date calculation was adjusted to avoid an impossible date.

sqlwarn7 CHAR(1) Reserved for future use. If CONNECT is invoked and successful, contains

38

Page 39: Mainframes Interview Questions

Name Data Type Field Values

'E' if the DYN_QUERY_MGMT database configuration parameter is enabled.

sqlwarn8 CHAR(1) Contains W if a character that could not be converted was replaced with a substitution character.

sqlwarn9 CHAR(1) Contains W if arithmetic expressions with errors were ignored during column function processing.

sqlwarn10

CHAR(1) Contains W if there was a conversion error when converting a character data value in one of the fields in the SQLCA.

sqlstate CHAR(5) A return code that indicates the outcome of the most recently executed SQL statement.

Some functions may not set SQLWARN2 to W, even though null values were eliminated, because the result was not dependent on the elimination of null values.

Introduction to JOINS

If there is one SQL construct that I believe has generated the most confusion since its overhaul in DB2 for OS/390® Version 6, it would have to be outer joins.

Version 6 expanded the capabilities for coding predicates within the ON clause, as well as introducing a host of other optimization and query rewrite enhancements. Enhancing the syntax has definitely increased the potential usage of outer joins, but this also means that there is more to understand. The syntax too has been aligned much more closely with its cousins on the UNIX®, Linux, Windows®, and OS/2® platforms, making it easier to be consistent with your SQL coding across the DB2 family.

In this article, which consists of two parts, I attempt to assemble a guide for coding outer joins to achieve two goals:

The most important goal, obtaining the correct result. Secondly, consideration of the performance implications of coding your predicates in

different ways.

Part 1 covers the simpler constructs of outer joins, providing a simple comparison of the effect of coding predicates in the ON or WHERE clause. In Part 2, I will cover the more complex topics such as outer join simplification and nesting of outer joins.

39

Page 40: Mainframes Interview Questions

The examples in this article use extracts from the DB2 Universal Database (UDB) (non-OS/390) sample database. The data (in Figure 1) is a subset of the full tables. To cater for all outer join combinations, the row with PROJNO = 'IF2000' in the project table has been updated to set the DEPTNO = 'E01'.

For z/OS® and OS/390 users, the table names differ:

DB2 on Workstation table names DB2 for OS/390 and z/OS table names

EMPLOYEE EMP

DEPARTMENT DEPT

PROJECT PROJ

Back to top

Inner to outer joins

Inner joins Outer join classification

Left outer join

Right outer join

40

Page 41: Mainframes Interview Questions

Full outer join

Inner joins

For an inner join (or simple table join), only matched rows based on the join predicates are included in the result. Therefore, unmatched rows are not included.

In Figure 2, when joining the Project and Department tables on the DEPTNO column, the row with DEPTNO = 'E01' in the Project (left) table does not have a matched row in the Department table, and is therefore not returned in the result. Similarly, the row with DEPTNO = 'A01' in the department (right) table is also unmatched and not returned.

This example uses the"explicit join" syntax, whereby the keywords"INNER JOIN" (or simply JOIN) are coded between the joined tables. The join predicates are coded in the ON clause. Although this is not mandatory syntax for inner joins, it is for outer joins, and it is therefore good programming practice for consistency. There are a number of other reasons to consider this syntax:

It is more descriptive than simply coding a"comma" in the FROM clause to separate tables. This is important as queries become larger.

It forces the join predicates (ON clause) to be coded after each join, which means you are less likely to forget to code join predicates.

It is easy to determine which join predicates belong to what tables.

An inner join can be easily converted to an outer join if required.

41

Page 42: Mainframes Interview Questions

And, finally, on the subject of inner joins, people often ask me: "Does it matter in what order I code my tables in the FROM clause?" For retrieving the correct result, the answer is "no." For performance, the answer is "generally, no." The DB2 optimizer evaluates all possible join permutations (sequences) and selects the most efficient one. However, to quote the DB2 UDB for OS/390 and z/OS Administration Guide: "The order of tables or views in the FROM CLAUSE can affect the access path." My interpretation of this statement is that if two (or more) different join sequences equate to the same cost, then the tie-breaker may be the table order in the FROM clause.

Outer join table classification

Before exploring outer join examples, it is important to first understand how we classify tables in the join.

Tables in the FROM clause of an outer join can be classified as either preserved row or NULL- supplying. The preserved row table refers to the table that preserves rows when there is no match in the join operation. Therefore, all rows from the preserved row table that qualify against the WHERE clause will be returned, regardless of whether there is a matched row in the join.

The preserved row table is:

The left table in a left outer join. The right table in a right outer join.

Both tables in a full outer join.

The NULL-supplying table supplies NULLs when there is an unmatched row. Any column from the NULL- supplying table referred to in the SELECT list or subsequent WHERE or ON clause will contain NULL if there was no match in the join operation.

The NULL-supplying table is:

The right table in a left outer join The left table in a right outer join

Both tables in a full outer join.

In a full outer join, both tables can preserve rows, and also can supply NULLs. This is significant, because there are rules that apply to purely preserved row tables that do not apply if the table can also supply NULLs.

The order of coding tables in the FROM clause can have extreme significance for left and right outer joins -- and also for outer joins involving more than two tables -- because preserved row and NULL supplying tables behave differently when there is an unmatched row in the join.

Left outer join

42

Page 43: Mainframes Interview Questions

Figure 3 shows a simple left outer join.

The left outer join returns those rows that exist in the left table and not in the right table (DEPTNO = 'E01'), plus the inner join rows. Unmatched rows are preserved from the preserved row table and are supplied with NULLs from the NULL-supplying table. That is, when the row is unmatched with the right table (DEPTNO = 'E01'), then the DEPTNO value is NULL-supplied from the Department table.

Note the select list includes DEPTNO from both the preserved row and NULL-supplying table. From the output you can see that it is important to select columns from the preserved row table where possible, otherwise the column value may not exist.

Right outer join

43

Page 44: Mainframes Interview Questions

The right outer join returns those rows that exist in the right table and not in the left table (DEPTNO = 'A00'), plus the inner join rows. Unmatched rows are preserved from the preserved row table and are supplied with NULLs from the NULL-supplying table.

For a right outer join, the right table becomes the preserved row table, and the left table is the NULL-supplying table. The DB2 for OS/390 and z/OS optimizer rewrites all right outer joins to become left outer joins, by simply inverting the tables in the FROM clause and by changing the keyword RIGHT to LEFT. This query rewrite can only be seen by the presence of the value"L" in the JOIN_TYPE column of the plan table. For this reason, you should avoid coding right outer joins to avoid confusion when you are interpreting the access path in the plan table.

Full outer joins

44

Page 45: Mainframes Interview Questions

The full outer join returns those rows that exist in the right and not in the left (DEPTNO = 'A00'), plus the rows that exist in the left table and not in the right table (DEPTNO = 'E01'), and the inner join rows.

Both tables can supply NULLs but also preserve rows. However, the tables are identified as NULL-supplying because there are"query rewrite" and"WHERE clause predicate evaluation" rules that apply separately to NULL-supplying and to preserved row tables. I'll describe more about these differences in later examples.

In this example, both join columns have been selected to show that either table can supply NULL for unmatched rows.

To ensure that a non-NULL is always returned, code the COALESCE, VALUE, or IFNULL clause, which returns the first argument that is not NULL, as shown here:COALESCE(P.DEPTNO,D.DEPTNO)

Back to top

Outer join predicate types

Before-join predicates During-join predicates

After-join predicates

45

Page 46: Mainframes Interview Questions

In releases before DB2 for OS/390 Version 6, predicates could be only applied before the join, or totally after the join. Version 6 introduced the concepts of"during-join" predicates and"after-join-step" predicates.

DB2 can apply before-join predicates before the join to delimit the number of rows that are joined to subsequent tables. These"local", or"table access," predicates are evaluated as regular indexable, stage 1 or stage 2 predicates on the outer table of a pairwise join. Pairwise join is the term used to describe each join step of two or more tables. For example, a row from table 1 and table 2 is joined, and the result is joined to table 3. Each join only joins rows from two tables at a time.

During-join predicates are those coded in the ON clause. For all but full outer joins, these predicates can be evaluated as regular indexable, stage 1 or stage 2 predicates (similar to before-join predicates) on the inner table of a pairwise nested loop or hybrid join. For a full outer join, or any join using a merge scan join, these predicates are applied at stage 2, where the physical joining of rows occurs.

After-join-step predicates can be applied between joins. These are applied after the join in which all columns of the where clause predicate become available (simple or complex predicate separated by OR), and before any subsequent joins.

Totally-after-join predicates are dependent on all joins occurring before they can be applied.

Before-join predicates

Before Version 6 DB2 for OS/390, DB2 had a limited ability to push down WHERE clause predicates for application before the join. Therefore, to ensure a where clause predicate was applied before the join, you had to code the predicate in a nested table expression. This not only added complexity to achieve acceptable performance, but the nested table expression required the additional overhead of materializing the result before the join.

46

Page 47: Mainframes Interview Questions

From Version 6 onwards, DB2 can merge the nested table expression into a single query block, and thus avoids any unnecessary materialization. DB2 aggressively merges any nested table expression based upon the standard materialization rules listed in the Administration Guide or the Application Programming and SQL Guide.

Instead of coding these in nested table expressions, the predicates can now be coded in the WHERE clause as shown in Figure 7.

47

Page 48: Mainframes Interview Questions

The rule for before-join predicates coded in a WHERE clause is that they must apply to the preserved row table only; or to be more specific, the WHERE clause must not apply to a NULL-supplying table. This means you no longer have to code these predicates in nested table expressions.

For a full outer join, neither table can be identified as only being a preserved row table, and of course, both are NULL-supplying. For NULL-supplying tables, the risks of coding predicates in the WHERE clause are that they will either be applied totally after the join or will cause simplification of the outer join (which I will talk about in Part 2). To apply the predicates before the join, you must code them in nested table expressions as shown in Figure 8.

48

Page 49: Mainframes Interview Questions

Because they limit the number of rows that will be joined, before-join predicates are the most efficient of the predicate types described here. If you begin with a 5-million row table, which returns one row after the WHERE clause is applied, it is obviously more efficient to apply the predicate before joining the one row. The other alternative, which is not efficient, is to join 5 million rows, and then apply the predicate to produce a result of one row.

During-join predicates

Coding join predicates in the ON clause is mandatory for outer joins. In DB2 for OS/390 Version 6 and later, you can also code expressions, or"column-to-literal" comparisons (such as DEPTNO = 'D01',) in the ON clause. However, coding expressions in the ON clause can produce very different results from those same expressions coded in a WHERE clause.

This is because predicates in the ON clause, or during-join predicates, do not limit the result rows that are returned; they only limit which rows are joined. Only WHERE clause predicates limit the number of rows of rows that are actually retrieved.

Figure 9 demonstrates the result of coding an expression in the ON clause of a left outer join. This is not the result expected by most people when coding this type of query.

49

Page 50: Mainframes Interview Questions

In this example, because there are no WHERE clause predicates to limit the result, all rows of the preserved row (left) table are returned. But the ON clause dictates that the join only occurs when both P.DEPTNO = D.DEPTNO AND P.DEPTNO = 'D01'. When the ON clause is false (that is, P.DEPTNO <> 'D01'), then the row is supplied NULLs for those columns selected from the NULL-supplying table. Similarly, when P.DEPTNO is 'E01', then the first element of the ON clause fails, and the row from the left table is preserved, and the null is supplied from the right table.

When DB2 accesses the first table and determines that the ON clause will fail (such as when P.DEPTNO <> 'D01'), then to improve performance, DB2 immediately supplies NULL for the NULL-supplying table columns without even attempting to join the row.

Now let's talk about during- join predicates for a full outer join. The rules for the ON clause are the same for full joins as for left and right outer joins: the predicates in the ON clause do not limit the resultant rows which are returned, only which rows are joined.

For the example in Figure 10, because there are no WHERE clause predicates to limit the result and because both tables of a FULL JOIN preserve rows, then all rows of the left and right tables are returned. But the ON clause dictates that the join only occurs when P.DEPTNO = 'D01'. When the ON clause is false (that is, P.DEPTNO <> 'D01'), then the row is supplied NULLs for those columns selected from the opposite table to the table whose row is being preserved.

Note: This syntax is non-OS/390 only because OS/390 does not permit expressions in the ON clause of a full outer join.

50

Page 51: Mainframes Interview Questions

To simulate having the non-OS/390 comply with OS/390 DB2 syntax, then we must first derive the expression as a column within a nested table expression, and then perform the join. By first deriving the column DEPT2 as 'D01' in Figure 11, the ON clause effectively becomes a join only when P.DEPTNO = 'D01'.

51

Page 52: Mainframes Interview Questions

After-join predicates

Figure 12 contains a query with both after-join-step and totally-after-join predicates.

The first compound predicate in the WHERE clause refers only to tables D and E (D.MGRNO = E.EMPNO OR E.EMPNO IS NULL). Therefore, if the join sequence chosen by the optimizer mimics the coding of the SQL, then DB2 can apply the WHERE clause predicate after the join between D and E, and before the join to P. However, the second compound predicate in the WHERE clause refers to tables D and P (D.MGRNO = P.RESPEMP OR P.RESPEMP IS NULL).

52

Page 53: Mainframes Interview Questions

These are the first and third tables in the join sequence. This predicate cannot be applied therefore until the third table is joined, which is the final table in the join sequence. Hence this is referred to as a totally-after-join predicate.

It is likely that an after-join-step predicate may revert to a totally-after-join predicate if the table join sequence alters, which is possible given that the DB2 OS/390 optimizer can reorder the table join sequence based upon the lowest cost access path. Given that DB2 is able to apply the predicate as early as possible in between joins to limit the rows required for subsequent joins, then you should also attempt to code your predicates such that DB2 is able to apply them as early in the join sequence as possible.

Back to top

Conclusion

In this article, I described several topics:

The order of tables in the FROM clause and the effect on inner and outer joins The differences between these type of joins

The different predicate types.

To recap, WHERE clause predicates that are applied to the preserved row table can filter rows as either:

Before-join predicates After-join-step or totally-after-join predicates.

If these predicates are currently coded in a nested table expression, you can now write them in the WHERE clause. Before-join predicates are the most efficient predicates, because they limit the number of rows before the join. After-join-step predicates also limit the number of rows for subsequent joins. Totally-after-join predicates are the least efficient, since filtering occurs completely after all joins have taken place.

Predicates in the ON clause are the biggest surprise, because they only filter rows on the NULL-supplying table as during-join predicates. They do not filter rows on the preserved row table, as WHERE clause predicates do.

In Part 2 of this article, I will describe what happens if WHERE clause predicates are coded against the NULL-supplying table.

I hope that this article has given you some insight into outer joins and has given you some clues on how to solve the mystery of where to code your outer join predicates.

53

Page 54: Mainframes Interview Questions

FILE STATUS (error handling) in COBOL

A number of errors can occur that result from file input/output that programmer may wish to be able to deal with in order to avoid unexpected program termination.

Run time errors can arise quite easily from a file not being available to open, or if present the data is corrupted. Furthermore, what if there is no more disk space available or not enough space has beenallocated to allow for addition of new data. Other errors, such as attempting to close a file that isn't open, or to read a file opened for output only, may well derive from logical errors (that is, programming mistakes) but can be dealt with nonetheless when debugging. These kinds of errors will normally result in terminationof the program run, whereas using File Status can allow the programmer to deal with any such problems without the program run stopping and returning to the operating system.

File Status Codes are made of two digits, the first indicates one one of 5 classes:

0 ----> Input/output operation successful1 ----> File "at end" condition2 ----> Invalid key3 ----> Permanent I/O error4 ----> Logic error

The second digit refers to the particular case within the class. Here are examples common to both Microfocus and Fujitsu compilers (although there are more besides). I would check your compiler documentation.----------------------------------------------------------------------------------------Code Meaning----------------------------------------------------------------------------------------00 ---> Input/output operation successful02 ---> Duplicate record key found (READ ok)04 ---> Length of record too large (READ ok)10 ---> File AT END14 ---> "The valid digits of a read relative record number are greater than the size of the relative key item of the file."16 ---> Program tries to read file already AT END22 ---> Program attempts to write a record with a key that already exists23 ---> Record not found24 ---> Program attempts to write record to a disk that is full30 ---> Input/output operation unsuccessful, no further information available34 ---> Program attempts to write record to a disk that is full35 ---> Program tries to open non-existant file for INPUT, I-O or EXTEND37 ---> Program tries to open line sequential file in I-O mode41 ---> Program tries to open file that is already open42 ---> Program tries to close file that is not open43 ---> Program tries to delete or rewrite a record that has not been read44 ---> Program tries to write or rewrite a record of incorrect length

54

Page 55: Mainframes Interview Questions

46 ---> Program tries to read a record where the previous read or START has failed or the AT END condition has occurred47 ---> Program tries to read a record from a file opened in the incorrect mode48 ---> Program tries to write a record from a file opened in the incorrect mode49 ---> Program tries to delete or rewrite a record from a file opened in the incorrect mode-------------------------------------------------------------------------------------------To use these codes you need to include the FILE STATUS clause in the SELECT statement of the environment division:

SELECT TEST-FILE ASSIGN TO 'TEST-DATA.DAT'ORGANIZATION IS SEQUENTIALFILE STATUS IS W-STATUS.

Of course W-STATUS could any user name you like. It must however be defined in working storage as PIC XX, i.e. as alpha numeric and not numeric. So, if during a program run a certain input/output error occurs, rather than the program terminate, the program will simply produce an error status.

code:

*Here a possible danger of too big a record being moved into W-RECORDREAD RECORD-IN INTO W-RECORDIF W-STATUS = "04" THENDISPLAY "Over-sized record has been read"SET REC-XS-FLAG TO TRUEEND-IF

Another example might be, when reading from an indexed file:READ IN-FILEIF W-STATUS = "23" THENDISPLAY "Record not found"ELSE PERFORM MAIN-PROCESS

You could have easily had written:READ IN-FILEINVALID KEYDISPLAY "Record not found"NOT INVALID KEY PERFORM MAIN-PROCESSEND-READ

55