Top Banner
Tera-Tom on Teradata Utilities V12-V13 by Tom Coffing Coffing Data Warehousing. (c) 2011. Copying Prohibited. Reprinted for Jaskaran Singh, Capgemini US LLC [email protected] Reprinted with permission as a subscription benefit of Books24x7, http://www.books24x7.com/ All rights reserved. Reproduction and/or distribution in whole or in part in electronic,paper or other forms without written permission is prohibited.
12
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Chapter 8 INMOD Processing

 

Tera-Tom on Teradata Utilities V12-V13

by Tom Coffing Coffing Data Warehousing. (c) 2011. Copying Prohibited.

  

Reprinted for Jaskaran Singh, Capgemini US LLC

[email protected]

Reprinted with permission as a subscription benefit of Books24x7, http://www.books24x7.com/

All rights reserved. Reproduction and/or distribution in whole or in part in electronic,paper or other forms without written permission is prohibited.

Page 2: Chapter 8 INMOD Processing

Chapter 8: INMOD Processing

“Democracy is a process by which people are free to choose the man who will get the blame.” 

- Laurence J. Peter

What is an INMOD

When data is being loaded into Teradata the processing of the data is performed by the utility. All of the utilities are able to read files that contain a variety of formatted and unformatted data. They are able to read from disk and from tape. These files and devices must support a sequential access method. Then, the utility is responsible for incorporating the data into SQL for use by Teradata. However, there are times when it is advantageous to use a different access technique or a special device.

When special input processing is desired, then an (acronym for INput MODule) is a potential approach to solving the problem. An is written to perform the input of the data from a data source. It removes the responsibility of performing input data from the utility. Many times an is written because the utility is not capable of performing the particular input processing. Other times, it is written for convenience.

The is a user written routine to do the specialized access from the file system, device or database. The does not replace the utility; it becomes a part of and an extension of the utility. The major difference is that instead of the utility receiving the data directly; it receives the data from the INMOD. An can be written to work with FastLoad, MultiLoad, and FastExport.

As an example, an might be written to access the data directly from another RDBMS besides Teradata. It would be written to perform the following steps:

1. Connect to the RDBMS

2. Retrieve a row using a SELECT or DECLARE CURSOR

3. Pass the row to the utility

4. Loop back and do steps 2 & 3 until there is no more data

5. When there is no more data, disconnect from the RDBMS

An is sometimes called an exit routine. This is because the utility exits itself by calling the and passing control to it. The performs its processing and exits back as its method for passing the data back to the utility. The following diagram illustrates the normal logic flow when using the utility:

Tera­Tom on Teradata Utilities V12­V13

Reprinted for Q4OGY\237048, Capgemini US LLC Coffing Data Warehousing, Coffing Publishing (c) 2011, Copying Prohibited

Page 2 / 12

Page 3: Chapter 8 INMOD Processing

As seen in the above diagram, there is an extra step involved with the processing of an INMOD. On the other hand, it can eliminate the need to create an intermediate file by literally using another RDBMS as its data source. However, the user still scripts and executes the utility, like when using a file, that portion does not change. The following chart shows the appropriate languages for mainframe and network-attached systems: written in.

Calling an INMOD from FastLoad

As shown in the diagrams above, the user still executes the utility and the utility is responsible for calling the INMOD. Therefore, the utility needs an indication from the user that it is supposed to call the instead of reading a file.

Normally the utility script contains the name of the file or JCL statement (DDNAME). When using an INMOD, the file designation is no longer specified. Instead, the name of the program to call is defined in the script.

The following chart indicates the appropriate statement to define the INMOD:

Writing an INMOD

The writing of an is primarily concerned with processing an input data source. However, it cannot do the processing haphazardly. It must wait for the utility to tell it what and when to perform every operation.

It has been previously stated that the returns data to the utility. At the same time, the utility needs to know that it is expecting to receive the data. Therefore, a high degree of handshake processing is necessary for the two components (and utility) to know what is expected.

As well as passing the data, a status code is sent back and forth between the utility and the INMOD. As with all processing, we hope for a successful completion. Earlier in this book, it was shown that a zero status code indicates a successful

Operating System Programming Language

VM or MVS Assembler, COBOL, SAS/C or IBM PL/I

UNIX or Windows C (although not supported, MicroFocus COBOL can be used)

Utility Name Statement (replaces FILE or DDNAME)

FastLoad DEFINEINMOD=<INMOD-name>

MultiLoad, TPumpand FastExport .IMPORTINMOD=<INMOD-name>

Tera­Tom on Teradata Utilities V12­V13

Reprinted for Q4OGY\237048, Capgemini US LLC Coffing Data Warehousing, Coffing Publishing (c) 2011, Copying Prohibited

Page 3 / 12

Page 4: Chapter 8 INMOD Processing

completion. That same situation is true for communications between the utility and the INMOD.

Therefore, a memory area must be allocated that is shared between the and the utility. The area contains the following elements:

1. The return or status code

2. The length of the data that follows

3. The data area

Writing an INMOD for FastLoad

The following charts show the various programming statements to define the data elements, status codes and other considerations for the various programming languages.

Parameter definition for FastLoad

Return/status codes from FastLoad to the INMOD

Return/status codes for the INMOD to FastLoad

Entry point for FastLoad used in the DEFINE:

Figure 7-6

Value Indicates that . . .

0 FastLoad is calling the INMODfor the first time. The INMOD should open/connect to the data source, read the first record and return it to FastLoad.

1 FastLoad is calling for the next record. The INMOD should read the next record and return it to FastLoad.

2 FastLoadand the INMOD failed and have been restarted. The INMOD should use the saved record count to reposition in the input data source to where it left off. Since checkpoint is optional in FastLoad, it must be requested in the script. This also means that for values 0 and 1, the INMOD must count each record and save the record count for use if needed. Do not return a record to FastLoad.

3 FastLoad has written a checkpoint. The INMODshould guarantee that the record count has been written to disk. Do not return a record to FastLoad.

4 The Teradata RDBMS failed. The INMOD should use the saved record count to reposition in the input data source to where it left off. Do not return a record to FastLoad.

5 FastLoad has finished loading the data to Teradata. The INMOD should cleanup and end.

Value Indicates that . . .

0 The INMOD is returning data to the utility.

Not 0 The utility is at end of file.

SAS/C <dynamic-name-by-user>

All others BLKEXIT

Tera­Tom on Teradata Utilities V12­V13

Reprinted for Q4OGY\237048, Capgemini US LLC Coffing Data Warehousing, Coffing Publishing (c) 2011, Copying Prohibited

Page 4 / 12

Page 5: Chapter 8 INMOD Processing

NCR Corporation provides two examples for writing a INMOD. The first is called BLKEXIT.C, which does not contain the checkpoint and restart logic, and the other is BLKEXITR.C that does contain both checkpoint and restart logic.

Writing for MultiLoad, Tpump, and FastExport

The following charts show the data statements used to define the two parameter areas for the various languages.

First Parameter definition for MultiLoad, TPump and FastExport to the INMOD

Second Parameter definition for INMOD to MultiLoad, TPump and FastExport

Return/status codes for MultiLoad, TPump and FastExport to the INMOD

Value Indicates that . . .

0 The utility is calling the INMOD for the first time. The INMOD should open/connect to the data source, read the first record and return it to the utility.

1 The utility is calling for the next record. The INMOD should read the next record and return it to the utility.

2 The utility and the INMOD failed and have been restarted. The INMOD should use the saved record count to reposition in the input data source to where it left off. Since checkpoint is optional in The utility, it must be requested in the script. This also means that for values 0 and 1, the INMOD must count each record and save the record count for use if needed. Do not return a record to the utility.

3 The utility needs to write a checkpoint. The INMOD should guarantee that the record count has been written to disk and return it to the utility in the second parameter to be stored in the LOGTABLE. Do not return a record to the utility.

4 The Teradata RDBMS failed. The INMOD should receive the record count from the utility in the second parameter for use in repositioning in the input data source to where it left off. Do not return a record to the utility.

5 The utility has finished loading the data to Teradata. The INMOD should cleanup and end.

Tera­Tom on Teradata Utilities V12­V13

Reprinted for Q4OGY\237048, Capgemini US LLC Coffing Data Warehousing, Coffing Publishing (c) 2011, Copying Prohibited

Page 5 / 12

Page 6: Chapter 8 INMOD Processing

The following diagram shows how to use the return codes of 6 and 7:

Return/status codes for the INMOD to MultiLoad, TPump and FastExport:

Entry point for MultiLoad, TPump and FastExport:

Migrating an INMOD

As seen above many of the return codes are the same. However, it should also be noted that must remember the record count in case a restart is needed, whereas, the other utilities send the record count to the INMOD. If the fails to accept the record count when sent to it, the job will abort or hang and never finish successfully.

This means that if a is used in one of the other utilities, it will work as long as the utility never requests that a checkpoint take place. Remember that unlike FastLoad, the newer utilities default to a checkpoint every 15 minutes. The only way to turn it off is to set the CHECKPOINT option of the .BEGIN to a number than is higher than the number of records that is being processed.

Therefore, it is not the best practice to simply use a as if it is interchangeable. It is better to modify the logic for the restart and checkpoint processing necessary to receive the record count and use it for the repositioning operation.

Writing a NOTIFY Routine

As seen earlier in this book, there is a NOTIFY statement. If the standard values are acceptable, you should use them. However, if they are not, you may write your own NOTIFY routine.

If you chose to do this, refer to the NCR Utilities manual for guidance for writing this processing. We just want you to know here that it is something you can do.

6 The INMOD should initialize prepare to receive the first data record from the utility.

7 The INMOD should receive the next data record from the utility.

Value Indicates that . . .

0 The INMOD is returning data to the utility.

Not 0 The utility is at end of file.

All languages <dynamic-name-by-user>

Tera­Tom on Teradata Utilities V12­V13

Reprinted for Q4OGY\237048, Capgemini US LLC Coffing Data Warehousing, Coffing Publishing (c) 2011, Copying Prohibited

Page 6 / 12

Page 7: Chapter 8 INMOD Processing

Sample INMOD

Below is and example of the PROCEDURE DIVISION commands that might be used for MultiLoad, or FastExport.

PROCEDURE DIVISION USING PARM-1, PARM-2. BEGIN. MAIN. { specific user processing goes here, followed by: } IF RETCODE= 0 THEN DISPLAY "RECEIVED - RETURN CODE 0 - INITIALIZE & READ " PERFORM 100-OPEN-FILES PERFORM 200-READ-INPUT GOBACK ELSE IF RETCODE= 1 THEN DISPLAY "RECEIVED - RETURN CODE 1- READ" PERFORM 200-READ-INPUT GOBACK ELSE IF RETCODE= 2 THEN DISPLAY "RECEIVED - RETURN CODE 2 - RESTART " PERFORM 900-GET-REC-COUNT PERFORM 950-FAST-FORWARD-INPUT GOBACK ELSE IF RETCODE= 3 THEN DISPLAY "RECEIVED - RETURN CODE 3 - CHECKPOINT " PERFORM 600-SAVE-REC-COUNT GOBACK ELSE IF RETCODE= 5 THEN DISPLAY "RECEIVED - RETURN CODE 5 - DONE " MOVE 0 TO RETLENGTH MOVE 0 TO RETCODE GOBACK ELSE DISPLAY "RECEIVED – INVALID RETURN CODE " MOVE 0 TO RETLENGTH MOVE 16 TO RETCODE GOBACK. 100-OPEN-FILES. OPEN INPUT DATA-FILE. MOVE 0 TO RETCODE. 200-READ-INPUT. READ INMOD-DATA-FILE INTO DATA-AREA1 AT END GO TO END-DATA. ADD 1 TO NUMIN. MOVE 80 TO RETLENGTH. MOVE 0 TO RETCODE. ADD 1 TO NUMOUT. END-DATA. CLOSE DATA-FILE. DISPLAY "NUMBER OF INPUT RECORDS = " NUMIN. DISPLAY "NUMBER OF OUTPUT RECORDS = " NUMOUT. MOVE 0 TO RETLENGTH. MOVE 0 TO RETCODE. GOBACK.

Nexus Query Chameleon - Pivot your Answer Sets

Nexus allows you to Pivot your answers sets in a variety of ways. Many manufacturing companies utilize Nexus and they need to be able to pivot data and then graph and chart the data as well. Nexus has very sophisticated pivoting techniques that allow you to see your data just like you demand.

Just right click on any answer set and choose Pivot. The Pivot Wizard will allow you to Pivot your data in any direction. There is a Basic Pivot tool (see below) and an Advanced Tab on the Wizard that will give you the flexibility for your more complicated Pivots.

Tera­Tom on Teradata Utilities V12­V13

Reprinted for Q4OGY\237048, Capgemini US LLC Coffing Data Warehousing, Coffing Publishing (c) 2011, Copying Prohibited

Page 7 / 12

Page 8: Chapter 8 INMOD Processing

Download a FREE Trial of the Nexus Query Chameleon at: www.CoffingDW.com

See the Nexus Query Chameleon User Guide at: http://www.coffingdw.com/data/Nexus_Product_Info.pdf

Nexus Query Chameleon - History of your SQL

Press the HISTORY at the top of Nexus and your most recent SQL queries are right in front of you in full screen easy-to-read mode. Double click on any SQL and you are ready to paste it and go. Advanced search capabilities allow you to find anything you have ever run. You can even decide how many queries you want to see in your history view or show it all.

Tera­Tom on Teradata Utilities V12­V13

Reprinted for Q4OGY\237048, Capgemini US LLC Coffing Data Warehousing, Coffing Publishing (c) 2011, Copying Prohibited

Page 8 / 12

Page 9: Chapter 8 INMOD Processing

Nexus Query Chameleon - Use the Nexus Scheduler

The Nexus allows you to schedule when you want to run any query, batch job, Load Utility, Compress or Synchronization job. Why even come to work?

Teradata Compression with SmartCompress of Nexus

Teradata has the ability to use their Multi-Column Compression capabilities, but don’t provide a tool or make it an easy process to compress eligible columns. The Nexus has SmartCompress which compresses tables and saves about 35% space savings. That can literally save a company millions. Is there any bad news? No!

Tera­Tom on Teradata Utilities V12­V13

Reprinted for Q4OGY\237048, Capgemini US LLC Coffing Data Warehousing, Coffing Publishing (c) 2011, Copying Prohibited

Page 9 / 12

Page 10: Chapter 8 INMOD Processing

Compressing every table will save space, make queries faster, help with joins, and reduces spool. If you are not compressing your Teradata system you are leaving easy money on the table (well, I should say In the Table).

SmartCompress is extremely flexible and easy-to-use. It provides the ability to check your work before you compress and even provides detailed reports, graphs, and charts showing you your space savings per column.

Teradata Compression Reports

Nothing is nicer than showing your boss what a great job you are doing. SmartCompress has a REPORT button on it that will allow you to show a wide variety of reports, charts, and graphs for each table compressed. You can even run a fake compress and check the report and then decide to proceed or omit columns.

Tera­Tom on Teradata Utilities V12­V13

Reprinted for Q4OGY\237048, Capgemini US LLC Coffing Data Warehousing, Coffing Publishing (c) 2011, Copying Prohibited

Page 10 / 12

Page 11: Chapter 8 INMOD Processing

Nexus Query Chameleon - Use the Nexus Dashboard

At the top of Nexus is the Dashboard. Click you shall receive. You can see all types of system metrics in beautiful reports, charts and graphs with fantastic search capabilities.

Nexus Query Chameleon - Use the Nexus Dashboard to Profile a Table

At the top of Nexus is the Dashboard. You can click on that and choose the Profiler. This will allow you to examine all your columns and see all kinds of information including how well the column distributes and its skew factor.

Tera­Tom on Teradata Utilities V12­V13

Reprinted for Q4OGY\237048, Capgemini US LLC Coffing Data Warehousing, Coffing Publishing (c) 2011, Copying Prohibited

Page 11 / 12

Page 12: Chapter 8 INMOD Processing

Profiling data demographics will help enormously in making sure your Primary Index distributes well. This is also a great feature for queries against tables that are running out of spool.

Tera­Tom on Teradata Utilities V12­V13

Reprinted for Q4OGY\237048, Capgemini US LLC Coffing Data Warehousing, Coffing Publishing (c) 2011, Copying Prohibited

Page 12 / 12