Top Banner
MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation
49

MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

Jan 18, 2016

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

MODIFY your way of thinking when it comes to anomalous

data formats

Steve Simon State Street Corporation

Page 2: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

What we shall examine during this hour

•Data files of different formats.•Examine ways and means of massaging the different formats into one ‘usable’ format.•Examine ways of “manufacturing” records to facilitate generating end user reports.

Page 3: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

A bit of history

• While working at a major airline a few years back, I encountered problem where one of our databases contained flight information with a start date of the service and a planned termination date for the service”.

Page 4: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

Orig Dest Origin City DestCity

Start Date

End Date

ALB SDF ALBANYNY

LOUISVILLE KY

20070122

20080403

ABQ LBB LUBBOCKTX

ALBUQUERQU

ENM

20070101

20071031

Page 5: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

• Our booking database on the other hand contained ‘daily records’ of the seating status of each class, for each flight segment (which could consist of one or more legs).

• This necessitated the break down of the data shown above into “a record per day” format.

Page 6: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

Date Orig Dest The Key

20070101 ABQ LBB 20070101ABQLBB

20070102 ABQ LBB 20070102ABQLBB

20070103 ABQ LBB 20070103ABQLBB

20070104 ABQ LBB 20070104ABQLBB

Page 7: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

So that we could effect a join to the “Available Seating” database.

Page 8: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

KEY F Y M N Q S20070101ABQLBB 9 8 1 2 3 920070102ABQLBB 2 5 2 7 3 420070103ABQLBB 4 4 1 8 9 520070104ABQLBB 9 2 1 1 3 4

Available Seating

Page 9: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.
Page 10: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.
Page 11: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.
Page 12: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.
Page 13: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

The raw data

Page 14: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.
Page 15: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

FILEDEF RAW DISK C:/ibi/apps/steve/AirlineSchedule.txtFILEDEF AIRLINE DISK C:\ibi\apps\steve\AIRLINE.FOC-RUN CREATE FILE AIRLINE-RUNMODIFY FILE AIRLINE FIXFORM DEPARTURE/3 DEPARTURECITY/50 ARRIVAL/3 ARRIVALCITY/50 FIXFORM STARTDATE/A8 ENDDATE/A8 MATCH WITH-UNIQUES DEPARTURE ARRIVAL ON MATCH REJECT ON NOMATCH INCLUDE DATA ON RAWEND

Page 16: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.
Page 17: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

Creating that “record per day”

Page 18: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

FILEDEF AIRLINE1 DISK C:/ibi/apps/steve/AIRLINE.OUTTT -RUN MODIFY FILE AIRLINE COMPUTE STARTDATE1/YYMD = 0; COMPUTE ENDDATE1/YYMD = 0; COMPUTE STARTCITY/A50=; COMPUTE ENDCITY/A50=; COMPUTE STARTCODE/A3=; COMPUTE ENDCODE/A3=; COMPUTE TEMPDATE/YYMD=0; PERFORM EXTRACT1

Filedef’s and variable initialization

Page 19: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

We shall utilize the Scratch Pad Area (SPA)

Page 20: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

Get the data from the databaserecord by record

CASE EXTRACT1 NEXT WITH-UNIQUES DEPARTURE ARRIVAL ON NEXT ACTIVATE DEPARTURECITY ARRIVALCITY STARTDATE

ENDDATE ON NEXT COMPUTE STARTDATE1= D.STARTDATE; ON NEXT COMPUTE ENDDATE1 = D.ENDDATE; ON NEXT COMPUTE STARTCITY = D.DEPARTURECITY; ON NEXT COMPUTE ENDCITY = D.ARRIVALCITY; ON NEXT COMPUTE STARTCODE = D.DEPARTURE; ON NEXT COMPUTE ENDCODE = D.ARRIVAL; ON NEXT COMPUTE TEMPDATE = D.STARTDATE; ON NEXT PERFORM EXTRACT2 ON NONEXT GOTO EXIT ENDCASE

Page 21: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

Start date greater than end date?Yes: quit case No: write the record to file

CASE EXTRACT2 IF TEMPDATE GT ENDDATE1 THEN PERFORM EXTRACT1;TYPE ON AIRLINE1 "<TEMPDATE><STARTCODE><ENDCODE><STARTCITY> <ENDCITY>" COMPUTE TEMPDATE = TEMPDATE + 1; GOTO EXTRACT2 ENDCASE DATA END -RUN

Page 22: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

The output

Page 23: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.
Page 24: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

Where do we go from here?The available seating table resides

in a SQL Server database

Page 25: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

Load this data into our SQL Server data repository

Page 26: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

Create INSERT statements

Page 27: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

FILEDEF ROUTECOUNT DISK C:/ibi/apps/steve/AIRLINE.OUTTT-RUNAPP HOLD steveTABLE FILE ROUTECOUNTPRINT *ON TABLE HOLD AS RECCOUNTEND-SET &LLINES = &LINES;-START111-SET &FILENUM = 1;-SET &CURRENTCTR =0;-SET &FIRSTLINE = 'INSERT INTO DailyFlights(Date,Start,Destination,';-SET &FIRSTLINE1 = 'StartCity,DestinationCity)';-SET &SECONDLINE =;-SET &THIRDLINE = ;-SET &APOST = HEXBYT(39,'A1');-SET &DATEE=;-SET &STARTC=;-SET &DESTC=;-SET &SCDEST=;-SET &ECDEST=;

Page 28: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

Write the SQL “Use”Statements

FILEDEF ROUTECOUNT1 DISK C:/ibi/apps/steve/AIRLINE.OUTTT FILEDEF SCHEDULE DISK C:/ibi/apps/steve/AIRLINE.SQL1-RUN-WRITE SCHEDULE USE FUSE2007-WRITE SCHEDULE GO-WRITE SCHEDULE BEGIN TRANSACTION

Page 29: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

Read all records & write to file

-REPEAT LOOPER FOR &I FROM 1 TO &LLINES STEP 1-READ ROUTECOUNT1 &A.2 &DATEE.10 &C.1 &STARTC.3 &A.1 &DESTC.3 &B.1 &SCDEST.50,- &CA.1 &ECDEST.50-SET &SECONDLINE = ' VALUES (' || &APOST || &DATEE || &APOST;-SET &SECONDLINE = &SECONDLINE || ',' || &APOST || &STARTC || &APOST;-SET &SECONDLINE = &SECONDLINE || ',' || &APOST || &DESTC || &APOST;-SET &SECONDLINE = &SECONDLINE || ',' || &APOST || &SCDEST || &APOST;-SET &SECONDLINE = &SECONDLINE || ',' || &APOST || &ECDEST || &APOST;-SET &SECONDLINE = &SECONDLINE || ');';-WRITE SCHEDULE &FIRSTLINE-WRITE SCHEDULE &FIRSTLINE1-WRITE SCHEDULE &SECONDLINE-LOOPER-WRITE SCHEDULE COMMIT TRANSACTION

Page 30: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

The Insert Statements

USE FUSE2007GOBEGIN TRANSACTIONINSERT INTO DailyFlights(Date,Start,Destination,StartCity,DestinationCity) VALUES ('2006/11/30','ABE','MHT','ALLENTOWN, PA','MANCHESTER, NH');INSERT INTO DailyFlights(Date,Start,Destination,StartCity,DestinationCity) VALUES ('2006/12/01','ABE','MHT','ALLENTOWN, PA','MANCHESTER, NH');

…..COMMIT TRANSACTION

Page 31: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

The 50 million foot view

Page 32: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

Raw Data

Sequential File SQL Statements

File SystemWatcher & SSIS Load Package

Page 33: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.
Page 34: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.
Page 35: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.
Page 36: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.
Page 37: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

cd C:\Program Files\Microsoft SQL Server\90\DTS\Binn

DTExec /f "C:\AirlineScheduleLoad\AirlineSchedule\AirlineSchedule\bin\LoadSchedule.dtsx"

Page 38: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.
Page 39: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.
Page 40: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.
Page 41: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.
Page 42: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

Query created by join

Page 43: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.
Page 44: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

JOIN DATEM AND START AND DESTINATION IN DAILYFLIGHTS TO DATEE AND START AND DESTINATION IN BOOKINGS AS J1-RUNDEFINE FILE DAILYFLIGHTSCITYPAIR/A10 = START || '-'|| DESTINATION;ENDTABLE FILE DAILYFLIGHTSPRINT DATEM AS 'Date‘ CITYPAIR AS 'City Pair‘ STARTCITY AS 'Origin' DESTINATIONCITY AS 'Destination‘ FCLASS AS 'F’ YCLASS AS 'Y‘ MCLASS AS 'M' NCLASS AS 'N‘ QCLASS AS 'Q‘ SCLASS AS 'S'BY START NOPRINTBY DESTINATION NOPRINTBY DATEM NOPRINTON TABLE SUBHEAD"Orange Free State Airlines""Flight Schedule“

…..

Page 45: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.
Page 46: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.
Page 47: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

During this hour we

•Examined data files of different formats.•Examined ways and means of massaging the different formats into one ‘usable’ format.•Examined ways of “manufacturing” records to facilitate generating end user reports.•Verified that the data was correct.

Page 48: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

During this hour we

• Saw that there were many “different” ways to modify anomalous data into the format of your choice, to produce the reports that you require.Which really goes to show that you can..

Page 49: MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

MODIFY your way of thinking when it comes to anomalous

data formats

Steve Simon State Street Corporation

PowerPoint presentation & code samples may be found at:http://cid-4c765fc825912e4d.skydrive.live.com/browse.aspx/Public

or by email [email protected]