Top Banner
Career Portfolio Name: Tushar Mahapatra Email: [email protected] As of: August 2010
27

Tushar Mahapatra - Portfolio for recent Projects

Jun 24, 2015

Download

Technology

Tushar Mahapatra - portfolio for recent projects
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Tushar Mahapatra - Portfolio for recent Projects

CareerPortfolio

Name: Tushar Mahapatra

Email: [email protected]

As of: August 2010

Page 2: Tushar Mahapatra - Portfolio for recent Projects

1. Contents1. Contents...............................................................................................................2

2. Introduction..........................................................................................................3

3. 09/2009 – present: Weather Data ETL...................................................................4

3.1 Analysis..........................................................................................................4

3.2 Design............................................................................................................5

3.3 Open-source ETL............................................................................................5

4. 09/2009 – 03/2010: SharePoint Training...............................................................7

4.1 SharePoint Student Project.............................................................................7

4.2 SharePoint Team Project................................................................................8

5. 2008 – 2009: Modeling with XSD, AIXM5, GML, UML...........................................12

6. 2004: Job Queue Framework and Data Extractors...............................................16

6.1 Design..........................................................................................................16

6.2 Implementation............................................................................................18

Page 2 of 24

Page 3: Tushar Mahapatra - Portfolio for recent Projects

2. IntroductionThis portfolio describes aspects of select projects I have worked on recently. The projects covered relate to the period starting from 2004 to the present (08/2010). Except for the SharePoint training projects, which were for training by SetFocus, all the other projects were executed as a consultant with the Federal Aviation Administration.

Page 3 of 24

Page 4: Tushar Mahapatra - Portfolio for recent Projects

3. 09/2009 – present: Weather Data ETL

1.1 AnalysisSince September 2009, I have been working on the development of a solution for capturing weather data and saving them to a database. The long-term requirement is for support for a variety of weather data formats. These formats consist of complex and archaic codes. Until now, I have been working with the METAR (meteorological report) and TAF (terminal aerodrome forecast) formats. They are similar to some extent. Available documentation describing the formats is not sufficiently detailed. Many variations of these formats exist. I analyzed the specific file formats I was provided with and described the syntax of these formats in a document using the railroad diagram construct. The specification is integrated, i.e. both the METAR and the TAF formats were described using one specification, and common elements were specified only once. The following picture shows a part of the document.

The texts with grey shading are hyperlinks, which make it easier to drill down deeper.

Page 4 of 24

Page 5: Tushar Mahapatra - Portfolio for recent Projects

1.2 DesignBased on my findings in the analysis phase, I designed the class diagram shown below to help me in the implementation of the parsing code. Implementation was in Java using the Eclipse development tool. I used a plug-in which facilitated model-driven architecture. Whenever I modified the UML class diagram, appropriate Java code was generated or modified. Vice versa, when I changed the Java code, the model was automatically updated. Both the METAR and the TAF reports are modeled in an integrated manner.

1.3 Pentaho Data IntegrationTo avail the advantages of open-source code, I researched the open-source ETL tools available. My comparative analysis steered me towards selecting the Pentaho Data Integration (PDI) product (formerly called Kettle, and a part of the Pentaho BI Project suite of business intelligence tools).

To integrate my parsing code with PDI, I developed a ‘METAR Input’ step plug-in. PDI solutions are packaged as ‘transformations’ and ‘jobs’. ‘Transformations’ contain the actual ETL functionality, whereas ‘jobs’ help in gluing together other jobs and transformations with job steps which contain non-ETL functionality. Transformations and jobs are edited and assembled digrammatically using a PDI editing tool called ‘Spoon’. Besides being self-documenting, the diagrams are themselves units of executable code, so all PDI code is self-documenting.

I developed two solutions to implement the ETL of METAR data: one is for downloading and ingesting METAR files, and the other is for ingesting METAR files already downloaded.

1.3.1 ‘FTP & ingest METAR files’ solution

The ‘FTP & ingest METAR files’ solution is designed to be scheduled to run at the top of each hour. On each run, 24 METAR files (one for each hour of a day) are downloaded from NOAA’s NWS website, and then ingested into the database. The following pictures illustrate parts of my solution. The first picture shows the top-level job named ‘FTP_ingest_METAR_files’. A determination is made of the current UTC hour and that is then used to drive certain actions. For example, log and data files

Page 5 of 24

Page 6: Tushar Mahapatra - Portfolio for recent Projects

are compressed for archival twice a day. A list of the names of the 24 METAR files ordered in a certain sequence is created. That list is then used to drive the FTP and ingestion of the METAR files.

The following picture shows the ‘Set_session_constants’ transformation opened in Spoon. The ‘Set fields to session constants’ step is an instance of the ‘Javascript Values’ step where Javascript is used to determine the current (UTC) hour and today’s and yesterday’s dates. The configuration dialog for the step is shown opened.

The ‘Generate_METAR_filenames’ transformation is shown opened in Spoon below. The configuration dialog for the ‘Set METAR_FILE_NAME of each row’ step is shown opened. This step is also an instance of the ‘Javascript Values’ step. The Javascript code shows how the names of the METAR files to be processed are generated. The order of the files is from least recent to most recent.

Page 6 of 24

Page 7: Tushar Mahapatra - Portfolio for recent Projects

On the left, the ‘FTP_ingest_METAR_file’ job is shown opened in the Spoon editor. The configuration dialog for the ‘FTP METAR file’ job step is also shown opened. This is an instance of the ‘Get a file with FTP’ job step. It shows how a field which was previously set to a METAR file name is being used.

The ‘Ingest_METAR_file’ transformation is shown opened below. The ‘METAR Input’ step plugin whose development was discussed above is shown in the toolbox. An instance of the step is shown being used in the transformation. The transformation parses the METAR file using the new plugin and then inserts the data in the database. Data for three child tables is denormalized before insertion.

Page 7 of 24

Page 8: Tushar Mahapatra - Portfolio for recent Projects

The configuration dialog for the ‘Read & parse METAR’ transformation, a part of the new plugin, is shown opened below.

1.3.2 ‘Ingest local METAR files’ solution

This solution ingests a folder tree containing METAR files and loads the data into a database. The ‘Ingest local METAR files’ job is shown below. First, a list of all METAR files in all subfolders is created. That list is then used to determine what folders need to be processed. For each folder in the list, the ‘Ingest METAR folder’ job is called.

Page 8 of 24

Page 9: Tushar Mahapatra - Portfolio for recent Projects

The ‘Determine METAR folders’ transformation is shown below.

The ‘Ingest METAR folder’ job is shown below. A list of the METAR files in the folder is first made. For each file, the ‘Ingest METAR file’ job is called. After ingestion of all the METAR files in the folder is done, the METAR and log files are zipped and the folder is deleted if it has no other files.

The ‘Get METAR filenames’ transformation is shown below.

The ‘Ingest METAR file’ job is shown below. The ‘Ingest METAR file’ transformation invoked by the job is the same one used in the earlier solution discussed in the previous section.

Page 9 of 24

Page 10: Tushar Mahapatra - Portfolio for recent Projects

4. 09/2009 – 03/2010: SharePoint Training

Between September 2009 and March 2010, I was enrolled in the SharePoint Training track of SetFocus’ Master’s Program. This track is an intensive SharePoint training experience designed to prepare students for development opportunities with Microsoft’s SharePoint 2007 product.

As part of the training, students were expected to complete two projects simulating real-world projects. The first project was a student project which each student completed alone. The second project was a team project where all students collaborated in the completion of the project.

1.4 SharePoint Student ProjectThe student project was for a fictitious towing company called Acme. We had to design and establish a SharePoint Solution Management Portal to help manage all of the SharePoint solutions created for the company. This portal in the company intranet was required to be created for the company’s SharePoint developers to organize and manage their solution projects. The following picture is a screenshot of the portal I developed.

The top section titled ‘Create Solution Sites’ is an instantiation of a ‘CreateSolutionSiteWebPart’ web part I developed. It accepts site creation data from the user and uses it to create a web site. On the top of the web page, there is a collection of tabs of which some have names of the form ‘Test Site ##’. These are sites created by this web part. Below this web part in the section titled ‘Solution Management’ is an instantiation of another web part I created named

Page 10 of 24

Page 11: Tushar Mahapatra - Portfolio for recent Projects

‘SolutionManagementWebPart’. This web part used SharePoint’s SPGridView and SPDataSource controls to display a grid-view of all solution items in the ‘Solution List’ SharePoint list. After this web part is a SharePoint list created from a ‘Change Management List Definition’ I developed. Next is an instance of the Content Query Web Part I developed to find and display all Solution items in the site collection.

1.5 SharePoint Team ProjectThe team project was also for a fictitious construction company called Acme. We had to design and establish a SharePoint application to support the company’s towing providers. We were expected to perform the following:

Create a SharePoint Application with internal as well as extranet visibility

Develop an InfoPath document library

Develop custom workflows

Implement Content Management

I developed the following InfoPath form which was meant for the user to submit purchase order data to a form library.

Below is another InfoPath form I developed similarly for the invoice form library.

Page 11 of 24

Page 12: Tushar Mahapatra - Portfolio for recent Projects

I developed an ‘Invoice’ ECB menu item for the purchase order list. The menu item is shown open below. Selecting the ‘Invoice’ option led to the display of an ASP.NET application page (shown next) where the user could review purchase order data retrieved from the purchase orders list, enter invoice data, and submit it to the invoice form library.

Page 12 of 24

Page 13: Tushar Mahapatra - Portfolio for recent Projects

Page 13 of 24

Page 14: Tushar Mahapatra - Portfolio for recent Projects

The invoice list below shows invoice list items created by the process described above.

Page 14 of 24

Page 15: Tushar Mahapatra - Portfolio for recent Projects

5. 2008 – 2009: Modeling with XSD, AIXM5, GML, UML

The TFR Automation team in the FAA is responsible for managing the enhancement and support of the TFR Automation system. This system facilitates the tracking of TFR’s (Temporary Flight Restrictions). The system’s web interface is very heavily used and is consulted by pilots prior to undertaking flying missions. As a member of this team, I implemented compliance with the AIXM5 standard for data interchange. AIXM is an XML standard for aeronautical information. It is an extension of another XML standard called GML which supports the exchange of geographical information.

For the implementation, I had to develop extensions to the AIXM5 standard. To do this, I had to first understand the requirements of the TFR Automation system in detail. Next, the requirements had to be modeled using UML using AIXM5 and GML constructs. The UML model was then converted by scripts into XML schemas (XSD). These schemas were then used to develop serialization/de-serialization code to read and write XML documents complying with the schemas. This code was then invoked by the TFR Automation code for reading and writing AIXM5 compliant documents.

The screenshot below shows the UML model developed in Rational Rose for the project.

Page 15 of 24

Page 16: Tushar Mahapatra - Portfolio for recent Projects

This UML model was converted by scripts to the following schema shown in pictorial format in the XMLSpy tool.

Page 16 of 24

Page 17: Tushar Mahapatra - Portfolio for recent Projects

The text for this XML schema is shown below.

An XML document instance of this schema is shown below with the specification of the location of the schema highlighted in yellow.

Page 17 of 24

Page 18: Tushar Mahapatra - Portfolio for recent Projects

A screenshot showing web pages at the TFR web site is shown below. The red circle encircles a button labeled ‘AIXM5’ which links to the XML document shown earlier containing the TFR information.

Page 18 of 24

Page 19: Tushar Mahapatra - Portfolio for recent Projects

6. 2004: Job Queue Framework and Data

ExtractorsA common requirement in the Airspace Information Management (AIM) laboratory of the FAA was the extraction of data from various data stores. Typically, these extractions would take significant amounts of time to complete and it was not convenient or reasonable to expect the user to wait for their completion after they were initiated. In response, I designed and implemented a Job Queue framework which was used as the basis for various data extraction systems in the lab. The framework was developed using .NET 1.1, C#, VB.NET, Oracle ODP.NET and ASP.NET. I also developed a couple of data extractors (‘Offload Extractor’ for traffic data, ‘Obstacles Extractor’ for obstacles data). Other data extractors were developed using this framework by other developers.

1.6 DesignSome diagrams from the design document which I authored are shown below. The first diagram is an ‘Architectural Overview’ diagram.

The ‘Deployment Diagram’ is shown below.

Page 19 of 24

Page 20: Tushar Mahapatra - Portfolio for recent Projects

The object model for the Offload Extractor system is shown below.

Page 20 of 24

Page 21: Tushar Mahapatra - Portfolio for recent Projects

1.7 ImplementationSome screenshots of actual web pages currently in production for the Offload Extractor system are shown below.

The first screenshot is that of the main menu page of the Offload Extractor.

Page 21 of 24

Page 22: Tushar Mahapatra - Portfolio for recent Projects

The screenshot below is for the web page which shows the Offload Extractor job queue.

Page 22 of 24

Page 23: Tushar Mahapatra - Portfolio for recent Projects

The web page showing details for an Offload Extractor job is shown below.

Page 23 of 24

Page 24: Tushar Mahapatra - Portfolio for recent Projects

Page 24 of 24