Top Banner
13
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: ACNCN 2012 17th 18th March Libre
Page 2: ACNCN 2012 17th 18th March Libre

242. Preventing Cross-site Scripting(XSS) Attacks Using One-Time Passwordin Cloud Computing

M.Narasimha raju, K Rama devi, P.L.V.V. Naga lakshmi

84

243. Protocol Sensor Based Embedded Technique for efficient Water-Saving Agriculture

Naresh Kumar Reddy, Pradeep G, Rajesh Chandra Gogineni, A S NChakravarthy

84

244. High Throughput DA-Based DCT With High Accuracy Error-Compensated Adder Tree

SVS Gowtham Reddi, Mrs, E. Chitra

85

245. Impact of Mobility on Route Failure in MANETMs.Rasika Vispute, Ms. Akriti Bhat, Ms.Anjali Thite

85

246. Managing Effective Information Securuty-Unified And ProposedFramework

Usha Bala Varanasi, Suvarna Kumar G, Sumit Gupta

85

247. Implementation of Meteorological Data Acquisition System usingARM9 And CAN Bus

D.Anusha

85

248. A Novice Approach to Identity based Page Redirection Anti-PhishingTechnique

K. Krishna Kiran, Dr. Ch.Rupa, Dr. P. S Avadhani

86

249. Effective Recovery Technique in Visual CryptographyV.Srinivas, Dr. E.V.Krishna Rao, Ch.Madhava Rao, K.Anitha

86

250. An Introduction to Quantum Computing and a Holographic Gratingbased approach to meet its Data Storage requirements An Analytical

ApproachL.SaiRam, Dheeraj

87

251. Anonymous Approach For File SharingR.R. Srikanth, P Ranjana

87

252. Clock-Tree Power Reduction TechniqueS.Venkatesh, Mrs. T Gowri

87

253. Comparison on Various Vulnerability Analysis Method in ServiceOriented Architecture

D. Gayathri, R.V.Lakshmi Priya

88

254. Data Warehousing Concept Using ETL Process For InformaticaMapping Designer

K.Srikanth, N.V.E.S Murthy, J.Anitha

88

Page 3: ACNCN 2012 17th 18th March Libre
Page 4: ACNCN 2012 17th 18th March Libre

Data Warehousing Concept Using ETL Process For Informatica Mapping Designer

K.Srikanth N.V.E.S.Murthy J.Anitha M.Tech (Ph.D) Professor M.Tech (Ph.D) Andhra University Andhra University Andhra University [email protected] dr [email protected] [email protected] Abstract: The topic of data warehousing encompasses architec- tures, algorithms, and tools for bringing together selected data from multiple databases or other information sources into a single repository, called a data ware- house, suitable for direct querying or analysis. Data Warehousing is a single, unified enterprise data integration platform that allows companies and government organizations of all sizes to access, discover, and integrate data from virtually any business system, in any format, and deliver that data throughout the enterprise at any speed. A mapping is a set of source and target definitions linked by transformation objects that define the rules for data transformation. Mappings represent the data flow between sources and targets. When the Integration Service runs a session, it uses the instructions configured in the mapping to read, transform, and write data An ETL (Extract, Transform and Load). Keywords-ETL; Metadata; Mapping; Data Warehouse INTRODUCTION 1. STUDY AND ANALYSIS OF ETL Data of data warehouse come from different business systems. They should be initially extracted from raw data of various data sources of systems, then after a serial of filtering and converting, and finally be loaded to data warehouse. This kind of process is defined as ETL process. Data Warehouse (DW) systems are used by decision makers to analyze the status and the development of an organization. DWs are based on large amounts of data integrated from heterogeneous sources into multidimensional schemata which are optimized for data access in a way that comes natural to human analysts. A data warehouse is a single source for key, corporate information needed to enable business decisions .An application that updates is called an on-line transaction processing (OLTP) application. An application that issues queries to the read- only database is called a decision support system (DSS).A data mart is a subset of the data warehouse that may make it simpler for users to access key corporate data. Obtain results from the information sources; perform appropriate translation, ltering, and merging of the information to the user or application.

Page 5: ACNCN 2012 17th 18th March Libre

Figure 1: Informatica Server Architecture The architecture of conventional ETL: The architecture of conventional ETL is shown as Fig.1. The phases of extract, transform and load were executed in one single process. Under the framework of conventional ETL, the ETL process is defined [1] for different data source, develop and compile program or script; retrieval records from database; Fig1. After extract, exchange the data according to users’ requirement, load the data to target data warehouse; and process the records piece by piece until the end of source database. The integrator is responsible for installing the information in the warehouse, which may include filtering the information, summarizing it, or merging it with information from other sources. In order to properly integrate new information into the warehouse, it may be necessary for the integrator to obtain further information from the same or different Information sources. Desired performance. The architecture and basic functionality we have described is more general than that provided by most commercial data warehousing systems. In particular, current systems usually assume that the sources and the warehouse subscribe to a single data model normally relational, that propagation of information from the sources to the warehouse is performed as a batch process. 1.2 Extract Transform and Load is used to populate a data warehouse

A. Extract is where data is pulled from source systems SQL connect over network Flat files Transaction messaging B. Transformation can be the most complex part of data warehousing

Convert text to numbers Apply business logic in this stage

C. Load is where data is loaded into the data warehouse Sequential or bulk loading

Page 6: ACNCN 2012 17th 18th March Libre

2. Power Center Client Applications:

Figure 2 : Use a pass-through mapping

Source qualifier can be termed as a default transformation which comes up when we select source for the mapping. It provide a default SQL Query which can be generated by clicking Generate SQL (properties tab–>SQL Query attribute). You can have multiple sources in a mapping but you can only have one source qualifier for the mapping .You can enter any SQL statement supported by your Source database with proper joins with other Sources. When you drag a source (Represents data read from relational or flat file sources) into the Mapping Designer workspace [2], Fig 2. add an instance of the source definition to the mapping and one Source Qualifier automatically comes with that, but if we are using multiple sources then multiple Source Qualifier will automatically pop up (as shown in the screenshot

below) which requires us to delete the additional Source Qualifier[3].

2.1 Mapping: Fig 3. Logically Defines the ETL Process:

Figure 3 : Use a pass-through mapping & Expression 1. Reads data from sources, 2. Applies transformation logic to data 3. Writes transformed data to targets 3. Mapping Design Process :

Table 1: Oracle SQL Query On EMP Table

Page 7: ACNCN 2012 17th 18th March Libre

Describe an Oracle Table

Figure 4: Creating ODBC connection on Source creation

Figure 5: Creating ODBC connection on Target creation

Fig 4 and 5 In Windows, in the System DSN tab of the ODBC Data Source Administrator, in user ID create a grant DBA and TNS name by default ORCL. Create an ODBC connection names scott_source and kanth_target.

Page 8: ACNCN 2012 17th 18th March Libre

Figure 6: Source Table an Source Analyzer Fig 6. When U add a relational or a flat file source definition to a mapping, U need to connect it to a source qualifier transformation[4].The source qualifier transformation represents the records that the informatica server reads when it runs a session.

Figure 7: Target Table an Target Designer Fig 7. Target definitions define the structure of tables in the target database, or the structure of file targets the Power Center Server creates when you run a workflow. If you add a target definition to the repository that does not exist in a relational database, you need to create target tables in your target database[5]. You do this by generating and executing the necessary SQL code within the Warehouse Designer.

Page 9: ACNCN 2012 17th 18th March Libre

Figure 8: The logic ports of Expression Transformation Fig 8. In the following steps, you will copy the EMPLOYEES source definition into the Warehouse Designer to create the target definition [6]. Then, you will modify the target definition by deleting and adding columns to create the definition you want. 1. In the Designer, switch to the Warehouse Designer. 2. Click and drag the EMPLOYEES source definition from the Navigator to the Warehouse. Designer workspace. The Designer creates a new target definition, EMPLOYEES, with the same column Definitions as the EMPLOYEES source definition and the same database type. Next, you will modify the Target column definitions. 3. Double-click the EMPLOYEES target definition by to open it. The Edit Tables dialog box appears. 4. Click Rename and name the target definition EMP_TARGET. 5. Click the Columns tab.

Figure 9: Creating Variable Port Logic

Page 10: ACNCN 2012 17th 18th March Libre

I've a Two columns, salary & commission both the columns contains some null values[7].For Example Fig 9, if commission is null then salary column has some values in this case, we can write the following statement.

IIF(ISNULL(COMM),SAL,SAL+COMM)

Figure 10: Creating output port logic

Fig 10. The above expression in emp table V_TOTAL>3000 above sal calculate tax 0.25 and V_TOTLA> 3000 below sal calculate tax 0.15

IIF(V_TOTAL>3000,V_TOTAL*0.25,V_TOTAL*O.15)

Figure 11: Mapping Designer Process

Open the Mapping Designer.

Choose Mappings-Create, or drag a repository object into the workspace.

Enter a name for the new mapping and click OK

Fig 11& Fig 12. Map column from source tables with Source Qualifier and put the

Page 11: ACNCN 2012 17th 18th March Libre

SQL Query under Properties tab SQL Query Attribute. SQL query should contain all

The fields [1], joins and table name required for getting the required data.

Under Ports tab the Port (fields) should be in the same order as in the SQL[3], and

Data type should be same as in the source table

Now create a transformation Expression

Go to Transformation (Under Tool Bar) Select –> Create Select –> Expression from Dropdown.

Map the ports from Source Qualifier to Expression Transformation.

Map the Ports with the Target Table

Figure 12: Arrange The Mapping Dsigner

Figure 13: Workflow process

Fig 13. The above diagram create Work Flow Designer connect to the Session, in this sessino internally create Read and Write operation in mapping Designer.

Page 12: ACNCN 2012 17th 18th March Libre

Figure 14: Executing Informatica powercenter workfolw monitor

Fig 14. Informatica PowerCenter Workflw moniter will display the result executing process will display the result success or fail.

Table 2 : Display the Designer Preview Data

4. Result: Calculating the TAX total 14 employes the above Expression Transformation logic ports, Variable port and Output port Figure9 and Figure 10 will dispaly the Result Designer Preview Data. Source Data: Table 1 Target Data : Table 2

Page 13: ACNCN 2012 17th 18th March Libre

5. CONCLUSIONS AND FUTURE WORK : Extraction-Transformation-Loading (ETL) tools are pieces of software responsible for the extraction of data from several sources .In this paper, we have focused on the problem of the definition of ETL activities and provided foundations for their conceptual representation. The phases of extract, transform and load were executed in one single process. Under the framework of conventional ETL, the ETL process is defined[7] for different data source, develop and compile program or script, retrieval records from database.In this paper, a useful engineering uade study for ETL tool selection was developed. In the end. all three initial objec- tives were achieved. Comprehensive ETL criteria were identified. testing procedures were developed. and this work was applied to commercial ETL tools. The study covered all major aspects of ETL usage and can be used to effectivel! compare and evaluate various ETL tools. REFERENCES [1] I. William, S. Derek, and N. Genia, DW 2.0: The Architecture for the Next Generation of Data Warehousing. Burlington, MA: Morgan Kaufman, 2008, pp. 215-229. [2] R. J. Davenport, September 2007. [Online] ETL vs. ELT: A Subjective View. In Source IT Consulting Ltd., U.K. Available at: http://www.insource.co.uk/pdf/ETL_ELT.pdf [3] T. Jun, C. Kai, Feng Yu, T. Gang, “The Research and Application of ETL Tools in Business Intelligence Project,” in Proc. International Forum on Information Technology and Applications, 2009, IEEE, pp.620-623. [4] Informatica Power Center, Available at: www.informatica.com/ products/ data integration/ power center/ default.htm [5] Teradata, Available at: www.teradata.com [6] Sun SPACE M9000 Processor, Available at: http://www.sun.com/servers/highend/m9000/ [7] L. Troy, C. Pydimukkala, How to Use Power Center with Teradata to Load and Unload Data, Informatica Corporation [Online], Available at: www.myinformatica.com