-
Informatica Big Data Trial Sandbox for
Hortonworks Quick Start
2014 Informatica Corporation. No part of this document may be
reproduced or transmitted in any form, by any means (electronic,
photocopying, recording or otherwise) without prior consent of
Informatica Corporation. All other company and product names may be
trade names or trademarks of their respective owners and/or
copyrighted materials of such owners.
-
AbstractThis document describes how to use Informatica Big Data
Edition Sandbox for Hortonworks to run sample mappings based on
common big data uses cases. After you understand the sample big
data use cases, you can create and run your own big data
mappings.
Supported Versions Informatica 9.6.1 HotFix 1
Table of ContentsInstallation and Configuration Overview. . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . 3Step 1. Download the Software. . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . 3
Download and Install VMWare Player. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3Register at Informatica Marketplace . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3Download the Big Data Trial Sandbox for Hortonworks Files. . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . 4
Step 2. Start the Big Data Trial Sandbox for Hortonworks Virtual
Machine. . . . . . . . . . . . . . . . . . . . . . . . 4Step 3.
Configure and Install the Big Data Trial Sandbox for Hortonworks
Client. . . . . . . . . . . . . . . . . . . . 4
Configure the Domain Properties on the Windows Machine. . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . 4Configure a
Static IP Address on the Windows Machine. . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 5Install the Big Data Trial
Sandbox for Hortonworks Client. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . 6
Step 4. Access the Big Data Trial Sandbox for Hortonworks
Sandbox. . . . . . . . . . . . . . . . . . . . . . . . . . .
6Apache Ambari. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6Informatica Administrator. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6Informatica Developer. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
Big Data Trial Sandbox for Hortonworks Samples. . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7Running Common Tutorial Mappings on Hadoop. . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 7Performing
Data Discovery on Hadoop. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . 8Performing Data
Warehouse Optimization. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . 9Processing Complex
Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 11
Reading and Parsing Complex Files. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11Writing
to Complex Files. . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Working with NoSQL Databases. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14HBase. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. 15
2
-
Troubleshooting. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. 16
Installation and Configuration OverviewBig Data Trial Sandbox
for Hortonworks consists of a virtual machine component and a
client component. Use Big Data Trial Sandbox for Hortonworks to run
Informatica mappings on a Hortonworks virtual machine configured
for the Hadoop environment.The Big Data Trial Sandbox for
Hortonworks virtual machine has the following components: 9.6.1
Informatica services Hortonworks 2.1.3 Sample data Sample mappings
for common big data use casesNote: The Informatica Big Data Trial
Sandbox for Hortonworks installation and configuration document is
available on the desktop of the virtual machine.The Big Data Trial
Sandbox for Hortonworks client installs the libraries and binaries
required for the Informatica Developer (Developer tool) client.
Step 1. Download the SoftwareBefore you download the Big Data
Trial Sandbox for Hortonworks software, you must download and
install VMware Player. Then, register at Informatica Marketplace
and download the Big Data Trial Sandbox for Hortonworks virtual
machine and client.
Download and Install VMWare PlayerTo play the Big Data Trial
Sandbox for Hortonworks virtual machine download and install VMware
Player.Download VMware Player from the following VMware website:
https://my.vmware.com/web/vmware/free#desktop_end_user_computing/vmware_player/6_0The
software available for download at the referenced links belongs to
a third party or third parties, not Informatica Corporation. The
download links are subject to the possibility of errors, omissions
or change. Informatica assumes no responsibility for such links
and/or such software, disclaims all warranties, either express or
implied, including but not limited to, implied warranties of
merchantability, fitness for a particular purpose, title and
non-infringement, and disclaims all liability relating thereto.You
must have at least 10 GB of RAM and 30 GB of disk space available
on the machine on which you download and install VMWare Player.
Register at Informatica MarketplaceRegister at Informatica
Marketplace. Then, create an account to log in to Informatica
Marketplace to download the Big Data Trial Sandbox for Hortonworks
client and server software.You can access Informatica Marketplace
here: https://marketplace.informatica.com/bdehortonworks.When you
register with Informatica Marketplace, you get a free 60-day trial
to use Big Data Trial Sandbox for Hortonworks.
3
-
Download the Big Data Trial Sandbox for Hortonworks FilesAfter
you log in to Informatica Marketplace, download the Big Data Trial
Sandbox for Hortonworks virtual machine and client.Download the
following files:BigDataTrialSandboxForHortonworks.ova
Includes the Big Data Trial Sandbox for Hortonworks virtual
machine. Download the file to the machine on which VMware Player is
installed.
961_BigDataTrial_Client_Installer_win32_x86.zipIncludes the
compressed Big Data Trial Sandbox for Hortonworks client. Download
the file to an Informatica client installation directory on a
Microsoft Windows-32 machine.Extract files in the client zip file
to a directory on your local machine. For example, extract the
files to the C:/ drive on your machine.
Step 2. Start the Big Data Trial Sandbox for Hortonworks Virtual
MachineOpen the Big Data Trial Sandbox for Hortonworks virtual
machine in VMware Player.1. Go to the directory where you
downloaded BigDataTrialSandboxForHortonworks.ova and double-click
the
file. VMware Player opens and starts the
BigDataTrialSandboxForHortonworks virtual machine.
2. Optionally, in VMware Player click Browse > Import to
extract the contents of the virtual machine to the selected
location and start the virtual machine. Then, click Play virtual
machine.
You are logged in to the virtual machine. The Informatica
services and Hadoop services start automatically.
Step 3. Configure and Install the Big Data Trial Sandbox for
Hortonworks ClientTo communicate with the virtual machine before
you run the client, you must configure the domain properties for
the Big Data Trial Sandbox for Hortonworks client
installation.Optionally, to avoid updating the IP address of the
virtual machine each time it changes, you can configure a static IP
address for the virtual machine.Then, you can run the silent
installer to install the Big Data Trial Sandbox for Hortonworks
client.
Configure the Domain Properties on the Windows MachineConfigure
the IP address and host name of the virtual machine for the
Developer tool.1. Click Applications > System Tools >
Terminal to open the terminal to run commands. 2. Run the ifconfig
command to find the IP address of the virtual machine.
The ifconfig command returns all interfaces on the virtual
machine. Select the eth interface to get values for IP address.
4
-
The following image shows the ifconfig command with the return
value for inet addr highlighted with a red arrow:
3. Add the IP address and the default hostname hdp-bde-demo to
the hosts file on the Windows machine on which you install the
Developer tool. The hosts file can be located in the following
location: C:\Windows\System32\drivers\etc\hosts. Add the following
line to the hosts file: . For example, add the following line:
192.168.159.159 hdp-bde-demo
Configure a Static IP Address on the Windows MachineOptionally,
to avoid updating the IP address in the hosts file each time the IP
address of the virtual machine changes, configure a static IP
address for the virtual machine.1. Click Applications > System
Tools > Terminal to open the terminal to run commands. 2. Run
the ifconfig command to find the IP address and hardware ethernet
address of the virtual machine.
The ifconfig command returns all interfaces on the virtual
machine. Select the eth interface to get values for the hardware
ethernet address.The following image shows the ifconfig command
with the return values for inet addr and HWaddr outlined with red
boxes:
3. Edit vmnetdhcp.conf to add the values for host name, IP
address, and hardware ethernet address. vmnetdhcp.conf is located
in the following directory: C:\ProgramData\VMwareAdd the following
entry before the #END tag at the end of the file:
host {hardware ethernet ;fixed-address ;}
The following sample code shows how to set a static IP
address:host hdp-bde-demo {hardware ethernet
00:0C:29:10:F9:4C;fixed-address 192.168.159.159;}
4. Add the IP address and the default hostname hdp-bde-demo to
the hosts file on the Windows machine on which you install the
Developer tool.
5
-
The hosts file can be located in the following location:
C:\Windows\System32\drivers\etc\hosts. Add the following line to
the hosts file: . For example, add the following line:
192.168.159.159 hdp-bde-demo5. Shut down the virtual machine. 6.
Restart the host machine and virtual machine.
Install the Big Data Trial Sandbox for Hortonworks ClientTo
install the client libraries and binaries perform the following
steps:1. Go to the directory that contains the client installation
files.2. Click silentInstall.bat to run the silent installer.The
silent installer runs in the background. The process can take
several minutes.The command window displays a message that
indicates that the installation is complete.You can find the
Informatica_Version_Client_InstallLog.log file in the following
directory: C:\Informatica\9.6.1_BDE_Trial\
After the installation process is complete, you can launch the
Big Data Trial Sandbox for Hortonworks Client.
Step 4. Access the Big Data Trial Sandbox for Hortonworks
SandboxYou can log in to Apache Ambari to install, configure, and
manage Hadoop clustersYou can log in to Informatica Administrator
(the Administrator tool) to monitor Informatica services and the
status of mapping jobs.You can log in to the Developer tool to run
the sample mappings based on common big data use cases. You can
create your own mappings and run the mappings from the Developer
tool.For more information on how to run mappings in the Developer
tool, see the Informatica Big Data Trial Sandbox for Hortonworks
User Guide.
Apache AmbariYou can log in to Ambari from the following URL:
http://hdp-bde-demo:8080/#/login.Enter the following credentials to
log in to Ambari:User name: adminPassword: admin
Informatica AdministratorYou can access the Administrator tool
from the following URL: http://hdp-bde-demo:6005Enter the following
credentials to log in to the Administrator tool:User name:
AdministratorPassword: Administrator
6
-
Informatica DeveloperYou can start the Developer tool client
from the Windows Start menu.Enter the following credentials to
connect to the Model repository Infa_mrs:User name:
AdministratorPassword: Administrator
Big Data Trial Sandbox for Hortonworks SamplesThe Big Data Trial
Sandbox for Hortonworks provides samples based on common Hadoop use
cases.The Big Data Trial Sandbox for Hortonworks includes samples
for the following use cases: Running common tutorial mappings on
Hadoop. Performing data discovery on Hadoop. Performing data
warehouse optimization. Processing complex files. Working with
NoSQL databases.After you run the mappings in the Developer tool,
you can monitor the mapping jobs in the Administrator tool.
Running Common Tutorial Mappings on HadoopBig Data Trial Sandbox
for Hortonworks provides a sample tutorial mapping that reads text
files and counts how often words occur. The word count mappings
appear in the Hadoop_tutorial project in the Developer tool. After
you open a mapping, you can right-click the mapping to run the
mapping on Hadoop.The Hadoop_tutorial project contains the
following sample mappings:m_DataLoad_1
m_DataLoad_1 loads data from the READ_WordFile1 flat file from
your machine to the WRITE_HDFSWordFile1 flat file on HDFS.The
following image shows the mapping m_DataLoad_1:
m_DataLoad_2m_DataLoad_2 loads data from the READ_WordFile2 flat
file from your machine to the WRITE_HDFSWordFile2 file on HDFS.
7
-
The following image shows the mapping m_DataLoad_2:
m_WordCountm_WordCount reads two source files from HDFS and
parses the data and the output to a flat file on HDFS.The following
image shows the mapping m_WordCount:
The mapping contains the following objects: Sources. HDFS files.
Expression transformations. Removes the carriage return and new
line characters from a word. Union transformation. Forms a
collective data set. Aggregator transformation. Counts the
occurrence of each word in the mapping. Target. Flat file on
HDFS.
Performing Data Discovery on HadoopBig Data Trial Sandbox for
Hortonworks provides samples that you can use to discover data on
Hadoop and run and create profiles on the data. After you open the
profile, you can right-click the profile to run the profile.
Running a profile on any data source in the enterprise gives you a
good understanding of the strengths and weaknesses of its data and
metadata.The DataDiscovery project in the Developer tool includes
the following samples that you can use to perform data discovery on
Hadoop: CustomerData. Flat file data source that includes customer
information. Profile_CustomerData. Profiles the customer data to
determine the characteristics of the customer data.Use the samples
to understand how to perform data discovery on Hadoop. You want to
discover the quality of the source customer data in the
CustomerData flat file before you use the customer data as a source
in a mapping. You
8
-
should verify the quality of the customer data to determine
whether the data is ready for processing. You can run the
Profile_CustomerData profile based on the source data to determine
the characteristics of the customer data.The profile determines the
characteristics of columns in a data source, such as value
frequencies, unique values, null values, patterns, and
statistics.The profile determines the following characteristics of
source data: The number of unique and null values in each column,
expressed as a number and percentage. The patterns of data in each
column and the frequencies with which these values occur.
Statistics about the column values, such as the maximum value
length, minimum value length, first value, and
last value in each column. The data types of the values in each
column.The following figure shows the profile results that you can
analyze to determine the characteristics of the customer data:
Performing Data Warehouse OptimizationYou can optimize an
enterprise data warehouse with the Hadoop system to store more
terabytes of data cheaply in the warehouse. Big Data Trial Sandbox
for Hortonworks provides samples that demonstrate how to perform
data warehouse optimization on Hadoop.The DataWarehouseOptimization
project in the Developer Tool includes samples that you can use to
perform data warehouse optimization on Hadoop.Use the samples to
analyze customer portfolios by processing the records that have
changed in a 24 hour time period. You can offload the data on
Hadoop, find the customer records that have been inserted, deleted,
and updated in the last 24 hours, and then update those records in
your data warehouse. You can capture these changes even if the
number of columns change or if the keys change in the source
files.To capture the changes, use the Data Warehouse Optimization
workflow. The workflow contains mappings that move the data from
local flat files to HDFS, identify the changes, and then load the
final output to flat files.The following image shows the sample
Data Warehouse Optimization workflow:
9
-
To run the workflow, enter the following command to run the
workflow from the command line:./infacmd.sh wfs startWorkflow -dn
infa_domain -sn infa_dis -un Administrator -pd Administrator
-Application App_DataWarehouseOptimization -wf
wf_DataWarehouseOptimization
To run the mappings in the workflow, open a mapping and
right-click the mapping to run the mapping.The workflow contains
the following mappings and transformations:Mapping_Day1
The workflow object Mapping_Day1 reads customer data from flat
files in a local file system and writes to an HDFS target for the
first 24-hour period.
Mapping_Day2The workflow object Mapping_Day 2 reads customer
data from flat files in a local file system and writes to an HDFS
target for the next 24-hour period.
m_CDC_DWHOptimizationThe workflow object m_CDC_DWHOptimization
captures the changed data. It reads data from HDFS and identifies
the data that has changed. To increase performance, you can
configure the mapping to run on Hadoop cluster nodes in a Hive
environment.The following image shows the mapping
m_CDC_DWHOptimization:
The mapping contains the following objects: Sources. HDFS files
that were the targets of the previous two mappings. The Data
Integration Service
reads all of the data as a single column. Expression
transformations. Extract a key from the non-key values in the data.
The expressions use the
INSTR function and SUBSTR function to perform the extraction of
key values. Joiner transformation. Performs a full outer join on
the two sources based on the keys generated by the
Expression transformations. Filter transformations. Use the
output of the Joiner transformation to filter rows based on whether
or not
the rows should be updated, deleted, or inserted. Targets. HDFS
files. The Data Integration Service writes the data to three HDFS
files based on whether
the data is inserted, deleted, or updated.
10
-
Consolidated_MappingThe workflow object Consolidated_Mapping
consolidates the data in the HDFS files and loads the data to the
data warehouse.The following figure shows the mapping
Consolidated_Mapping:
The mapping contains the following objects: Sources. The HDFS
files that were the target of the previous mapping are the sources
of this mapping. Expression transformations. Add the deleted,
updated, or inserted tags to the data rows. Union transformation.
Combines the records. Target. Flat file that acts as a staging
location on the local file system.
Processing Complex FilesBig Data Trial Sandbox for Hortonworks
provides samples to process large volumes of data from complex
files that contain unstructured data. The data might be on the
Hadoop Distributed File System (HDFS) or on your local file
system.Big Data Trial Sandbox includes samples that demonstrate the
following use cases to process complex files: Reading and parsing
complex files. Writing to complex files.
Reading and Parsing Complex FilesCapturing and analyzing
unstructured or semi-structured data such as web traffic records is
a challenge because of the volume of data involved. Big Data Trial
Sandbox for Hortonworks provides samples to read and process
semi-structured or unstructured data in complex files.The
LogProcessing project in the Developer tool includes samples that
you can use to read and parse complex files.Use the samples to
process daily web logs from an online trading site and write the
parsed data to a flat file. The web logs contain details about
visitors who log in to the website and look up the value of stocks
using stock symbols.To process the web logs, use the web log
processing workflow.
11
-
The following image shows the sample web log processing
workflow:
To run the workflow, enter the following command to run the
workflow from the command line:./infacmd.sh wfs startWorkflow -dn
infa_domain -sn infa_dis -un Administrator -pd Administrator
-Application app_logProcessing -wf wf_LogProcessing
To run the mappings in the workflow, open a mapping and
right-click the mapping to run the mapping.You can run the
following mappings and transformations in the
workflow:m_LoadData
The workflow object m_LoadData reads the parsed web log data and
writes to a flat file target. The source and target are flat
files.The following image shows the mapping m_LoadData:
m_sample_weblog_parsingThe workflow object
m_sample_weblog_parsing is a logical data object read mapping reads
data from a HDFS source, parse the data using a Data Processor
transformation, and writes to a logical data object.
12
-
The following image shows the mapping
m_sample_weblog_parsing:
The following image shows the expanded logical data object read
mapping m_sample_weblog_parsing:
The mapping contains the following objects: Source. HDFS file
that was the target of the previous mapping. Data Processor
transformation. Processes the input binary stream of data, parses
the data, and writes to
XML format. Joiner transformation. Combines the activity of
visitors who return to the website on the same day with
stock queries. Expression transformation. Adds the current date
to each transformed record. Target. Flat file.
Writing to Complex FilesBig Data Trial Sandbox for Hortonworks
provides samples to read, parse, and write large volumes of
unstructured data to complex files.The Complex_File_Writer project
in the Developer tool includes samples that you can use to write
unstructured data to complex files.Use the samples to generate a
report in XML format of the sales by country for each customer. You
know the customer purchase order details such as customer ID,
product names, and item quantity sold. The purchase order details
are stored in semi-structured compressed XML files in HDFS. Create
a mapping that reads all the customer purchase records from the
files in HDFS and use a Data Processor transformation to process
the sales by country for each customer. The mapping converts the
semi-structured data to relational data and writes it to a
relational target.
13
-
The following figure shows the Complex File Writer sample
mapping:
The mapping contains the following objects:HDFS inputs
The inputs, Read_customers_flatfile, Read_products_flatfile,
Read_sales_flatfile, Read_promotions_flatfile,
Read_countries_flatfile are flat files stored in HDFS.
Transformations The Joiner transformation Joiner_products joins
product and sales data. The Joiner transformation Joiner_promotions
joins sales and promotion data. The Data Processor transformation,
customer_sales_xml_generator, provides a binary, hierarchical
output
for sales by country for each customer.HDFS output
The output, Write_binary_single_file, is a complex file stored
in HDFS.
Working with NoSQL DatabasesBig Data Trial Sandbox for
Hortonworks provides samples that demonstrate how to read from and
write to NoSQL databases. You can run the sample mappings to
understand the simple extract, transform, and load scenarios when
you use a NoSQL database.Big Data Trial Sandbox for Hortonworks
provides samples for the following NoSQL database: HBase
14
-
HBaseUse HBase when you need random real-time read and writes
from a database. HBase is a non-relational distributed database
that runs on top of the Hadoop Distributed File System (HDFS) and
can store sparse data. Big Data Trial Sandbox for Hortonworks
provides samples that demonstrate how to read and process binary
data from HBase.The HBase_Binary_Data project in the Developer tool
includes samples that you can use to read and process binary data
in HBase tables to string data in a flat file target.The sample
HBase table contains the details of people and the cars that they
purchased over a period of time. The table contains the Details and
Cars column families. The column names of the Cars column family
are of String data type. You can get all columns in the Cars column
family as an single binary column. You can use the sample Java
transformation to covert the binary data to string data. You can
join the data from both the column families and write it to a flat
file.To process the Hbase binary data, use the wf_HBase_Binary_Data
workflow.The following figure shows the wf_HBase_Binary_Data
workflow:
To run the workflow, enter the wfs startworkflow command to run
the workflow from the command line.To run the mappings in the
workflow, open a mapping and right-click the mapping to run the
mapping.The workflow contains following mappings and
transformations:m_person_Cars_Write_Static
The workflow object references the m_person_Cars_Write_Static
HBase write data object mapping that writes data to the columns in
the Cars and Details column family.
m_preson_Cars_Write_Static1The workflow object references the
m_pers_cars_static_reader mapping that transforms the binary data
in an HBase data object to columns of the String data type and
writes the details to a flat file data object.:
The HBase mapping contains the following objects:
15
-
Person_Car_Static_ReadThe first source for the mapping is an
HBase data object named Person_Car_Static that contains the columns
in the Details column family. The HBase read data object operation
is named Person_Car_Static_Read.
pers_cars_Static_bin_readThe second source for the mapping is an
HBase data object named Person_cars_Static_bin that contains the
data in the Cars column family. The HBase read data object
operation is named pers_cars_Static_bin_read.
Transformations The HBase_ProtoBuf_Read_String.xml Java
transformation transforms the single column of binary
data in the Person_Car_Static data object to column values of
the String data type. The Sorter transformation sorts the data in
ascending order based on the row ID. The Expression and Aggregator
transformations convert the row data to columnar data. The Joiner
transformation combines the data from both the HBase input sources
before you load the
data to the flat file data object. The Filter transformation
filters out any person with age less than or equal to 43.
Write_Person_Cars_FFThe target for the mapping is a flat file
data object named Person_Cars_FF. The flat file data object write
operation is named Write_Person_Cars_FF to write data from the Cars
and Details column families.
The Data Integration Service converts the binary column in
Person_cars_Static_bin, joins the data in Person_Car_Static, and
writes the data to the flat file data object
Write_Person_Cars_FF.
TroubleshootingThis section describes troubleshooting
information.Informatica Services shut down
The Informatica services might shut down when the machine on
which you run the virtual machine goes into hibernation or when you
resume the virtual machine.Run the following command to restart the
services on the operating system of the virtual machine: sh
/home/infauser/BDETRIAL/.cmdInfaServiceUtil.sh start
Debug mapping failuresTo debug mapping failures, check the error
messages in the mapping log file.The mapping log file appears in
the following location:
/home/infauser/bdetrial_repo/informatica/informatica/tomcat/bin/disTemp
Virtual machine does not start because of a 64-bit errorVMWare
Player displays a message that states it cannot power on a 64-bit
virtual machine. Or, VMware Player might display the following
error when you play the virtual machine: The host supports Intel
VT-x, but Intel VT-x is disabled. Intel VT-x might be disabled if
it has been disabled in the BIOS/firmware settings or the host has
not been power-cycled since changing this setting.
You must enable the BIOS of the machine on which VMware Player
runs to use Intel Virtualization Technology. For more information,
refer to the VMware Knowledge Base article here.
16
-
Virtual machine is in a suspended stateIf the virtual machine is
in a suspended state, you need to resume the virtual machine. You
need to log in to the virtual machine. After you log in, the
Informatica services and Hadoop services start automatically.In
VMware Player, select the virtual machine and click Play virtual
machine.Enter a user name and password for the virtual machine. The
default user name and password is: infa / infa
The Developer tool takes a long time to connect to the Model
repositoryThe Developer tool might take a long time to connect to
the Model repository because the virtual machine cannot find the IP
address and host name of the client machine.You must add the IP
address and host name of the client machine on the hosts file of
the virtual machine.Use the ipconfig and hostname commands from the
command line of the Windows machine to find the IP address and
hostname of the Windows machine.Add the IP address and the host
name to the hosts file on the virtual machine.For example, the
hosts file can be located in the following directory on the virtual
machine: /etc/hostsAdd the following line to the hosts file:
Mapping fails and job execution failed errors appear in the
mapping log
If the mapping fails and you cannot determine the cause of the
job execution failed errors that appear in the mapping log, you can
clear the contents of the following directory on the machine that
hosts the virtual machine: /tmp/infa. Then, run the mapping
again.
Author Big Data Edition Team
17
AbstractSupported VersionsTable of ContentsInstallation and
Configuration OverviewStep 1. Download the SoftwareDownload and
Install VMWare PlayerRegister at Informatica MarketplaceDownload
the Big Data Trial Sandbox for Hortonworks Files
Step 2. Start the Big Data Trial Sandbox for Hortonworks Virtual
MachineStep 3. Configure and Install the Big Data Trial Sandbox for
Hortonworks ClientConfigure the Domain Properties on the Windows
MachineConfigure a Static IP Address on the Windows MachineInstall
the Big Data Trial Sandbox for Hortonworks Client
Step 4. Access the Big Data Trial Sandbox for Hortonworks
SandboxApache AmbariInformatica AdministratorInformatica
Developer
Big Data Trial Sandbox for Hortonworks SamplesRunning Common
Tutorial Mappings on HadoopPerforming Data Discovery on
HadoopPerforming Data Warehouse OptimizationProcessing Complex
FilesReading and Parsing Complex FilesWriting to Complex Files
Working with NoSQL DatabasesHBase
TroubleshootingAuthor