General Informatica Best Practices Performance and Tuning Overview Identifying ETL Bottlenecks Target Bottlenecks Source Bottlenecks Mapping Bottlenecks Session Bottlenecks System Bottlenecks Partitioning –How to make it fly
7/27/2019 Identifying Bottlenecks
http://slidepdf.com/reader/full/identifying-bottlenecks 1/7
General Informatica Best Practices
Performance and Tuning Overview
Identifying ETL Bottlenecks
Target Bottlenecks
Source Bottlenecks Mapping Bottlenecks
Session Bottlenecks
System Bottlenecks
Partitioning –How to make it fly
7/27/2019 Identifying Bottlenecks
http://slidepdf.com/reader/full/identifying-bottlenecks 2/7
Identifying Bottlenecks
Target Bottleneck
7/27/2019 Identifying Bottlenecks
http://slidepdf.com/reader/full/identifying-bottlenecks 3/7
Common sources of problems:
indexes or key constraints
database checkpoints
small database network packets size
too many target instances in your mapping
target table is too wide
Common solutions:
drop indexes and key constraints before loading, rebuild after loading
use bulk loading or external loaders when practical
increase database network packets size
decrease the frequency of database checkpoints
optimize target database disks allocation
when using partitions, consider partitioning your target table as well
Source Bottleneck
7/27/2019 Identifying Bottlenecks
http://slidepdf.com/reader/full/identifying-bottlenecks 4/7
Common sources of problems:
slow query
small database network packets size
wide source tables
Common solutions:
analyze the query issued by the Source Qualifier. It appears in the session log.
consider using database optimizer hints when joining several tables in a Source Qualifier
consider indexing tables when you have order by or group by clauses
try database parallel queries if supported
try partitioning the session if appropriate, try partitioning your source database as well
test Source Qualifier conditional filter versus filtering at the database level
increase database network packets size
Mapping Bottleneck
Common sources of problems:
too many transforms
unused links between ports
too many input/output or outputs ports in aggregator or ranking transformations
unnecessary data type conversions
7/27/2019 Identifying Bottlenecks
http://slidepdf.com/reader/full/identifying-bottlenecks 5/7
Common solutions:
eliminate transformation errors
if several mappings read from the same source, try single pass reading
optimize data types, use integers for comparisons.
don’t convert back and forth between data types
optimize lookups and lookup tables, using cache and indexing tables
put your filters early in the data flow, use a simple filter condition
for aggregators, use sorted input, integer columns to group by and simplify expressions
use reusable sequence generators, increase number of cached values
if you use the same logic in different data streams, apply it before the streams branch off
optimize expressions: isolate slow and complex expressions
reduce or simplify aggregate functions
Session Bottleneck
Common sources of problems:
inappropriate memory allocation settings
running in series rather than in parallel
error tracing override set to high level
7/27/2019 Identifying Bottlenecks
http://slidepdf.com/reader/full/identifying-bottlenecks 6/7
Common solutions:
calculate DTM buffer pool and buffer block size
make sure to keep data caches and indexes in memory, paging to disk is very slow
if your mapping allows it, use partitioning
run sessions in parallel, within concurrent batches, whenever possible
increase database commit interval
turn off recovery and decimal arithmetic (they’re off by default)
use debugger rather than high error tracing, always reduce your tracing level for production runs
System Bottleneck
Common sources of problems:
slow network connections
overloaded or under-powered servers slow disk performance
Common solutions:
get the best machines to run your server. Better yet, use several servers against the samerepository (power center only)
use multiple CPUs and session partitioning
make sure Informatica servers and database servers are closely located in your network
7/27/2019 Identifying Bottlenecks
http://slidepdf.com/reader/full/identifying-bottlenecks 7/7
if you have several CPUs, several disk drives and gobs of RAM, consider having Informaticaserver and database server on the same machine
shutdown unneeded processes or network services on your servers
use 7 bit ASCII data movement (the default) if you don’t need Unicode
evaluate hard disk performance, try locating sources and targets on different drives
get as much RAM as you can for your servers
Partitioning
A partition is a pipeline stage that executes in a single thread
Partition points mark the thread boundaries in a pipeline and divides the pipeline process into stages
The partition strategy can be different at each partition point in the pipeline process
Adding partitions increase the number of threads created by Informatica PowerCenter allows for up to 16 partitions at each partition point
By increasing partition points, threads increase, allowing performance increase HOWEVER load onserver is also increased, so if server is undersized partitioning is of no value, can actually decreaseperformance
Partitioning continued
Partition Types
Round Robin
Key Range
Hash Key
Pass Through
Performance can be increased by changing partitioning strategy at different partition pointsSource Qualifier –Key Range or Hash Auto
Expression or Filter –Round Robin
Sorter and Aggregator –Hash Auto Keys
Target –Key Range