IBM Information ServerWebSphere DataStage 8.0Richard HedgesProgram Director, Product ManagementIBM Information Server
Agenda
IBM Information Server Overview & ArchitectureWebSphere DataStage Usability ImprovementsBest in class Data TransformationFocus on ConnectivityPerformance, Performance, and PerformanceInstallation, Configuration, Administration, ReportingUpgrade to WebSphere DataStage v8.0
IBM Information ServerDelivering information you can trust
Understand Cleanse Transform Deliver
Parallel Processing
Rich Connectivity to Applications, Data, and Content
IBM Information Server
Discover, model, and govern information
structure and content
Standardize, merge,and correct information
Combine and restructure information for new uses
Synchronize, virtualizeand move information for
in-line delivery
Unified Deployment
Unified Metadata Management
IBM Information Server Architecture
AnalysisInterface
Web AdminInterface
DevelopmentInterface
UNIFIED USER INTERFACE
COMMON SERVICES
MetadataServices
SecurityServices
Logging &ReportingServices
UNIFIED METADATA
Design Operational
UNIFIED PARALLEL PROCESSING
Understand Cleanse Transform
COMMON CONNECTIVITY
UnifiedService
Deployment
Deliver
Structured, Unstructured, Applications, Mainframe
Supporting IBM WebSphere
Application Server
Supporting IBM DB2, Oracle, and MS SQL Server
Agenda
IBM Information Server Overview & ArchitectureWebSphere DataStage Usability ImprovementsBest in class Data TransformationFocus on ConnectivityPerformance, Performance, and PerformanceInstallation, Configuration, Administration, ReportingUpgrade to WebSphere DataStage v8.0
DataStage and QualityStage Designer
Quick Find - Basic
Find item in Repository tree– In-place find
– Find by Name (Full or Partial)
– Wild card support– Find next…– Filter on type
Find – Advanced Search Criteria
Search on following criteria:– Object type
• Job, Table Definition, Stage etc.– Creation
• Date/Time• By User
– Last Modification• Date/Time• By User
– Where Used• What other objects use this object?
– Dependencies of• What does this object use?
Options– Case– Match on “name & description” or
“name or description”
Impact Analysis – Graphical View
Results shown using the Advanced Find window
-Find dependencies…What does this item depend on?
-Find where used…Where is this item used?
Impact Analysis:
Impact Analysis – Tabular View
Results can be saved to html or xml file for additional processing or remote user viewing. Within application, results list can feed export, reporting or compilation functions
Job, Table or Routine Difference
Tables
Available for Jobs, Tables & Routines
Textual report with hot links to the relevant editor in Designer.
Job Parameter Sets
New object in repository that contains the names and values of job parameters
A Parameter Set can be referenced by one or more jobs
Job Parameter SetsCan use Impact Analysis to determine which Jobs are using a Parameter Set
Works for DataStage Server and DataStage Enterprise Edition
Easier to share job parameters across jobs
Easier to deploy jobs across machines
Easier to propagate a changed job parameter value
Collaboration: Multi-User Environment
Locking to prevent concurrent update clashesOptional “read-only” view when items already locked in RepositoryVisible lock “owner” to aid identification
– By Name & Session ID
Identified user for “last modified” or “created by” actions– Searchable using Advanced Find– E.g. “Find all items created by user x today”
Export Improvements
Export based on a result of a search
Available from
The new GUI allows modification of the original populated export list. Items can be added, removed, filtered out.
Meta Data SharingDataStage, QualityStage & Information Analyzer
Sharing meta data with WebSphere Information Analyzer– Both tools store Table meta data in the common repository– DataStage users can see the table meta from Information Analyzer
• Allows sharing of meta data definitions• Provides single meta data import from data source ~ for use in both tools• Enables DS user to see IA analysis data for shared tables
– Where is the IA analysis informationavailable in DS/QS Designer?
• “Analytical Information” tab on theEditRow dialog when looking at thedetails of an individual column from…
– …a Table Definition– …a stage editor
• “Analytical Information” tab on the TableDefinition dialog
Agenda
IBM Information Server Overview & ArchitectureWebSphere DataStage Usability ImprovementsBest in class Data TransformationFocus on ConnectivityPerformance, Performance, and PerformanceInstallation, Configuration, Administration, ReportingUpgrade to WebSphere DataStage v8.0
Lookup Stage – New Range Capabilities
Range check box allows you to specify a range key for a 1 to 2 type range lookup
Key Type drop down allows you to specify a range key for a 2 to 1 type range lookup
Double clicking on the Key Expression field of a range key will bring up the Range Expression dialog
New Range Expression Dialog
Column selection for the range key from the reference table
Column selection for the bounding columns from the primary input
Range expression operator drop down. Specifies whether the range bounds are inclusive or exclusive
Surrogate Key ManagementNew engine functionalityExposed in 2 new stages and 1 old one– Surrogate Key Generator– Slowly Changing Dimension– Transformer – Initialize(), GetNextKey()
How it works– Uses built-in state files or DBMS sequences (DB2 & Oracle)– Supports large integer (uint64) surrogate key values– Can be used to discover surrogate key values which are already
being used so that use of duplicate key values will be avoided– Customizable block size to manage key gaps vs. performance
New Functionality to Support SCD
New engine capabilities– Surrogate Key
management– Updatable in-memory
lookups New & enhanced stages– Surrogate Key
Generator– Slowly Changing
Dimension
Agenda
IBM Information Server Overview & ArchitectureWebSphere DataStage Usability ImprovementsBest in class Data TransformationFocus on ConnectivityPerformance, Performance, and PerformanceInstallation, Configuration, Administration, ReportingUpgrade to WebSphere DataStage v8.0
Connectivity Updates
New functionality and more DB supported in SQL builders– SQL Server, Teradata, ODBC
New Stored Procedures functionality and for more DBs– SQL Server, Teradata
Latest/Greatest version support (not all listed)– DB2 9.1– Oracle 10gR2– SQL Server 2005– Teradata v2r6.1 (DB server) / 8.1 (TTU)– Sybase ASE 15, Sybase IQ 12.7– Informix 10 (IDS)– SAS 9.1– IBM WS MQ 6.1, WS MB 5.1– Netezza v3.1
New Connectivity– Stages for WebSphere Federation and Classic Federation
• Server and Enterprise stages • DRS Support• Native integration with Federation and Classic Federation
– Netezza Enterprise Stage• Parallel Loader leveraging NZ_Load and External Tables
– SFTP Enterprise Stage • Secure data transmission
– iWay Enterprise Stage• Integration with over 250 disparate/legacy sources
Connection ObjectsNew top-level repository objectAllows saving of a re-usable connection path to a specific source or target– Username, password, db
name etc.Supported on specific stagetypes– New Rich Connectors– Enterprise Stages: DB2,
Informix, Oracle, Teradata– For Plug-ins…– For Server built-ins
• ODBC, UniVerse, UniData
Next Generation “Rich” ConnectorsCombining the best of the plug-ins, operators, plus more.....
ODBC– Embedded DataDirect v5.2 Connect for ODBC drivers
DB2 – Q107– For DPF and non-DPF
Teradata – Q107– New support for Teradata Parallel Transport (TPT)
Oracle – Q107– New support for 10gR2
WebSphere MQ – Q107– Adding support for “client only” configuration
Next Generation “Rich” Connectors
Connection objects allow properties to be dropped onto stage
Diagram lets you select the link to edit as though you’re on the canvas
Warning sign tells you which
fields are mandatory
Test the connection
instantly
Parameter button on every field
Graphical SQL builder
Enterprise Packs Updates
– New Validations for enterprise apps versions• SAP ECC 6.0• SAP BI 7.0• Siebel 7.8• JD Edwards EnterpriseOne 8.12
– New SAP Unicode Certifications• BW-STA 3.5 : Staging BAPI certification for BW Load • BW-OHS 3.5 : Open-Hub service certification for BW Extract• CA-ALE 4.0 : IDoc Load and Extract supports Web AS 6.40• IA-BAPI : BAPI Load and Extract supports Web AS 6.40
– New Functionality• Enhanced support for Siebel EIM and Business Components• New Metadata browser and importer for Oracle Applications• Greater support for large enterprise class deployments
CFF Stage – Multi-Format Record Support
Complex Flat File stage now processes Multi Format Flat (MFF) file
Constraints can be specified on the output links to filter data and/or define when a record should be sent down the link
New Fast Path feature provides guided creation
Agenda
IBM Information Server Overview & ArchitectureWebSphere DataStage Usability ImprovementsBest in class Data TransformationFocus on ConnectivityPerformance, Performance, and PerformanceInstallation, Configuration, Administration, ReportingUpgrade to WebSphere DataStage v8.0
Performance ImprovementsImproved Job Startup Time– Allow efficient use of DS EE against smaller data sets
Buffer Optimization– Improved buffer placement algorithm– E.g., Removed unnecessary buffer before parallel sort in some
instances
Combinability Optimizations– More combinable stages– Intelligent combining
Adaptive Job Monitoring– The Adaptive Job Monitoring feature detects when CPU utilization
by the conductor reaches 80% and throttles the volume of job monitoring data
– Note: only monitor messages will be throttled, metadata and summary messages are not affected
– Time-based monitoring is now supported
Job Performance Analysis
A new visualization tool which:Provides deeper insight into runtime job behavior.Offers several categories of visualizations, including:
– Record Throughput– CPU Utilization– Job Timing– Job Memory Utilization– Physical Machine
UtilizationHides runtime complexity by emphasizing the stages on the designer canvas.
Resource EstimationDifficult to estimate resources required for job execution– Scratch space, CPU, etc.
What happens if data volume increases?How do I prevent job aborting due to lack of system resources?
Resource Estimation Tool Layout Overview
Agenda
IBM Information Server Overview & ArchitectureWebSphere DataStage Usability ImprovementsBest in class Data TransformationFocus on ConnectivityPerformance, Performance, and PerformanceInstallation, Configuration, Administration, ReportingUpgrade to WebSphere DataStage v8.0
New IBM Information Server Installation
Create Users, Assign Roles, and Map Credentials
1. Administration tab click on users then select create new users
2. Enter values for the different user attributes. Id, Password, First Name and Last Name are required
3. Assign Suite and Product Roles as appropriate
4. Click on Save
5. Map Credentials
Security ServicesInternal Directory– Defines users, groups, roles– Support browsing/creation/deletion/update operations
External Directories– LDAP, Active Directory, Unix– External directories password are not stored– Support browsing/partial update operations
Roles– Suite roles: Suite User, Suite Administrator– Product roles: e.g. DataStage user– Project roles: e.g. Information Analyzer User
Standard Based Authentication– JAAS– Work against the supported directories
LoggingA new common logging facility– Used by all the products of the Suite– Logs go into the operational repository
DataStage Client log viewer does not changeLogging administration done from the administration consoleLogging Views are “saved queries”– Opening a view displays the log events corresponding to the
“saved query”– Example
• Severity level: Error• Category: DataStage• Timestamp: past 12 hours
– A user can now view logs in a Production environment via a browser and perform nothing else in that environment
Reporting Console
Can publish reports from DataStage to the IBM Information Server Reporting Console
Job Reports, Advanced Find, Impact Analysis, etc.
Source-to-Target and Target-to-Source
Agenda
IBM Information Server Overview & ArchitectureWebSphere DataStage Usability ImprovementsBest in class Data TransformationFocus on ConnectivityPerformance, Performance, and PerformanceInstallation, Configuration, Administration, ReportingUpgrade to WebSphere DataStage v8.0
UpgradeAll objects from DataStage v7 projects upgrade into DataStage v8.0– Export projects and Import into DataStage v8.0– All jobs (Server, Parallel, Mainframe, and Sequencer)
along with all other objects will migrate
Unix users can install IBM Information Server and previous versions on the same server
Note: DataStage Version Control not in v8.0.
PlatformsAt GA– DS & QS Client: Windows XP– Windows Server 2003– AIX 5.2, 5.3– Red Hat Enterprise Linux AS 3.0– Red Hat Enterprise Linux AS 4.0– SuSE Enterprise Linux 9, 10– HP-UX 11i1 (11.11), 11i2 (11.23) – PA-RISC– Solaris 2.9, 2.10
NLS Support, but not localized
The IBM Information Server AdvantageA Complete Information Infrastructure
A comprehensive, unified foundation for enterprise information architectures, scalable to any volume and processing requirement
Auditable data quality as a foundation for trusted information across the enterprise
Metadata-driven integration, providing breakthrough productivity and flexibility for integrating and enriching information
Consistent, reusable information services—along with application services and process services, an enterprise essential
Accelerated time to value with proven, industry-aligned solutionsand expertise
Broadest and deepest connectivity to information across diverse sources: structured, unstructured, mainframe, and applications
Thank You!