Top Banner
Talend Open Studio for MDM Installation Guide 5.0_a
34
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: TalendOpenStudio MDM IG 50a En

Talend Open Studiofor MDM

Installation Guide

5.0_a

Page 2: TalendOpenStudio MDM IG 50a En

Talend Open Studio

Talend Open Studio : Installation GuideAdapted for the Talend Open Studio for MDM and Talend MDM Web User Interface v5.0.x releases.

Copyleft

This documentation is provided under the terms of the Creative Commons Public License (CCPL).

For more information about what you can and cannot do with this documentation in accordance with the CCPL, please read: http://creativecommons.org/licenses/by-nc-sa/2.0/

Page 3: TalendOpenStudio MDM IG 50a En

Talend Open Studio for MDM Installation Guide

Table of Contents

Preface .............................................. v1. General information ................... v

1.1. Purpose ........................... v1.2. Audience ......................... v1.3. Typographicalconventions ............................ v

2. History of changes ...................... v3. Feedback and Support ............... vi

Chapter 1. Prior to installingMDM ................................................ 1

1.1. Hardware requirements ............ 21.1.1. Memory usage ................ 21.1.2. Disk usage ..................... 21.1.3. Compatible OperatingSystems ................................. 31.1.4. Compatible Webbrowsers ................................ 31.1.5. Naming conventions ........ 4

1.2. Third-party softwares ............... 4Chapter 2. Installing the MDMserver ................................................ 7

2.1. Two different installationmodes ........................................... 82.2. Installing MDM modulesusing the Windows/Linuxexecutable file ............................... 82.3. Installing MDM modulesusing the jar file ............................ 8

2.3.1. Installing in GUI mode............................................. 82.3.2. Installing in Command/Console mode ....................... 10

Chapter 3. Migrating databasesand MDM objects .......................... 11

3.1. Migrating MDM projects ......... 123.1.1. Migrating the eXistdatabase ............................... 123.1.2. Reimporting andredeploying your Jobs ............. 123.1.3. Moving the picturesand web resources ................. 13

Chapter 4. Managing MDMdatabase(s) ...................................... 15

4.1. Managing the eXist database................................................... 16

4.1.1. eXist tuning andperformance ......................... 164.1.2. eXist’s databasebackup/restore ....................... 194.1.3. Standalone eXist ........... 21

Chapter 5. ImportantConfiguration subjects .................. 25

5.1. Configuring session timeoutfor the Web User Interface ............ 265.2. Configuring access controlinformation for the Studio andthe Web User Interface ................. 265.3. Changing the default ports inJBOSS ........................................ 27

5.3.1. Default port list ............. 275.3.2. Using an alternatebinding ................................ 28

Page 4: TalendOpenStudio MDM IG 50a En

Talend Open Studio for MDM Installation Guide

Page 5: TalendOpenStudio MDM IG 50a En

Talend Open Studio for MDM Installation Guide

Preface

1. General information

1.1. PurposeThis Installation Guide explains how to install and configure Talend MDM modules and relatedapplications. For detailed explanation on how to use and fine-tune Talend MDM applications, pleaserefer to the Talend Open Studio for MDM Administrator Guide and Talend MDM Web User InterfaceUser Guide.

Information presented in this document applies to Talend MDM releases beginning with 5.0.x.

1.2. AudienceThis guide is devoted for administrators of Talend Open Studio for MDM and Talend MDM Web UserInterface.

The layout of GUI screens provided in this document may vary slightly from your actual GUI.

1.3. Typographical conventionsThis guide uses the following typographical conventions:

• text in bold: window and dialog box buttons and fields, keyboard keys, menus, and menu andoptions,

• text in [bold]: window, wizard, and dialog box titles,

• text in courier: system parameters typed in by the user,

• text in italics: file, schema, column, row, and variable names,

•The icon indicates an item that provides additional information about an important point. It isalso used to add comments related to a table or a figure,

•The icon indicates a message that gives information about the execution requirements orrecommendation type. It is also used to refer to situations or information the end-user need to beaware of or pay special attention to.

Any command is highlighted with a grey background.

2. History of changesThe below table lists changes made in the Talend MDM Installation Guide.

Page 6: TalendOpenStudio MDM IG 50a En

Feedback and Support

vi Talend Open Studio for MDM Installation Guide

Version Date History of Change

v 4.2_a 19/05/2011 Creation of an MDM Installation Guide

v 4.2_b 11/07/2011 Updates in the Talend MDM Installation Guide include:

-a new hardware and software prerequisites chapter.

-Slight modification and reorganization in the MDM serverinstallation chapter.

-A new section in the database management chapter to talk aboutmanaging the Talend XML database.

v 5.0_a 21/11/2011 Updates in the Talend MDM Installation Guide include:

-splitting the MDM IG into two guides: one for Talend Open Studiofor MDM and the other for Talend Enterprise MDM Studio.

-Updated documentation to reflect new product names. For furtherinformation on these changes, see the Talend website.

3. Feedback and SupportYour feedback is valuable. Do not hesitate to give your input, make suggestions or requests regardingthis documentation or product and find support from the Talend team, on Talend’s Forum website at:

http://talendforge.org/forum

Page 7: TalendOpenStudio MDM IG 50a En

Talend Open Studio for MDM Installation Guide

Chapter 1. Prior to installing MDMThis chapter provides useful information on software and hardware prerequisites you should be aware of prior tostarting the installation of Talend MDM modules.

Page 8: TalendOpenStudio MDM IG 50a En

Hardware requirements

2 Talend Open Studio for MDM Installation Guide

1.1. Hardware requirementsTo make the most out of Talend MDM to which you subscribed, please consider the hardware recommendationslisted in the following sections.

As the installation of MDM includes Talend Open Studio for Data Integration and its web application,related modules are included in the recommendation lists.

1.1.1. Memory usage

Memory usage heavily depends on the size and nature of your Talend projects. However, to make it short, ifyours Jobs include many transformation components, you should consider upgrading the total amount of memoryallocated to your servers, based on the following recommendations.

Product Client/Server Recommended alloc. memory

Talend MDM Web User Interface Server 1GB minimum (default configuration), 4 GBrecommended

Talend Open Studio for MDM Client 1GB minimum, 2 GB recommended

Talend Administration Center Server 1GB

Commandline Server 1GB

JobServer Server Depending on your projects: 4GB+

1.1.2. Disk usage

The same requirements also apply for disk usage. It also depends on your projects but can be summarized as thefollowing:

Product Required disk space forinstallation

Required disk space for use

Talend MDM Web User Interface 700 MB -(server) 1 GB+

-(MDM database) 2 x # records numberin Ko. For example: 5 M records = 10Go. This represents the size that will beneeded on the disk.

However, we recommend to multiplythe size really needed on the disk by 2in order to avoid problems during hightransactions.

Talend Open Studio for MDM 400 MB 1 GB+

Talend Administration Center 50MB (WAR) + 70MB(deployed)

~50MB (cache)

Commandline 400MB 1GB+

JobServer 3MB 3MB + project size = 100MB+

Talend Open Studio for DataIntegration

400MB 1GB+

Page 9: TalendOpenStudio MDM IG 50a En

Compatible Operating Systems

Talend Open Studio for MDM Installation Guide 3

1.1.3. Compatible Operating Systems

Despite our intensive tests, you might encounter some issues when installing Talend MDM on some OperatingSystems.

Please refer to the grid below for a summary of supported OS environments. Based on reported issues, weconsidered that some OS are not supported even though the issue can be resolved in particular conditions. A notehas been added providing configuration details.

OS Talend MDMWeb UserInterface

Talend OpenStudio for MDM

TalendAdministrationCenter /CommandLine

Talend OpenStudio for DataIntegration

JobServer

SUN SOLARIS 64bits Working Working Working Working Working

SUN SOLARISSPARC

Working Working Working Working Working

SUN SOLARISx86-64

Working Working Working Working Working

WINDOWS XP Working Working Working Working Working

WINDOWS VISTA

(32bits / 64 bits)

Working Working Working Working Working

WINDOWS2003/2008 SERVER

(32bits/64bits)

Working Working Working Working Working

LINUX MANDRIVA Working Working Working Working Working

LINUX DEBIAN /UBUNTU

Working Working Working Working Working

LINUX REDHAT Working Working Working Working Working

LINUX CENTOS Working Working Working Working Working

HP UX Working Working Working Working Working

IBM AIX

(32bits / 64 bits)

Working1 Working2 Working1 Working2 Working

1. Requires the use of an IBM JVM version 1.6+ 32bits. Only limited support is provided. Contact Support for details.

2. However the graphical mode being not supported only Commandline can be used.

1.1.4. Compatible Web browsers

Despite our intensive tests, you might encounter some issues when accessing Talend MDM Web User Interfacewith some Web browser.

Please refer to the table below for a summary of supported Web browser. Based on reported issues, we consideredthat some Web browsers are not supported even though the issue can be resolved in particular conditions. A notehas been added providing configuration details.

Web browser Talend MDM Web User Interface

Mozilla Firefox Working (Versions 3.0 and above)

Page 10: TalendOpenStudio MDM IG 50a En

Naming conventions

4 Talend Open Studio for MDM Installation Guide

Web browser Talend MDM Web User Interface

Microsoft Internet Explorer 7 and above Working (Versions below 9.0)

Google Chrome Working1

Safari Working1

Opera Working1

1. Only limited support is provided. Contact Support for details.

1.1.5. Naming conventions

The email you received from Talend lists a number of links to the software modules you are allowed to downloadaccording to the license you have. The file naming conventions are as follows:

Zip file naming convention Example Description

Talend-All-rYYYY-vA.B.C Talend-All-r63143-V4.2.2.zip Commandline interface to the IDE +Talend Open Studio for MDM IDE(GUI)

TMDM_TDQEEMPX-Server-All-rYYYY-VA.B.C

TMDM_TDQEEMPX-Server-All-r63143-V4.2.2.jar

The MDM server

TAC-rYYYY-vA.B.C TAC-r63143-V4.2.2.zip Talend Administration Center:Web-based application used toadministrate Talend IntegrationSuite projects and users.

org.talend.remote.jobserver_A.B.C_rYYYYorg.talend.remote.jobserver_4.2.2_r63143.zipJobServer: Standalone executionserver

Soamanager-rYYYY-VA.B.C soamanager-63143-V4.2.2.jar SOA Manager: helps deployingWeb services Jobs

Where:

• YYYY: Revision number,

• A.B.C.: Major. Minor. Patch: revision level if relevant.

The software modules must be all in the same versions/revisions! This means that both YYYY and A.B.Cmust match on both: client side and server side.

1.2. Third-party softwaresSome additional third-party applications are required for Talend MDM modules to work smoothly together.

As the installation of MDM includes Talend Open Studio for Data Integration and its web application,related applications are included in the lists.

• A Web application server able to deploy WAR files, for example:

- Apache Tomcat version 5.5 or 6.0 (version 6.0 is recommended) - http://tomcat.apache.org and/or

- JBoss Application Server version 4.2.2 - http://www.jboss.org/jbossas/downloads/

Page 11: TalendOpenStudio MDM IG 50a En

Third-party softwares

Talend Open Studio for MDM Installation Guide 5

By default, Talend global Installer will install the above both servers. You can still customize the installto deploy everything on just JBoss. However, this configuration requires some expertise.

You are also not required to download JBoss prior to installation as the server is included in the installbundle. For further information on Talend global Installer, see the User Guide.

• Sun Microsystems (JDK or JRE) JVM 1.5+ (but version 1.6+ is recommended) - http://java.sun.com/javase/downloads/index.jsp

• Subversion for storing your projects - http://subversion.tigris.org/ or http://www.visualsvn.com/server/download/

Page 12: TalendOpenStudio MDM IG 50a En

Talend Open Studio for MDM Installation Guide

Page 13: TalendOpenStudio MDM IG 50a En

Talend Open Studio for MDM Installation Guide

Chapter 2. Installing the MDM serverThis chapter provides information about how to install the MDM server using: a graphical installer, your consoleserver or the silent installation XML file generated at the end of installing the server via the installer.

Page 14: TalendOpenStudio MDM IG 50a En

Two different installation modes

8 Talend Open Studio for MDM Installation Guide

2.1. Two different installation modesYou can install the MDM modules using either an executable or a jar file.

The common installation mode is using the executable file to install the MDM modules on Windows or Linux.The less common installation mode is using the jar file to install the MDM modules on all platforms other thanWindows and Linux.

2.2. Installing MDM modules using theWindows/Linux executable fileThe executable file allows you to launch a global Installer that helps to set up all Talend modules including thosefor MDM.

However, if you want to use the global Installer to install only the MDM modules, you must select the Custominstallation type in the Installer. For further information about using the global installer to install MDM onWindows or Linux, see the User Guide.

2.3. Installing MDM modules using the jar fileThe jar file allows you to launch a cross-platform MDM-dedicated graphical installer to install JBoss 4.2.2 anddeploy the MDM Server in simple click-next steps. The jar file is usually used with platforms other than Windowsand Linux.

Using the jar file provided by Talend, you can install the MDM modules in two different modes as the following:

• a cross-platform graphical installer to help you install JBoss 4.2.2 and deploy the MDM Server in simple click-next steps. On Windows, just double-click on the .jar file included in the product archive file and follow theinstructions. On other platforms, you may execute the jar by right-clicking it and selecting the OpenJDK JREor Sun's JRE. For further information, see Section 2.3.1, “Installing in GUI mode”.

• Otherwise, open your command-line and use the command: java -jar <jar name>.jar -console, and then followthe instructions to complete the installation of the MDM server. For further information, see Section 2.3.2,“Installing in Command/Console mode”.

The sections below explain in detail the above installation modes.

2.3.1. Installing in GUI mode

Talend Open Studio for MDM and Talend MDM Web User Interface that make up Talend MDM require that youinstall an MDM server.

Prerequisite(s):

-JDK 1.6.0 must be installed. You should also make sure that the JAVA_HOME environment variable is set topoint to the JDK directory.

For example, if the path is C:\Java\JDKx.x.x\bin, you must set the JAVA_HOME environment variable to pointto: C:\Java\JDKx.x.x.

Page 15: TalendOpenStudio MDM IG 50a En

Installing in GUI mode

Talend Open Studio for MDM Installation Guide 9

-(Only Linux) A Windows Manager must be installed.

-It is highly recommended that the full path to the server installation directory is as short as possible anddoes not contain any space character. -If you already have a suitable JDK installed in a path with a space,you simply need to put quotes around the path when setting the values for the environmental variable.

To install the MDM server using a .jar file, complete the following:

• Unzip the server file provided by Talend.

• On Windows, double-click the cross-platform .jar file to run the installer.

A language selection pop-up displays.

On other platforms, you may execute the .jar file by right-clicking it and selecting the OpenJDK JRE orSun's JRE.

• From the language selection pop-up, select an installation language from the list and click OK to close the pop-up and proceed to the next step.

• On the Talend MDM welcome page, click Next o to proceed to the next step.

• Read the license agreement and select the accept option. Click Next to proceed to the next step.

• Read the JBoss information and click Next to proceed to the next step.

• Select the check boxes of the packs you want to install, and then click Next to proceed to the next step.

The check boxes of required packs are already selected and unavailable (MDM in this case). If you havea JBoss application server already installed on your machine and you do not want to re-install it, clearthe JBoss check box.

• Browse to where you want to install JBoss and the MDM server, and then click Next to proceed to the nextstep. A message displays to inform you about the creation of a target directory.

• If you want to install JBoss as a Windows service, select the Create JBoss Windows service check box andthen click Next to proceed to the next step.

• Read the installation settings, and then click Next to proceed to the next step and start the installation.

Two progress bar indicate how much of the installation has been completed.

Page 16: TalendOpenStudio MDM IG 50a En

Installing in Command/Console mode

10 Talend Open Studio for MDM Installation Guide

• When the progress bars indicate the end of the installation, click Next to have a confirmation message that theinstallation is completed successfully.

• Click Done to close the installer.

The MDM server is installed.

An MBean is provided to manage the MDM server caches and it is available in the JBoss JMX console.

To run the MDM server, execute run.bat (Windows) or run.sh (Linux) in the JBoss.4.2.2.GA folder.

To shut the MDM server down, press Ctrl + C in the console window, or run bin/shutdown.bat or bin/shutdown.sh.

2.3.2. Installing in Command/Console mode

You can install the MDM server in a non-GUI mode using the command-line.

Prerequisite(s):

-JDK 1.6.0 must be installed. You should also make sure that the JAVA_HOME environment variable is set topoint to the JDK directory.

For example, if the path is C:\Java\JDKx.x.x\bin, you must set the JAVA_HOME environment variable to pointto: C:\Java\JDKx.x.x.

-(Only Linux) A Windows Manager must be installed.

-It is highly recommended that the full path to the server installation directory is as short as possible anddoes not contain any space character. -If you already have a suitable JDK installed in a path with aspace, you simply need to put quotes around the path when setting the values for environmental variable.

To use the command-line capabilities to install the MDM server:

• Unzip the .jar server file provided by Talend.

• Open your console server depending on the platform you have.

• Enter the below command, and then press the Enter key on your keyboard to launch the installation procedurethrough this text-only interface.

java -jar <jar name>.jar -console

• Follow the instructions to install the MDM server.

Page 17: TalendOpenStudio MDM IG 50a En

Talend Open Studio for MDM Installation Guide

Chapter 3. Migrating databases and MDMobjectsThis chapter provides you with information on how to migrate XML databases and other MDM objects (Jobs,pictures, workflows, etc.) on the MDM server.

Page 18: TalendOpenStudio MDM IG 50a En

Migrating MDM projects

12 Talend Open Studio for MDM Installation Guide

3.1. Migrating MDM projectsThe MDM repository and master-records are both stored in the database. On startup, MDM compares the initialdatabase version - the version that was set when you first launched the software - with the current version of thesoftware, and applies all the migration tasks to upgrade the database to the correct version, if necessary.

However, as not everything is in the database, you must import and redeploy manually all what is not in thedatabase, namely:

• Jobs,

• workflows

• pictures,

• web resources.

You must delete your web browser cache and cookies whenever you change the version, or the Studio(Talend Open Studio for MDM or Talend Enterprise MDM Studio ) or Talend MDM. Unpredictablebehavior or display errors will occur if you do not.

The sections below explain all the tasks you must carry out to have a complete migration operation for all the dataobjects you have on the MDM server including: master-records, Jobs, workflows, pictures and web resources.

3.1.1. Migrating the eXist database

Prerequisite(s): Make sure that both MDM servers are not running.

To migrate the eXist database to a newer MDM version, complete the following:

• In the Jboss folder of the old MDM version, browse to:

jboss-4.2.2.GA/server/default/deploy/exist-1.4.0-rev11706-TalendPatch.war/WEB-INF/data

• Copy the data folder of the old MDM version and paste it in the same path in the new MDM version.

if you have pictures in your model, make sure to copy jboss-4.2.2.GA/server/default/deploy/zz.50.ext.imageserver.war/upload to the same path on the new server. For detail information, seeSection 3.1.3, “Moving the pictures and web resources”.

• Launch the MDM server and then Talend Open Studio for MDM of the new MDM version as usual and youshould have access to the migrated data objects.

3.1.2. Reimporting and redeploying your Jobs

Prerequisite(s): Make sure that the MDM server is up and running.

If you have Talend Jobs in your old MDM application, complete the following to migrate these Jobs:

• Switch to the Data Integration perspective and import the old workspace to retrieve the Jobs. For furtherinformation on importing items from the remote repository, see the Talend Open Studio for Data IntegrationUser Guide.

Page 19: TalendOpenStudio MDM IG 50a En

Moving the pictures and web resources

Talend Open Studio for MDM Installation Guide 13

You can simply import your Jobs if they are exported in archive files from older MDM Studios. For furtherinformation on importing/exporting items, Routes or Jobs, see Talend Open Studio for Data IntegrationUser Guide.

• Deploy the Jobs to the new MDM server one by one. For further information, see the Talend Open Studio forMDM Administrator Guide.

You can also copy/paste the job scripts (.war or.zip) from their corresponding folder in the oldapplication to the same folder in the new application: jboss-4.2.2.GA/server/default/deploy for wars andjboss-4.2.2.GA/jobox/deploy for zips. But this will not import the job design that you may need at somepoint. Another limitation with this copy/paste mode is that it is recommended only between two MDMservers that have the same major version (first number of the unique identifier of the version). If the majorversions differ, it is very likely that the MDM components will not work with the new MDM Server. If youare migrating between 2 identical versions or 2 versions where only the minor version differs, however,copying the wars or zips will be a lot faster than redeploying the Jobs.

3.1.3. Moving the pictures and web resources

Prerequisite(s): Make sure that both MDM servers are not running.

If you use pictures in your data-model, complete the following to migrate them to the new MDM server:

• In the Jboss folder of the old MDM version, browse to:

jboss-4.2.2.GA/server/default/deploy/zz.50.ext.imageserver.war/upload

• Copy the upload folder of the old MDM version and paste it in the same path in the new MDM version.

• Launch the MDM server and then Talend Open Studio for MDM of the new MDM version as usual and youshould have access to the migrated data objects.

If you use web resources (images, css, js, etc. in your smart views, complete the following to migrate them tothe new MDM server:

• In the Jboss folder of the old MDM version, browse to:

jboss-4.2.2.GA/server/default/deploy/jboss-web.deployer/ROOT.war

• Copy the web resources from the old MDM version and paste them in the same path in the new MDM version.

• Launch the MDM server and then Talend Open Studio for MDM of the new MDM version as usual and youshould have access to the migrated data objects.

Page 20: TalendOpenStudio MDM IG 50a En

Talend Open Studio for MDM Installation Guide

Page 21: TalendOpenStudio MDM IG 50a En

Talend Open Studio for MDM Installation Guide

Chapter 4. Managing MDM database(s)This chapter describes some XML database management options regarding database performance, databasebackup and restore and the installation of an eXist standalone database.

Page 22: TalendOpenStudio MDM IG 50a En

Managing the eXist database

16 Talend Open Studio for MDM Installation Guide

4.1. Managing the eXist databaseTalend Enterprise MDM Studio uses an eXist database to store the MDM repository and master data records. Thesections below detail some management options you can carry on the eXist database.

4.1.1. eXist tuning and performance

The performance of Talend MDM depends for a good part on the eXist database. Below are some tuning tips youcan use to improve performance.

4.1.1.1. Configuration of the eXist cache

eXist cache needs all the memory you can give. By default, eXist cache is very conservative: 48 MB. There is avery good chance that for every request you make, eXist spends most of its time swapping pages back and forthin the cache. The same applies to most operations, including loading data. The eXist cache has a big impact onpaging the record sets in both Talend Open Studio for MDM and in the Talend MDM Web User Interface.

If you used the installer to install the MDM server, eXist is part of a web application which is hosted by JBoss:

TMDM_TDQPERTX-Server-All-r50363-V4.1.1/jboss-4.2.2.GA/server/default/deploy/exist-1.4.0-rev11706-TalendPatch.war

Therefore it shares the JVM with JBoss. The total memory allocated to the JVM is specified with the -Xmxswitch:

Use the below file for Windows:

TMDM_TDQPERTX-Server-All-r50363-V4.1.1/jboss-4.2.2.GA/bin/run.bat

And use the below file for Linux:

TMDM_TDQPERTX-Server-All-r50363-V4.1.1/jboss-4.2.2.GA/bin/run.conf

We default it to 1 GB. Example in run.bat:

set JAVA_OPTS=%JAVA_OPTS% -Xms512m -Xmx1024m -XX:MaxPermSize=256m

Some of this memory can be allocated specifically to eXist cache. This is specified in the eXist settings:

TMDM_TDQPERTX-Server-All-r50363-V4.1.1/jboss-4.2.2.GA/server/default/deploy/exist-1.4.0-rev11706-TalendPatch.war/WEB-INF/conf.xml

The property to look for is cacheSize. Change the 48 MB default to something more realistic.

It is recommended not to go over 1/2 of the JVM total memory (-Xmx) when the database coexists withother applications. Keep in mind you still need everything else in JBoss to keep working.

<db-connection cacheSize="48M" (...)> <!-- changeto 512M max when -Xmx is 1024 -->

On a 64 bit machine with memory aplenty, and with a 64 bit JVM of course, you can set the -Xmx to a highnumber, say 8 GB, and cacheSize to much more than half of that.

Page 23: TalendOpenStudio MDM IG 50a En

eXist tuning and performance

Talend Open Studio for MDM Installation Guide 17

4.1.1.2. eXist outside J2EE

eXist also works as a standalone application outside a J2EE container. It then has its own JVM so you can set the-Xmx independently. You may delete/move:

TMDM_TDQPERTX-Server-All-r50363-V4.1.1/jboss-4.2.2.GA/server/default/deploy/exist-1.4.0-rev11706-TalendPatch.war

And follow the instructions outlined in Section 4.1.3, “Standalone eXist”.

You can safely move the below file to this new instance to get your data back:

TMDM_TDQPERTX-Server-All-r50363-V4.1.1/jboss-4.2.2.GA/server/default/deploy/exist-1.4.0-rev11706-TalendPatch.war/WEB-INF/data

4.1.1.3. Range indexes

When you click on the Search button with no search criteria, Talend MDM basically performs a full search.No help from the index here. However, as soon as you do set some criteria you definitely want to setup thecorresponding range indexes in the database to improve performance.

The following procedure is based on http://exist-db.org/indexing.html.

Basically, you may want to index every primary key and foreign keys, as well as every element specified in thesearchable section of the browse items views.

Talend MDM data containers are stored as collections under /db. For instance, the path to a Product data-containeris: /db/Product. To specify an index for this collection, create a document called collection.xconf under:

/db/system/config/<same full path to the collection>

For example:

/db/system/config/db/Product/collection.xconf

• Launch the Admin Client as outlined in Section 4.1.2.1, “Launching the eXist Admin Client ”.

• Navigate to:

/db/system/config

• Use File - Create Collection to create a “db” collection under:

/db/system/config

• Go into the newly created collection, so the current path is now:

/db/system/config/db

• Use File - Create Collection to create a collection with the exact same name as the data-container to index;e.g. Product.

• Go into the newly created collection, so the current path is for instance:

/db/system/config/db/Product

• Use File - Create blank Document to create a new empty document with this exact name: collection.xconf.

• Open collection.xconf to specify the index.

Below is a sample collection.xconf:

Page 24: TalendOpenStudio MDM IG 50a En

eXist tuning and performance

18 Talend Open Studio for MDM Installation Guide

<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://exist-db.org/collection-config/1.0"> <index> <!-- Range indexes --> <create qname="Id" type="xs:string"/> <create qname="AgencyFK" type="xs:string"/> <create qname="Name" type="xs:string"/> <create qname="Firstname" type="xs:string"/> <create qname="Lastname" type="xs:string"/> <!-- Full text index --> <lucene> <text qname="Product"> <ignore qname="Id"/> </text> </lucene> </index> </collection>

• Navigate back to the top level /db, select your data-container (e.g. Product) and run File - Reindex Collection.

If you do not do this step, only new records will be indexed, so the index will be incomplete, andconsequently your will be able to search only new records.

You can specify range indexes for integers, decimals, dates and strings. You can also create full-textindexes. Please refer to http://exist-db.org/indexing.html.

It is recommended to set the element name by QName instead of Xpath. Therefore, it is a best practiceto always name the PKs the same, for instance Id, so if you set an index on this QName, all PKs willbe indexed.

4.1.1.4. Full-text indexes (Lucene)

As eXist embeds Lucene, you just need to add the entity name (e.g. Product) in the searchable section of thebrowse items view in order to allow the web interface to select “full-text search”.

Keep in mind the search is performed in Lucene so if your Lucene index is not populated, you will get 0 result.You specify the Lucene index in collection.xconf, just like the range indexes:

<lucene> <text qname="Product"> <ignore qname="Id"/> </text> </lucene>

The <ignore> element tells Lucene not to index the Product key (usually meaningless, no need to pollute theindex).

4.1.1.5. “Too many open files” issue

By default eXist serializes the XML documents extracted from the database onto the file-system. It does that twice:on the server-side, and as part of the XML-RPC API we use to communicate with eXist. As a result, when you setthe page size to a relatively high number (several hundred), eXist creates too many temporary files. This might beinvisible on some OS (Windows) but you are very likely to reach a system limit on Linux, where the maximumnumber of the file handles that can be created by one JVM is 1024.

Page 25: TalendOpenStudio MDM IG 50a En

eXist’s database backup/restore

Talend Open Studio for MDM Installation Guide 19

eXist creates tmp files in the first place because originally native XML databases were containers for big, if nothuge XML documents, and deserializing those in memory was hardly an option. However, most of the time thisuse case does not apply to MDM where you will usually have numerous small documents.

In addition, since eXist is an open-source database, we have modified it to optionally not create temporary files.The standard installation by the graphical installer uses this modified version by default. To activate the option,add the following options in JAVA_OPTS:

-Dorg.exist.xmldb.inMemory.remote.content=true-Dorg.exist.xmlrpc.inMemory.retrieve.content=true

So at the end this is how the JAVA_OPTS variable could look like in run.bat:

set JAVA_OPTS=%JAVA_OPTS% -Xms512m -Xmx1024m -XX:MaxPermSize=256m-Dorg.exist.xmldb.inMemory.remote.content=true -Dorg.exist.xmlrpc.inMemory.retrieve.content=true

4.1.2. eXist’s database backup/restore

Backups are strongly recommended for data protection in the event you experience a system crash or loss of data.Backups are also very useful for exporting data in order to re-import all or parts of the data to a different database,e.g. while upgrading eXist to a newer version.

eXist provides different methods for creating backups. For detail information, see http://exist.sourceforge.net/backup.html. You may use the Admin Client web application to perform backup and restore of eXist data.

4.1.2.1. Launching the eXist Admin Client

Before being able to use the eXist Java Admin Client to back up and restore data, you need first to launch thisAdmin Client.

To launch the Java Admin Client, complete the following:

• Connect to http://localhost:8080/exist/.

• At the bottom left corner of the page and in the Administration panel, click Launch.

• Click OK to accept that the Java Webstart Launcher starts the administration client.

If you have a security warning message, accept to run the application to proceed to the next step.

• In the login page of the administration client, enter admin in the Username field.

• Make sure the Type is set to Remote and the URL is set to:

xmldb:exist://localhost:8080/exist/xmlrpc

Page 26: TalendOpenStudio MDM IG 50a En

eXist’s database backup/restore

20 Talend Open Studio for MDM Installation Guide

• In the Password field, enter 1bc29b36f623ba82aaf6724fd3b16718.

The administrator password is specified in jboss-4.2.2.GA/bin/mdm.conf.

• If required, enter a favorite in the Title field, MDM DB for example, and then click the Save button to the rightof the page.

The new favorite is listed in the Favorites list. The next time you want to launch the administration client, youcan double click this favorite to fill in the login information instead of entering it manually.

• Click OK to close the login page and open the administration client.

Page 27: TalendOpenStudio MDM IG 50a En

Standalone eXist

Talend Open Studio for MDM Installation Guide 21

From this page, you can see the content of the eXist database. You can also use the button on top of the page tocarry out different management options on data including creating backups.

4.1.2.2. Backing up and restoring data

From the Java Admin Client and by using the and buttons, you can respectively create backups of theeXist data and restore your database files from a backup. For detail information, see http://exist.sourceforge.net/backup.html.

4.1.3. Standalone eXist

The XML database can be installed and run in two modes in Talend MDM: either embedded as a component ofthe application server – which is the default – or as a separate application, independent of the application server.

In standalone mode, eXist can be run on a different machine than that of the MDM server. The only requirementis that a TCP connection can be established on a single selectable port from the machine hosting the MDM serverto the machine hosting the eXist server.

The sections below describe the steps required to install a standalone eXist and use it with Talend MDM.

4.1.3.1. Downloading eXist

• Download the latest eXist .jar installer from http://www.exist-db.org/download.html.

Page 28: TalendOpenStudio MDM IG 50a En

Standalone eXist

22 Talend Open Studio for MDM Installation Guide

Make sure you remember the password of the admin user.

4.1.3.2. Fine-tuning eXist

How to edit {eXist Dir}/bin/functions.d/eXist-settings.sh

Allocate memory to eXist by updating the -Xmx parameter in set_java_options(). For instance, this willallocate 2GB of RAM for eXist:

set_java_options() { if [ -z "${JAVA_OPTIONS}" ]; then JAVA_OPTIONS="-Xms128m -Xmx2048m -Dfile.encoding=UTF-8"; fi JAVA_OPTIONS="${JAVA_OPTIONS} -Djava.endorsed.dirs=${JAVA_ENDORSED_DIRS}";}

How to edit {eXist Dir}/conf.xml

• If needed, change the admin password in:

<cluster dbaPassword=[enter your password here]

• Increase the cache memory in:

<db-connection cacheSize=“xxM”

to no more than half of the allocated heap size (i.e. the previous -Xmx parameter in JAVA_OPTIONS).

• Increase the cache memory in:

<db-connection collectioncache=“yyM”

to no more than half of the cache and only if you are using a lot of containers/collections (heavy use of versionsand revisions in Talend Open Studio for MDM.

• Activate automatic backups (recommended) by uncommenting the section:

<job type=“system” name=“backup”

Backups are triggered by default every 6 hours. This may be changed using a cron like syntax.

How to edit {eXist Dir}/server.xml

Change the port on which eXist listens for requests in the element:

<listener port=“xxxx”

Page 29: TalendOpenStudio MDM IG 50a En

Standalone eXist

Talend Open Studio for MDM Installation Guide 23

A typical value is 8088.

The default value (8080) will clash with the port used by the JBoss and the MDM Server.

4.1.3.3. Launching eXist

You can run eXist in two modes:

• A lightweight, server-only mode, with no web-based administration,

• The complete mode that includes the web administration. Check eXist documentation for implications.

Start eXist through:

• server.sh, or server.bat as appropriate, for the server-only mode,

• startup.sh, or .bat, for the complete web-based mode.

4.1.3.4. Updating the MDM server to use the standalone eXist

• Stop the MDM Server.

####################################################### eXist DB Setting######################################################xmldb.server.name = the server name or the IP address of the server running eXistxmldb.server.port = the port set in server.xml abovexmldb.administrator.username = usually "admin"xmldb.administrator.password = the admin password, may be empty after a default installxmldb.dburl = xmlrpc/db if you start eXist through server.sh, or exist/xmlrpc/db if you start it through startup.sh.xmldb.isupurl = leave empty if you start eXist through server.sh, or xmlrpc/db if you start it through startup.sh

Make sure the settings in “xmldb.dburl” and “xmldb.isupurl” are consistent with the mode you chose

Below is an example for a default install of eXist on a machine called exa, starting eXist with server.sh:

####################################################### eXist DB Setting######################################################xmldb.server.name=exaxmldb.server.port=8088xmldb.administrator.username=adminxmldb.administrator.password=xmldb.dburl=xmlrpc/dbxmldb.isupurl=

Since you will not need the embedded eXist, you may archive {MDM JBoss Dir}/server/default/deploy/exist-1.4.0-rev10440.war somewhere else to prevent JBoss from deploying it. You may also want to copy

Page 30: TalendOpenStudio MDM IG 50a En

Standalone eXist

24 Talend Open Studio for MDM Installation Guide

the WEB-INF/data directory to {eXist Dir}/webapp/WEB-INF/data if you want to restore the exact samedatabase.

At this point you can start up the MDM Server.

4.1.3.5. General notes

• The URL to enter in the eXist client to access a standalone, server mode eXist (started through server.sh) is:.

xmldb:exist://{name of the machine}:8088/xmlrpc

• The Talend MDM run.sh or run.bat startup script should include a mechanism to start the eXist server beforeit is actually started. Something like (Unix/Linux/Mac only):

# Check if eXist is upif [ -n "`pgrep -l -f exist.home `" ]; then echo $"eXist is already running"else echo "*****************************************" echo "** Starting eXist" echo "*****************************************" d=`date +%Y%m%d%H%M%S` /opt/eXist-1.4/bin/server.sh &> server_$d.log &fi

Page 31: TalendOpenStudio MDM IG 50a En

Talend Open Studio for MDM Installation Guide

Chapter 5. Important Configuration subjectsThis chapter provides useful information about miscellaneous configuration subjects including configuring sessiontimeout or access control and changing the default ports in JBoss.

Page 32: TalendOpenStudio MDM IG 50a En

Configuring session timeout for the Web User Interface

26 Talend Open Studio for MDM Installation Guide

5.1. Configuring session timeout for the WebUser InterfaceA user session timeout for Talend MDM Web User Interface is set to 30 minutes by default. The business useror data steward will be redirected to the login page of the Web User Interface after a period of 30 minutes ofnon-activity.

You can always change this session timeout, if required.

To set up a new timeout for users connecting to the Web User Interface, complete the following:

• In the JBoss folder, browse to the web.xml file in:

server\default\deploy\jboss-web.deployer\conf

• Open the web.xml file in a text editor and search for the following tag:

<!-- Default Session Configuration -->

• Change the value of the default session timeout as desired.

• Save your modifications.

The new session timeout parameter has been set for users connecting to the Web User Interface.

5.2. Configuring access control informationfor the Studio and the Web User InterfaceThe default authorized users for Talend Open Studio for MDM and Talend MDM Web User Interface use thefollowing authentication information: admin as the login and talend as the password for the Studio; user/administrator as the login and user/administrator as the password for the Web User Interface.

It is possible for an administrator to change this access control information, if required.

To configure new logins and passwords, complete the following:

• Browse to the login-config.xml file in:

JBoss\server\default\conf

• Double-click this file to open it and search for the following tag:

Page 33: TalendOpenStudio MDM IG 50a En

Changing the default ports in JBOSS

Talend Open Studio for MDM Installation Guide 27

<!-- Policy for talend MDM -->

• Change the default access control information in the following elements, as desired:

<module-option name="logins">admin,administrator,user</module-option>

<module-option name="passwords">talend,administrator,user</module-option>

• Save your modifications.

The new logins and passwords have been set for the Studio and the Web User Interface.

5.3. Changing the default ports in JBOSSYou may also want to browse the JBoss documentation for running multiple instances of JBoss on the samemachine at http://community.jboss.org/wiki/ConfiguringMultipleJBossInstancesOnOnemachine.

5.3.1. Default port list

Below is the default port list:

Port Change in

bin/mdm.conf

deploy/jboss-web.deployer/server.xml

deploy/http-invoker.sar/META-INF/jboss-service.xml

8080

deploy/jbossws.sar/jbossws.beans/META-INF/jboss-beans.xml

deploy/jboss-web.deployer/server.xml8443

deploy/jbossws.sar/jbossws.beans/META-INF/jboss-beans.xml

8009 deploy/jboss-web.deployer/server.xml

3873 deploy/ejb3.deployer/META-INF/jboss-service.xml

Page 34: TalendOpenStudio MDM IG 50a En

Using an alternate binding

28 Talend Open Studio for MDM Installation Guide

Port Change in

8093 deploy/jms/uil2-service.xml

8083 conf/jboss-service.xml

conf/jboss-minimal.xml1099

conf/jboss-service.xml

conf/jboss-minimal.xml1098

conf/jboss-service.xml

4444 conf/jboss-service.xml

4445 conf/jboss-service.xml

4446 conf/jboss-service.xml

5.3.2. Using an alternate binding

• Browse to the following file:

jboss-4.2.2.GA\server\default\conf\jboss-service.xml

• Uncomment the following:

<mbean code="org.jboss.services.binding.ServiceBindingManager"

name="jboss.system:service=ServiceBindingManager">

<attribute name="ServerName">ports-01</attribute>

<attribute name="StoreURL">${jboss.home.url}/docs/examples/binding-manager/sample-bindings.xml</attribute>

<attribute name="StoreFactoryClassName">

org.jboss.services.binding.XMLServicesStoreFactory

</attribute>

</mbean>

• In \jboss-4.2.2.GA\bin\mdm.conf, modify the HTTP port accordingly:

#xmldb.server.port=8080

xmldb.server.port=8180

• Windows service only: update the port in \jboss-4.2.2.GA\bin\service.bat:

call shutdown -s jnp://localhost:1199 -S < .s.lock >>shutdown.log 2>&1