Top Banner
Ææ Understanding ATG Data Anywhere Architecture™ Efficient, transactional data access without writing code using Dynamo Repositories April 2002 ATG White Paper Pat Durante Senior Practice Manager, ATG Education Services
46
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: ATG Data Anywhere Architecture WP

Ææ

Understanding ATG Data Anywhere Architecture™ Efficient, transactional data access without writing code using Dynamo Repositories

April 2002

ATG White Paper Pat Durante Senior Practice Manager, ATG Education Services

rburrage
Text Box
Page 2: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

ii

Contents

1 Executive Summary 2

2 The Challenge of the Data Access Problem 4 Hasn’t the Data Access Problem Been Solved? 5 Why Should You Care about the ATG Data Anywhere Architecture™? 5

3 ATG Data Anywhere Architecture™ 8 Data Source Independence 8 Understanding the ATG Data Anywhere Architecture™ 9 Repository Basics 10 Using Repository Data 12

The Repository API 12 RepositoryFormHandler 13 Dynamo Servlet Beans and the Repository Query Language (RQL) 15

4 Less Java Code: Faster Time-to-Market, Less Maintenance 17 Using the Visitor Profile (Out-of-the-Box) 22 Extending the Definition of a Repository Item 23

Using a Simple Auxiliary Table (To model a one-to-one relationship) 23 Using a "Multi" Table (to model a one-to-many relationship) 25

Switching to an Alternative Relational Database Management System 26 Converting from one type of database to another 27

5 A Unified View of Customer Interactions 28

6 Maximum Performance Through Intelligent Caching 30 Case 1: Single Dynamo Server 30 Case 2: Read frequently, modify rarely or never 31 Case 3: Modifications made by one Dynamo server at a time 31 Case 4: Modification by multiple Dynamo servers 31 Case 5: Modification by a non-Dynamo Application 32

Disabling Caching 32 Invalidating the Cache 32

Controlling the Cache Sizes 33

7 Simplified Transactional Control 35 Overview of Transactional Integrity 35 The J2EE Approach to Transactional Integrity 35 ATG Data Anywhere Support for Transactional Integrity 35

Page 3: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

iii

Advantages of the ATG Data Anywhere Approach 35 Example Page 36 Default Transactional Behavior 36 Recommendations 36

8 Strong Built in Search Capabilities 37

9 Fine-grained Access Control 38 Case 1: Controlling access to all items of the same type 38 Case 2: Controlling access to specific items 38 Case 3: Controlling access to specific properties 38 Case 4: Limiting Query Results 38 Creating a Secured Repository 39

10 Conclusions 41

Appendix: Other Sources of Information 42 Documentation 42 Education 42

Page 4: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

2

1 Executive Summary

Providing good online service requires access to lots of data. At most companies, this data is spread among

different data stores and in different data formats across the enterprise. To provide a single face to their

customers, firms need to utilize the data in all those silos. Companies also benefit by putting together a complete

picture about each customer and driving their marketing and sales efforts more effectively.

Accessing data for online use is difficult. Data has to be cached efficiently to prevent bottlenecks. Software has to

provide transactional integrity, so that accounts will be accurate. It has to provide rich tools for important

functions like searching. And data needs to be secured to prevent unauthorized access. Most importantly,

accessing data has to be easy. Since there is so much work to create and maintain data access, some developers

end up spending the majority of their time simply trying to integrate data sources.

The ATG Data Anywhere Architecture™, featuring Dynamo Repositories, provides a world in which a simple XML

file is all you need to integrate a new data source for online use. This environment provides a wealth of caching

choices, insures transactional integrity, and offers the rich tools needed to rapidly manipulate, search and secure

data. It also provides a world where access to data stored in file systems, relational databases and LDAP directories

is all accomplished using the same set of interfaces. This world is accessible to all applications built using ATG

products.

What does this mean? Faster time to market, better maintainability, and more extensibility combine to decrease

total cost of ownership of web applications. With ATG Data Anywhere Architecture™, developers can focus on

implementing business logic rather than spending time writing "wrapper classes" for each persistent data type.

ATG Data Anywhere Architecture™ offers several advantages over the standard data access methods such as Java

Data Objects (JDO), Enterprise JavaBeans (EJB), and Java Database Connectivity (JDBC). Among the differences:

��ß Data source independence – ATG Data Anywhere Architecture™ provides access to relational

database management systems, LDAP directories, and file systems using the same interfaces. This

insulates application developers from schema changes and also storage mechanism. Data can

even move from a relational database to an LDAP directory without requiring re-coding. Java Data

Objects support data source independence, but it is up to vendors to provide an LDAP

implementation.

��ß Fewer lines of Java code – Less code leads to faster time-to-market and reduced maintenance cost.

Persistent data types created using ATG Data Anywhere are described in an XML file. Absolutely no

java code required.

��ß Unified view of all customer interactions – A unified view of customer data (gathered using web

applications, call center applications, and ERP systems) can be provided without copying data into

a central data source. This unified view of customer data leads to a coherent and consistent

customer experience.

Page 5: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

3

Figure 1 – The unified view of data access provided by the ATG Data Anywhere Architecture™

��ß Maximum performance - Our intelligent caching of data objects ensures excellent performance and timely,

accurate results. The JDO and EJB standards rely on a vendor implementation of caching which may or may not be

available.

��ß Simplified Transactional Control – The key to overall system performance is minimizing the impact of transactions

while maintaining the integrity of your data. In addition to full Java Transaction API (JTA) support, ATG Data

Anywhere allows both page developers and software engineers to control the scope of transactions using the

same transactional modes (required, supports, never, etc.) used by EJB deployment engineers.

��ß Powerful built-in search capabilities – Quality search tools lead to increased visitor satisfaction and efficiency

(which often lead to increased or sustained revenue!) Customers can’t buy what they can’t find.

��ß Fine-grained access control – Control who has access to which data at the data type, data object, even down to the

individual property using Access Control Lists (ACLs)

��ß Integration with ATG product suites - Our award winning personalization, scenarios, commerce, and portal

applications all make use of Repositories for data access. A development team is free to use EJBs along side of ATG

technology, but the easiest way to leverage investment in ATG technology is to follow the example set by our

solution sets. Our solution sets satisfy all of their data access needs using Repositories.

Technical leads and architects are faced with difficult choices to make when deciding upon the data access mechanism used for a new

application. Some think JDO or J2EE/EJB may be the right choice since both they offer portability across application server vendors.

However, in addition to all of the advantages above, the ATG Data Anywhere Architecture™ is also portable across application servers.

With support for Dynamo Application Server, BEA WebLogic and IBM WebSphere, applications built using ATG Data Anywhere

Architecture™ can be deployed on the majority of the application server market. The bottom line: ATG Data Anywhere Architecture is

the most powerful, most flexible, easiest to use data access method available. It saves developers time and frustration. It helps

customers have a better experience. It saves organizations money. Can there be any other choice for your next project?

������

����

���

��

���

�������

�����

�������

����

�������

����

�������

3URILOHV

������3K\VLFDO'DWD

6WRUDJH

&XVWRPHU�3URILOH�'DWD

(PSOR\HH�'LUHFWRU\

3URGXFW�&DWDORJ

&RQWHQW�0DQDJHPHQW�

6\VWHP

&DOO�&HQWHU�'DWDEDVH

6DOHV�)RUFH�'DWDEDVH

3URGXFW�&DWDORJ

&RQWHQW

$QDO\WLFV

(-%�&RQWDLQHUV

6HFXULW\

Page 6: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

4

2 The Challenge of the Data Access Problem

In the first generation of sites for the World Wide Web, most companies developed simple sites with largely static content

describing their goods and services. Known as “Brochure Ware,” early sites proved to be a cost effective tool for displaying

information, and a popular way for clients to do basic research on companies.

As the Internet become more popular, firms recognized that the Web had the possibility of becoming a complete channel,

able to service a large section of their client base for many of their needs. While providing the access that clients increasingly

sought, Web sites also offered firms the ability to significantly decrease the cost of serving customers by making a self-

service option available around the clock. Firms like Amazon and Fidelity recognized that providing excellent service at low

cost could create a significant competitive advantage over their rivals.

However, firms quickly realized that to change customer behavior in large numbers, web sites had to offer a similar or better

level of service than traditional channels offered. Many firms with business strategies predicated on lower cost, without

providing a superior customer service experience, dropped from the market in record numbers.

In order to provide good service via the Internet, firms have had to offer intelligent web sites. These next generation web

sites are able to understand client’s account, just as a real customer service representative is able to do. To fulfill the vision of

excellent self-services, sites had to develop from simple brochures to rich, transactional environments able to satisfy

customer needs as completely as possible. The more data sources that are available, the better the chance that the

customer’s request can be answered.

Figure 2– EMC's Powerlink Required Approximately 20 Integrations

EMC’s Powerlink system (see figure 2 ) is a great example of the kinds of integrations necessary for a modern, world-class

web site (A detailed case study on the EMC Powerlink project is available on atg.com). A typical enterprise web site will

require integrating data from 15 to 50 systems. A more complex site can require far more integrations.

Adding to the challenge of numerous and varied data sources is the problem of transforming data gathered from these

external systems into an object-oriented framework. Information is organized differently in these external systems.

Relational databases store information in tables; file systems and LDAP directories store data hierarchically.

Page 7: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

5

Hasn’t the Data Access Problem Been Solved?

Some data access problems have, in fact, been solved. Java Database Connectivity (JDBC) enables our web applications to

interact with relational database management systems in a vendor independent way. Your applications are finally loosely

coupled with your database vendor. Unfortunately, JDBC does not insulate your application from the database schema, nor

does it map well into the object-oriented space. JDBC is a fairly low-level technology. Application developers execute SQL

statements to interact with the data source and if a result set is returned from a query, the developer must transform the

results into objects by writing code.

Enterprise JavaBean (EJB) technology enables web applications to interact with relational databases in a schema

independent way. The mapping between EJB properties and database columns is provided in an XML deployment descriptor

file. Also, the connection logic and the transactional control is handled outside of application code thus freeing up the

developers to focus on application logic. On the downside, the need for absolute portability across application servers has led

to complexity in both the coding and the configuration required to get an Enterprise JavaBean up and running. Developers

have to write at least 2 interfaces and one Java class for each EJB as well as provide a wide assortment of intricate

configuration details. Modern development tools such as JBuilder (by Borland) have made the process of creating and

configuring EJBs easier than ever, but still developers have faced steep learning curve and tedious development and

configuration tasks when working with EJB technology.

Java Data Objects (JDO) is the latest standard data access approach approved through the Java Community Process. JDO

offers a more transparent data access mechanism. JDO allows developers to persist any Java class without source code

modification. On the downside, developers still need to write a Java class for each persistent type (and modify that Java class

to add or remove properties). Also, JDO is going to rely on vendor implementations for caching of data objects and LDAP

access.

JDBC, Enterprise JavaBeans, and JDO focus on solving only some of the data access challenges described above. If your web

application needs to access data in a file system or in an LDAP directory, you'll be forced to use yet-another technology.

Why Should You Care about the ATG Data Anywhere Architecture™?

As the chart below shows, ATG Data Anywhere Architecture can do anything the others approaches can do, and also much

more. The ATG Data Anywhere Architecture™ was designed to meet the demanding requirements of web applications. Our

technology enables web applications to access data in a data source and schema independent way without writing code to

transform or store data in an object. Data Anywhere Architecture is a higher-level abstraction that leads to faster time-to-

market and higher reliability.

Think about the possibilities: if the integration with individual data sources is simpler, the number of integrations your team

can complete in the same amount of time will increase. The more successful integrations your team builds, the more

intelligent your customer interactions can be. In an economy where customer retention is key, the web experience can make

or break the success of your business.

As an instructor who has taught and used both J2EE/EJB technology and ATG Data Anywhere Architecture since they were

first implemented, I have seen the differences firsthand. To teach a developer to the J2EE/EJB approach (create a JSP that

accesses either a JavaBean or a Servlet that in turn accesses a container-managed entity bean) takes 4 days. In contrast, I

Page 8: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

6

teach developers to use the ATG Data Anywhere Architecture™ in 2 days, including covering much of the additional

functionality provided. This is my personal measure of the elegance of the ATG Data Anywhere Architecture.

Page 9: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

7

Challenge Description ATG Data

Anywhere

Architecture

JDO Enterprise Java

Beans (EJB)

Java

Database

(JDBC)

Data Source

Independence

Application logic does not change based upon the

type of data source (e.g., relational database, XML file,

LDAP directory, etc.)

�� ���

With

Connectors or

BMP

Schema Independence Application logic is completely independent from the

schema (e.g., table names, column names, table

relationships, etc.) so that if the schema needs to

change (e.g., a new column is added/removed), the

application doesn't need to be changed.

�� �� ��

Object Relational

Mapping

Applications interact with objects not relationally or

hierarchically organized data.

�� �� ��

No Java Classes Developers do not need to write, compile and test

Java classes (or interfaces) for each persistent data

type that they want to use in their application,

reducing development time and errors.

��

Portability Across

Application Servers

Applications that make use of the data access

solution are portable to other application servers.

�� �� �� ��

Intelligent Caching The data access mechanism provides a caching

mechanism so that frequently accessed information is

available in memory, improving application reliability

and scalability.

�� Vendor

Specific

Vendor

Specific

Simplified

Transactional Control

The data access mechanism ensures the integrity of

the data and the transactional scope can controlled

programmatically (using modes) or via provided

dynamic page tags.

�� � ��

Searching The ability to search across data source/types.

��

Access Control The ability to control access to data objects and

properties within those data objects

�� ��

Page 10: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

8

3 ATG Data Anywhere Architecture™

Data Source Independence

Figure 3 below provides a high-level overview of the ATG Data Anywhere Architecture™.

Figure 3 – The ATG Data Anywhere Architecture™

��ß With ATG Data Anywhere, the application logic created by developers uses the same approach to

interact with data regardless of the source of that data. One of the most powerful aspects of this

architecture is that the source of the data is hidden behind the Dynamo Repository abstraction. It

would be easy to change from a relational data source to an LDAP directory since none of the

application logic would need to change.

��ß Once data is retrieved from a data source it is transformed into an object-oriented representation.

Manipulation of the data can then be done using simple getPropertyValue and

setPropertyValue methods.

���������

���������-DYD

2EMHFW

SURSHUW\�

SURSHUW\�

�������������

64/�&RQQHFWRU

/'$3�&RQQHFWRU

)LOH�6\VWHP&RQQHFWRU

5'%06

/'$3�'LUHFWRU\

���������

SURSHUW\�

«

���������������

5HSRVLWRU\$3,�RU�'URSOHWV

���������

���������-DYD

2EMHFW

SURSHUW\�

SURSHUW\�

�������������

64/�&RQQHFWRU64/�&RQQHFWRU

/'$3�&RQQHFWRU/'$3�&RQQHFWRU

)LOH�6\VWHP&RQQHFWRU)LOH�6\VWHP&RQQHFWRU

5'%065'%06

/'$3�'LUHFWRU\/'$3�'LUHFWRU\

���������

SURSHUW\�

«

���������������

5HSRVLWRU\$3,�RU�'URSOHWV

Page 11: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

9

Understanding the ATG Data Anywhere Architecture™ • A Repository is a data access layer that defines a generic representation of a data store. Application

developers access data using this generic representation by using only interfaces such as Repository and

RepositoryItem.

• Repositories accesses the underlying data storage device through a connector, which translates the request

into whatever calls are needed to access that particular data store. Connectors for relational databases, LDAP

directories, and file systems are provided out-of-the-box. Connectors use an open, published interface, so

additional custom connectors can be added if necessary.

• Developers use Repositories to create, query, modify, and remove Repository Items.

• A Repository Item is like a JavaBean, but its properties are determined dynamically at runtime. From the

developer’s perspective, the available properties in a particular repository item depend on the type of item

they are working with. One item might represent the user profile (name, address, phone number), while

another may represent the meta-data associated with a news article (author, keywords, synopsis).

• The purpose of the Repository interface system is to provide a unified perspective for data access. For

example, developers can use targeting rules with the same syntax to find people or content.

• Applications that use only the Repository interfaces to access data can interface to any number of back-end

data stores solely through configuration.

• Developers do not need to write a single interface or Java class to add a new persistent data type to an

application

ATG also provides a unified view of your applications data through the ATG Control Center which is a graphical user interface

that uses the Repository interfaces to allow users to create, query, update, and remove repository items. Figure 4 below

shows the interface to user repository items – this UI will look the same regardless of the data source used to store the user

data.ß

ß

Figure 4 – Using the ATG Control Center to Access the User Repository

Page 12: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

10

Repository Basics

Figure 5 below shows an example of a repository that stores customer information.

Figure 5 – Sample Repository

Inside each repository, there can be several types of items (which are called "item-descriptors") and for each type there can be

several repository items. The definition of each type of item is described in a repository definition file using XML. In this

example, the Visitor Profile Repository defines two types of items (user and address).

������������ ���� ��������������� ���� �����������������������������

���������������� �������������������� ����

������������� ����

5HSRVLWRU\�,WHP�´���0DLQ�6Wµ��LG�����

������������� �������

5HSRVLWRU\�,WHP�´��3DUN�6Wµ��LG�����

5HSRVLWRU\�,WHP�´-RHµ��LG�����

5HSRVLWRU\�,WHP�´6XHµ��LG�����

������������ ���� ��������������� ���� �����������������������������

���������������� �������������������� ����

������������� ����

5HSRVLWRU\�,WHP�´���0DLQ�6Wµ��LG�����

������������� �������

5HSRVLWRU\�,WHP�´��3DUN�6Wµ��LG�����

������������� ����

5HSRVLWRU\�,WHP�´���0DLQ�6Wµ��LG�����5HSRVLWRU\�,WHP�´���0DLQ�6Wµ��LG�����

������������� �������

5HSRVLWRU\�,WHP�´��3DUN�6Wµ��LG�����5HSRVLWRU\�,WHP�´��3DUN�6Wµ��LG�����

5HSRVLWRU\�,WHP�´-RHµ��LG�����5HSRVLWRU\�,WHP�´-RHµ��LG�����

5HSRVLWRU\�,WHP�´6XHµ��LG�����5HSRVLWRU\�,WHP�´6XHµ��LG�����

Page 13: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

11

Developers can model relationships between types of items as shown in figure 6.

Figure 6 – Relationships between Repository Items

Dynamo Repositories use the Java Collections Framework to model complex relationships between items using familiar

object-oriented concepts. You can store the "list" of addresses as a Set, List, Map, or array (whatever make sense for

your applications needs).

But the boundary does not fall at the Repository’s wall. Developers can create links between items in different repositories

(see figure 7 below). This allows you to create repository items that are composed of properties retrieved from more than

one data source. You’ll have to keep in mind though that the properties in the adjunct repositories will not be queryable.

Applications that need to query against properties from multiple data sources can still make use of Repositories, but the

developers will need to query each repository separately.

In the example shown below, the majority of the information about a particular visitor is stored in a relational database. In

many web applications, an LDAP directory is used to store information about the organizational structure of a company

and/or userid/password combinations for authentication. Dynamo Repositories allow you to create an item that has access

to both relational data and LDAP data from the same object.

��� ��� ���������� ������� ������������������������������� ���������

������� ��������������������������� �������� ���������������������������������������

���������������� �������������������� ����

������������� ����

5HSRVLWRU\�,WHP�´���0DLQ�6Wµ��LG�����

������������� �������

5HSRVLWRU\�,WHP�´��3DUN�6Wµ��LG�����

5HSRVLWRU\�,WHP�´-RHµ��LG�����

5HSRVLWRU\�,WHP�´6XHµ��LG�����

��� ��� ���������� ������� ������������������������������� ���������

������� ��������������������������� �������� ���������������������������������������

���������������� �������������������� ����

������������� ����

5HSRVLWRU\�,WHP�´���0DLQ�6Wµ��LG�����

������������� �������

5HSRVLWRU\�,WHP�´��3DUN�6Wµ��LG�����

������������� ����

5HSRVLWRU\�,WHP�´���0DLQ�6Wµ��LG�����5HSRVLWRU\�,WHP�´���0DLQ�6Wµ��LG�����

������������� �������

5HSRVLWRU\�,WHP�´��3DUN�6Wµ��LG�����5HSRVLWRU\�,WHP�´��3DUN�6Wµ��LG�����

5HSRVLWRU\�,WHP�´-RHµ��LG�����5HSRVLWRU\�,WHP�´-RHµ��LG�����

5HSRVLWRU\�,WHP�´6XHµ��LG�����5HSRVLWRU\�,WHP�´6XHµ��LG�����

Page 14: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

12

Figure 7 – Linking Between Repositories

Using Repository Data

Dynamo provides many powerful ways to make use of repository data in your application:

��ß Programmatically via the Repository API

��ß Through the use of RepositoryFormHandler

��ß On a dynamic page (through Dynamo Servlet Beans and the Repository Query Language (RQL))

The Repository API

The Repository API allows you to programmatically create, retrieve, update, or delete items. The power of the Repository API

is that developers use the same API regardless of data source. An item that contains data from an LDAP directory is

manipulated the same way that an item that contains relational data is manipulated.

Here’s a code example that shows how a developer can retrieve the age property of a user item (assuming that the id of the

user is known – in this case '9'):

import atg.repository.*;

Repository repository = getRepository();

RepositoryItem user = repository.getItem("9","user");

Integer age = (Integer) user.getPropertyValue("age");

���������������� �������

������������� ����

,WHP�´-RHµ��LG�����

,WHP�´6XHµ��LG�����

���������������� �������

������������� ����

,WHP�´$7*?(GXFDWLRQµ

������������� ������������

,WHP�´$7*?6HUYLFHVµ

5'%06 /'$3�'LUHFWRU\

�������������� ������� ��������������� ��� ��������������

���������������� �������

������������� ����

,WHP�´-RHµ��LG�����

,WHP�´6XHµ��LG�����

������������� ����

,WHP�´-RHµ��LG�����,WHP�´-RHµ��LG�����

,WHP�´6XHµ��LG�����,WHP�´6XHµ��LG�����

���������������� �������

������������� ����

,WHP�´$7*?(GXFDWLRQµ,WHP�´$7*?(GXFDWLRQµ

������������� ������������

,WHP�´$7*?6HUYLFHVµ,WHP�´$7*?6HUYLFHVµ

5'%065'%06 /'$3�'LUHFWRU\/'$3�'LUHFWRU\

�������������� ������� ��������������� ��� ��������������

�������������� ������� ��������������� ��� ��������������

Page 15: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

13

The following code snippet shows how a developer can change the age property of a user item:

try {

MutableRepository mutableRepository =

(MutableRepository)getRepository();

MutableRepositoryItem mutableUser =

mutableRepository.getItemForUpdate("9", "user");

mutableUser.setPropertyValue("age",new Integer(43));

mutableRepository.updateItem(mutableUser);

}

catch (RepositoryException exc) ...

Notice that the code created by the application developer uses only the Repository API. The code has no knowledge of the

type of data source nor does the code have any knowledge of the schema. There is much more in the Repository API that you

will want to explore (such as the ability to query the repository, control transactional boundaries, and control the validity of

the repository items that are cached to improve performance), but this should give you a taste of what is involved.

RepositoryFormHandler

As you know, ATG provides a robust form handling framework that can be used by developers whenever a web form needs to

be created. ATG provides a specific form handler that can be used to manipulate repository data as well. The

RepositoryFormHandler can be used out-of-the-box to create, update, or delete repository items. And like any Java

class, it can be extended if you need specialized behavior.

Before you can use the RepositoryFormHandler, you'll need to configure a component based on this class. You will most

likely want to configure the repository it will be interacting with as well as the type of repository item. Here's a property file

(this is the configuration syntax used by Dynamo's Nucleus component framework) that shows an example configuration for

this type of form handler:

# /RepositoryFormHandler

#Thu Sep 06 08:41:24 EDT 2001

$class=atg.repository.servlet.RepositoryFormHandler

$scope=request

itemDescriptorName=topic

repository=/MyApplication/TopicRepository

requireIdOnCreate=false

Page 16: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

14

Once you've configured a form handler as shown above, a page designer can make use of it. Here's an example

page that allows a visitor to add a new topic to the TopicRepository.

<H1>Add a New Topic</H1>

<dsp:form action="addTopic.jsp" method="POST">

<!-- Default form error handling support -->

<dsp:droplet name="/atg/dynamo/droplet/ErrorMessageForEach">

<dsp:oparam name="output">

<B><dsp:valueof param="message"/></B><BR>

</dsp:oparam>

<dsp:oparam name="outputStart">

<LI>

</dsp:oparam>

<dsp:oparam name="outputEnd">

</LI>

</dsp:oparam>

</dsp:droplet>

Enter the Topic Name:<BR>

<dsp:input bean="/RepositoryFormHandler.value.topicName"

name="topicName" size="24" type="TEXT"

required="<%=true%>"/><BR>

<dsp:input bean="/RepositoryFormHandler.value.topicBody"

name="topicBody" type="TEXT"/><BR>

<dsp:input bean="/RepositoryFormHandler.createSuccessURL"

type="HIDDEN" value="/Discussion/alltopics.jsp"/>

<dsp:input bean="/RepositoryFormHandler.create" type="Submit"

value="Add Forum"/>

</dsp:form>

A few notes about this example:

��ß This form handler makes it extremely easy to tie a form element to a specific item property.

Note the syntax used is /FormHandlerComponentName.value.propertyName.

��ß This form handler provides several "handlers" to enable a page developer to perform the

various operations (create,update,delete). This example uses the create handler.

Page 17: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

15

Dynamo Servlet Beans and the Repository Query Language (RQL)

ATG provides several Servlet Beans (also known as droplets) to allow page developers to retrieve and display repository data

on dynamic pages (JSP or DSP). In the simplest case (when the page developer knows the unique id of the repository item),

the ItemLookupDroplet can be used. The following code example shows this droplet in action (using JSP in this case):

<%@ taglib uri="/dspTaglib" prefix="dsp"%>

<dsp:page>

<dsp:importbean bean="/MyApplication/TopicRepository"/>

<dsp:importbean bean="/atg/dynamo/droplet/ItemLookupDroplet"/>

<dsp:setvalue bean="ItemLookupDroplet.useParams" value="true"/>

<dsp:droplet name="ItemLookupDroplet">

<dsp:param name="id" value="1"/>

<dsp:param name="repository" bean="TopicRepository"/>

<dsp:param name="itemDescriptor" value="topic"/>

<dsp:oparam name="output">

Name: <dsp:valueof param="element.topicName"/><br>

Body: <dsp:valueof param="element.topicBody"/><br>

</dsp:oparam>

</dsp:droplet>

</dsp:page>

A couple of notes about this example:

��ß We provided three inputs: The unique id of the item (1), the name of the repository

(TopicRepository) that contains the item, and the type of item (Topic).

��ß The output of the droplet is a RepositoryItem called element. We can retrieve the properties of that

item using the simple dot notation (element.topicName for example).

In many cases, page developers will not know the unique id of the item (or items) they want to display on the page. In fact,

what page developers often need to do is query the repository for a set of items that match some criteria. You might assume

that the page developers will use an industry standard such as SQL to perform this query. The problem with SQL is that it is

designed to work only with relational databases. Since a repository may have a relational database, an LDAP directory, or a

file system behind it SQL is not an appropriate query language. ATG provides a SQL-like query language for repositories called

the Repository Query Language (RQL). ATG also provides droplets that can be used by page developers to execute RQL queries

and loop over the results.

The code example below shows how a JSP developer can use the RQLQueryForEach droplet to display a list of all the topics

in the TopicRepository that have at least 1 reply associated with them.

Page 18: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

16

<%@ taglib uri="/dspTaglib" prefix="dsp"%>

<dsp:page>

<dsp:droplet name="/atg/dynamo/droplet/RQLQueryForEach">

<dsp:param name="queryRQL" value="numReplies >= 1"/>

<dsp:param name="repository"

value="/MyApplication/TopicRepository"/>

<dsp:param name="itemDescriptor" value="topic"/>

<dsp:oparam name="output">

Name: <dsp:valueof param="element.topicName"/><br>

Body: <dsp:valueof param="element.topicBody"/><br>

</dsp:oparam>

</dsp:droplet>

</dsp:page>

The primary difference between the ItemLookUpDroplet and RQLQueryForEach droplet is that

RQLQueryForEach requires an RQL statement as an input rather than an id. Also, the output oparam will be

rendered once for each item that the RQL query returns.

Page 19: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

17

�ß Less Java Code Leads to Faster Time-to-Market and Less Maintenance

Developers who use the ATG Data Anywhere Architecture do not need to write, compile or test Java classes or interfaces for

each persistent data type that they want to use in their application. A new persistent data type can be created by simply

creating an XML file which defines a mapping between a repository item and the underlying data structure as shown in

example 1 below.

ß

Example 1: XML Required to Define a Persistent Type using Dynamo Repositories

<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE gsa-template PUBLIC "-//Art Technology Group, Inc.//DTD Dynamo Security//EN" "http://www.atg.com/dtds/gsa/gsa_1.0.dtd"> <gsa-template> <header> <name>Account Repository</name> <author>Pat Durante</author> </header> <item-descriptor name="account" default="true"> <table name="Account" type="primary" id-column-name="accountId"> <property name="accountId" column-name="account_id" data-type="string"/> <property name="type" data-type="string"/> <property name="balance" data-type="double"/> <property name="customerId" data-type="string"/> </table> </item-descriptor> </gsa-template>

In this example, we are creating a new type of repository item that represents a bank account. The account item contains

four properties (accountId, type, balance, and customerId) that are mapped into the columns of the Account database

table.

Page 20: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

18

With this item description in place, we can easily display all of the accounts on a dynamic web page as shown in example

below:

<%@ taglib uri="/dspTaglib" prefix="dsp"%>

<dsp:page>

<dsp:droplet name="/atg/dynamo/droplet/RQLQueryForEach">

<dsp:param name="queryRQL" value="ALL"/>

<dsp:param name="repository"

value="/MyApplication/AccountRepository"/>

<dsp:param name="itemDescriptor" value="account"/>

<dsp:oparam name="output">

Account Id: <dsp:valueof param="element.accountId"/><br>

Balance: <dsp:valueof param="element.balance"/><br>

</dsp:oparam>

</dsp:droplet>

</dsp:page>

The J2EE mechanism for representing a new persistent data type is to define a new Enterprise JavaBean (specifically an

EntityBean). Creating an EJB requires writing a fair amount of code (as shown in example 2 below). And deploying an EJB

requires a significant amount of configuration work (XML) as well.

ß

Example 2 – The Code Required for a Container Managed Entity Bean (EJB) (Account.java, AccountHome.java, and AccountBean.java)

Account.java: package atg.atm.account; import java.rmi.RemoteException; import javax.ejb.*; public interface Account extends EJBObject { public void setBalance(double pBalance) throws RemoteException; public double getBalance() throws RemoteException; public String getType() throws RemoteException; public String getCustomerId() throws RemoteException; } ß

Page 21: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

19

ß

AccountHome.java: package atg.atm.account; import javax.ejb.*; import java.rmi.RemoteException; import java.util.*; public interface AccountHome extends EJBHome { public Account create(String accountId, String customerID, double initialBalance, String type) throws CreateException, RemoteException; public Account findByPrimaryKey(String primaryKey) throws FinderException, RemoteException; public Enumeration findAccountsForACustomer(String custId) throws FinderException, RemoteException; } AccountBean.Java: package atg.atm.account; import java.io.Serializable; import java.rmi.RemoteException; import java.rmi.Remote; import javax.ejb.*; import java.util.*; public class AccountBean implements EntityBean { private transient EntityContext ctx; public String accountId; public String customerId; public double balance; public String type; public void ejbActivate() throws RemoteException { ... } public void ejbPassivate() throws RemoteException { ... } public void setEntityContext(EntityContext ctx) throws RemoteException { this.ctx = ctx; } public void unsetEntityContext() throws RemoteException { this.ctx = null; } public void ejbLoad() throws RemoteException { } public void ejbStore() throws RemoteException { }

Page 22: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

20

public void ejbRemove() throws RemoteException { } public String ejbCreate(String accountId, String customerId, double initialBalance, String type) { this.accountId = accountId; this.customerId = customerId; this.balance = initialBalance; this.type = type; return null; } public void ejbPostCreate(String accountId, String customerId, double initialBalance, String type) { } public void setBalance(double pBalance) { balance = pBalance; } public double getBalance() { return balance; } public String getType() { return type; } public String getCustomerId() { return customerId; } }

To be fair, there are tools in the marketplace that can generate most of this "boilerplate" code for a new EJB, but still this code

needs to be maintained and extended as the system grows. Also, even with the development and deployment of this new

EJB, the data is still not available to a dynamic page designer. According to the Sun Blueprint methodology, a JSP should not

access an EJB directly. This means that the developer has to write either a JavaBean or a Servlet that interacts with the EJB.

Only then can a JSP be created that includes dynamic information from a data source.

The JDO approach requires just a standard Java class. It provides transparent data access for all Java classes. Developers can

use existing classes or write new classes that new persistence. Example 3 below shows the code needed for our account

type.

Page 23: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

21

ß

Example 3: The Java Class Required By JDO

public class Account { private String accountId; private String customerId; private double balance; private String type; public Account(String accountId, String customerId, double initialBalance, String type) { this.accountId = accountId; this.customerId = customerId; this.balance = initialBalance; this.type = type; } public void setBalance(double pBalance) { balance = pBalance; } public double getBalance() { return balance; } public String getType() { return type; } public String getCustomerId() { return customerId; }

Page 24: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

22

Using the Visitor Profile (Out-of-the-Box)

By default, the visitor profiling data that is used by the ATG e-Business Platform is stored in the Solid relational database

management system that ships with the product suite. The basic information about a visitor is stored in a table called

dps_user (although several auxiliary tables are used to store additional information about visitors).

Figure 8 below provides a conceptual view of the out-of-the-box architecture.

Figure 8 – Out of the Box Profile Architecture

Note that the data source configuration files contain the only dependency on the Solid RDBMS.

Usr_tbl

������������� ���� ����

���

����

firstNamefirstName

loginlogin

����������

first_name loginid

idid

���� dps_user

�������������

�������������� �����

������������� ������� ���� ���� ����������������� ����

������������ ��������� �����

�����������

��������

��� ������������� ������������ ������������������

�������������� ����������������������� ������ ����

Usr_tbl

������������� ���� ����

������

����

firstNamefirstName

loginlogin

��������������������

first_name loginid

idid

���� dps_user

�������������

�������������� �����

�������������� �����

������������� ������� ���� ���� ����������������� ����

������������ ��������� �����

�����������

��������

��� ������������� ������������ ������������������

�������������� ����������������������� ������ ����

Page 25: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

23

Extending the Definition of a Repository Item

Using a Simple Auxiliary Table (To model a one-to-one relationship)

Lets say we want to extend the profile definition to include a subscription id for each visitor at the site. This too is extremely

easy to do. The steps are as follows:

��ß Create the additional table to store the new data (create a one-to-one relationship between the existing

dps_user table and your new table). For example:

CREATE TABLE elrn_user (

id VARCHAR(40) not null,

subscription_id VARCHAR(32) null,

constraint elrn_user_p primary key ( id ),

constraint elrn_user_f foreign Key ( id )

references dps_user(id)

)

��ß Create a new userprofile.xml file (in your CONFIGPATH somewhere) to extend the definition of the out-of-the-

box user item descriptor. For example:

<gsa-template>

<item-descriptor name="user">

<table name="elrn_user" type="auxiliary"

id-column-name="id">

<property name="subscription_id"

column-name="subscription_id"

data-type="string"

category="eLearning"

display-name="Subscription Id"/>

</table>

</item-descriptor>

</gsa-template>

��ß Restart the server. That's it! No code changes required. The new property will show up in the ATG Control

Center and you'll be able to retrieve and/or modify the value of this new property from your dynamic web

pages!

<dsp:page>

<dsp:importbean bean="/atg/userprofiling/Profile"/>

Welcome back, <dsp:valueof bean="Profile.firstName"/>!

Your subscription id is:

<dsp:valueof bean="Profile.subscription_id"/>.

</dsp:page>

Page 26: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

24

Adding a Property to an EJB

Adding a single property to an EJB is considerably more involved than adding a property to a Repository Item type.

Lets say we would like to add a single boolean property to our Account EJB presented above (to keep track of

whether or not the account includes overdraft protection).

��ß Modify the schema of the account table to include a new column. (Alternatively, you can create a new

table and use a vendor specific mapping to build a relationship between tables.)

��ß Modify the Account interface code to allow other programmers to gain access to the new property:

public boolean getOverdraftProtection() throws RemoteException;

��ß Modify the AccountHome interface to allow the overdraft property to be initialized upon account

creation:

public Account create(String accountId, String customerID,

double initialBalance, String type, boolean overdraft)

throws CreateException, RemoteException;

��ß Modify the AccountBean class to accommodate the additional create parameter:

public String ejbCreate(String accountId, String customerId, double

initialBalance, String type, boolean overdraft)

{

this.accountId = accountId;

this.customerId = customerId;

this.balance = initialBalance;

this.type = type;

this.overdraft = overdraft;

return null;

}

public void ejbPostCreate(String accountId, String customerId,

double initialBalance, String type,

boolean overdraft)

{ }

��ß Add a get method for the new property to the AccountBean class:

public boolean getOverdraft() { return overdraft; }

��ß Re-deploy the J2EE application making sure to map the new bean property to the appropriate database

column.

��ß Modify the JavaBean or the Servlet code that interacts with your EJB (since JSPs should not access an

EJB directly). At a minimum, you will need to add a method that can check to see if the account has

overdraft protection.

��ß You are now ready to access your new property from a JSP.

Page 27: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

25

Using a "Multi" Table (to model a one-to-many relationship)

Lets say we want to extend the profile definition to include a list of each visitor's favorite subjects. Modeling a one-to-many

relationship is a little more involved, but still generally straight forward. The steps are as follows:

��ß Create the additional table to store the new data (create a one-to-many relationship between the existing

dps_user table and your new table). For example:

CREATE TABLE elrn_subjects (

id VARCHAR(32) not null,

subject VARCHAR(32) not null,

constraint elrn_subjects_p primary key ( id, subject ),

constraint elrn_subjects_f foreign Key ( id ) references dps_user(id)

)

��ß Create a new userprofile.xml file (in your CONFIGPATH somewhere) to extend the definition of the out-

of-the-box user item descriptor. Note that you can you a Set, List, array,or Map as the data-type of a

multi-value property. In this example, we will use a Set (since the order of the visitor's favorite subjects is

not important and we want each subject to be included only once). For example:

<gsa-template>

<item-descriptor name="user">

<table name="elrn_subjects" type="multi" id-column-name="id">

<property name="favoriteSubjects" column-name="subject"

data-type="set" component-data-type="string"/>

</table>

</item-descriptor>

</gsa-template>

��ß Restart the server. That's it! No code changes required. The new property will show up in the ATG Control

Center and you'll be able to retrieve and/or modify the values assigned to this new property from your

dynamic web pages!

<dsp:importbean bean="/atg/userprofiling/Profile"/>

<dsp:page>

<dsp:importbean bean="/atg/dynamo/droplet/ForEach"/>

Welcome back, <dsp:valueof bean="Profile.firstName"/>!

Your favorite subjects are:

<dsp:droplet name="/atg/dynamo/droplet/ForEach">

<dsp:param bean="Profile.favoriteSubjects" name="array"/>

<dsp:oparam name="output">

<li><dsp:valueof param="element"/>

</dsp:oparam>

</dsp:droplet>

</dsp:page>

Page 28: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

26

Switching to an Alternative Relational Database Management System

At times, companies need to change from one database system to another. For example, during the prototyping phase of a

new project, many web architects choose to use the free Solid database included with ATG Dynamo. Eventually, application

will need to switch from the Solid RDBMS to a production grade RDBMS (such as Oracle). The ATG Data Anywhere

Architecture makes this switch incredibly easy. All you have to do is create the appropriate tables in the RDBMS of your choice

and change a few properties files to change the connection information.

We recently built an application that makes use of Microsoft SQL Server instead of Solid. Here's what we needed to do to get

the ATG Dynamo e-Business Platform running against SQL Server:

��ß Create the necessary tables and indices in a SQL Server database using the provided SQL (for each ATG

product there is a set of SQL files that contain DDL which can be used to create the appropriate tables and

indices). For example, under \ATG\Dynamo5.6\DAS\sql\install\mssql there is a file called das_dll.sql

which you can use for this purpose.

��ß Create a new data source (or replace the existing configuration). We chose to replace the existing

configuration as shown below:

# /atg/dynamo/service/jdbc/MyDataSource

#Wed Nov 14 15:07:38 EST 2001

$class=atg.service.jdbc.MonitoredDataSource

$description=JTA Participating eLearning Datasource

$scope=global

dataSource=/atg/dynamo/service/jdbc/MyXADataSource

logListeners=/atg/dynamo/service/logging/LogQueue,/atg/dynamo/service/logging/Scre

enLog

max=5

min=5

transactionManager=/atg/dynamo/transaction/TransactionManager

# /atg/dynamo/service/jdbc/MyXADataSource

#Wed Nov 14 15:05:06 EST 2001

$class=atg.service.jdbc.MyXADataSource

$scope=global

URL=jdbc\:inetdae7\:hostname.atg.com\:1433

dataSourceJNDIName=

database=eLearningBeta

driver=com.inet.tds.TdsDriver

logListeners=/atg/dynamo/service/logging/LogQueue,/atg/dynamo/service/logging/Scre

enLog

password=thepassword

server=hostname.atg.com\:1433

user=theuserid

IMPORTANT: Note that the change in data source required no changes to the application code, nor did it involve

changing the repository configuration. Changes were isolated to the data source configuration files.

Page 29: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

27

Converting from one type of database to another

If we want to switch over to using an LDAP directory (such as iPlanet Directory Server), you can do that easily as well. This

paper will not provide details on how to accomplish this. If you'd like to learn more about how to do this, please read the

following sections in the ATG Personalization Programming Guide:

��ß Setting Up an LDAP Profile Repository

��ß Linking SQL and LDAP Repositories

Page 30: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

28

�ß A Unified View of Customer Interactions

The ATG Data Anywhere Architecture excels at providing a unified view of customer interactions. A unified view of customer

data leads to a coherent and consistent customer experience. For example, when a service call is logged using a called center

application, your web application is aware of the service call and its status.

One of the biggest challenges faced by a web application is getting access to information about a customer gathered outside

of a web context. As you know, many applications within the enterprise record information about customer interactions.

Call center applications and enterprise resource planning (ERP) systems are good examples of the kinds of systems used to

service customers in the enterprise.

The flexibility of ATG Data Anywhere allows you to "hook into" the important data gathered by call center, ERP, and other

enterprise applications without having to copy it all into a central repository.

Figure 9 below shows an example of an enterprise that gathers data about customer interactions using three disparate

systems (a web application, a call center application, and an ERP system).

Figure 9 – Enterprise data is often managed by disparate systems

Usr_tbl

����������

first_name loginid

��� ���������������

Service_request

�����������

description statusid

������������ ���������������

Order_history

�����������

order_num dateid

���������������

Usr_tbl

��������������������

first_name loginid

��� ���������������

��� ���������������

Service_request

����������������������

description statusid

������������ ���������������

������������ ���������������

Order_history

����������������������

order_num dateid

������������������������������

Page 31: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

29

With ATG Data Anywhere, you can access all of this customer-focused data without relocating it. Figure 10 below shows one way this

can be accomplished.

Figure 10 – A Unified View of Customer Data using ATG Data Anywhere

Usr_tbl

������������� ���� �����

���

��������

firstNamefirstName

loginlogin

����������

first_name loginid

idid

�������������

����������� ������

������������ ��������� �����

�����������

ordersorders

callscalls

Service_request

�����������

description statusid

Order_history

�����������

order_num dateid

:HE�$SSOLFDWLRQ�'DWD

&DOO�&HQWHU�$SSOLFDWLRQ�'DWD

(53�6\VWHP�'DWD

Usr_tbl

������������� ���� �����

������

��������

firstNamefirstName

loginlogin

��������������������

first_name loginid

idid

�������������

����������� ����������������� ������

������������ ��������� �����

�����������

ordersorders

callscalls

Service_request

����������������������

description statusid

Order_history

����������������������

order_num dateid

:HE�$SSOLFDWLRQ�'DWD:HE�$SSOLFDWLRQ�'DWD

&DOO�&HQWHU�$SSOLFDWLRQ�'DWD

&DOO�&HQWHU�$SSOLFDWLRQ�'DWD

(53�6\VWHP�'DWD(53�6\VWHP�'DWD

Page 32: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

30

�ßMaximum Performance Through Intelligent Caching

Thoughtful design of database access is a key to achieving acceptable performance in web applications. You want to

minimize how often your application needs to access the database, while still maintaining data integrity. An intelligent

caching strategy is central to achieving these goals. Dynamo SQL Repositories provide intelligent caching of data objects to

ensure excellent performance and timely/accurate results.

The ATG Data Anywhere Architecture supports an intelligent and flexible caching model that provides fine-grained control to

deployment experts. When an item is first retrieved from the database, it is stored in memory (in a cache). Subsequent

queries for the same item will not necessarily need to access the database (as long as the cache is still "valid" the data

currently stored in memory can be used).

ATG Data Anywhere was designed to work in the harsh web environment, while other systems make caching assumptions

that are more appropriate to a low-scale intranet. In most cases, a single server is accessing a data object (like the user

profile) at a time. Our locked-mode caching (see case #3 below) offers both high performance and data integrity. Locked

caching introduces a little bit of overhead to data access, since locks must be checked, set and removed during I/O. This

checking insures that data will never be stale, insuring data integrity. In the worst case, if one is reading and writing the very

same item all of the time, then the performance effect is similar to omitting caching all together. However, in the normal

case, when different data elements are being read and written by different systems, then locked caching offers performance

similar to simple caching. Bottom line: high performance caching while data integrity is assured: the best of both worlds.

See the white paper called " Caching Data for Scalability Without Losing Data Integrity" on atg.com for more information

about this topic.

Perhaps the most important thing to understand about Repository caching is that it is highly configurable to meet the needs

of your application. Developers can choose the appropriate caching strategy in the repository definition file (XML).

Dynamo SQL Repositories define four caching modes that can be used by developers as appropriate:

��ß Simple caching (which is the default)

��ß Locked caching

��ß Distributed caching

��ß Disabled (No caching)

Case 1: Single Dynamo Server

During development (and in some rare cases in production), an application may be deployed entirely on one Dynamo server.

The number of Dynamo servers that you'll be running on your production site will vary depending on the size/complexity of

your application and the number of simultaneous visitors the site needs to support.

Simple caching is recommended for single-Dynamo-server environments.

With simple cache mode, cached items are stored in each individual Dynamo server’s memory. No attempt is made to keep

cache items in sync between Dynamo servers. If one Dynamo server modifies an item, other servers will have inaccurate data

in cache until the cache is manually or automatically flushed (see "Invalidating the Cache" below).

Page 33: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

31

Note that simple caching is the default. Sites running multiple Dynamo servers will want to change the cache mode (to

locked or distributed) on important item types (whose data changes periodically).

Case 2: Read frequently, modify rarely or never

Some types of repository items (such as product catalog items) are modified rarely or never on the production site. Simple

caching can be used in these situations too. This includes all items that you modify only on a staging server and that you do

not modify once they are published to your live site.

Developers will need to flush the cache on all Dynamo instances in the production environment whenever modifications are

being pushed from the staging environment to the live site (see "Invalidating the Cache" below).

Case 3: Modifications made by one Dynamo server at a time

Some types of repository items (such as the User Profile) are consistently used by one server at a time. The data may change

frequently during a visitor's session, but a session is handled by a single Dynamo server (unless a failover occurs).

Locked caching is recommended when typical usage will involve modification by one server at a time. If more than one server

tries to modify an item at the same time, the 2nd server will be locked out until the 1st server completes its modifications. If

you allow this to happen frequently, your site performance will suffer.

Locked Caching is based on write-locks and read-locks. If no servers have a write-lock for an item, any number of servers may

have a read-lock on that item. When a server requests a write-lock, all other servers are instructed to release their read-locks.

Once an item is write-locked, no other server may get a read-lock or write-lock until the first server releases its write-lock. In

other words, once a server has a write-lock on an item, all access to that item is blocked until the write is completed.

A server requests a read-lock the first time it tries to access an item. Once the server has a read-lock on the item, it holds

that read-lock until the lock manager notifies the server to release its read-lock. At that time, it drops the item from its cache.

Case 4: Modification by multiple Dynamo servers

In extreme cases, in a multiple Dynamo server environment, you need the ability to notify all other servers that an item has

been modified (even if those other servers are not going to modify the item themselves).

Use distributed caching for items that are modified infrequently during runtime. Distributed mode works best if there is

little chance that two Dynamo servers will attempt to access and change a repository item at the same time. For items that

change more frequently, use locked mode.

Distributed mode allows any Dynamo server to read or modify an item in cache. When one Dynamo server modifies an item,

it broadcasts a JMS cache invalidation event to all servers (see "Invalidating the Cache" below). Distributed caching uses

asynchronous message delivery. This means that there is a slight chance of a user getting stale data, until the invalidation

event message is received by all servers: if a user logged in on one server makes a change to an item, and another user logged

in on a different server requests that item after the change is made, but before the second server has received the cache

invalidation event, the second user would get stale data.

This mode is seldom used; in most cases, locked caching is preferable to distributed caching.

Page 34: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

32

Case 5: Modification by a non-Dynamo Application

On some sites, the data used by the web application can be modified by a third party system. In these cases, you will need to

either disable caching or find a way to notify the production Dynamo servers whenever a change is made by the third party

application (using messaging).

Disabling Caching

Disabled caching should be used with great caution, because it will result in database access for every page that

accesses an item of this type. This potentially has a severe impact on performance.

Caching should be disabled when there is a possibility that the underlying data will be changed by a non-Dynamo

Repository application. For instance, if you have an on-line banking application, and the same data is accessed by

other applications in addition to Dynamo, you may want to turn off caching for displaying the user’s account

balances.

The other caching modes can only be set on a per-item-type basis, but disabled caching mode may be set on a per-

property basis. If a request is made for a disabled cache property of a cached item, the database will be queried.

Example from userprofile.xml:

<property category="Login" name="password" data-type="string"

required="true"

column-name="password" cache-mode="disabled" >

Invalidating the Cache

Usually cache invalidation happens automatically when repository items are changed using the Repository API.

Sometimes it is necessary to force cache invalidation manually, such as when the contents of the database are

changed directly by a third-party application (without going through the Repository API).

One way of handling integration with a third-party application is to have the third-party application send a JMS

message whenever interesting data is modified. Your Dynamo application can then receive the message and

programmatically invalidate the appropriate cache.

To flush all items and all queries in all caches in a specific repository, use:

atg.repository.RepositoryImpl.invalidateCaches()

To flush the caches associated with a specific item type, use:

// The following method empties the item cache for the given item

// descriptor

atg.repository.ItemDescriptorImpl.invalidateItemCache()

// The following method empties the item cache and query caches // for this item

descriptor

Page 35: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

33

atg.repository. ItemDescriptorImpl.invalidateCaches()

// The following method method removes a specific repository item // from the

cache

atg.repository.ItemDescriptorImpl.removeItemFromCache(String itemId)

Controlling the Cache Sizes

The size of each repository item cache is configurable as well. By default, the item cache size is 1000 items. After running

your site for some time, you can get a good idea of how well the repository item caches are working by going to the

repository's page in the Dynamo Administration interface. For example, the Administration interface page for the Commerce

Product Catalog repository is:

http://localhost:8830/nucleus/atg/commerce/ProductCatalog

Under the heading Cache usage statistics, this page lists, for each item descriptor, the number of items and queries in the

item and query caches, the cache size, the percent of the cache in use, the hit count, the miss count, and the hit ratio. If you

have a high quantity of misses and no hits, you are gaining no benefit from caching, and you can probably just turn it off, by

setting the cache size to 0. If you have a mix of hits and misses, you might want to increase the cache size. If you have all hits

and no misses, your cache size is big enough and perhaps too big. There is no harm in setting a cache to be too big unless it

will fill up eventually and consume more memory than is necessary.

Page 36: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

34

The cache size can be adjusted in a repository definition file as shown in the following example:

<gsa-template>

<item-descriptor name="topic" cache-mode="locked"

query-cache-size="100" item-cache-size="1500">

...

</item-descriptor>

</gsa-template>

There are actually two types of caches in the repository. The item-cache caches the item and the properties; the

query-cache caches the result set so that you don't need to hit the database to find out which items to return

when the same query is issued again and again. By default, query caching is turned off (the default query-

cache-size is set to zero).

Page 37: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

35

�ß Simplified Transactional Control

Overview of Transactional Integrity

Web applications need to be built carefully to balance the integrity of the data it manages with its performance goals.

Consider a web application that allows visitors to transfer funds between bank accounts. A "transfer" operation really

involves two actions (a debit from the source account and a credit to the destination account). In order to maintain the

integrity of the data, both actions must complete successfully. If anything goes wrong during the "transfer" operation, the

account balances should be "rolled back" to their original amounts. A system with transactional integrity will allow a

developer to group multiple actions (e.g., a debit and a credit) into a single activity that either succeeds as a whole or fails as a

whole.

The J2EE Approach to Transactional Integrity

J2EE provides a vendor and data source independent mechanism for managing transactions called Java Transaction API (JTA).

JTA allows developers to control transactional boundaries (start, commit, rollback). J2EE also defines six transaction

demarcation modes (REQUIRED, REQUIRES_NEW, NOT_SUPPORTED, SUPPORTS, MANDATORY, NEVER) for specifying the

scope and impact of transactions on a particular Enterprise JavaBean method. All J2EE containers must provide a

UserTransaction component which exposes programmatic control of transactions to developers.

J2EE paved the way for what is called declarative transactional demarcation that allows the developer to establish the

transactional behavior outside of Java code. In J2EE, the transactional behavior for a particular method is specified at

deployment time in a deployment descriptor (an XML file).

ATG Data Anywhere Support for Transactional Integrity

The ATG Data Anywhere Architecture™ supports all of the requirements of the J2EE specification.

��ß The ATG Dynamo Application Server™ provides a fully J2EE-compliant TransactionManager,

but if you are building on a third-party application server (such as BEA WebLogic), you can use its

TransactionManager in place of ours.

��ß Transactional boundaries can be set declaratively (for EJBs) and programmatically (using JTA)

The down side of transactional integrity is that performance of the data access functions is slowed due to the overhead of

tracking data access operations occurring within transactions. The key to good overall system performance is to minimize

the impact of transactions. For this reason, the ATG Data Anywhere Architecture™ allows page developers and Java

developers to control the scope of transactions.

Advantages of the ATG Data Anywhere Approach

��ß Page developers can use simple droplets to control transactional boundaries (without writing Java code)

��ß Java developers can leverage the transactional demarcation modes using the provided

TransactionDemarcation interface (in J2EE only EJB methods can use the modal demarcation technique,

with ATG this technique can be used in simple JavaBeans and Servlets as well)

Page 38: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

36

Example Page

<dsp:page>

<dsp:importbean bean="/atg/dynamo/transaction/droplet/Transaction"/>

<dsp:importbean bean="/atg/dynamo/transaction/droplet/EndTransaction"/>

<dsp:importbean bean="/atg/userprofiling/Profile"/>

<dsp:droplet name="Transaction">

<dsp:param name="transAttribute" value="required"/>

<dsp:oparam name="output">

One transaction instead of three:

<P> <dsp:valueof bean="Profile.firstName" />

<P> <dsp:valueof bean="Profile.lastName" />

<P> <dsp:valueof bean="Profile.city" />

<dsp:droplet name="EndTransaction">

<dsp:param name="op" value="commit"/>

<dsp:oparam name="successOutput">

The transaction ended successfully!

</dsp:oparam>

<dsp:oparam name="errorOutput">

Failure: <dsp:valueof / param="errorMessage">

</dsp:oparam>

</dsp:droplet>

</dsp:oparam>

</dsp:droplet>

</dsp:page>

Default Transactional Behavior

In order to protect the integrity of data, SQL Repositories wrap a "required" mode transaction around every property read and

write. This is good because by default transactional integrity will be enforced, however developers will need to consider the

performance implications of such granular transactional scope. Unless a developer creates a transaction of his/her own

(programmatically or using droplets), a new transaction will be conducted every time the getPropertyValue or

setPropertyValue methods are called on a repository item. In order to achieve good performance, developers need to be

aware of this default behavior and override it when appropriate, such as a dynamic page that displays multiple properties

from the user profile (rather than beginning and ending a transaction for each property, it would be more efficient to read all

of the properties in a single transaction).

Recommendations

��ß Use the Transaction droplets when displaying repository information.

��ß When processing a form, use programmatic demarcation (typically at the start and end of your handler

methods.)

Page 39: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

37

�ß Strong Built in Search Capabilities

The ATG Data Anywhere Architecture™ provides a powerful set of Repository searching tools. We've already examined the

use of some of the searching mechanisms that are provided (searching by id (using the ItemLookupDroplet) and querying

against a single item type within a single repository (using the RQLForEachDroplet)). ATG also provides a

SearchFormHandler that can be configured for most of your "search page" needs.

The SearchFormHandler supports several types of searching:

��ß Keyword searches allow you to build a search page in which visitors enter a set of keywords and queries

all of the item properties that have been hold keywords. For example, "find all products in the catalog

with the keyword tools"

��ß Text searches allow your visitors to perform full-text searches. Dynamo can simulate full-text searches

or make use of your RDBMS-specific one (if it is available). For example, "find all products in the catalog

whose description contains quality"

��ß Hierarchical searches allow your visitors to limit a search to a particular subset of items. For example,

"find all products in the catalog with the keyword tools in the home-goods category"

��ß Advanced searches (also called Parametric searches) allow your visitors to limit the search based on a

range of property values ("find all recipes whose cook time is between 5 and 20 minutes") or based on a

specific enumerated value ("find all movies with the keyword action where the rating is PG-13")

��ß Combination searches – any of the above search types can be combined together.

Searching for content across repositories and item types is an extremely powerful feature. It allows visitors and developers

find the data they need more rapidly. Quality searching tools lead to higher satisfaction, greater efficiency, and potentially

more revenue.

Once again, developers do not need to write Java code to include a search function in their applications. The provided form

handler can be configured to perform a great variety of searches. If the provided form handler does not meet your developers

needs they can use inheritance to extend the provided class. For example, ATG developers specialized the search form

handler for searching the commerce catalog (it allows the results to be presented as matching categories followed by

matching products).

Page 40: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

38

�ß Fine-grained Access Control

The ATG Data Anywhere Architecture™ provides a Secured Repository system that works with the Dynamo Security System

to provide fine-grained access control to repository item descriptors, individual repository items, and even individual

properties through the use of Access Control Lists (ACLs).

Any repository can have security by configuring an instance of a Secured Repository Adapter on top of the repository

instance. Depending on the security features you desire, some new properties (such as an owner property and an acl

property) may have to be added to the underlying repository in order to support access control information storage.

Case 1: Controlling access to all items of the same type

The most basic level of access control is at the item type level. This is similar to controlling access to a particular database

table. For example, you can specify that only members of the administrators group have access to user profile items.

Case 2: Controlling access to specific items

The next level of access control is on specific items. This is similar to access control of a single row in a database. For

example, you can specify that members of the education managers group have access to user profile items for people who

work in the education department.

Case 3: Controlling access to specific properties

You can even control who is allowed to read/write a particular property of an item. For example, you can specify that

members of the human resources group can retrieve the salary property within certain user profile items.

Case 4: Limiting Query Results

You can control who can receive certain repository items as results from a repository query. For example, you can specify that

only the owner can query new items until the owner previews and approves the item.

Page 41: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

39

Creating a Secured Repository 1. Modify the underlying repository. For those item descriptors you want to secure, you need to make some

minor modifications to the underlying data and item descriptors to add properties with which to store the

Access Control List (a String or an array of Strings) and owner (a user profile) information. For example:

<item-descriptor name="account">

<table name="Account" type="primary"

id-column-name="accountId">

<property name="accountId" data-type="string" />

<property name="type" data-type="string" />

<property name="ACL" data-type="string" />

<property name="accountOwner" component-type="user" />

</table>

</item-descriptor>

��ß Configure the Secured Repository Adapter component. You need to wrap a Secure Repository component

around the underlying repository. For example:

SecureAccountRepository.properties:

$class= atg.adapter.secure.GenericSecuredMutableRepository

$scope=global

# The name property is for the ACC.

name=Secure Account Repository

repositoryName=SecureAccountRepository

# The repository property refers to the underlying repository

repository=AccountRepository configurationFile=secureAccountRepository.xml

securityConfiguration=

/atg/dynamo/security/SecuredRepository/SecurityConfiguration

Page 42: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

40

��ß Write a secure repository definition file to spell out the access control you desire. In the following example,

we first establish the name of the owner (accountOwner) property and the name of the property holding

the access control list (ACL). Then it establishes an ACL that grants read, write, and list (for queries) access to

account items to members of the ACC's administrators-group.

<!DOCTYPE gsa-template

PUBLIC "-//Art Technology Group, Inc.//DTD General SQL Adapter//EN"

"http://www.atg.com/dtds/security/secured_repository_template_1.1.dtd">

<secured-repository-template>

<item-descriptor name="account">

<owner-property name="accountOwner"/>

<acl-property name="ACL"/>

<descriptor-acl

value="Admin$role$administrators-group:list,read,write;"/>

</item-descriptor>

</secured-repository-template>

Page 43: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

41

��ß Conclusions: ATG Data Anywhere Architecture™ Decreases Total Cost of Ownership

At the heart of all web applications is data access. Data access makes web sites more intelligent and thereby more useful for

companies and customers. Web applications require a data access mechanism to interact with user profiling information,

web site content and enterprise data. A customer facing web site needs to have a unified view of all customer interactions

(including sales force interactions, call center interactions, and web experiences). This unified view of customer data leads to

an integrated and coherent customer experience.

Data access for a web application is especially complex because the object-oriented world of a Java application is quite

different than the structure of data in a relational database, an LDAP directory, or a file system. The way you access each of

these data sources varies dramatically, so developers have to learn the tools and tips of each kind of system.

J2EE provides some support for data access in the form of JDBC and container managed entity beans (EJBs), but both most

implementations of these technologies focus on mapping relational data to objects. Of course, developers can create bean-

managed EJBs that let you interface EJBs to whatever data source you want (as long as you’re willing to write a lot of code or

use a tool to help you).

JDO is another data access standard for data access that supports data source independence, but still requires developers to

write a Java class for each persistent type and caching of data objects is left to the vendor implementations.

As you can see, the ATG Data Anywhere Architecture™ has several advantages over traditional data access mechanisms as

summarized below:

��ß Insulates application developers from schema changes and also storage mechanism (data can move

from a relational database to an LDAP directory without requiring any re-coding)

��ß Unifies your customer data without copying it all into a central data source

��ß Provides intelligent caching of data objects to ensure excellent performance and timely/accurate results

��ß Simplifies transactional control (programmatic demarcation using modes or droplets on a dynamic

page)

��ß Provides powerful searching tools out-of-the-box that can span data sources and data types

��ß Provides fine-grained access control to data – all the way down to the individual property level.

��ß Easier to use and more powerful than Java Data Objects (JDO) and Enterprise JavaBeans (shorter

learning curve, no code required to represent a persistent type, simpler configuration, more than just

relational database support)

The ATG Data Anywhere Architecture provides advantages that go well beyond the other options. Dynamo Repositories

allow developers to focus on implementing business logic rather than spending time writing "wrapper classes" for each

persistent data type used by the application. This focus directly improves time-to-market and significantly reduces the total

cost of ownership of web applications.

Page 44: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

42

��ßAppendix: Other Sources of Information

Documentation

ATG Dynamo Application Server Programming Guide

Part II: Repositories

ATG Dynamo Personalization Programming Guide

Setting Up a Profile Repository

Setting Up an LDAP Profile Repository

Linking SQL and LDAP Repositories

Working with the Dynamo User Directory

Setting Up an LDAP User Directory

ATG Dynamo Administration Guide

Using JDBC with Dynamo

Configuring Databases

Managing Database Servers

Repository and Database Performance

ATG Dynamo Page Developers Guide

Using Search Forms

Dynamo 5 ER Diagrams

Education

See atg.com for more information about these education offerings.

Instructor-led Training

ATG Dynamo Essentials for Java Developers (5 days)

Utilizing Dynamo Repositories (2 days)

Self-directed learning

Mastering Web Applications

Mastering Personalized Applications

ATG e-Learning Connection

Extending the User Profile (an e-Course)

Page 45: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

��

This publication may not, in whole or in part, be copied, photocopied, translated, or reduced to any electronic medium or machine-readable form for commercial use without prior consent, in writing, from Art Technology Group (ATG), Inc. ATG does authorize you to copy documents published by ATG on the World Wide Web for non-commercial uses within your organization only. In consideration of this authorization, you agree that any copy of these documents which you make shall retain all copyright and other proprietary notices contained herein. This documentation is provided “as is” without warranty of any kind, either expressed or implied, including, but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. The contents of this publication could include technical inaccuracies or typographical errors. Changes are periodically added to the information herein; these changes will be incorporated in the new editions of the publication. ATG may make improvements and/or changes in the publication and/or product(s) described in the publication at any time without notice. In no event will ATG be liable for direct, indirect, special, incidental, economic, cover, or consequential damages arising out of the use of or inability to use this documentation even if advised of the possibility of such damages. Some states do not allow the exclusion or limitation of implied warranties or limitation of liability for incidental or consequential damages, so the above limitation or exclusion may not apply to you. Acknowledgments I would like to thank all of the people who have contributed along the way to the creation of this paper. First and foremost, thanks to Bill Morrison, ATG Product Marketing Manager, who sponsored the creation of this white paper. Extra special thanks to the following trainers and courseware developers whose inspiration, ideas, diagrams, words, and experience have been used as source material for this white paper: Diana Carroll, Blake Crawford, Kevin Johnson, Pierre Billon, Karin Layher, and Paul Donovan. Thanks go to the following folks who reviewed the paper and provided helpful feedback: Blake Crawford, Karen Kilty, Joyce Wang, and Nathan Abramson. Final thanks go to my wife Bonnie Durante who put up with long hours spent designing, writing, and proofreading.

Page 46: ATG Data Anywhere Architecture WP

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™

www.atg.com/offices

America Headquarters Art Technology Group, Inc.

25 First Street Second Floor

Cambridge, MA 02141 USA

Tel: +1 617 386 1000

Fax: +1 617 386 1111

North American Offices

Atlanta / Chicago / Dallas / Los Angeles / New York / Palo Alto / San Francisco / Toronto / Washington DC

European Headquarters

Art Technology Group (Europe), Ltd

Apex Plaza Forbury Road

Reading RG1 1AX UK

Tel: +44 0 118 956 5000

Fax: +44 0 118 956 5001

European Offices

Amsterdam / Frankfurt / London / Milan / Paris / Stockholm

Asia/Pacific Headquarters

Art Technology Group, Inc.

Suite 46 Level 11 Tower B

Zenith Centre

821 Pacific Highway

Chatswood NSW 2067

Sydney Australia

+61 2 8448 2071

+61 2 8448 2010

Asia/Pacific Offices Hong Kong / Singapore

Japan Headquarters Art Technology Group, Inc.

Imperial Tower, 15th Floor

1-1-1 Uchisaiwaicho

Chiyoda-ku, Tokyo 100-0011, Japan

www.atg.com 6540001-01 April 2002

© 2002, Art Technology Group, Inc. ATG, Art Technology Group, the techmark, the ATG Logo, and Dynamo are registered trademarks, and Personalization Server and Scenario Server are trademarks of Art Technology Group. All other trademarks are the property of their respective holders. All specifications are subject to change without notice. Art Technology Group, Inc. cannot accept liability for any loss or damage arising from the use of information or particulars in the brochure.

NASDAQ:ARTG

rburrage
Text Box
About ATG A trusted, global specialist in e-commerce, ATG has spent the last decade focused on helping the world's premier brands maximize the success of their online businesses. The ATG Commerce application suite is the top-rated platform by industry analysts for powering highly personalized, efficient and effective e-commerce sites. The company's platform-neutral e-commerce optimization services can be easily added to any Web site to increase conversions and reduce abandonment. These services include ATG Recommendations and eStara Connections. For more information, please visit http://www.atg.com. ©2009 Art Technology Group, Inc. ATG, Art Technology Group and the ATG logo are registered trademarks of Art Technology Group. All other trademarks are the property of their respective holders. NASDAQ:ARTG
rburrage
Rectangle