October 25–29, 2009 • Mandalay Bay • Las Vegas, Nevada 1 Revolutionizing the Data Abstraction Layer with IBM Optim pureQuery Dr. Vladimir Bacvanski, Vice President, InferData, [email protected]Daniel Galvin, Consultant, Galvin Consulting, [email protected]Session Number 2171
37
Embed
Revolutionizing the Data Abstraction Layer with IBM Optim pureQuery and DB2
Looking for a more flexible and efficient way for Java programs to access the database? Join us as we explore how you can bridge the gap between Java and relational databases. Enhance your Java environment with access layer generation, data access best practices, traceability between Java packages and SQL statements, improved impact analysis and more. And most importantly, see how new technology can improve not only new development, but existing applications as well. Be prepared to see designs and code samples!
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
October 25–29, 2009 • Mandalay Bay • Las Vegas, Nevada1
Revolutionizing the Data Abstraction Layer with IBM Optim pureQuery
Dr. Vladimir Bacvanski, Vice President, InferData, [email protected]
Daniel Galvin, Consultant, Galvin Consulting, [email protected] Number 2171
What is this revolution about?
2
NO SLOW APPS!
NO BAD SQL!
GET CONTROL BACK !
3
Show of Hands: What Data Access Technology Have You Used?
What’s most important to you?– Productivity
– Performance
– Security
– Portability
JDBC
iBatis
SQLJ
HibernateEJB Entity
Beans
JPA
mashuphttp Stored
Procedures
JSON
QoS goalsJSP
XML
JDBC
Runstats
Response Time!
SQL
Spring
REORG
Partition strategy
ApplicationDeveloper
SQLJ
JDBC
JPA
iBatis, . . .
SpringWhy does this query take so long?
I can’t believe I got called out last week. I wish I could see how these queries will run in production.
Writing Java code is so easy with this eclipse environment.I wish it was that easy to get the SQL right.
This ORM doesn’t allow me to leverage all my database’s SQL.
Static SQL? Sounds like another delay to getting my program deployed
Sometimes I need POJOs, sometime JSON, sometimes XML, what should I use?
Java Data Access – Two Views of the World
Database Developer& Administrator
Another runaway query! Where are these coming from? JDBC? Hmmm…
Inconsistent response time? How long will it take me to find the offending application sending bad SQL this time?
These ad-hoc queries are dangerous. We need a library of tested SQL interfaces.
Can I examine the SQL “before” the application is deployed?
Another GRANT request? This security administration is out of control.
5
Application-Centric– Top-Down– Start with Object Domain
Model– ORM Mapping– Well supported in dynamic
languages and frameworks
Hybrid– Meet in the middle– Can be challenging w/o
comprising
Data-Centric– Bottom-UP– Start with Relational Data
Model– Not well supported in dynamic
languages and frameworks
Persistence Layer
TopDown
BottomUp
Meet in theMiddle
Data Mapping Approaches
66
EJB, JPA, and Hibernate vs. The Database
DBA and SQL developer chasm Where is the SQL coming from? What is it? Where is it? How do we tune it? How de we manage it?
Performance Concerns: Some App Server vendors claim
(unsurprisingly) that Managed objects performs fine.
There are many user claims of bad Managed object performance is bad on the web.
As always, the truth is in the middle. And will depend on your app server,
application, database, etc ..
“Our top story: Large Customer moves from COBOL to Java to become more agile. In other news, DBA develop amnesia.”
Introducing pureQuery
pureQuery Components:Simple and intuitive API– Enables SQL access to databases or in-memory Java objects– Facilitates best practices Optim Development Studio (integrates with RAD/RSA)– Integrated development environment with Java and SQL
support – Improve problem isolation and impact analysisOptim pureQuery Runtime– Flexible static SQL deployment for DB2
A high-performance, data access platform to simplify developing, managing, securing, and optimizing data
access.
pureQuery Balances Productivity and Control
Managed objects Object-relational mapping
Spring templates
Full SQL control
Code all your SQL
Use SQL templates, inline only
Complex OR mapping and persistence management, but loss of controls
Adds container management option
JDBC / SQLJ
iBATIS
Hibernate
OpenJPA (EJB3)
Add basic OR mapping and annotated-method stylepureQuery
Existing JDBC to Static• Reroute Dynamic Queries to Static
Jump Start Application Design• Generate SQL and Code from Database Objects • Setup basic DAO Pattern
Oracle Support• Replace Query w/o changing
source
Code Example: JDBC Table Column TypeEMP NAME CHAR(64)
EMP ADDRESS CHAR(128)
EMP PHONE_NUM CHAR(10)
class Employee { String name; String homeAddress; String homePhone; …}
java.sql.PreparedStatement ps = con.prepareStatement( "SELECT NAME, ADDRESS, PHONE_NUM FROM EMP WHERE NAME=?");ps.setString(1, name);java.sql.ResultSet rs= ps.executeQuery();names.next();Employee myEmp = new Employee();myEmp.setName(rs.getString(1));myEmp.setHomeAddress(rs.getString(2));myEmp.setHomePhone(rs.getString(3));names.close();
Code Example: pureQuery
11
Employee myEmp = db.queryFirst( "SELECT NAME, ADDRESS, PHONE_NUM FROM EMP WHERE NAME=?", Employee.class, name);
Even simpler, if we have a method getEmployee with a Java annotation or XML file with SQL for the query:
Employee myEmp = getEmployee(name);
WHY SHOULD BE THE DATA SPECIALISTS BE INTERESTED IN PUREQUERY?
12
Motivations of the Data Specialists
SQL Performance Tuning Ease of Tuning Autonomy of Developers Predictability of an Optimized Data Access Path
Reduction of Costs to satisfy SQL statements Optimized Access Paths Reduction of CPU intensive components of SQL Execution Utilization of Specialty Processors
Capacity Planning Hardware Utilization
Problem Determination capabilities
13
pureQuery Capabilities
Static SQL for Runtime with Dynamic SQL Execution in Development
pureQuery can utilize SELECT INTO from a Java application With Client Optimization, Static SQL from existing JDBC with
no changes to the Application Homogeneous and Heterogeneous Batching of Statements Statically bound packages are easy to EXPLAIN and monitor
for changes in access path pureQuery coupled with IBM Optim Performance Monitoring
provides E2E Performance Monitoring and Problem Determination
Impact analysis is greatly improved by the static packages and the ability to tie each statement to a method in the application code
14
STATIC VS. DYNAMIC SQL EXECUTION
15
Static vs. Dynamic SQL
Check Plan/Package Authorization and look for
stmt in Cache
Parse SQL Statement
Check Table/View authorization
Compute Access Path
Execute Statement
Validate SQL against DB2 Catalog
Dynamic SQL – Full Prepare
(~ 300,000 lines of code)
Check Plan/Package Authorization and look for
stmt in Cache
Execute Statement
Dynamic SQL – Short Prepare (~ 15,000 lines
of code)
Create Skeleton in EDM Pool
Copy skeleton from cache to local DB2 thread storage
Check Plan/Package Authorization
Execute Statement
Load package into EDM if not previously loaded
Static SQL
Cost of Prepare
CPU cost of Short Prepare on DB2 9 for z/OS – between 400µs and 1ms
CPU cost of Full Prepare on DB2 9 for z/OS – approximately 30 to 50ms. Cost could be much higher and generally increase with complexity.
IRWW – an OLTP workload, Type 4 driverCache hit ratio between 70 and 85%23 % improvement in throughput using pureQuery over dynamic JDBC15% - 25% reduction on CPU per transaction over dynamic JDBC
274
360420 446
485524
0
100
200
300
400
500
No
rmalized
Th
rou
gh
pu
t (I
TR
)
Normalized Throughput by API for JDBC Type 4 Driver
-35%
-14%
6%15%
25%
-50%
% in
crea
se/r
edu
ctio
n in
CP
U p
er
tran
sn c
om
par
ed t
o J
DB
C
% increase/reduction in CPU per transaction compared to JDBC using Type 4 driver
20
How well does it work? - .Net applications
Throughput during static execution increased by 159% over dynamic SQL execution assuming a 79% statement cache hit ratio
*Any performance data contained in this document were determined in various controlled laboratory environments and are for reference purposes only. Customers should not adapt these performance numbers to their own environments as system performance standards. The results that may be obtained in other operating environments may vary significantly. Users of this document should verify the applicable data for their specific environment.
IRWW – OLTP application
Application accesses DB2 for z/OS
BATCHING SQL STATEMENTS
21
Homogeneous & Heterogeneous Batch
Homogeneous Batch – all instances in the batch are the same statement and require only 1 line turn
Heterogeneous Batch – allows different SQL statements to be included in batch.
Both Utilize Multi-Row Insert
Heterogeneous Batches may contain 0 to many Homogeneous Batches
Allows you to bind static SQL packages from existing JDBC code
Avoids the cost of rewriting the application to code to the pureQuery API
Allows Heterogeneous batch with minor changes to the code
None of the productivity advantages are realized. Code is still maintained in JDBC.
End-to-End monitoring lacks some introspective capability into the coding
Creation of the static packages requires that you run the code.
Some overhead at runtime related to resolution of statements to static packages
25
DB2 Data Servers
pureQuery client optimization enables static execution for JDBC applications (custom-developed, framework-based, or packaged) Existing JDBC Application
JDBC Driver w/ pureQuery
Dynamic SQL execution Static SQL execution
Optimize Existing JDBC Applications
Captured SQL- related
metadata
Capture Configure Bind Execute
"The ability to use static SQL with pureQuery is huge. Recently, I worked with a client who could reduce CPU usage by 7 percent thanks to this one feature."
— David Beulke, Pragmatic Solutions Inc.
Improve performance for DB2 – without changing a line of code
Why should DBAs care ?
DBAs have little to no visibility of application SQL before deployment, no opportunity for review and optimization
Problem isolation takes days with contemporary environments such as Java, PHP, .NET, etc due to inability to trace SQL to Java application and source code
Constantly increasing Java application workload taxes existing systems – need to fit more work into existing systems
SQL injection represents an increasing risk to data security
Why should Developers care ?
Get data access right the first time !
Get it done faster - Improved productivity
Single environment that spans Java application and database development
Improved problem isolation and resolution
29
Control performance– Decide at deployment time how the SQL is executed– Understand and lock down the access plan for SQL– Replace suboptimal SQL without changing the application
Control security– Prevent SQL injection– Prevent execution of unauthorized SQL– Better manage database security
See inside applications that are driving your database– Understand where SQL comes from– Understand when frameworks and ORM’s are getting in the way
Simplify problem determination and troubleshooting– Correlate problem SQL with applications, ORM’s and frameworks