Page 1
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 1
Real-Time Coherence Monitoring in Integrated Environments
Correlating Coherence Monitoring Metrics with Infrastructure, Database, and Application Server Metrics
5 December 2013 - London, UK
Everett Williams
Senior Director of Technology
SL Corporation
Tom Lubinski
Chief Technology Officer
SL Corporation
Page 2
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 2
Disclaimer
The following is intended to outline our general product direction. It is
intended for information purposes only, and may not be incorporated
into any contract. It is not a commitment to deliver any material, code,
or functionality, and should not be relied upon in making purchasing
decisions. The development, release, and timing of any features or
functionality described for SL’s products remains at the sole discretion
of SL.
Page 3
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 3
Agenda
• Customer quotes/Problem Statement
• Data Collected
• Current Tools/Analysis capabilities
• Architecture
• Demo
• Expanding to App Servers
• Challenges
Page 4
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 4
Customer Questions/Comments
• Nodes, Caches, and Services are nice. But we are server people. –
Online Major Retailer
• How do I know if hardware/network are causing my coherence
problem – Major Apparel Company
• Does blade configuration #1 or blade configuration #2 run my
application better? – Major Retailer
• We spend a great deal of time “poking around” looking for system
metrics – Investment Bank.
Page 5
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 5
Assumptions
• 100-500 JVMS
• 10-50 of Hosts
• 100s of caches
• Analysis over time
• Overlapping Data set (more than outlier analysis)
Page 6
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 6
Data Collected and Aggregated
• Coherence Information
– Cache
– Service
– Node
– Storage Manager
• Host Information
– Host CPU/Memory
– Host Network
– Host Process CPU Memory
• Coherence information aggregated to the host level.
Page 7
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 7
What makes Coherence “unique”
• Single task spread across multiple processes and servers.
• Impact of Network and latency
Page 8
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 8
Server view of data
• Single Server
• Single state
• No Context
• Doesn’t scale visually
Page 9
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 9
Top
• Single Server
• Single state
• No Context
• Doesn’t scale visually
Page 10
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 10
Top/NMon
• Single Server
• Single state
• No Context
• Doesn’t scale visually
Page 11
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 11
Taskmon Graphs
• Single Server
• No Context
• Time Series not aligned
Page 12
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 12
Stacked Graphs
• Doesn’t visually scale
• Top N Servers don’t re-sort
Page 13
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 13
Single Graph with Multiple Trends
• Good for Outlier analysis
• Not good for overlapping trends.
• Single Server
• No Context
Page 14
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 14
RTView Enterprise Solution
Collect, Analyze, Correlate, and Visualize data from multiple disparate sources
RTView
Enterprise System RTView
Developer
Generic
JVM
VMware Oracle
WebLogic
Oracle
Coherence
… and many more
TIBCO
…
IBM
…
Oracle
Database
Custom
Package
OEM
Connector System
Metrics
OEM
Target Systems
Page 15
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 15
DataServers
RTView Enterprise Solution
Users
EM Central Server(s)
.
CMDB
ALERTDEFS
Configuration
Management
Alert
Aggregation
Directory
Cache Map
Display
Server
HISTORY
EM Central Server(s) provide configuration to
dataservers and alert management to users.
It also provides a CacheMap identifying the
location of all data contained in DataServers.
Page 16
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 16
DataServer Configuration
Primary / Backup Servers run on different machines
RTView DataServer – with H/A Deployment
DataServer
<primary>
DataServer
<backup>
Historian
<primary>
Historian
<backup>
RTView EM
ConfigServer
Host
A
Systems
Being
Monitored
…
DataServer
components obtain
configuration
information via EM
ConfigServer
Host
B
H/A Database
Page 17
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 17
End-To-End Monitoring
• Capture the nested levels systems are implemented in –
heterogeneous component layering
Host Layer
Physical Servers, Network, Disk, OS
App Server Caching Messaging
Servlet JSP
EJB
Topic Queue
Route
Cache Service
Page 18
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 18
DEMO
Page 19
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 19
Challenges
• Consolidated snapshot of data
• Visual scalability
• Multiple combinations of trends
Page 20
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 20
Process/Node
Page 21
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 21
Network/Service
Page 22
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 22
Network/Service #2
Page 23
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 23
Host/Service
Page 24
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 24
Network/TCMP
Page 25
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 25
Network/TCMP #2
Page 26
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 26
Expanding to Application Servers
Correlation of WebLogic and Coherence
Monitoring Metrics
Page 27
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 27
On-Line Store Overview Diagram
System overview diagram
Page 28
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 28
WebLogic Cluster/Server Summary
All Servers Organized by Cluster, with Health State
Page 29
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 29
WebLogic Cluster App Summary
Each Cluster shown as a unit, with server metrics aggregated
Page 30
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 30
Load Balance Analysis
Load Balance Comparison of multiple metrics across WebLogic and Coherence
Page 31
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 31
Aggregating Other Middleware Information
Health State of each service aggregated from multiple components
Page 32
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 32
Aggregating Other Middleware Information
Health State of each service aggregated from multiple components
Page 33
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 33
Aggregating Other Middleware Information
Including Aggregate Service Alert History over Time
Page 34
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 34
Aggregating Other Middleware Information
Including Detailed History of Coherence Cache Service
Page 35
© 2012 SL Corporation. All Rights Reserved.
© 2013 SL Corporation. All Rights Reserved. 35
Thank you!
For more information, please visit
www.sl.com + www.sl.com/blog