(ATS3-APP14) Troubleshooting Symyx Notebook client performance Mike Wilson Advisory Product Manager [email protected]
Nov 10, 2014
(ATS3-APP14) Troubleshooting Symyx Notebook client performance
Mike WilsonAdvisory Product [email protected]
The information on the roadmap and future software development efforts are intended to outline general product direction and should not be relied on in making a purchasing decision.
Agenda
• Notebook application overview• System performance – a complex set of relationships• Assessing Notebook performance• Leveraging Accelrys support• Q&A
AdministrationWorkflow
Designer
Configuration
Management
Accelrys Vault Server
Symyx Notebook by Accelrys
Notebook High-Level Architecture
Experiment Editor
Framework Platform
Notebook
Browser
Reporting
Search
Materials &
Chemistry
Notebook Platform
Pluggable Services
Lab Automatio
nMaterial
Registration
Material Lookup
OpenEye
Accelrys Draw
Renditor
ServicesQueryIndexWorkflow Repositor
y
OracleDirect
Cartridge
AuthenticationA
uth
oriza
tion
Windows Communication FoundationDiscovery Gate W-S
Registration
Automation Studio
Content Managemen
t
Data Warehouse
• Centralized Deployment– App servers located close to database
server– Clients installed locally or accessed
centrally via Citrix
• Scaling– Multiple app servers behind load
balancer– Database clustering via Oracle tools
• Virtualization– VMWare ESX server support– Citrix XenApp client support
Symyx Notebook Deployment
Response Time Breakdown
• Response time can be divided into 9 components – Client processing– Network transit from client to middle tier– Middle tier processing– Network transit from middle tier to database tier– Database tier processing– Network transit from database tier to middle tier– Middle tier processing– Network transit from middle tier to client– Client processing
• The key to understanding a performance problem is to isolate these 9 components to determine which contributes most to the problem. With that information it becomes possible to define an approach to solving the problem that will yield the largest improvement.
A day in the life….
• “Hello, Help Desk, Mike speaking…”– “This is Conrad in the lab. I’m trying to save a Notebook document and it’s taking FOREVER! It
never used to do this but now it’s so slow that I can’t do anything!”
• “Hmmm… Did you try re-booting your computer?”– “Are you serious?”
• “Okay…we’ll take a look at the server and see what’s going on.”
• A few hours later…
• “Hey Conrad, this is Mike from the Help Desk. Is Notebook working better now?”– “Let me check. Wow, it’s fast again! What did you do?”
• “We checked the servers and everything looked fine. Must have been a temporary glitch. I will mark your trouble ticket resolved.”
And the next day? Guess who’s calling again…
Getting Started
• Performance problems can be intermittent, based on qualitative assessments (seems slower), and are generally of unknown (at the time) origin
• Pre-requisites to a positive troubleshooting experience– Up to date network diagram for your deployment– Baseline performance data based on periodic measurements
First Questions…
• Is the problem localized or widespread? • Can it be easily reproduced?
• Can the problem be isolated to a specific response time component?
• Your toolkit– Client-side logging– Network application monitors– Network health monitors
The next question…What Changed?
• Multi-tier systems have many moving parts that are typically managed by many groups (or companies!)– Database hardware, OS, Oracle, database instance– Middle tier hardware, OS, IIS, firewall, AV, proxy, domain contact– Client tier OS, patch set, pushed app configuration– Virtualization/shared services creates another complication– # of users, changes to system configuration
• One of the first challenges is to figure out what is different today, your toolkit includes:– Installation Qualification, Operation Qualification– Health monitors– Periodic configuration/status reports (database and app tier)– Change history log that tracks all configuration changes made to the system
Tools for Troubleshooting SN Performance
• Consistent, periodic baselines aid later troubleshooting– Important to run across sites
• Accelrys provides a standard set of tests via the support team– Use of the same tests can allow
comparison to Accelrys performance testing
Periodic Performance Baselines
• Client performance logging is enabled in the app config file– Login performance is tracked by default– Check-in, Check-out, Script
performance must be enabled manually
• Logs can be easily analyzed in Excel– Append multiple logs– Date/Time in GMT– User and source computer captured
• Tip: log periodic baselines for easy analysis
Client Performance Logging
• A complete chronology from the initial install to present detailing what changes have been made to the system – this should include but not be limited to the following:– How and when users were added to the system– Changes to workflow– Any customizations that were introduced, and the source code, tests,
requirements• Scripts• Custom forms• Custom workflow activities• Custom Web services
• What (if any) adapters are installed
Change Logs
Leveraging Accelrys Support
Customer Resources
• Line up the following resources to assist the Accelrys support team– Database tier owner (DBA)– Middle tier owner (OS/IIS)– Network analyst (load balancer, DNS, etc.)
System Configuration Info
• Site audit– Current deployment network diagram– Adapters that are part of the deployment– # of users, # of PSDs– IQ/OQ from most recent upgrade/install
• Change log from initial install to present– History of user expansion in system– Customizations that have been introduced (and when)
Operational Data
• Middle Tier– All Vault debug logs– Configuration files– Windows event logs– Ping times to/from client and database– CPU utilization reports– Load balancer configuration
• Database Tier– Strongly recommended: AWR and ADDM reports from periods with good as well as bad
performance (requires Oracle Diagnostics Pack)– Oracle health monitor report– Verify that statistics are up to date and are collected regularly – CPU and disk utilization reports
CSI: Symyx Notebook
Common root causes taken from real support cases:
• General slowdowns gradually getting worse over time– Oracle database statistics jobs were not running. Without up to date statistics, Oracle
queries become progressively more inefficient
• Periodic slowdowns affecting all users– Oracle tablespace extent size set very low causing frequent extensions to tablespaces,
slowing response as disk was allocated
• Some users get fast responses while others are slow– Load balancer distribution overloaded one server while others were lightly used
• Saving documents extremely slow while browse/open are normal– Problem eventually isolated to a network link that was slow in one direction
Summary
• Effective performance troubleshooting benefits from a consistent approach and strong collaboration between your team and Accelrys support
• Resources– Notebook IT/Admin forum on the Accelrys Community
• Email [email protected] to join
– Troubleshooting guidance: [email protected]
The information on the roadmap and future software development efforts are intended to outline general product direction and should not be relied on in making a purchasing decision.
For more information on the Accelrys Tech Summits and other IT & Developer information, please visit:https://community.accelrys.com/groups/it-dev