Oracle Operational Timing Data Paper 36571. Cary Millsap ([email protected]) Hotsos Enterprises, Ltd. OracleWorld 2003 / San Francisco, California USA 8:30a –9:30a Wednesday 10 September 2003. Additional resources for you…. Paper update hotsos.com Questions - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Example: What’s the problem here? It’s obvious, right?
Response Time Component # Calls------------------------------ ---------CPU service 18,750SQL*Net message to client 6,094SQL*Net message from client 6,094db file sequential read 1,740log file sync 681SQL*Net more data to client 108SQL*Net more data from client 71db file scattered read 34direct path read 5free buffer waits 4log buffer space 2direct path write 2log file switch completion 1latch free 1
One easy way out of the problem is to use Oracle’s extended SQL trace data.
• Easy to activate• Low overhead if used sensibly• Produces a chronological record• Contains all the performance data you need in most cases• Doesn’t require any special tools to get started
The easiest way to activate extended SQL trace is to add a few statements to your application’s source code.
alter session set timed_statistics=truealter session set max_dump_file_size=unlimitedalter session set tracefile_identifier='POX20031031a'alter session set events '10046 trace name context forever, level 8'
/* code to be traced goes here */
alter session set events '10046 trace name context off'
There are two types of wait event: events issued within db calls, and calls issued between db calls.
• Events issued between db calls are recognizable by nam– SQL*Net message from client– SQL*Net message to client– single-task message– pipe get– rdbms ipc message– pmon timer– smon timer
Extended SQL trace is reliable for diagnosing the root cause of virtually any performance problem.
• Method is reliable for even more problem types by isolating the “un-measured” components– M – measurement intrusion effect– E – quantization error– N – time spent not executing– U – un-instrumented Oracle code (systematic error + bugs)– S – CPU consumption that is counted twice
What you can do with the timing data is spectacular.
• Performance problems cannot hide from you• No more solving the wrong problem• No more stabs in the dark or trial-and-error “tuning”• No more multi-month performance diagnosis projects• No more CTD [Vaidyanatha and Deshpande (2000)]
You either solve the right problem quicklyor prove that solving it is not worth the effort.
Why I use extended SQL trace data instead of Oracle’s V$ fixed views…
• I don’t use V$ data anymore, because it’s a mess– Can’t poll fast enough with SQL– Too complicated to attach directly to SGA– Too many data sources (V$SESSTAT, V$SESSION_EVENT, V$LATCH, V$LOCK, V$FILESTAT, V$WAITSTAT, V$SQL, …)
– No notion of e; therefore, can’t isolate M, E, N, U, S– No read consistency– Statistics are unreliable (CPU time used by this session, SECONDS_IN_WAIT, …)
– Statistics are susceptible to overflow– Hard (impossible?) to determine recursive SQL relationships
Where do Oracle’s timing statistics actually come from?
procedure dbcall { e0 = gettimeofday; # mark the wall time c0 = getrusage; # obtain resource usage statistics ... # execute the db call (may call wevent) c1 = getrusage; # obtain resource usage statistics e1 = gettimeofday; # mark the wall time e = e1 – e0; # elapsed duration of dbcall c = (c1.utime + c1.stime) – (c0.utime + c0.stime); # total CPU time consumed by dbcall }
procedure wevent { ela0 = gettimeofday; # mark the wall time ... # execute the wait event (syscall) ela1 = gettimeofday; # mark the wall time ela = ela1 – ela0; # elapsed duration of wevent}
Understanding where Oracle timings come from motivates all sorts of interesting research projects…
• How to quantify measurement intrusion effect• How to bound quantization error• How to bound un-instrumented call effects• How to estimate the amount of kernel-mode time double-