Roundtable Discussion with Oracle Professionals Ask an Expert and Be an Expert Jože Senegačnik Māris Elsiņš Andrejs Vorobjovs
Jun 25, 2015
Roundtable Discussion with Oracle Professionals
Ask an Expert and Be an Expert
Jože Senegačnik Māris Elsiņš Andrejs Vorobjovs
© 2013 Pythian2
Jože Senegačnik• Owner of Dbprof d.o.o.• First experience with Oracle Version 4.1 in 1988• 25+ years of experience with Oracle RDBMS.• Proud member of the OakTable Network www.oaktable.net• Oracle ACE Director• VP of Slovenian OUG (SIOUG) board• CISA – Certified IS auditor• Blog about Oracle: http://joze-senegacnik.blogspot.com• Co-author of the OakTable book “Expert Oracle Practices”
by Apress (Jan 2010)
• PPL(A) / IR(SE) – private pilot license, instrument rating• Blog about flying: http://jsenegacnik.blogspot.com• Blog about Building Ovens, Baking and Cooking:
http://senegacnik.blogspot.com
© 2013 Pythian3
Māris ElsiņšGoogle: Elsins Oracle
Twitter, LinkedIn, Blog, Slideshare
Oracle [Apps] DBA – 8 years {PL/SQL Developer 3 years}Oracle Certified Master – 10g {9i/10g/11g/11i/R12 OCP}Speaker at Oracle Conferences – 14+ {since 2007}Pythian – {since 2011}FAN OF BAAG!
© 2013 Pythian4
© 2013 Pythian5
Roundtable Discussion!
Courage = BeerQuestions = PresentsBeer + Presents = Fun
© 2013 Pythian6
Warm-up 1: troubleshooting!• 10.2.0.3• Suddenly, we started getting lots of “ORA-
07445: exception encountered: core dump [qercoStart()+156]…”
• No code changes in the DB• MOS: there are bugs for SQLs with:
– Using ROWNUM < x condition in the where clause– Using ROWNUM condition and FULL OUTER joins– Using ROWNUM condition with UNION ALL set operation
© 2013 Pythian7
Warm-up 1: troubleshooting!Māris Elsiņš
© 2013 Pythian8
Warm-up 1: troubleshooting!• There is a “ROWNUM < x”
• There’s no “FULL OUTER”• There’s no “No UNION ALL”
© 2013 Pythian9
Warm-up 1: troubleshooting!• Try all workarounds listed in MOS bugs:
– Flushed the shared pool, eh?– bouncing the database, eeeh??– setting “_complex_view_merging”=false, uhh!
Eh…
• None of the bugs is our case. What’s common?
© 2013 Pythian10
Warm-up 1: troubleshooting!
© 2013 Pythian11
Warm-up 1: troubleshooting!
© 2013 Pythian12
Warm-up 1: troubleshooting!
© 2013 Pythian13
Warm-up 1: troubleshooting!
© 2013 Pythian14
Warm-up 1: troubleshooting!
© 2013 Pythian15
Warm-up 1: troubleshooting!• Oracle introduced native FULL OUTER JOIN
operation in 10.2.0.5. Before that, it was implemented using the UNION ALL operation.
• “OR” and “IN” + “OR Expansion” transformation = UNION ALL (CONCATENATION)
© 2013 Pythian16
My query – Statistics changed and “OR Expansion” was applied
© 2013 Pythian17
Warm-up 1: troubleshooting!SQL> ALTER SESSION SET EVENTS '10053 trace name context forever,level 1';SQL> alter session set tracefile_identifier=CR758708_2;SQL> alter session set max_dump_file_size=unlimited;SQL> explain plan for SELECT CBMD.CBMD_BASE_MDL_NUMBER, ....
$ more test_ora_17805_CR758708_2.trc************************************* PARAMETERS WITH DEFAULT VALUES ******************************..._px_broadcast_fudge_factor = 100 _ordered_nested_loop = true _no_or_expansion = false optimizer_index_cost_adj = 100 optimizer_index_caching = 0
© 2013 Pythian18
Warm-up 1: troubleshooting!• Alter system set “_no_or_expansion” = true;
• And I deserve a beer!
© 2013 Pythian19
Warm-up 2: troubleshooting!Māris Elsiņš
© 2013 Pythian20
Warm-up 2: troubleshooting!• eBusiness Suite. • Oracle Forms. • frmweb memory footprint is under 200M.• Sudden spikes to 1.2G• 7 processes = Unusable system
© 2013 Pythian21
Warm-up 2: troubleshooting!• Initial troubleshooting efforts
– Forms FRD trace. Too hard to read and low level tracing caused issues to end users.
– Gdb / Gcore – memory dumps – reading a 1.2 Gb memory dumps is “too exciting” (may be I just can’t do it properly”
© 2013 Pythian22
Warm-up 2: troubleshooting!• Hypothesis and the plan:
- Forms process grows because it receives huge amount of data from the database. How to confirm?
- Find the query. How to find it?- Check MOS- Fix the bug
© 2013 Pythian23
Warm-up 2: troubleshooting!• R12.1.3 has proper instrumentation for
oracle sessions!• exec
dbms_monitor.CLIENT_ID_TRACE_ENABLE('BFONROUGE', waits=>false, binds=>true);
© 2013 Pythian24
Warm-up 2: troubleshooting!• Recursive and non-Recursive SQLs:PARSE #140120325469424:c=0,e=39,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1514468521,tim=1377002345178916PARSE #140120325223680:c=0,e=45,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=2933345646,tim=1377002345179984PARSE #140120325374992:c=0,e=40,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=0,tim=1377002345184564PARSE #140120323759152:c=0,e=35,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=1,plh=2782854086,tim=1377002345185121PARSE #140120325363448:c=0,e=36,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=1,plh=1057973958,tim=1377002345185491PARSE #140120325362576:c=0,e=34,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=1,plh=749271464,tim=1377002345185890PARSE #140120325535560:c=0,e=42,p=0,cr=0,cu=0,mis=0,r=0,dep=3,og=4,plh=2020579421,tim=1377002345186987PARSE #140120325534544:c=1000,e=718,p=0,cr=0,cu=2,mis=1,r=0,dep=3,og=4,plh=0,tim=1377002345189805PARSE #140120325373648:c=0,e=36,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=1895590010,tim=1377002345192930PARSE #140120325364640:c=0,e=106,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1388734953,tim=1377002345852910PARSE #140120325526432:c=0,e=166,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=4065726397,tim=1377002347109204
• dep=0 = non-recursive = called by the form.
© 2013 Pythian25
Warm-up 2: troubleshooting!$ tkprof PROD_ora_14363.trc PROD_ora_14363.trc.tkp sort=fchcnt sys=no$ grep "Parsing user id" PROD_ora_14363.trc.tkp | moreParsing user id: 173Parsing user id: 173 (recursive depth: 1)Parsing user id: 173 (recursive depth: 2)Parsing user id: 173 (recursive depth: 1)Parsing user id: 173
© 2013 Pythian26
Warm-up 2: troubleshooting!$ more PROD_ora_14363.trc.tkp********************************************************************************SQL ID: 2f4375b2hn94s Plan Hash: 603069092SELECT *FROM ( SELECT l.address1, l.address1||decode(l.address2,null,null,';'||
call count cpu elapsed disk query current rows------- ------ -------- ---------- ---------- ---------- ---------- ----------Parse 3 0.00 0.00 0 0 0 0Execute 3 0.01 0.01 0 0 0 0Fetch 3840 33.82 34.76 0 619685 0 253415------- ------ -------- ---------- ---------- ---------- ---------- ----------total 3846 33.84 34.79 0 619685 0 253415
© 2013 Pythian27
Warm-up 2: troubleshooting!$ egrep "^\*\*\* MODULE|2f4375b2hn94s" PROD_ora_14363.trc | grep -B1 "2f4375b2hn94s”
*** MODULE NAME:(e:CSF:frm:CSXSRISR) 2013-08-21 03:46:26.026PARSING IN CURSOR #140415761994912 len=2289 dep=0 uid=173 oct=3 lid=173 tim=1377071356826929 hv=3305776280 ad='262dad2180' sqlid='2f4375b2hn94s’
• Submitted a SR – still under being worked on.
© 2013 Pythian28
Warm-up 3: Performance! Measurement errorsJože Senegačnik
I/O Bottleneck ProblemResponse Time Component Time % Elap AvgEla ---------------------------------------- ----------- ------- ---------CPU service 3934.97s 48.39% 0.000716un-accounted for time 1363.01s 16.76% db file sequential read 1122.00s 13.80% 0.032253gc buffer busy 451.73s 5.56% 0.011746log buffer space 451.64s 5.55% 0.123974buffer busy waits 176.79s 2.17% 0.029579gc cr block 2-way 156.49s 1.92% 0.003287gc cr grant 2-way 100.20s 1.23% 0.006541latch: cache buffers chains 98.92s 1.22% 0.005708gc current grant 2-way 69.68s 0.86% 0.006728latch: library cache 30.10s 0.37% 0.010030row cache lock 28.95s 0.36% 0.018727gc current block 2-way 26.72s 0.33% 0.003828gc cr block busy 19.35s 0.24% 0.006802gc current grant busy 15.30s 0.19% 0.004999latch: row cache objects 14.28s 0.18% 0.006165gc cr block 3-way 11.73s 0.14% 0.002952gc current block 3-way 11.34s 0.14% 0.003440log file sync 10.71s 0.13% 0.315066enq: SQ - contention 9.14s 0.11% 0.060911
Timings for Single Block Reads – 3 node RAC
Single Block Read Times From AWR
0
5
10
15
20
25
30
35
40
45
50
Snapshot ID - 20 minutes interval
Sin
gle
Blo
ck R
ead
Tim
e i
n
Mil
liseco
nd
s
Inst 1
Timings for Single Block Reads – 3 node RAC
Single Block Read Times From AWR
0
5
10
15
20
25
30
35
40
45
50
Snapshot ID - 20 minutes interval
Sin
gle
Blo
ck R
ead
Tim
e i
n
Mil
liseco
nd
s
Inst 1
Inst 2Inst3
The Facts About The I/O Bottleneck Problem• Facts:
– 3 node RAC– Same storage– Single block read time for Instance 1 was substantially different from
read times for other instances during off hours – 5 batch jobs during off hours
• The timings for Instance 1 are obviously not correct• The sandwich syndrome (output from strace)
gettimeofday({1159440978, 931945}, NULL) = 0pread(14, "\6\242\0\0\375\23\0\2+\254.\0\0\0\1\6\0054\0\0\1\0\5\0"..., 8192, 455
057408) = 8192gettimeofday({1159440978, 944159}, NULL) = 0
• Waiting in runque for CPU exaggerates all wait times of the process.
© 2013 Pythian33
Warm-up 4: Performance! Database Connection ProblemsJože Senegačnik
Too many sessions• 35000 sessions per instance • database level 70000 (2-node RAC)
http://www.youtube.com/watch?v=xNDnVOCdvQ0