Top Banner
CHAPTER 22 Database Troubleshooting
27

Chapter 22

Feb 23, 2016

Download

Documents

kreeli

Chapter 22. Database Troubleshooting. Introduction to Database Troubleshooting. DBAs need to be experts at database troubleshooting. There are a wide variety of topics covered in this chapter: Assessing database availability issues quickly. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Chapter 22

CHAPTER 22Database Troubleshooting

Page 2: Chapter 22

Introduction to Database Troubleshooting

• DBAs need to be experts at database troubleshooting.• There are a wide variety of topics covered in this chapter:

• Assessing database availability issues quickly.• Identifying system performance issues with operating system

utilities.• Querying data dictionary views to display resource intensive SQL

statements.• Using Oracle performance tools to identify resource consuming

SQL statements.• Identifying and resolving locking issues.• Troubleshooting open cursor issues.• Investigating issues with the undo and temporary tablespaces.• Auditing database activities.

Page 3: Chapter 22

Checking Server and Database Availability

• The following command establishes:• Is the database up and available• Is the listener up• Is the network up

$ sqlplus barts/l1sa@'dwdb1:1521/dwrep1' 

• Use ping to see if the box can be connected to over the network• Use telnet to establish server availability on a given port• Use tnsping to troubleshoot Oracle Net connectivity

Page 4: Chapter 22

Investigating Disk Fullness• When first logging onto a box, one issue that will cause a

database to hang or have problems is a full mount point.• The df command with the human readable -h switch

assists with verifying disk fullness:•  $ df -h

• The du command is also useful for investigating where space is being consumed:

• $ du -sh

Page 5: Chapter 22

Locating the Alert Log and Trace Files

• Troubleshooting an unavailable database oftentimes begin with inspecting the alert.log.

• Here’s a OS function that can help you navigate to the alert.log location:

function bdump { if [ "$ORACLE_SID" = "O12C" ]; then cd /orahome/app/oracle/diag/rdbms/o12c/O12C/trace elif [ "$ORACLE_SID" = "O11R2" ]; then cd /orahome/app/oracle/diag/rdbms/o11r2/O11R2/trace fi}

Page 6: Chapter 22

Removing Files• Linux/Unix environments provide a wide variety of ways to

identify the size and age of files.• When troubleshooting, oftentimes it’s useful to find files of

over a certain size or age. $ find . -type f -mtime +2 -name "*.trc"

• Then to delete files identified by the find command (be very careful when doing this):

 $ find . -type f -mtime +2 -name "*.trc" | xargs rm

Page 7: Chapter 22

Viewing the Alert Log via OS Tools• After navigating to the directory that contains the alert.log, you can view the

most current messages by viewing the end (furthest down) in the file (in other words, the most current messages are written to the end of the file).

• To view the last 50 lines, use the tail command: $ tail -50 alert_<SID>.log • You can continuously view the most current entries by using the -f switch:•  • $ tail -f alert_<SID>.log • You can also directly open the alert.log with an operating system editor

(such as vi):•  $ vi alert_<SID>.log

Page 8: Chapter 22

Viewing the alert.log Using the ADRCI Utility• If you're using Oracle Database 11g or higher, you can use the ADRCI utility to

view the contents of the alert.log file.• Run the following command from the operating system to start the ADRCI utility: $ adrci • You should be presented with a prompt: adrci> • Use the SHOW ALERT command to view the alert.log file: adrci> show alert•  • If there are multiple Oracle homes on the server, then you will be prompted to

choose which alert.log you want to view.

Page 9: Chapter 22

Identifying System Bottlenecks Using vmstat

• Linux/Unix environments provide a wide variety of command line tools for troubleshooting system bottlenecks.

• The vmstat utility displays real-time performance information about processes, memory, paging, disk I/O, and CPU usage.

$ vmstat 2 10

Page 10: Chapter 22

Identifying System Bottlenecks Using top

• Another tool for identifying resource-intensive processes is the top command. Use this utility to quickly identify which processes are the highest consumers of resources on the server.

• By default, top will repetitively refresh (every few seconds) information regarding the most CPU-intensive processes.

• $ top

Page 11: Chapter 22

Mapping an Operating System Process to a SQL Statement• Oftentimes when you have multiple databases running on

a single server, it’s difficult to pinpoint which database and process may be consuming inordinate amounts of system resources.

• In these situations, use ps to identify the process and its associated database.

• Then use a SQL statement to identify what the process is doing within the database.

Page 12: Chapter 22

Mapping an Operating System Process to a SQL Statement$ ps -e -o pcpu,pid,user,tty,args | sort -n -k 1 -r | head

• Then once the process and associated database are identified, then run a query similar to this one to identify details about the process:

select 'USERNAME : ' || s.username|| chr(10) || 'OSUSER : ' || s.osuser || chr(10) || 'PROGRAM : ' || s.program || chr(10) || 'SPID : ' || p.spid || chr(10) || 'SID : ' || s.sid || chr(10) || 'SERIAL# : ' || s.serial# || chr(10) || 'MACHINE : ' || s.machine || chr(10) || 'TERMINAL : ' || s.terminalfrom v$session s, v$process pwhere s.paddr = p.addrand p.spid = '&PID_FROM_OS';

Page 13: Chapter 22

Monitoring Real-Time SQL Execution Statistics• The V$SQL_MONITOR view provides near real-time statistics on resource intenisve

SQL queries:

select * from (select a.sid session_id,a.sql_id,a.status,a.cpu_time/1000000 cpu_sec,a.buffer_gets,a.disk_reads,b.sql_text sql_textfrom v$sql_monitor a ,v$sql bwhere a.sql_id = b.sql_idorder by a.cpu_time desc)where rownum <=20;

Page 14: Chapter 22

Monitoring Real-Time SQL Execution Statistics• The V$SQL_MONITOR view provides near real-time statistics on resource intenisve

SQL queries:

select * from (select a.sid session_id,a.sql_id,a.status,a.cpu_time/1000000 cpu_sec,a.buffer_gets,a.disk_reads,b.sql_text sql_textfrom v$sql_monitor a ,v$sql bwhere a.sql_id = b.sql_idorder by a.cpu_time desc)where rownum <=20;

Page 15: Chapter 22

Displaying Resource Intensive SQL• To display queries currently executing:

select * from (select a.sid session_id,a.sql_id,a.status,a.cpu_time/1000000 cpu_sec,a.buffer_gets,a.disk_reads,substr(b.sql_text,1,15) sql_textfrom v$sql_monitor a ,v$sql bwhere a.sql_id = b.sql_idand a.status='EXECUTING'order by a.disk_reads desc)where rownum <=20;

Page 16: Chapter 22

Oracle provides several utilities for diagnosing database performance issues• Automatic Workload Repository (AWR)• Automatic Database Diagnostic Monitor (ADDM)• Active Session History (ASH)• Statspack

Page 17: Chapter 22

Using AWR• An AWR report is good for viewing the entire system

performance and identifying the top resource-consuming SQL queries.

• Run the following script to generate an AWR report:

SQL> @?/rdbms/admin/awrrpt

• You can also generate an AWR report for a specific SQL statement by running the awrsqrpt.sql report.

• You will be prompted for the SQL_ID of the query of interest: SQL> @?/rdbms/admin/awrsqrpt.sql

Page 18: Chapter 22

Using ADDM• ADDM useful as a first place to look for database

performance issues.• Use DBMS_ADDM PL/SQL package or Oracle provided

SQL script to view the output of ADDM.• You need to provide as input (you’ll be prompted) a begin

and end snapshot ID:

SQL> @?/rdbms/admin/addmrpt

Page 19: Chapter 22

Using ASH• Useful for identifying short lived performance problems.• Use the Oracle provided script to manually run this utility:

SQL> @?/rdbms/admin/ashrpt

Page 20: Chapter 22

Using Statspack• Statspack does not require an extra license.• Provides much of the same type of information as the AWR reports.• First create the repository and the PERFSTAT user:

SQL> @?/rdbms/admin/spcreate.sql • To enable the automatic gathering of Statspack statistics, run this script: • SQL> @?/rdbms/admin/spauto.sql • After some snapshots have been gathered, you can run the following

script as the PERFSTAT user to create a Statspack report: • SQL> @?/rdbms/admin/spreport.sql

Page 21: Chapter 22

Detecting and Resolving Locking Issues

• Oracle provides several data dictionary views to help diagnose locking issues:

V$LOCKV$LOCKED_OBJECTV$SQLAREAV$SESSION

Page 22: Chapter 22

Resolving Open Cursor Issues• Run a query such as the following to determine the number of open cursors each

session has opened:•  select a.value, c.username, c.machine, c.sid, c.serial#from v$sesstat a ,v$statname b ,v$session cwhere a.statistic# = b.statistic#and c.sid = a.sidand b.name = 'opened cursors current'and a.value != 0and c.username IS NOT NULLorder by 1,2;

• You may need to increase the value of OPEN_CURSORS to resolve issues. • Another common issue is code that doesn’t correctly close a cursor.

Page 23: Chapter 22

Determining if Undo is Correctly Sized• Undersized online redo log files can cause excessive switching which in

turn causes modified database blocks to be more frequently flushed from memory.

• To get an idea of switching frequency:

select to_char(begin_time,'MM-DD-YYYY HH24:MI') begin_time,ssolderrcnt ORA_01555_cnt,nospaceerrcnt no_space_cnt,txncount max_num_txns,maxquerylen max_query_len,expiredblks blck_in_expiredfrom v$undostatwhere begin_time > sysdate - 1order by begin_time;

Page 24: Chapter 22

Viewing SQL that is Consuming Undo Space• Sometimes you may face a situation where you keep running out of

space in the UNDO tablespace. • In these situations it's helpful to identify which users are consuming

space in the undo tablespace.• Run this query to report on basic information regarding space

allocated on a per user basis:•  select s.sid, s.serial# ,s.osuser, s.logon_time ,s.status, s.machine ,t.used_ublk ,t.used_ublk*16384/1024/1024 undo_usage_mb from v$session     s     ,v$transaction t where t.addr = s.taddr;

Page 25: Chapter 22

Determining if Temporary Tablespace is Sized Correctly• There is no exact formula to determine the optimal size of the

temporary tablespace.• You’ll have to monitor your temporary tablespace while there is a load

on your database to determine its usage patterns.• If you are using Oracle Database 11g or higher, run the following

query to show both the allocated and free space within the temporary tablespace:

•  select tablespace_name,tablespace_size/1024/1024 mb_size,allocated_space/1024/1024 mb_alloc,free_space/1024/1024 mb_freefrom dba_temp_free_space;

Page 26: Chapter 22

Viewing SQL that is Consuming Temporary Space

• When troubleshooting temporary tablespace issues, run a query such as this to show how resources are being consumed:

 SELECT  s.sid, s.serial# ,s.username, p.spid ,s.module, p.program ,SUM(su.blocks) * tbsp.block_size/1024/1024 mb_used ,su.tablespace FROM v$sort_usage    su     ,v$session       s     ,dba_tablespaces tbsp     ,v$process       p WHERE su.session_addr = s.saddr AND   su.tablespace   = tbsp.tablespace_name AND   s.paddr         = p.addr GROUP BY  s.sid, s.serial#, s.username, s.osuser, p.spid, s.module,  p.program, tbsp.block_size, su.tablespace ORDER BY s.sid;

Page 27: Chapter 22

Summary• Database troubleshooting covers a wide variety of topics

and subject areas.• This chapter focused on some common database

problems you’ll encounter.• As a DBA you should be aware of what kinds of problems

can happen, how to diagnose issues, and how to resolve them in a timely manner.