Configure AIX to support Oracle with the best performance Setup Oracle to take advantage of AIX features Understand the joint effort to tune both Oracle and AIX and how they interact
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
The components of an Oracle server are : � database = set of datafiles (data + redologs + control files)� instance = memory structures allocated at startup + background processes (automatically forked at startup) associated to a database
� Database files
� datafiles – contains data (tables, indexes…)� redologs – contains redo entries� control files – record the physical structure of the database
� Oracle memory structures = System Global Area (SGA) + Process Global Area (PGA)
� SGA � shared memory region that contains data and control information for one Oracle instance� allocated at instance startup and deallocated at instance shutdown � each instance has its own SGA
� PGA � memory buffer that contains data and control information for a server process� PGA is created by Oracle when a server process is started� contains private SQL area
Oracle Server Architecture (cont’d)
IBM Systems & Technology Group – Technical Conference 2008
(Programs, SGA, PGA) from being paged out and force the LRUD to steal pages from FS-CACHE only.
1 - AIX is started, applications load some computational pages into the memory.
As a UNIX system, AIX will try to take advantage of the free memory by using it as a cache file to reduce the IO on the physical drives.
2 - The activity is increasing, the DB needs more memory but there is no free pages available. LRUD (AIX page stealer) is starting to free some pages into the memory.
3 - With the default setting, LRUD will page out some computational pages instead of removing some pages from the File System Cache.
Virtual Memory Management
IBM Systems & Technology Group – Technical Conference 2008
Large Page Support – improves performance of prefetching
► On AIX part
● vmo –r –o lgpg_size = 16777216 –o lgpg_regions=(SGA size / 16 MB)● chuser capabilities=CAP_BYPASS_RAC_VMM,CAP_PROPAGATE oracle (allow Oracle user ID to use
Large Pages)● ldedit –b lpdata oracle (to allow oracle binary to use large page data)● export LDR_CNTRL=LARGE_PAGE_TEXT=Y@LARGE_PAGE_DATA=M (prior Oracle user ID to
starting Oracle instance and listener to allow both large page text and large page data)
IBM Systems & Technology Group – Technical Conference 2008
► All Oracle memory structures were manually managed and were mostly static (db_block_buffers…)
9i : Dynamic memory resizing
db_cache_size (dynamic parameter)
►sga_max_size (static parameter) – maximum size of the SGA for the lifetime of the instance.
►pga_aggregate_target (dynamic parameter) – specifies the target aggregate PGA memoryavailable to all server processes attached to the instance
►additional parameter : db_cache_advice (dynamic parameter) – enables or disables statistics gathering used for predicting behavior with different cache sizes.
10g : Automatic Shared Memory Management (ASMM)
►sga_target (dynamic) – if set the db_cache_size is automatically sized (shared_pool_size, large_pool_size, java_pool_size as well).
Can be increased up to sga_max_size.
To use Automatic Shared Memory Management, sga_target must be <>0.
IBM Systems & Technology Group – Technical Conference 2008
►memory_target (dynamic parameter) – specifies the total memory size to be used by the instance – SGA and PGA. Exchanges between SGA and PGA are done according to the needs.
If sga_target and pga_aggregate_target are not set the policy is to give 60% of memory_target to the SGA and 40% to the PGA.
►memory_max_target (static parameter) – specifies the maximum memory size for the instance.
To use Automatic Memory Management, memory_target must be <>0.
See Metalink notes 443746.1 and 452512.1 explaining AMM and these new parameters.
IBM Systems & Technology Group – Technical Conference 2008
► Create a Volume Group with a 8M,16M or 32M PPsize. (PPsize will be the “Stripe size”)
● AIX 5.2 : Choose a Big VG and change the “t factor” : # mkvg –B –t <factor> –s <PPsize>...
● AIX 5.3 +: Choose a Scalable Volume Group : # mkvg –S –s <PPsize> ...
► Create LV with “Maximum range of physical volume” option to spread PP on different hdisk in a
Round Robin fashion : # mklv –e x ...
jfs2 filesystem creation advice :
If you create a jfs2 filesystem on a striped (or PP spreaded) LV, use the INLINE logging option. It willavoid « hot spot » by creating the log inside the filesystem (which is striped) instead of using a uniq PP stored on 1 hdisk.
# crfs –a logname=INLINE …
IBM Systems & Technology Group – Technical Conference 2008
� Check if the definition of your disk subsystem is present in the ODM.
� If the description shown in the output of “lsdev –Cc disk” the word “Other”, then it means that AIX doesn’t have a correct definition of your disk device in the ODM and use a generic device definition.
# lsdev –Cc disk
� In general, a generic device definition provides far from optimal performance since it doesn’t properly customize the hdisk device :
exemple : hdisk are created with a queue_depth=1
1. Contact your vendor or go to their web site to download the correct ODM definition for your storage subsystem. It will setup properly the “hdisk” accordingly to your hardware for optimal performance.
2. If AIX is connected to the storage subsystem with several Fiber Channel Cards for performance, don’t forget to install a multipath device driver or path control module.
- sdd or sddpcm for IBM DS6000/DS8000
- powerpath for EMC disk subsystem
- hdlm for Hitachi etc....
Generic device definitionbad performance
IBM Systems & Technology Group – Technical Conference 2008
IO : Filesystems Mount Options (DIO, CIO)If Oracle data are stored in a Filesystem, some mount option can improve performance :
�Direct IO (DIO) – introduced in AIX 4.3.
• Data is transfered directly from the disk to the application buffer, bypassing the file buffer cache hence avoiding double caching (filesystem cache + Oracle SGA). • Emulates a raw-device implementation.
�To mount a filesystem in DIO$ mount –o dio /data
�Concurrent IO (CIO) – introduced with jfs2 in AIX 5.2 ML1
• Implicit use of DIO. • No Inode locking : Multiple threads can perform reads and writes on the same file at the same time. • Performance achieved using CIO is comparable to raw-devices.
�To mount a filesystem in CIO:$ mount –o cio /data
Bench throughput over run duration – higher
tps indicates better performance.
IBM Systems & Technology Group – Technical Conference 2008
1. Avoid double caching : Some data are already cache in the Application layer (SGA)
2. Give a faster access to the backend disk and reduce the CPU utilization
3. Disable the inode-lock to allow several threads to read and write the same file (CIO only)
IO : Benefits of CIO for Oracle
� Restrictions :
1. Because data transfer is bypassing AIX buffer cache, jfs2 prefetching and write-behind can’t be used. These functionnalities can be handled by Oracle.
⇒ (Oracle parameter) db_file_multiblock_read_count = 8, 16, 32, ... , 128 according to workload
2. When using DIO/CIO, IO requests made by Oracle must by aligned with the jfs2 blocksize to avoid a demoted IO (Return to normal IO after a Direct IO Failure)
=> When you create a JFS2, use the “mkfs –o agblksize=XXX” Option to adapt the FSblocksize with the application needs.
Rule : IO request = n x agblksize
Exemples: if DB blocksize > 4k ; then jfs2 agblksize=4096
Redolog are always written in 512B block; So jfs2 agblksize must be 512
IBM Systems & Technology Group – Technical Conference 2008
• ASM is a database file system which provide cluster file system and volume manager capabilities. Alternative to conventional file-system and LVM functions. For Oracle datafiles.
• Integrated to Oracle DB with no additional cost, for single or RAC databases
• With ASM, the management of Oracle datafiles is the same for the DBA on all platforms (Unix, Linux, Windows)
• Datafiles are striped across all ASM disks - I/O is spread evenly to prevent hot spots and maximize performance
• Online add/drop of disk devices with automatic online redistribution of data
LUN 1
LUN 2
AIX
hdisk1
ASM DISKGROUP
StorageOracle
ASM striping
LUN 4
LUN 3
hdisk2
hdisk3
hdisk4
IBM Systems & Technology Group – Technical Conference 2008
• ASM is implemented as a special kind of Oracle instance, with its own SGA and backgroundprocesses (by default, less than 100M of SGA are allocated for the ASM instance) and generate very little activity except during rebalance operations.
• A single ASM instance can service one or more instance databases.
• Can maintain redundant copies of data
• An ASM-managed database will have approximately the same performance as a database implemented in raw-devices.
ASM Instance
ASM DISKGROUP
Database on
ASM
create tablespace DATA
datafile ‘+DG/data01.dbf’ size 15G;
IBM Systems & Technology Group – Technical Conference 2008
With fast_path, IO are queued directly from the application into the LVM layer without any “aioservers kproc” operation.
� Better performance compare to non-fast_path� No need to tune the min and max aioservers� No ioservers proc. => “ps –k | grep aio | wc –l” is not relevent, use “iostat –A” instead
AIX Kernel
IBM Systems & Technology Group – Technical Conference 2008
ASYNCH : enables asynchronous I/O on file system files (default)DIRECTIO : enables direct I/O on file system files (disables AIO)SETALL SETALL SETALL SETALL : enables both asynchronous and direct I/O on file system filesNONE : disables both asynchronous and direct I/O on file system files
Since the version 10g, Oracle will open data files located on the JFS2 file system
with the O_CIO option if the filesystemio_options initialization parameter is set to
either directIO or setall.
Advice : set this parameter to ‘setall’…
Note : set the disk_asynch_io parameter to ‘true’ as well
IO : AIO,DIO/CIO & Oracle Parameters
IBM Systems & Technology Group – Technical Conference 2008
• Feature available on POWER5 and POWER6 system running AIX
5.3 or 6.1
• allows 2 instruction path to share access to the POWER execution units on every clock cycle
• instruction path = “virtual processor” = logical POWER processor (i.e. a server with 4 physical processors will have 8 logical processors with SMT enabled)
• SMT allows to use resources that would otherwise usually be
unexploited and can achieve significant throughput gains – up to
40% greater performance
• No source code modification needed to use SMT
How to control SMT:
� smtcl [–m on | off]
� “-m on” enables the SMT mode, “-m off” disables it
� “-w boot” makes SMT change effective on reboot
� “-w now” makes SMT change effective immediately but not persistent across reboots
thread2thread1
IBM Systems & Technology Group – Technical Conference 2008
Monitoring & Tuning (cont’d) : System Monitoring Tools
� vmstat – useful for obtaining an overall picture of CPU, paging, and memory usage
� lvmstat – useful to get logical volume I/O statistics
� iostat – allows to get statistics on disk activity (and also on terminals, CPU, adapters)
� lparstat – reports logical partition related information and statistics
� nmon – complete tool which gives information on all the components of the system
� filemon – uses the trace facility to report on the I/O activity of physical volumes, logical volumes, individual files, and the Virtual Memory Manager
� xmperf – allows to define monitoring environments to supervise the performance of the local and remote systems
� netstat – allows to monitor network activity
....
IBM Systems & Technology Group – Technical Conference 2008
• Based on AWR, invoked automatically every time that a new AWR snapshot is generated
• Powerful self-diagnostic engine which :
� analyze the system
� indentify major problems
� recommend corrective actions!!!
• ADDM reports are generated by running $ORACLE_HOME/rdbms/admin/addmrpt.sql and providing 2 snapshot Ids – Can also be generated using the Grid Control.
• In 11g, for RAC clusters, ADDM analyze the system at the cluster level.
SQL Tuning Advisor (STA)
• Detects missing or stale statistics, missing indexes…
• Suggests new execution plan which can be applied to the system and used without any change in the application
• Advised to use STA through Grid Control – but also possible to invoke it using DBMS_SQLTUNE package.
NEW!11g
IBM Systems & Technology Group – Technical Conference 2008
• Implement and get the best of Oracle on AIX is the result of a joint effort between the OS administrator and the Oracle DBA.
• The tuning process is made of a first guess parameter setting and then, iterations to change progressively the configuration to get the best results…
• Prefer JFS2 + Concurrent IO or ASM.
• Always use Asynchronous IO servers and, if available, use FastPath.
• Implement SMT if available on your system.
• Use AIX, Oracle and storage monitoring tools (AIX : Nmon, Topas – Oracle : AWR & Grid Control – Storage : TPC…)
• … and feel free to contact us for additional questions!!!
Addendum - Certification matrix : Oracle database Enterprise Edition on IBM AIX based Systems
When upgrading to Oracle 10GR2 on AIX 6.1, download and run an updated version of rootpre script before proceeding with the Oracle Database install (refer to bug 6613550 for more details).
10th of april 2008
IBM Systems & Technology Group – Technical Conference 2008
TrademarksThe following are trademarks of the International Business Machines Corporation in the United States, other countries, or both.
The following are trademarks or registered trademarks of other companies.
* All other products may be trademarks or registered trademarks of their respective companies.
Notes:
Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here.
IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.
All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions.
This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area.
All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.
Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography.
Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries.
Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom. Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
UNIX is a registered trademark of The Open Group in the United States and other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office.IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency, which is now part of the Office of Government Commerce.
For a complete list of IBM Trademarks, see www.ibm.com/legal/copytrade.shtml:
*, AS/400®, e business(logo)®, DBE, ESCO, eServer, FICON, IBM®, IBM (logo)®, iSeries®, MVS, OS/390®, pSeries®, RS/6000®, S/30, VM/ESA®, VSE/ESA, WebSphere®, xSeries®, z/OS®, zSeries®, z/VM®, System i, System i5, System p, System p5, System x, System z, System z9®, BladeCenter®
Not all common law marks used by IBM are listed on this page. Failure of a mark to appear does not mean that IBM does not use the mark nor does it mean that the product is not
actively marketed or is not significant within its relevant market.
Those trademarks followed by ® are registered trademarks of IBM in the United States; all others are trademarks or common law marks of IBM in the United States.