The Oracle Database 12c integrates various features to provide the highest level of availability for your data; already in a single instance deployment. Protection against data corruption in an Oracle Database starts at the block level and Oracle Flashback technology can be used to recover from human errors. Oracle Automatic Storage Management (ASM) complements the data protection on storage level, while Oracle Real Application Clusters (RAC) One Node adds an easy way of recovering from server failures, simplifying maintenance operations. Oracle RAC provides local high availability (HA) as its optimum by further increasing protection against server failures and adding scalability on demand functionality. Application Continuity (AC) completes the picture by masking recoverable database failures from the application and thereby the end user. This presentation will focus on the local HA features of the Oracle Database and provide an overview of how these various features can be used to provide well defined protection levels.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Oracle Database with Real Application Clusters (RAC) 12c High Availability Best Practices Markus Michalewicz Director of Product Management Oracle Real Application Clusters (RAC)
RMAN, Oracle Secure Backup – Backup to disk, tape or cloud
Enterprise Manager Cloud Control – Coordinated Site Failover Application Continuity – Application HA Global Data Services – Service Failover / Load Balancing
Standardize on Oracle RAC and Oracle Multitenant The new standard for Oracle Database Consolidation
Oracle GI
Oracle RAC One Node
Consolidation
Agi
lity
Oracle GI
Oracle RAC
1/16/14
9
17
Commonwealth Bank n The Commonwealth Bank is one of Australia’s leading providers of integrated financial
services including retail, business and institutional banking, funds management, superannuation, insurance, investment and broking services. The Bank is one of the largest listed companies on the Australian Stock Exchange.
18
Introduction
n In 2007 CBA set out to create an Oracle database shared service offering for the bank • The offering has been highly successful by several measures • Oracle-as-a-Service has continued to be developed through several iterations
n Oracle as a Service (OaaS) v1 – went live May 2008
1/16/14
10
19
n Host many Oracle database applications on a cluster of hardware
n Processor consolidation
• Run each server hotter • Take advantage of complimentary workload peaks.
n Higher Availability
• Load balancing • HA failover for component failure • Standby DR • Most apps do not implement these features – too expensive
n Cost Reduction
n Better Service
• Full time experts • Always on-call
n Reduced Risk
• Whole environment is managed • Operated as a “business”
Oracle as a Service In A Nutshell
20
Reduce Risk, Improve Time to Market n For new Projects:
• Remove a phase from the project – infrastructure already in place • Remove reliance on expensive/scarce SME resources for design and build • No longer need to manage risk associated with procurement and build • Time to instantiate a new Production quality environment: 3 months -> 2 minutes.
n Example: New ISV Application introduced into our Online Share Trading platform • Required to test performance under the workload
& data volume conditions projected in 2 years time.
Dedicated Infrastructure OaaS
Implementation Time 3-4 months few hours
$ Cost to Project Several hundred thousand < $10K
On Project Completion Under-utilized asset remains Environment turned-off
• Minimize the cost of HA – Use HA features included with Oracle Database – Utilize backups to protect against media and site failures – Secure offsite tape storage (in the cloud) for DR
• Optionally – Consolidate with Oracle Multitenant – Improve HA with RAC One Node – Self-Service provisioning with
§ ASM supports ALL data – database files, file systems, Clusterware files (OCR, Voting Disk) § Built-in mirroring protects from disk failures § Auto-repair of corrupt blocks using a valid mirror copy
3rd Party FS Application
Automatic Storage Management
ASM Cluster & Single Node File System (ACFS)
Database
ACFS Snapshot
ASM Disk Group
DB Datafiles, OCR and Voting Files Oracle Binaries 3rd Party File Systems
dancer srvpool: backoffice Oracle GI for a cluster
Oracle RAC
raccdb1_2
dancer srvpool: frontoffice Oracle GI for a cluster
Oracle RAC
raccdb1_2
Policy-Managed Databases with Oracle RAC 12c Improved HA Management – New Failover Strategy
raccdb1
dasher srvpool: frontoffice Oracle GI for a cluster
Oracle RAC
raccdb1_4
vixen srvpool: frontoffice Oracle GI for a cluster
Oracle RAC
raccdb1_3
comet srvpool: backoffice Oracle GI for a cluster
Oracle RAC
raccdb1_1
§ Servers “Move” to Replace Failed node – Protects against cascade failures due to load – Ensures workload isolation between pools – Less important workloads
User selects product from application and purchases it from the web checkout
User transaction arrives at application infrastructure. It makes it’s way through the application tiers and results in a database transaction being created
The jdbc driver detects the failure and checks with an available node in the cluster, using “Transaction Guard”, whether the transaction committed or needs to be replayed
If the transaction needs to be replayed, “Application Continuity” will submit all of the inflight work to a surviving node in the cluster and perform a commit. This all happens transparently to the application
§ Flex ASM introduces new local resources: – At least one ASM listener
– One “proxy_advm” (per node)
§ Used for ACFS access to Flex ASM instances
§ Connections from a database instance to an ASM instance are based on SQLnet using listeners.
– The listener directs the connection to the least loaded ASM instance based on the load metric it maintains.
– The connection details are fetched from CSS global data
– The ASM instance to which the database instance connects to is listed in the database alert log:
§ NOTE: ASMB connected to ASM instance +ASM1 (Flex mode; client id 0x10004)
– The userid and password supplied are also managed automatically. They are supplied while establishing the session, not while connecting.
[GRID]> crsctl stat res -t ------------------------------------------------------------------------------- Name Target State Server State details ------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------
Policy-Managed Databases with Oracle RAC 12c Database Services
vixen srvpool: frontoffice
dancer srvpool: backoffice dasher srvpool: frontoffice Oracle GI | HUB Oracle GI | HUB
Oracle GI | HUB
Oracle RAC Oracle RAC
Oracle RAC
raccdb1
raccdb1_3
raccdb1_4 raccdb1_2
[GRID]> srvctl status serverpool Server pool name: frontoffice Active Servers count: 2 Server pool name: backoffice Active Servers count: 2 [RAC]> srvctl status service -d raccdb1 Service crmsvc is running on nodes: dasher,vixen Service hrsvc is running on nodes: comet,dancer
§ Database Services – Ensure that workload is hosted in the respective
server pool with the cardinality defined as part of the policy definition.
Policy-Managed Databases with Oracle RAC 12c Improved HA Management - Startup
raccdb1
[GRID]> srvctl config serverpool Server pool name: frontoffice Importance: 10, Min: 2, Max 2 Server pool name: backoffice Importance: 5, Min: 1, Max 1 Server pool name: Free Importance: 0, Min: 0, Max -1
§ Defining a Service Startup Order – Ensure services are started in specified groups
and specified order.
– Leverages Serverpool Min and Importance properties
Policy-Managed Databases with Oracle RAC 12c Improved HA Management – New Failover Strategy
raccdb1
[GRID]> srvctl config serverpool Server pool name: frontoffice Importance: 10, Min: 2, Max 2 Server pool name: backoffice Importance: 5, Min: 1, Max 1 [GRID]> srvctl status serverpool Server pool name: frontoffice Active Servers count: 2 Server pool name: backoffice Active Servers count: 1
dasher srvpool: frontoffice Oracle GI | HUB
Oracle RAC
raccdb1_4
vixen srvpool: frontoffice Oracle GI | HUB
Oracle RAC
raccdb1_3
comet srvpool: backoffice Oracle GI | HUB
Oracle RAC
raccdb1_1
§ Servers “Move” to Replace Failed node – Protects against cascade failures due to load
– Ensures workload isolation between pools
– Less important workloads shutdown transactionally
Policy-Managed Databases with Oracle RAC 12c Improved HA Management – Manage Last Service Standing
raccdb1
[RAC]> srvctl config serverpool Server pool name: frontoffice Importance: 10, Min: 2, Max 2 Server pool name: backoffice Importance: 5, Min: 1, Max 1 [RAC]> srvctl status service –db Service crmsrv is running on nodes comet,vixen Service hrsvc is not running
dasher srvpool: frontoffice Oracle GI | HUB
Oracle RAC
raccdb1_4
vixen srvpool: frontoffice Oracle GI | HUB
Oracle RAC
raccdb1_3
comet srvpool: backoffice Oracle GI | HUB
Oracle RAC
raccdb1_1
§ Business Critical Services survive multiple failures – Most important pool always gets the servers
– Services preserved across multiple failures
– Less important workloads shutdown transactionally
Policy-Managed Databases with Oracle RAC 12c Improved HA Management – Dynamic Provisioning
vixen srvpool: frontoffice
dancer srvpool: backoffice dasher srvpool: frontoffice Oracle GI | HUB Oracle GI | HUB
Oracle GI | HUB
Oracle RAC Oracle RAC
Oracle RAC
raccdb1
raccdb1_3
raccdb1_4 raccdb1_2
[GRID]> srvctl modify serverpool –serverpool backoffice –max 1 [RAC]> srvctl config serverpool Server pool name: frontoffice Importance: 10, Min: 2, Max 4 Server pool name: backoffice Importance: 5, Min: 1, Max 1 [RAC]> srvctl status service –db Service crmsrv is running on nodes dasher,dancer,vixen Service hrsvc is not running
comet srvpool: backoffice Oracle GI | HUB
Oracle RAC
raccdb1_1
§ Add Servers Just-In Time to meet demand – Server Pools sized via Min and Max properties
– Dynamically controlled by QoS Management
– Planned control via Clusterware or QoSM Policies
Policy-Management with Oracle RAC 12c Support for Multiple Policies tracking business objectives
§ More Information: – http://docs.oracle.com/cd/E16655_01/rac.121/e17886/pbmgmt.htm
§ New in Oracle Grid Infrastructure 12c – Server Categories
§ Server Categories use server attributes to allow for an active use of differently sized servers
– Policy Sets § Policy Sets allow for dynamic adjustment
to demand changes in an atomic transaction
Server Categories • NAME • ACL • EXPRESSION • …
Server Attributes • NAME • MEMORY_SIZE • CPU_COUNT • CPU_CLOCK_RATE • CPU_HYPERTHREADING • CPU_EQUIVALENCY • …
[GRID]> crsctl modify policyset –attr "LAST_ACTIVATED_POLICY=NightTime“ CRS-2673: Attempting to stop 'ora.raccdb1.crmsvc.svc' on 'comet' CRS-2673: Attempting to stop 'ora.raccdb1.crmsvc.svc' on 'dancer' CRS-2677: Stop of 'ora.raccdb1.crmsvc.svc' on 'comet' succeeded CRS-2673: Attempting to start 'ora.raccdb1.backup.svc' on 'comet' CRS-2677: Stop of 'ora.raccdb1.crmsvc.svc' on 'dancer' succeeded CRS-2672: Attempting to start 'ora.raccdb1.hrsvc.svc' on 'dancer‘
Policy-Management with Oracle RAC 12c Setting up policy sets – Provision Server Pools and creating a PolicySet
§ Add another server pool “backup”
§ Set up policy set with 3 server pools & 3 policies as follows: – DayTime:
§ frontoffice uses three servers (MIN_SIZE=3)
§ backoffice uses one server (MIN_SIZE=1)
§ backup does not run during daytime (MIN_SIZE=0)
– NightTime:
§ frontoffice uses one server (MIN_SIZE=1)
§ backoffice uses two servers (MIN_SIZE=2)
§ backup uses only one server (MIN_SIZE=1)
– Weekend:
§ frontoffice uses one server (MIN_SIZE=1)
§ backoffice uses one server (MIN_SIZE=1)
§ backup uses two servers (MIN_SIZE=2)
[GRID]> srvctl add serverpool –serverpool backup –min 0 –max 2 –importance 20 [GRID] srvctl status serverpool Server pool name: frontoffice Active Servers count: 3 Server pool name: backoffice Active Servers count: 1 Server pool name: backup Active Servers count: 0
Policy-Management with Oracle RAC 12c Using Policy Sets – part 2: check the result
vixen srvpool: frontoffice comet srvpool: backup
dancer srvpool: backup dasher srvpool: backoffice Oracle GI | HUB Oracle GI | HUB
Oracle GI | HUB Oracle GI | HUB
Oracle RAC Oracle RAC
Oracle RAC Oracle RAC
raccdb1
raccdb1_3
raccdb1_4 raccdb1_2
raccdb1_1
[RAC]> srvctl status database -d raccdb1 Instance raccdb1_1 is running on node comet Instance raccdb1_2 is running on node dancer Instance raccdb1_3 is running on node vixen Instance raccdb1_4 is running on node dasher [RAC]> srvctl status service -d raccdb1 Service backup is running on nodes: comet,dancer Service crmsvc is running on nodes: vixen Service hrsvc is running on nodes: dasher
Policy-Management with Oracle RAC 12c Adding Server Categories to the picture – part 1
§ Assume you have 2 servers that have better IO – Use these servers for backups whenever possible
§ Here comet and dancer have better IO by definition
§ What you need to do: – Set up a server category that identifies the servers
– Add the use of the server category to the server pool § Define the server pools that utilize the category
and during which policy activation it shall be used.
§ You need to restart the cluster stack on the servers that you modify in this fashion
[GRID]> su Password: [GRID]> crsctl set server label IOplus ... #On dancer [GRID]> crsctl set server label Ioplus [GRID]> crsctl get server label CRS-4972: Current SERVER_LABEL parameter value is Ioplus [GRID]> crsctl status server comet dancer –f Comet Dancer
[GRID]> crsctl modify policyset –attr "LAST_ACTIVATED_POLICY=DayTime“ [GRID] srvctl status serverpool Server pool name: frontoffice Active Servers count: 3 Server pool name: backoffice Active Servers count: 1 Server pool name: backup Active Servers count: 0
[RAC]> srvctl status service -d raccdb1 Service backup is not running. Service crmsvc is running on nodes: dasher,vixen,comet Service hrsvc is running on nodes: dancer
Policy-Management with Oracle RAC 12c Using Policy Sets means changing policies on a push of a button
dancer srvpool: frontoffice dasher srvpool: frontoffice Oracle GI | HUB Oracle GI | HUB
Oracle GI | HUB Oracle GI | HUB
Oracle RAC Oracle RAC
Oracle RAC Oracle RAC
raccdb1
raccdb1_3
raccdb1_4 raccdb1_2
raccdb1_1
[GRID]> date; crsctl modify policyset -attr "LAST_ACTIVATED_POLICY=DayTime"; date Mon Sep 16 19:26:42 PDT 2013 CRS-2673: Attempting to stop 'ora.raccdb1.backup.svc' on 'dancer' CRS-2673: Attempting to stop 'ora.raccdb1.backup.svc' on 'comet' CRS-2677: Stop of 'ora.raccdb1.backup.svc' on 'dancer' succeeded CRS-2677: Stop of 'ora.raccdb1.backup.svc' on 'comet' succeeded CRS-2672: Attempting to start 'ora.raccdb1.crmsvc.svc' on 'dancer' CRS-2672: Attempting to start 'ora.raccdb1.crmsvc.svc' on 'comet' CRS-2676: Start of 'ora.raccdb1.crmsvc.svc' on 'dancer' succeeded CRS-2676: Start of 'ora.raccdb1.crmsvc.svc' on 'comet' succeeded Mon Sep 16 19:26:43 PDT 2013 è Time to execute: 1 second!
Policy-Management with Oracle RAC 12c What-If evaluation of policy changes
[RAC]> srvctl status service -d raccdb1 Service backup is not running. Service crmsvc is running on nodes: dancer,vixen,dasher Service hrsvc is running on nodes: comet [GRID]> crsctl eval activate policy Weekend Stage Group 1: ------------------------------------------------------------------------------- Stage Number Required Action ------------------------------------------------------------------------------- 1 Y Server 'comet' will be moved from pools [ora.frontoffice] to pools [ora.backup] Y Server 'dancer' will be moved from pools [ora.frontoffice] to pools [ora.backup] Y Resource 'ora.raccdb1.crmsvc.svc' (1/1) will be in state [OFFLINE] Y Resource 'ora.raccdb1.crmsvc.svc' (2/1) will be in state [OFFLINE] 2 Y Resource 'ora.raccdb1.backup.svc' (1/1) will be in state [ONLINE|INTERMEDIATE] on server[comet] Y Resource 'ora.raccdb1.backup.svc' (2/1) will be in state [ONLINE|INTERMEDIATE] on server [dancer]
What-If with Oracle RAC 12c What-If evaluation of policy changes – in various levels
[RAC]> srvctl status service -d raccdb1 Service backup is not running. Service crmsvc is running on nodes: dancer,vixen,dasher Service hrsvc is running on nodes: comet [GRID]> crsctl eval activate policy Weekend -admin -l 'serverpools' NAME = Free ACTIVE_SERVERS = NAME = Generic ACTIVE_SERVERS = NAME = ora.backoffice ACTIVE_SERVERS = vixen NAME = ora.backup ACTIVE_SERVERS = comet dancer NAME = ora.frontoffice ACTIVE_SERVERS = dasher
[RAC]> srvctl status service -d raccdb1 Service backup is not running. Service crmsvc is running on nodes: dancer,vixen,dasher Service hrsvc is running on nodes: comet [GRID]> crsctl eval activate policy Weekend -admin -l 'resources' -------------------------------------------------------------------------------- Name Target State Server Effect -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.mgmtdb 1 ONLINE ONLINE dasher ora.raccdb1.backup.svc 1 ONLINE ONLINE comet Started 2 ONLINE ONLINE dancer Started ora.raccdb1.crmsvc.svc 1 ONLINE OFFLINE Stopped 2 ONLINE OFFLINE Stopped 3 ONLINE ONLINE dasher ora.raccdb1.db 1 ONLINE ONLINE comet 2 ONLINE ONLINE dancer 3 ONLINE ONLINE vixen 4 ONLINE ONLINE dasher ora.raccdb1.hrsvc.svc 1 ONLINE ONLINE vixen 2 ONLINE OFFLINE --------------------------------------------------------------------------------
Policy-Management with Oracle RAC 12c Information on each step on the way
Before After
[GRID]> crsctl modify policyset -attr "LAST_ACTIVATED_POLICY=NightTime“ CRS-2673: Attempting to stop 'ora.raccdb1.crmsvc.svc' on 'dancer' CRS-2673: Attempting to stop 'ora.raccdb1.crmsvc.svc' on 'comet' CRS-2677: Stop of 'ora.raccdb1.crmsvc.svc' on 'dancer' succeeded CRS-2677: Stop of 'ora.raccdb1.crmsvc.svc' on 'comet' succeeded CRS-2672: Attempting to start 'ora.raccdb1.backup.svc' on 'dancer' CRS-2672: Attempting to start 'ora.raccdb1.backup.svc' on 'comet' CRS-2676: Start of 'ora.raccdb1.backup.svc' on 'dancer' succeeded CRS-2676: Start of 'ora.raccdb1.backup.svc' on 'comet' succeeded
[GRID]> crsctl modify policyset -attr "LAST_ACTIVATED_POLICY=DayTime" CRS-2673: Attempting to stop 'ora.raccdb1.backup.svc' on 'dancer' CRS-2673: Attempting to stop 'ora.raccdb1.backup.svc' on 'comet' CRS-2677: Stop of 'ora.raccdb1.backup.svc' on 'dancer' succeeded CRS-2677: Stop of 'ora.raccdb1.backup.svc' on 'comet' succeeded CRS-2672: Attempting to start 'ora.raccdb1.crmsvc.svc' on 'dancer' CRS-2672: Attempting to start 'ora.raccdb1.crmsvc.svc' on 'comet' CRS-2676: Start of 'ora.raccdb1.crmsvc.svc' on 'dancer' succeeded CRS-2676: Start of 'ora.raccdb1.crmsvc.svc' on 'comet' succeeded