This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
ObjectivesObjectives Definition of termsDefinition of terms List functions and roles of data/database administrationList functions and roles of data/database administration Describe role of data dictionaries and information Describe role of data dictionaries and information
repositoriesrepositories Compare optimistic and pessimistic concurrency controlCompare optimistic and pessimistic concurrency control Describe problems and techniques for data securityDescribe problems and techniques for data security Describe problems and techniques for data recoveryDescribe problems and techniques for data recovery Describe database tuning issues and list areas where Describe database tuning issues and list areas where
changes can be done to tune the databasechanges can be done to tune the database Describe importance and measures of data qualityDescribe importance and measures of data quality Describe importance and measures of data availabilityDescribe importance and measures of data availability
Traditional Administration Traditional Administration DefinitionsDefinitions
Data AdministrationData Administration:: A high-level function A high-level function that is responsible for the overall that is responsible for the overall management of data resources in an management of data resources in an organization, including maintaining organization, including maintaining corporate-wide definitions and standardscorporate-wide definitions and standards
Database AdministrationDatabase Administration:: A technical A technical function that is responsible for physical function that is responsible for physical database design and for dealing with database design and for dealing with technical issues such as security technical issues such as security enforcement, database performance, and enforcement, database performance, and backup and recoverybackup and recovery
Traditional Data Administration Traditional Data Administration FunctionsFunctions
Data policies, procedures, standardsData policies, procedures, standards PlanningPlanning Data conflict (ownership) resolutionData conflict (ownership) resolution Managing the information repositoryManaging the information repository Internal marketing of DA conceptsInternal marketing of DA concepts
Evolving Approaches to Evolving Approaches to Data AdministrationData Administration
Blend data and database administration into one Blend data and database administration into one rolerole
Fast-track development – monitoring development Fast-track development – monitoring development process (analysis, design, implementation, process (analysis, design, implementation, maintenance)maintenance)
Procedural DBAs–managing quality of triggers and Procedural DBAs–managing quality of triggers and stored proceduresstored procedures
Data Warehouse Data Warehouse AdministrationAdministration
New role, coming with the growth in data New role, coming with the growth in data warehouseswarehouses
Similar to DA/DBA rolesSimilar to DA/DBA roles Emphasis on integration and coordination of Emphasis on integration and coordination of
metadata/data across many data sourcesmetadata/data across many data sources Specific roles:Specific roles:
Support DSS applicationsSupport DSS applications Manage data warehouse growthManage data warehouse growth Establish service level agreements regarding Establish service level agreements regarding
data warehouses and data martsdata warehouses and data marts
An alternative to proprietary packages An alternative to proprietary packages such as Oracle, Microsoft SQL Server, such as Oracle, Microsoft SQL Server, or Microsoft Accessor Microsoft Access
mySQL is an example of open-source mySQL is an example of open-source DBMSDBMS
Less expensive than proprietary Less expensive than proprietary packagespackages
Source code available, for modificationSource code available, for modification
Database Security:Database Security: Protection Protection of the data against accidental or of the data against accidental or intentional loss, destruction, or intentional loss, destruction, or misusemisuse
Increased difficulty due to Increased difficulty due to Internet access and client/server Internet access and client/server technologiestechnologies
Static HTML files are easy to secureStatic HTML files are easy to secure Standard database access controlsStandard database access controls Place Web files in protected directories on Place Web files in protected directories on
serverserver Dynamic pages are harderDynamic pages are harder
Control of CGI scriptsControl of CGI scripts User authenticationUser authentication Session securitySession security SSL for encryptionSSL for encryption Restrict number of users and open portsRestrict number of users and open ports Remove unnecessary programs Remove unnecessary programs
W3C Web Privacy StandardW3C Web Privacy Standard Platform for Privacy Protection (P3P) Platform for Privacy Protection (P3P) Addresses the following:Addresses the following:
Who collects dataWho collects data What data is collected and for what purposeWhat data is collected and for what purpose Who is data shared withWho is data shared with Can users control access to their dataCan users control access to their data How are disputes resolvedHow are disputes resolved Policies for retaining dataPolicies for retaining data Where are policies kept and how can they be Where are policies kept and how can they be
Views and Integrity ControlsViews and Integrity Controls
ViewsViews Subset of the database that is presented to one Subset of the database that is presented to one
or more usersor more users User can be given access privilege to view User can be given access privilege to view
without allowing access privilege to underlying without allowing access privilege to underlying tablestables
Integrity ControlsIntegrity Controls Protect data from unauthorized useProtect data from unauthorized use Domains–set allowable valuesDomains–set allowable values Assertions–enforce database conditionsAssertions–enforce database conditions
Passwords are flawed:Passwords are flawed: Users share them with each otherUsers share them with each other They get written down, could be copiedThey get written down, could be copied Automatic logon scripts remove need to explicitly type them inAutomatic logon scripts remove need to explicitly type them in Unencrypted passwords travel the InternetUnencrypted passwords travel the Internet
Possible solutions:Possible solutions: Two factor–e.g. smart card plus PINTwo factor–e.g. smart card plus PIN Three factor–e.g. smart card, biometric, PINThree factor–e.g. smart card, biometric, PIN Biometric devices–use of fingerprints, retinal scans, Biometric devices–use of fingerprints, retinal scans,
etc. for positive IDetc. for positive ID Third-party mediated authentication–using secret keys, Third-party mediated authentication–using secret keys,
Mechanism for restoring a Mechanism for restoring a database quickly and accurately database quickly and accurately after loss or damageafter loss or damage
Back-up FacilitiesBack-up Facilities Automatic dump facility that produces Automatic dump facility that produces
backup copy of the entire databasebackup copy of the entire database Periodic backup (e.g. nightly, weekly)Periodic backup (e.g. nightly, weekly) Cold backup–database is shut down Cold backup–database is shut down
during backupduring backup Hot backup–selected portion is shut Hot backup–selected portion is shut
down and backed up at a given timedown and backed up at a given time Backups stored in secure, off-site Backups stored in secure, off-site
Journalizing FacilitiesJournalizing Facilities Audit trail of transactions and Audit trail of transactions and
database updatesdatabase updates Transaction log–record of essential Transaction log–record of essential
data for each transaction processed data for each transaction processed against the databaseagainst the database
Database change log–images of Database change log–images of updated dataupdated data Before-image–copy before modificationBefore-image–copy before modification After-image–copy after modificationAfter-image–copy after modification
Recovery and Restart Recovery and Restart ProceduresProcedures
Disk Mirroring–switch between identical Disk Mirroring–switch between identical copies of databasescopies of databases
Restore/Rerun–reprocess transactions Restore/Rerun–reprocess transactions against the backupagainst the backup
Transaction Integrity–commit or abort Transaction Integrity–commit or abort all transaction changesall transaction changes
Backward Recovery (Rollback)–apply Backward Recovery (Rollback)–apply before imagesbefore images
Forward Recovery (Roll Forward)–apply Forward Recovery (Roll Forward)–apply after images (preferable to after images (preferable to restore/rerun)restore/rerun)
AtomicAtomic Transaction cannot be subdividedTransaction cannot be subdivided
ConsistentConsistent Constraints don’t change from before Constraints don’t change from before
transaction to after transactiontransaction to after transaction IsolatedIsolated
Database changes not revealed to users until Database changes not revealed to users until after transaction has completedafter transaction has completed
DurableDurable Database changes are permanentDatabase changes are permanent
Aborted transactionsAborted transactions Preferred recovery: rollbackPreferred recovery: rollback Alternative: Rollforward to state just prior to abortAlternative: Rollforward to state just prior to abort
Incorrect dataIncorrect data Preferred recovery: rollbackPreferred recovery: rollback Alternative 1: rerun transactions not including inaccurate data updatesAlternative 1: rerun transactions not including inaccurate data updates Alternative 2: compensating transactionsAlternative 2: compensating transactions
System failure (database intact)System failure (database intact) Preferred recovery: switch to duplicate databasePreferred recovery: switch to duplicate database Alternative 1: rollbackAlternative 1: rollback Alternative 2: restart from checkpointAlternative 2: restart from checkpoint
Database destructionDatabase destruction Preferred recovery: switch to duplicate databasePreferred recovery: switch to duplicate database Alternative 1: rollforwardAlternative 1: rollforward Alternative 2: reprocess transactionsAlternative 2: reprocess transactions
Concurrency ControlConcurrency Control ProblemProblem–in a multiuser environment, –in a multiuser environment,
simultaneous access to data can simultaneous access to data can result in interference and data lossresult in interference and data loss
SolutionSolution––Concurrency ControlConcurrency Control The process of managing simultaneous The process of managing simultaneous
operations against a database so that operations against a database so that data integrity is maintained and the data integrity is maintained and the operations do not interfere with each operations do not interfere with each other in a multi-user environmentother in a multi-user environment
The most common way of achieving The most common way of achieving serializationserialization
Data that is retrieved for the purpose of Data that is retrieved for the purpose of updating is locked for the updaterupdating is locked for the updater
No other user can perform update until No other user can perform update until unlockedunlocked
Database–used during database updatesDatabase–used during database updates Table–used for bulk updatesTable–used for bulk updates Block or page–very commonly usedBlock or page–very commonly used Record–only requested row; fairly commonly Record–only requested row; fairly commonly
Types of locks:Types of locks: Shared lock–Read but no update permitted. Shared lock–Read but no update permitted.
Used when just reading to prevent another user Used when just reading to prevent another user from placing an exclusive lock on the recordfrom placing an exclusive lock on the record
Exclusive lock–No access permitted. Used Exclusive lock–No access permitted. Used when preparing to updatewhen preparing to update
DeadlockDeadlock An impasse that results when two or more An impasse that results when two or more
transactions have locked common resources, and transactions have locked common resources, and each waits for the other to unlock their resourceseach waits for the other to unlock their resources
Figure 12-13The problem of deadlock
John and Marsha will wait John and Marsha will wait forever for each other to forever for each other to release their locked release their locked resources!resources!
May be difficult to determine all needed May be difficult to determine all needed resources in advanceresources in advance
Deadlock Resolution:Deadlock Resolution: Allow deadlocks to occurAllow deadlocks to occur Mechanisms for detecting and breaking themMechanisms for detecting and breaking them
Managing Data QualityManaging Data Quality Causes of poor data qualityCauses of poor data quality
External data sourcesExternal data sources Redundant data storageRedundant data storage Lack of organizational commitmentLack of organizational commitment
Data quality improvementData quality improvement Perform data quality auditPerform data quality audit Establish data stewardship program (data Establish data stewardship program (data
stewardsteward is a liaison between IT and business is a liaison between IT and business units)units)
Apply total quality management (TQM) practicesApply total quality management (TQM) practices Overcome organizational barriersOvercome organizational barriers Apply modern DBMS technologyApply modern DBMS technology Estimate return on investmentEstimate return on investment
Data Dictionaries and Data Dictionaries and RepositoriesRepositories
Data dictionaryData dictionary Documents data elements of a databaseDocuments data elements of a database
System catalogSystem catalog System-created database that describes all database System-created database that describes all database
objectsobjects Information RepositoryInformation Repository
Stores metadata describing data and data processing Stores metadata describing data and data processing resourcesresources
Information Repository Dictionary System Information Repository Dictionary System (IRDS)(IRDS) Software tool managing/controlling access to Software tool managing/controlling access to
Set cache levelsSet cache levels Choose background processesChoose background processes
Input/Output (I/O) ContentionInput/Output (I/O) Contention Use stripingUse striping Distribution of heavily accessed filesDistribution of heavily accessed files
CPU UsageCPU Usage Monitor CPU loadMonitor CPU load
Application tuningApplication tuning Modification of SQL code in applicationsModification of SQL code in applications