Distributed Data Base System RAVINDER CHAMOLI MSC[CS] OMIT, RISHIKESH
Nov 20, 2014
Distributed Data Base SystemRAVINDER CHAMOLIMSC[CS]OMIT, RISHIKESH
Distributed Data Base ManagementA distributed data base system, the database is
stored on several computers . A distributed database is a collection of
multiple , Logic computer network .
The Computer in a distributed system communicate with one another through various communication media, such as high-speed networks or telephone lines .
Distributed Database Systems Pikes
They do not share main-memory or disks .
The computer in a distributed systems may vary in size and function ,ranging from workstation up to mainframe systems .
The computer in a distributed system are referred to by a number , such as sites or nodes .
The sites terms mainly used to emphasize the physical distribution of these systems .
Sites Picks
Reasons for Building Distributed Database Systems
Sharing data:o The major advantage in building distributed
database system in the provision of an environment where users at one site may be able to access the data resending at over other sites .
Autonomy:o The primary advantages of sharing data by
means of distribution is that each site is able to retain a degree of control over data that are stored locally .
Availability:o If one site fails in a distributed systems , the
remaining sites may be able to continue operating .in particular, if data items are replicated in the several sites ,a transaction needing a particular data item may find that item in any of several sites .
o Thus ,the failures of a sites does not necessarily imply the stud down of the system.
THE PROPERTY OF DISTRIBUTED DATABASE SYSTEMSDistributed Database System should makes impact of data distributed transparent .
Distributed Database System have two major
property .o Distributed Data Independence .o Distributed Transaction atomicity
Distributed Data IndependenceDDI property enable user to ask queries
without specifying where the reference relation copies or fragments of the relation are located .
This principle is a natural extension of physical
and logical data Independence .
DISTRIBUTED TRANSACTION ATOMICITY
Distributed transition atomicity property enables users to write transitions that access and update data at several sites .
They would write transitions over purely local data the effects of transition across sites should continue to be atomic .
Types of Distributed Data Base
1. Homogeneous Distributed Data Base .2. Heterogeneous Distributed Data Base
Homogeneous Distributed Data Base
Homogeneous distributed data base is simplest from of a distributed data base where there are several sites each running their own application on the same DBMS software .
All sites have identical DBMS software .All user use identical software are aware of one
another and agree to cooperate in processing user request .
Heterogeneous Distributed Data Base
Heterogeneous distributed data base systems different sites run under the control of different DBMS software's .
Heterogeneous distributed data base systems is also referred to s multi-database systems or a federated data base system(FDBS) .
It’s well accepted standards for gateway protocols to expose DBMS functionality to external application.
The Gateway protocols help to make communicate the different sites
Distributed Data Storage
Consider a relation ‘r’ that is to be stored in the database .there are two approaches to storing this relation in the distributed database :
o Replication :o The System maintains several identical
replicas (copies) of relation, and stores each replica at different site. The alternative to replication is to store only one copy of the relation ‘r’.
Replication Picks
o Fragmentation:o The System Partitions the relation into several
fragments, and stores each fragment at different site.
o Fragmentation and Replication can be combined :o A relation can be partitioned into several
fragments and there may be several replicas of each fragment .
TransparencyThe user of a distributed database system
should not be required to know either where the data are physically located or how the data can be accessed at the specific local site. This characteristic called DATA TRANSPARENCY
Data Transparency can take several forms:o Fragmentation Transparencyo Replication Transparency o Location Transparency
Each site has its own local transaction manager, whose function is to ensure the ACID properties of those transaction that execute at that site.
The various transaction manager cooperate to execute global transaction.
System Structure
System Structure
To understand how such a manager can be implemented ,consider abstract model of a transaction system, in which each site contains two sub system .
The Transaction manager manages the execution of those transaction (or sub- transaction ) that access data stored in a local site.
Note that each such transaction may be either a local transaction(that is a transaction that executes at only that site ) or part of a global transaction ( that is a transaction that executes at several sites) .
The transaction coordinator coordinates the execution of the of the various transaction (both local and global ) initiated at that site.
Distributed Query Processing
In this distributed system, we must take into account several other matter ,including .
o The cost of data transition over the network.o The potential gain in performance from
having several sites process parts of the query in parallel .
The relative cost of data transfer over the network and data transfer to and from disk various widely depending on the type of network and on the speed of the disks .
Thus, in general ,we cannot focus solely on disk costs or on network costs. Rather , we must find a good trade off between the two.
The System Failure Modes
A distributed systems may suffer from types of failure that a centralized systems does (for example, software errors, hardware error, or disk crashes) .
The basic failure types are o Failure of site .o Lass of messages .o Failure of a communication link .o Network partition
To provide high availability, a distributed database must detect failures, reconfigure itself so that computation may continue, and recover when a processor or a link is repaired.
The task is greatly complication by the fact that it is hard to distinguish between network
partitions or sites failures
Other Important Issues
Commit Protocolso If we are to ensure atomicity, all the sites in
which a transaction T executed agree on the final outcomes of the execution must either commit at all sites, or it must abort at all sites .
o To ensure this property , the transaction coordinator of t must executed a commit protocols
Time stampingo The principal idea behind the time stamping
in is that each transaction is given a unique time stamp that the system user in deciding the serialization order .
Thanks