Motivation
Motivation
Centralized DBMS in a network
Distributed DBMS environment
IntroductionIntroduction
Distributed Database changes the way of data sharing, conceptually from centralization into decentralization.
Development of computer networks promotes a decentralized mode of work.
Development of distributed systems should improve the sharing ability of the data and the efficiency of data access
Distributed systems should help resolve the "islands of information" problem
12.3 Introduction
ConceptsConcepts
DDBMS to Avoid `islands of information’ problem…
A “Distributed Database”: is a logically interrelated collection of shared data (and a description of this data), physically distributed over a computer network.
A “Distributed DBMS” (DDBMS): is a Software system that permits the management of the distributed database and makes the distribution transparent to users.
Fundamental Principle: make distribution transparent to user. The fact that fragments are stored on different computers is hidden from the users
Concepts (cont’d)Concepts (cont’d)
In a distributed DBMS , single logical database is split into a number of fragments.
Each fragment is stored on one or more computers under the control of a separate DBMS
with the computer connected to a network. Each site is capable of independently processing user requests that require access to local data and is also capable of processing data stored on other computers in the network.
Concepts (cont’d)Concepts (cont’d)
There are two applications
1) local application: do not require data from other sites
2) global application: do require data from other sites
Distributed DBMS need to have at least one global application.
DATABASE ARCHITECUREDATABASE ARCHITECURE
Architecture of DB system is influenced by:
1. Networking
2. Parallel Processing
3. Distributing data
Homogeneous and Heterogeneous DDBMSs Homogeneous and Heterogeneous DDBMSs
Homogeneous DDBMS
In homogeneous DDBMS, all sites use the same DBMS product.
Much easier to design and manage.
This design provides incremental growth by making additional new sites to DDBMS easy
Allows increased performance by exploiting the parallel processing capability of multiple sites.
Heterogeneous DDBMSs Heterogeneous DDBMSs
• In heterogeneous DDBMS, all sites may run different DBMS products, which need not to be based on the same underlying data model and so the system may be composed of RDBMS, ORDBMS and OODBMS products.
In heterogeneous system, communication between different DBMS are required for translations.
In order to provide DBMS transparency, users must be able to make requests in the language of the DBMS at their local site.
Data from the other sites may have different hardware, different DBMS products and combination of different hardware and DBMS products.
The task for locating those data and performing any necessary translation are the abilities of heterogeneous DDBMS.
The Evolution of Distributed DBMSThe Evolution of Distributed DBMS
DDBMS Advantages
Data are located near the “greatest demand” site.
Faster data access
Faster data processing
Growth facilitation
Improved communications
Reduced operating costs
User-friendly interface
Less danger of a single-point failure
Processor independence
DDBMS Disadvantages
Complexity of management and control
Security
Lack of standards
Increased storage requirements
Distributed Processingand Distributed DatabaseDistributed Processing
and Distributed Database Distributed processing shares the database’s logical
processing among two or more physically independent sites that are connected through a network.
Distributed database stores a logically related database over two or more physically independent sites connected via a computer network.
Important difference between DDBMS and distributed processing !Important difference between DDBMS and distributed processing !
DDBMSDistributed processing of centralised DBMS
12.3 Introduction
Distributed processing of a centralised DBMS has following characteristics :•Much more tightly coupled than a DDBMS.•Database design is same as for standard DBMS•No attempt to reflect organisational structure•Much simpler than DDBMS•More secure than DDBMS•No local autonomy
12.3 Introduction
DDBMS
Important difference between DDBMS and Parallel Database
Parallel Database Architectures: Shared: a)memory b)disk c)nothing
Distributed Processing Environment
Distributed Database Environment
Distributed Processingand Distributed DatabaseDistributed Processing
and Distributed Database
Distributed processing does not require a distributed database, but a distributed database requires distributed processing.
Distributed processing may be based on a single database located on a single computer. In order to manage distributed data, copies or parts of the database processing functions must be distributed to all data storage sites.
Both distributed processing and distributed databases require a network to connect all components.
What Is A Distributed DBMS?What Is A Distributed DBMS?
A distributed database management system (DDBMS) governs the storage and processing of logically related data over interconnected computer systems in which both data and processing functions are distributed among several sites.
What Is A Distributed DBMS?What Is A Distributed DBMS?
Functions of a DDBMS Application interface
Validation to analyze data requests
Transformation to determine request’s components
Query-optimization to find the best access strategy
Mapping to determine the data location
I/O interface to read or write data
Formatting to prepare the data for presentation
Security to provide data privacy
Backup and recovery
Database administration
Concurrency control
Transaction management
Charactersitic of DDBMSCharactersitic of DDBMS
Collection of logically related shared data
Data split into fragments
Fragments can be replicated
Fragments/replicas may be allocated to sites
sites linked by communication network
data at each site is under control of DBMS
Centralized Database Management System
Fully Distributed Database Management System
Figure 10.4
DDBMS ComponentsDDBMS Components
Computer workstations that form the network system.
Network hardware and software components that reside in each workstation.
Communications media that carry the data from one workstation to another.
Transaction processor (TP) receives and processes the application’s data requests.
Data processor (DP) stores and retrieves data located at the site. Also known as data manager (DM).
Distributed Database System Components
DDBMS ComponentsDDBMS Components
DDBMS protocol determines how the DDBMS will:
Interface with the network to transport data and commands between DPs and TPs.
Synchronize all data received from DPs (TP side) and route retrieved data to the appropriate TPs (DP side).
Ensure common database functions in a distributed system -- security, concurrency control, backup, and recovery.
Levels of Data & Process DistributionLevels of Data & Process Distribution
Single-Site Processing, Single-Site Data (SPSD) All processing is done on a single CPU or host computer. All data are stored on the host computer’s local disk. The DBMS is located on the host computer. The DBMS is accessed by dumb terminals. Typical of most mainframe and minicomputer DBMSs. Typical of the 1st generation of single-user microcomputer
database.
Nondistributed (Centralized) DBMS
Levels of Data & Process DistributionLevels of Data & Process Distribution
Multiple-Site Processing, Single-Site Data (MPSD) Typically, MPSD requires a network file server on which
conventional applications are accessed through a LAN.
A variation of the MPSD approach is known as a client/server architecture.
Figure 10.7
Figure 10.8 Heterogeneous Distributed Database Scenario
Typically, distributed DBs:
Geographically distributedData sharing is goal (may run into heterogeneity, autonomy)Disconnected operation possible
1.Local Site Independence
2.Central Site Independence
3.Failure Independence
4.Location Transparency
5.Fragmentation Transparency
6.Replication Transparency
7.Distributed Query Processing
8.Distributed Transaction Processing
9.Hardware Independence
10.Operating System Independence
11.Network Independence
12.Database Independence