8/8/2019 Distributed Database Systems Lecture No 1
1/39
8/8/2019 Distributed Database Systems Lecture No 1
2/39
NEED OF DISTRIBUTED DATABASE
SYSTEM
Organization size
Organization Branches
Computerize Information System at each site
Single data at different location
y How to best utilize the resources ?
y How to maintain the integrity of data at different
location?
8/8/2019 Distributed Database Systems Lecture No 1
3/39
OUTLINE
Introduction
What is a distributed DBMS
Problems
Current state-of-affairs
Background
Distributed DBMS Architecture
Distributed Database Design (Briefly)
Distributed Query Processing (Briefly)
Distributed Transaction Management (Extensive)
Mobile Database Systems (research paper based)
Privacy, Trust, and Authentication (research
paper based)
Peer to Peer Systems (research paper based)
8/8/2019 Distributed Database Systems Lecture No 1
4/39
OUTLINE
Introduction
y What is a distributed DBMS
y Problems
y Current state-of-affairs
Background
Distributed DBMS Architecture
Distributed Database Design (Briefly)
Distributed Query Processing (Briefly)
Distributed Transaction Management (Extensive)
Mobile Database Systems (research paper based)
Privacy, Trust, and Authentication (research
paper based)
Peer to Peer Systems (research paper based)
8/8/2019 Distributed Database Systems Lecture No 1
5/39
FILE SYSTEMS
program 1
data description 1
program 2
data description 2
program 3
data description 3
File 1
File 2
File 3
8/8/2019 Distributed Database Systems Lecture No 1
6/39
DATABASE MANAGEMENT
database
DBMS
Applicationprogram 1(with datasemantics)
Applicationprogram 2(with datasemantics)
Applicationprogram 3(with datasemantics)
description
manipulation
control
8/8/2019 Distributed Database Systems Lecture No 1
7/39
INTEGRATE DATABASES AND COMMUINICATION
DatabaseTechnology
ComputerNetworks
integration distribution
integration
Distributed
Database
Systems
8/8/2019 Distributed Database Systems Lecture No 1
8/39
DISTRIBUTED COMPUTING
A number of autonomous processing elements
(not necessarily homogeneous) that areinterconnected by a computer network and that
cooperate in performing their assigned tasks.
8/8/2019 Distributed Database Systems Lecture No 1
9/39
Synonymous terms
y distributed data processing
y multiprocessors/multicomputers
y satellite processing
y backend processing
y dedicated/special purpose
computersy timeshared systems
y functionally modular systems
y Peer to Peer Systems
DISTRIBUTED COMPUTING
8/8/2019 Distributed Database Systems Lecture No 1
10/39
Processing logic
Functions
Data
Control
WHAT IS DISTRIBUTED
8/8/2019 Distributed Database Systems Lecture No 1
11/39
WHAT IS ADISTRIBUTED DATABASE SYSTEM?
A distributed database (DDB) is a collection of multiple,
logically interrelated databases distributed over a
computer network.
A distributed database management system (DDBMS)
is the software that manages the DDB and provides an
access mechanism that makes this distribution
transparent to the users.
Distributed database system (DDBS) = DB +
Communication
8/8/2019 Distributed Database Systems Lecture No 1
12/39
A timesharing computer system
A loosely or tightly coupled multiprocessor system
A database system which resides at one of the
nodes of a network of computers - this is a
centralized database on a network node
WHAT IS NOT ADDBS?
8/8/2019 Distributed Database Systems Lecture No 1
13/39
CENTRALIZED DBMS ON ANETWORK
Site 5
Site 1
Site 2
Site 3Site 4
Communication
Network
8/8/2019 Distributed Database Systems Lecture No 1
14/39
DISTRIBUTED DBMS ENVIRONMENT
Site 5
Site 1
Site 2
Site 3Site 4
Communication
Network
8/8/2019 Distributed Database Systems Lecture No 1
15/39
SHARED-MEMORYARCHITECTURE
Examples : symmetric multiprocessors (Sequent,
Encore) and some mainframes
(IBM3090, Bull's DPS8)
P1 Pn M
D
8/8/2019 Distributed Database Systems Lecture No 1
16/39
SHARED-NOTHINGARCHITECTURE
Examples : Teradata's DBC, Tandem, Intel's
Paragon, NCR's 3600 and 3700
P1
M1
D1
Pn
Mn
Dn
8/8/2019 Distributed Database Systems Lecture No 1
17/39
Manufacturing - especially multi-plant
manufacturing
Military command and control
Electronic fund transfers and electronic trading
Corporate MIS
Airline restrictions
Hotel chains
Any organization which has a decentralized
organization structure
APPLICATIONS
8/8/2019 Distributed Database Systems Lecture No 1
18/39
Transparent management of distributed,
fragmented, and replicated data
Improved reliability/availability through
distributed transactions
Improved performance
Easier and more economical system
expansion
DISTRIBUTED DBMS PROMISES
8/8/2019 Distributed Database Systems Lecture No 1
19/39
TRANSPARENCY
Transparency is the separation of the higher levelsemantics of a system from the lower levelimplementation issues.
Fundamental issue is to provide
data independence
in the distributed environment
y Network (distribution) transparency
y Replication transparency
y Fragmentation transparency
horizontal fragmentation: selection vertical fragmentation: projection
hybrid
8/8/2019 Distributed Database Systems Lecture No 1
20/39
8/8/2019 Distributed Database Systems Lecture No 1
21/39
TRANSPARENTACCESS
SELECT ENAME,SAL
FROM EMP,ASG,PAY
WHERE DUR > 12
AND EMP.ENO = ASG.ENOAND PAY.TITLE = EMP.TITLE Paris projects
Paris employees
Paris assignments
Boston employees
Montreal projects
Paris projects
New York projects
with budget > 200000
Montreal employees
Montreal assignments
Boston
Communication
Network
Montreal
Paris
New
York
Boston projects
Boston employees
Boston assignments
Boston projects
New York employees
New York projects
New York assignments
Tokyo
8/8/2019 Distributed Database Systems Lecture No 1
22/39
OUTLINE
Introduction
Background
Distributed DBMS Architecture
Distributed Database Design (Briefly) Distributed Query Processing (Briefly)
Distributed Transaction Management
(Extensive)
Building Distributed Database Systems
(RAID)
Mobile Database Systems
Privacy, Trust, and Authentication
Peer to Peer Systems
8/8/2019 Distributed Database Systems Lecture No 1
23/39
DISTRIBUTED DATABASE - USERVIEW
Distributed Database
8/8/2019 Distributed Database Systems Lecture No 1
24/39
DISTRIBUTED DBMS - REALITY
CommunicationSubsystem
UserQuery
DBMSSoftware
DBMSSoftware
UserApplication
DBMSSoftware
User
ApplicationUserQuery
DBMSSoftware
UserQuery
DBMSSoftware
8/8/2019 Distributed Database Systems Lecture No 1
25/39
POTENTIALLYIMPROVED PERFORMANCE
Proximity of data to its points of use
y Requires some support for fragmentation and
replication
Parallelism in execution
y Inter-query parallelism
y Intra-query parallelism
8/8/2019 Distributed Database Systems Lecture No 1
26/39
SYSTEM EXPANSION
Issue is database scaling
Peer to Peer systems
Communication overhead
8/8/2019 Distributed Database Systems Lecture No 1
27/39
DISTRIBUTED DBMS ISSUES
Distributed Database Design
y how to distribute the database
y replicated & non-replicated database distribution
y a related problem in directory management
Query Processing
y convert user transactions to data manipulation
instructions
y optimization problem
y min{cost = data transmission + local processing}
y general formulation is NP-hard
8/8/2019 Distributed Database Systems Lecture No 1
28/39
DISTRIBUTED DBMS ISSUES
Concurrency Control
y Synchronization of concurrent accesses
y Consistency and isolation of transactions' effects
y Deadlock management
Reliability
y How to make the system resilient to failures
y Atomicity and durability
Privacy/Security
y Keep database access private
y Protect against malicious activities
Trusted Collaborations (Emerging requirements)
y Evaluate trust among users and database sites
y Enforce policies for privacy
y Enforce integrity
8/8/2019 Distributed Database Systems Lecture No 1
29/39
Directory
Management
RELATIONSHIP BETWEEN ISSUES
Reliability
Deadlock
Management
Query
Processing
ConcurrencyControl
Distribution
Design
8/8/2019 Distributed Database Systems Lecture No 1
30/39
Operating System Support
y operating system with proper support for
database operations
y dichotomy between general purpose processing
requirements and database processing
requirements
Open Systems and Interoperability
y Distributed Multidatabase Systemsy More probable scenario
y Parallel issues
Network Behavior
RELATED ISSUES
8/8/2019 Distributed Database Systems Lecture No 1
31/39
OUTLINE Introduction Background
Distributed DBMS Architecturey Introduction to Database Concepts
Architecture, Schema, Views
y Alternatives in Distributed Database Systems
y Datalogical Architecturey Implementation Alternatives
y Component Architecture
Distributed Database Design (Briefly)
Distributed Query Processing (Briefly)
Distributed Transaction Management (Extensive)
Building Distributed Database Systems (RAID)
Mobile Database Systems
Privacy, Trust, and Authentication
Peer to Peer Systems
8/8/2019 Distributed Database Systems Lecture No 1
32/39
Background materials of database architecture
Defines the structure of the system
y components identified
y functions of each component defined
y interrelationships and interactions between
components defined
ARCHITECTURE OF ADATABASE SYSTEM
8/8/2019 Distributed Database Systems Lecture No 1
33/39
ANSI/SPARC ARCHITECTURE
External
Schema
Conceptual
Schema
Internal
SchemaInternal view
Users
External
view
Conceptualview
External
view
External
view
8/8/2019 Distributed Database Systems Lecture No 1
34/39
Reference Modely A conceptual framework whose purpose is to divide
standardization work into manageable pieces and to show at ageneral level how these pieces are related to one another.
Approaches
y
Component-basedComponents of the system are defined together with the
interrelationships between components.
Good for design and implementation of the system.
y Function-based
Classes of users are identified together with the functionality
that the system will provide for each class.The objectives of the system are clearly identified. But how do
you achieve these objectives?
y Data-based
Identify the different types of describing data and specify the
functional units that will realize and/or use data according tothese views.
STANDARDIZATION
8/8/2019 Distributed Database Systems Lecture No 1
35/39
RELATION EMP [
KEY = {ENO}
ATTRIBUTES = {
ENO : CHARACTER(9)
ENAME : CHARACTER(15)TITLE : CHARACTER(10)
}
]
RELATION PAY [
KEY = {TITLE}
ATTRIBUTES = {
TITLE : CHARACTER(10)
SAL : NUMERIC(6)
}
]
CONCEPTUAL SCHEMADEFINITION
8/8/2019 Distributed Database Systems Lecture No 1
36/39
RELATION PROJ [
KEY = {PNO}
ATTRIBUTES = {
PNO : CHARACTER(7)
PNAME : CHARACTER(20)
BUDGET : NUMERIC(7)
}
]
RELATIONASG [
KEY = {ENO,PNO}
ATTRIBUTES = {ENO : CHARACTER(9)
PNO : CHARACTER(7)
RESP : CHARACTER(10)
DUR : NUMERIC(3)
}
]
CONCEPTUAL SCHEMADEFINITION
8/8/2019 Distributed Database Systems Lecture No 1
37/39
RELATION EMP [
KEY = {ENO}
ATTRIBUTES = {ENO : CHARACTER(9)
ENAME : CHARACTER(15)
TITLE : CHARACTER(10)
}
]
INTERNAL_REL EMPL [
INDEXON E# CALL EMINX
FIELD = {HEADER : BYTE(1)
E# : BYTE(9)
ENAME : BYTE(15)
TIT : BYTE(10)
}
]
INTERNAL SCHEMADEFINITION
8/8/2019 Distributed Database Systems Lecture No 1
38/39
Create a BUDGET view from the PROJ relation
CREATE VIEW BUDGET(PNAME, BUD)
AS SELECTPNAME, BUDGET
FROM PROJ
EXTERNALVIEW DEFINITION EXAMPLE 1
8/8/2019 Distributed Database Systems Lecture No 1
39/39
Create a Payroll view from relations EMP and
TITLE_SALARY
CREATE VIEW PAYROLL (ENO, ENAME, SAL)
AS SELECT
EMP.ENO,EMP.ENAME,PAY.SAL
FROM EMP, PAY
WHERE EMP.TITLE = PAY.TITLE
EXTERNALVIEW DEFINITION EXAMPLE 2