D D B B S S Y Y S S T T E E M M S S Distributed Database Systems University of Manitoba Asper School of Business 3500 DBMS Bob Travica Updated 2013 1 of 24
DDBB
SSYYSSTTEEMMSS
Distributed Database
Systems
University of ManitobaAsper School of Business
3500 DBMSBob Travica
Updated 2013
1 of 24
DDBB
SSYYSSTTEEMMSS
Distributed Databases (DDB) Concept
Britain
Germany
France
Italy
SELECT SalesFROM Britain.SalesUNIONSELECT SalesFROM France.SalesUNIONSELECT SalesFROM Italy.Sales
Multiple independent systems
Each has DBMS engine, queries, locking,
transactinos, etc.
Usually on different machines & locations
Can be different hardware, OS, software.
2 of 24
DDBB
SSYYSSTTEEMMSS
DatabaseZeus
DatabaseApollo
DatabaseAthena
United States
England
France
• There must be different databases as opposed to a central database (teleprocessing)
• Network is a part of DDB system
Distributed Database System (DDBS) Concept
3 of 24
DDBB
SSYYSSTTEEMMSS
Teleprocessing vs. DDBS
DB
Loc 1, central
Loc 2 Loc 3
DBLoc1
DBLoc2
Teleprocessing DB System
Distributed DB System
4 of 24
DDBB
SSYYSSTTEEMMSS
DDBS Rules
Golden Rule - Transparency: the user should not know or care that the database is distributed and how. User is independent of any constraint.
Specifically: No reliance on a central site. Continuous operation. Location independence. Fragmentation (Partitioning)
independence Replication independence.
Support to distributed query processing.
Support to distributed transaction management.
Hardware independence. Operating system
independence. Network independence. DBMS independence.
(C.J. Date)
5 of 24
DDBB
SSYYSSTTEEMMSS
DDBS Features and Benefits
Each database can continue to run even if a portion fails.
Performance gains in contrast to teleprocessing.
Data and hardware can be moved without affecting
operations or users.
Operations expansion possible
System expansion and upgrades possible
6 of 24
DDBB
SSYYSSTTEEMMSS
More on DDBS Benefits
Support to operations’ dispersion Work and data are
segmented on departments. Work and data are
geographically segmented. Improved local performance
Most updates and queries are performed locally.
Local control and responsibility over data.
Can still combine data across the system.
Scalability and expansion support
Localtransactions
Futureexpansion
Combinedata
7 of 24
DDBB
SSYYSSTTEEMMSS
DDBS Designs
Two typical designs
PartitionedReplicatedPartitioned-
Replicated HybridTeleprocessing
Technical & economic requirements highlow
PartitionReplication C
B
A
C
B
AC
B
A
C
B
A
C
B
A
Loc 1 Loc 2 Loc 3
8 of 24
Loc 1
Loc 2
Loc 3
DDBB
SSYYSSTTEEMMSS
Choose hardware and DBMS vendor, and network.
Set up network and DBMS connections.
Choose locations for data segments.
Create backup plan and strategy.
Design local views (customized queries).
Design stress tests (max. loads, failure cases).
Creating Distributed Database System
9 of 24
DDBB
SSYYSSTTEEMMSS
Query Processing in Partitioned DDBS
Networks (LANs, WANs) infleunce the
speed of data transfer.
Goal is to minimize data transfers
Each system must be capable of
evaluating queries.
Results depend heavily on how the
system joins tables.
Disk driveLAN
WAN
10 of 24
DDBB
SSYYSSTTEEMMSS
Query Optimization in Partitioned DBS
Example NY: tbl.Customers: 1 M rows (Marketing) LA: tbl.Product: 10 M rows (Production) Chicago: tbl.Sale: 20 M rows (main Retail op’s) Query in Chicago: List customer and product data for
blue products sold on 1-Mar-2001. Must join the 3 tables.
Database is partitioned according to spatial dispersion of core operations (above). Only sales data are at the locale where query is run (Chicago).
Bad query design : Transfer entire tables Product and Customer to
Chicago
11 of 24
DDBB
SSYYSSTTEEMMSS
Customer(C#, …) 1,000,000 rows
NY
Product(P#, Color…) 10,000 rows
Sale(S#, C#, Date) 20,000,000 rowsSaleItem(S#, P#,…) 50,000,000 rows
Chicago
LA
returns requestedcustomerrecords
Better query design
qry1:SELECT C#FROM SaleWHERE Date=3/1/2006;
qry2:SELECT P#FROM SaleItem INNER JOIN Sale ON SaleItem.S#=Sale.S# WHERE Date=3/1/2006;
qry3:SELECT * FROM Customer WHERE C# in qry1’s output;
qry4:SELECT * FROM ProductWHERE P# in qry2’s outputAND Color like “blue”;
requests customerrecords for certain
customer IDs
requests product records
for certain Product IDs
returns requestedproductrecords
12 of 24
DDBB
SSYYSSTTEEMMSS
Replicated DDBS Design
Rule of thumb: Keep the data close to the location at which they are used.
Goals: Minimize transmissions Improve performance
Replication Manager
Britain: Customers& Sales
Spain: Customers& Sales
Britain
Britain: Customers& Sales
Spain: Customers& Sales
Spain
Periodic (batch) updates
Decision support systems:
Use replicated data warehouse.
13 of 24
DDBB
SSYYSSTTEEMMSS
Design: Partitioned database
The concurrency issues that apply to centralized database
become even more difficult in a partitioned database
Challenges: Slow network traffic Site unavailable Concurrency (more people trying to change the same data at the
same time) => locks management
DDBS and Concurrent Access
14 of 24
DDBB
SSYYSSTTEEMMSS
Two-Phase Commit Ensuring data consistency across the
system. In updating data, DBMS initiating
update becomes Coordinator. Two phases:
1. Get Ready: Coordinator sends new data and a request for update to all DDB parts.
DB partitions store new data in logs, and report update status.
2. Execute: 2.1 If all DDB parts ready for update,
Coordinator requests COMMIT. 2.2 If any DDB part not ready,
Coordinator requests ROLLBACK, and the update procedure repeats.
DB Partition 1 (Master price list)Initiate Transaction (Reduce price of
cookies for 5%)
DB Partition 2(Store A Product tbl) Partition 3
(Store B Product tbl)
1. Prepare to update.All agree?
2. CommitOK?
OK!
Go!
Coordinator
15 of 24
DDBB
SSYYSSTTEEMMSS
Client-Server DDB
Server performs database tasks at request of clients.
Clients handle front-end tasks and small data tables that are not shared.
Database Server
Client Front-end
(Forms, Queries,Reports)
QueryRequest
Query Output
Database ServerUpdateRequest
Update Response
16 of 24
DDBB
SSYYSSTTEEMMSS
Three-Tier Client-Server
Clients
Middleware
DatabaseServers
Front-end
DatabasesTransactionsLegacy systems
Database linksBusiness rulesCode
17 of 24
DDBB
SSYYSSTTEEMMSS
Client-Server Architecture for Internet
Client Browsers
DBS Middleware
DatabaseServers
Web Server
Application Server
Other Systems
18 of 24
DDBB
SSYYSSTTEEMMSS
Technologies for DDB Used in E-Commerce
- Open Database Connectivity (ODBC)
- Microsoft’s Active Data Objects (ADO)
- Client--Web Server Connectors (CGI)
- Data transfer (XML, SOAP)
19 of 24
DDBB
SSYYSSTTEEMMSS
Open Database Connectivity: ODBC
Since 1992, SQL Access Group; Microsoft’s decisive support
Provides SQL access to different DBMSes
ODBC handles: Login to database. Send query. Interpret result. Exchange data.
Oracle Database
Application
ODBC driver
CLIENT
ODBC driver
SE
LEC
T …
O
utpu
t
SQLDatabase
ODBC driver
ODBC driver
SERVER
SERVER
20 of 24
DDBB
SSYYSSTTEEMMSS
Active Data Objects (ADO)
Relational Database
Server
Front end (report)
ADO objects
ClientS
ELE
CT
…
Out
put
• ADO - Microsoft object-based technology for accessing data
• Data access at records level (“cursor programming”) via classes:
• recordset - similar to table• rowset - looser structure
Non-relationalDatabase
Server
• Builds on concepts of OO and cursor (the space for storing records in DB systems)
• Used in Active Server Pages (ASP)*
21 of 24
DDBB
SSYYSSTTEEMMSS
Client—Web Server Connectors
- HTML input needs to be translated for servers down the stream
- Output needs to be put back into HTML format
- Common Gateway Interface (e.g., CGI programming in PERL)
22 of 24
DDBB
SSYYSSTTEEMMSS
Communication technology -Extensible Markup Language (XML) &
SOAP
- SQL can extract DB data into XML documents. So, any database schema can be described => document transfer, validation and format support (e.g., supplier documents). - XML requires additional software for extracting data (parser) and formatting display (style sheets).
- Note: Document Type Definition (Declaration)
23 of 24
DDBB
SSYYSSTTEEMMSS
Extensible Markup Language (XML) & SOAP
- SOAP (Simple Object Access Protocol) - protocol for exchanging structured (marked up) data in distributed environment
- SOAP uses XML technologies to define an extensible
messaging framework, defining message that can
be exchanged over a variety of underlying protocols
(started with HTTP, hence “Simple”; complex indeed).
(See: http://www.w3.org/TR/2003/REC-soap12-part1-20030624/#encapsulation )
- SOAP and XML provide foundations for Web services – distributed application software, stored across nets, running on different operating systems and devices, written in different programming languages, and made by different vendors (e.g., .NET)
( More... )
24 of 24