Top Banner
SFDV3007 Chapter 2: Distributed Data Management
62

SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

Dec 28, 2015

Download

Documents

Neal Cobb
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

SFDV3007

Chapter 2: DistributedData Management

Page 2: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

Overview of Chapter 2

• Distributed information systems• Client/server systems• XML and its applications• Distributed database systems

2

Page 3: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

References

• Kifer chapters 15, 16, 23, 24, (25)• Silberschatz chapters 10, 18, 19, 21• Mannino chapter 17• Date (An Introduction to Database Systems, 6th

ed.) chapter 21• INFO 323 (distributed processing)• “Oracle DBA” = Oracle10g Administrator’s Guide,

Part VII• “Heterogeneous Connectivity” = Oracle10g

Heterogeneous Connectivity Administrator’s Guide

3

Page 4: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

Distributed information systems

2.1A brief overview &

definition of some terms

Page 5: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

5

We are here…

• Distributed information systems• Client/server systems• XML and its applications• Distributed database systems

Page 6: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

6

Timeline

Page 7: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

7

Timeline•CPU and memory related technology

•Offline storage

•Primary mode of user interaction (e.g. GUI

mainstream since 1984)

•Programming languages

•Data storage & management

•Hardware (boxes)

•Degree of centralisation

•Primary data processing style

Page 8: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

Databases form the“back end” of IS

• Controlled by DBMS.• Integrity code ideally in database or

at least in one location– includes triggers, stored procedures, …– applications cannot bypass

8

Page 9: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

Applications form the“front end” of IS

• Data entry, validation, query formulation, data retrieval and processing, information display.

• Commonly written in:– 3GLs (COBOL, C++, Java)– RAD (Rapid Application Development) tools

(Developer, Visual Studio, …)• Most code external to DBMS.• Originally all code was external to the DBMS, but

this has shifted with the introduction of stored procedures

9

Page 10: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

The database and application layers

10

Page 11: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

We can distribute processing

(Kifer §23.2; Silberschatz §18.1)

• Multiple independent, interconnected, cooperating computers.

• Processors may cooperate and/or data are distributed across various machines.

• multiple processors on different machines cooperate to carry out a task.

11

Page 12: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

Distributed Computing or Distributed

ProcessingShares database’s logical processing among physically, networked independent sites

Page 13: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

We can distribute data(Kifer pp. 687–688; Silberschatz §18.4)

• Data at multiple locations on the network.

• Compare with physical partitioning.

• Does not necessarily imply a “distributed database”.

• Often easier to distribute processing.

13

Page 14: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

We can distribute the database

(Kifer §16.1; Silberschatz §19.1)

• Stored on and managed by computers at several sites on a network.

• Distribution of data and database processing.

• Ideally a single logical database.

14

Page 15: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

What is a Distributed Database System?

Stores logically related database over physically independent sites

Page 16: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

16

We are here…

• Distributed information systems

•XML and its applications

• Distributed database systems

Page 17: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

XML and its applications

2.3Current developments in

distributed data management

Page 18: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

XML References

• Kifer ch. 15• Silberschatz ch. 10• XML in 10 Points <http://www.w3.org/XML/1999/XML-in-10-

points>

• A Technical Introduction to XML <http://www.xml.com/>

• Web standard specifications– World Wide Web Consortium (W3C)

<http://www.w3.org/>

– Organization for the Advancement of Structured Information Standards (OASIS) <http://www.oasis-open.org/>

18

Page 19: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

XML = Extensible Markup Language

(Kifer pp. 582–585; Silberschatz §10.1)

• Text-based markup language with user-definable tag sets.

• In HTML, the tag set is fixed.• Used to:

– create domain-specific markup languages– exchange data

• Not really intended for humans to read.

19

Page 20: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

What is a markup language?

• Embedded “commands” within text.• Examples: HTML, LaTeX, WordPerfect.• Possible uses:

– specifying semantics– specifying document structure– specifying visual formatting– structure/meaning vs. presentation

20

Page 21: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

XML has a long history

1969: IBM introduces Generalised Markup Language (GML); first use of <tag> </tag>.

1986: Standard Generalised Markup Language (SGML) defines different document types (DTD).

1990–1997: HyperText Markup Language (HTML) v1.0–4.0.1; an SGML document type

1996–1998: XML 1.0; a simplified form of SGML.2000: XHTML; HTML as an XML document type.

21

Page 22: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

Why XML?(Kifer §15.1; Silberschatz §10.1)

• HTML more for display:– mixes structure, meaning, formatting– difficult to query/manipulate HTML documents– XML separates content from presentation

• No predefined tags (cf. HTML).• Free-form data storage.• Typically self-documenting.• Plain text, hierarchical structure ⇒ very easy

to process.

22

Page 23: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

What XML is• “SGML (Standard Generalized Markup

Language) lite”.• Plain text (easy to manipulate).• Free-form.• Extensible (define your own elements).• Content-neutral markup -- which allowed for multi-

channel publishing into a variety of external container formats. .

• Hierarchically structured.• Extremely verbose (deliberately)

23

Page 24: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

What XML isn’t• A language

– XML is not a language in the programming sense, at least. It’s really more of a mechanism for specifying different varieties of markup.

• A replacement for HTML• Intended for human consumption• A database• A silver bullet - refers to any straightforward

solution perceived to have extreme effectiveness.

24

Page 25: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

XML = content, XSL = presentation

(Kifer §15.4.2; Silberschatz §10.4.2; Example 2–10)

• XML Stylesheet Language.• Specifies document formatting

(appearance).• Separate from XML documents.

25

Page 26: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

XML can be used for many things

• Domain specific markup languages:– XHTML (HTML as an XML document type)– SVG (Scalable Vector Graphics)– MathML (mathematical formulae)– ChemML (chemical industry)

• Dynamic document publication.– dynamic document publication with a single source and

multiple target formats.• Data interchange.• .

26

Page 27: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

XML can be used for many things

• Metadata.– Metadata specifications typically use RDF (Resource

Description Framework), which is a major component of the Semantic Web

• Aggregating data from multiple sources.• Document storage and manipulation.

– Document storage implies a database. Document manipulation implies the ability to transform XML documents into other forms.

27

Page 28: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

But XML has its problems too…

• Extremely (deliberately) verbose ⇒ bloated files.

• Hierarchical structure not suited to all applications.

• TMA: Too Many Acronyms (XML, XSL, XSLT, DTD, CSS, HTML, XHTML, DOM, …argh!)

• Too many specifications (XML, XPath, XQuery, XML Schema, XPointer, XLink, XForms, XHTML, XML Encryption, web services, …ARGH!)

28

Page 29: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

Distributed database systems

2.5Theory & practice

Page 30: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

30

We are here…

• Distributed information systems• XML and its applications• Distributed database systems

Page 31: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

Recall: We can distribute data

(Kifer pp. 687–688; Silberschatz §18.4)

• Data at multiple locations on the network.

• Does not necessarily imply a “distributed database”.

• Architectures:– networked databases– federated database – “true” distributed

database

31

Page 32: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

The networked databases architecture

(Kifer §16.1: “Multiple local schemas”)

32

Page 33: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

Networked databaseshave many problems

• Heterogeneous data management systems.

• Duplication ⇒ different versions of “same” data.

• Synonyms and homonyms.• Data “islands”.• Transfer difficulties.

33

Page 34: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

There are many reasonsfor heterogeneity

• Historical:– “databases” dating back decades– “foreign” databases acquired by

mergers

• Separate vs. monolithic:– Performance– focus

• “Bottom-up” development.34

Page 35: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

An example of a distributed database

35

DunedinWarehouse,

Inventory(SQL Server)

WellingtonSales (NI)& Staff(Sybase)

AucklandMarketing

(Oracle10g)

ChristchurchSales (SI)& Service(DB2)

Page 36: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

An example of a distributed query

36

DunedinProduct

Auckland(initiates

query)

ChristchurchCustomerOrder (SI)

SELECT Customer.Name, Product.Name, Employee.NameFROM Customer, Product, Employee, OrderWHERE Order.Cust_ID = Customer.Cust_ID AND Order.Prod_ID = Product.Prod_ID AND Order.Emp_ID = Employee.Emp_ID;

WellingtonCustomer (NI)EmployeeOrder (NI)

Page 37: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

An example of a distributed transaction

37

DunedinProduct

WellingtonCustomer (NI)EmployeeOrder (NI)

ChristchurchCustomer (SI)Order (SI)

SET TRANSACTION READ WRITE;INSERT INTO Customer VALUES (...);DELETE FROM Customer WHERE Credit_Limit < 1000;COMMIT;

Auckland(initiates

transaction)

Page 38: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

Oracle10g provides good support for DDB

Oracle10g Parallel Server– Oracle only, not really a true DDBMS

Oracle10g Heterogeneous Services– Oracle10g DDBMS– Underlying: just about anything as

long as an interface exists (see slide **126**)

38

Page 39: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

Oracle’s Heterogeneous Services

(Heterogeneous Connectivity; Oracle DBA)

• Distributed transactions (2PC).• SQL & data dictionary translation, pass-

through SQL.• Procedural access.• Global query optimisation.• Site autonomy.• Location transparency: synonyms &

views.

39

Page 40: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

Oracle’s Heterogeneous Services

(Heterogeneous Connectivity, Figure 2–2)

40

Page 41: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

How Heterogeneous Services integrates

• Non-Oracle SQL DBMSs integrated via agents (e.g., Oracle Transparent Gateways).

• Non-SQL DBMSs via direct application programming interfaces (API).

• Call any API as remote procedures.

41

Page 42: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

Heterogeneous Services is

built on other features• Networking through Oracle Net Services

(layer over network protocols).• Oracle Names global directory service

(cf. NetWare Directory Services).• Oracle10g Replication.• Oracle Transparent Gateways.• Generic connectivity (ODBC, OLE DB).

42

Page 43: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

Heterogeneous Services architecture

43

Page 44: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

Creating distributeddatabases in Oracle10g

(Oracle DBA, Figure 29–2)

44

Page 45: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

Creating distributeddatabases in Oracle10g

(Oracle DBA ch. 29; SQL Reference ◃ “CREATE DATABASE LINK”)

Create a database linkCREATE DATABASE LINK Christchurch USING Christchurch-NTS-07';

Use the database linkSELECT Address, Asking_PriceFROM Happyhomes.House@Christchurch;

45

Page 46: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

Location transparency in Oracle10g

(Oracle DBA, Figure 30–3)

46

Page 47: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

Location transparency in Oracle10g

(Oracle DBA ch. 30; SQL Reference ◃ “CREATE SYNONYM”)

Use views and/or synonyms

CREATE SYNONYM Chch_House

FOR Happyhomes.House@Christchurch;

CREATE VIEW Houses AS

SELECT * FROM House UNION

SELECT * FROM Chch_House;

47

Page 48: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

Replication in Oracle10g

Simple replication using materialised views(SQL Reference ◃ “CREATE MATERIALIZED VIEW”)

CREATE MATERIALIZED VIEW Chch_Sellers REFRESH COMPLETE START WITH SYSDATE NEXT SYSDATE + 1/48AS SELECT S.* FROM Seller@Christchurch S;

48

Page 49: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

Replication in Oracle10g

Oracle’s Advanced Replication(Oracle10g Advanced Replication manual)

•Read/write replicas.•DBA privileges required.•Usual issues with keeping replicas synchronised.

49

Page 50: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

Remote transactions in Oracle10g

(Oracle DBA ch. 29)

Access exactly one remote site

UPDATE Happyhomes.House@ChristchurchSET Name = 'The Palace'WHERE Name = 'The Dump';

50

Page 51: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

Distributed transactions in Oracle10g

(Oracle DBA ch. 29)

Access two or more remote or local sites

SELECT C.Name, C.Age, D.Name, D.RatingFROM Happyhomes.House@Dunedin D, Happyhomes.House@Christchurch CWHERE C.Name = D.Name;

51

Page 52: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

The Evolution of Distributed Database Management Systems

• Distributed database management system (DDBMS) – Governs storage and processing of

logically related data over interconnected computer systems in which both data and processing functions are distributed among several sites

52

Page 53: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

The Evolution of Distributed Database Management Systems

(continued)• Centralized database required that

corporate data be stored in a single central site

• Dynamic business environment and centralized database’s shortcomings spawned a demand for applications based on data access from different sources at multiple locations

53

Page 54: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

The Evolution of Distributed Database Management Systems

(continued)

Page 55: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

DDBMS Advantages and Disadvantages

• Advantages include:– Data are located near “greatest

demand” site– Faster data access– Faster data processing – Growth facilitation – Improved communications

55

Page 56: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

DDBMS Advantages and Disadvantages

(continued)• Advantages include (continued):

– Reduced operating costs – User-friendly interface – Less danger of a single-point failure – Processor independence

56

Page 57: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

DDBMS Advantages and Disadvantages

(continued)• Disadvantages include:

– Complexity of management and control

– Security – Lack of standards– Increased storage requirements – Increased training cost

57

Page 58: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

DDBMS Advantages and Disadvantages

(continued)

Page 59: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

DDBMS Advantages and Disadvantages

(continued)

Page 60: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

DDBMS Advantages and Disadvantages

(continued)

Page 61: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

Characteristics of Distributed

Management Systems• Application interface• Validation • Transformation• Query optimization• Mapping • I/O interface

61

Page 62: SFDV3007 Chapter 2: Distributed Data Management. Overview of Chapter 2 Distributed information systems Client/server systems XML and its applications.

Characteristics of Distributed

Management Systems (continued)

• Formatting• Security • Backup and recovery • DB administration • Concurrency control• Transaction management

62