L15-16_PPT_IVSem

8/18/2019 L15-16_PPT_IVSem

1/55

Lecture 15Lecture 15- - 1616

Database System ArchitecturesDatabase System Architectures

Distributed DatabaseDistributed DatabaseParallel DatabaseParallel Database

©Silberschatz, Korth and Sudarshan20.1Database System Concepts - 5 th Edition

8/18/2019 L15-16_PPT_IVSem

2/55

IntroductionIntroduction

Distributed Data base system consists

physical Components and the parallel

constitute a single data Base system


8/18/2019 L15-16_PPT_IVSem

3/55

ScopeScope

distributed database are widely used in large data processing, today’sword using the internet, e-banking system, whether forecasting etc.where large amount of data is processed, so there we need of dataprocessing if data is distributed in many places then that is distributeddata processing, there fore scope of parallel data bases anddistributed data processing is very bright.


8/18/2019 L15-16_PPT_IVSem

4/55

ResearchResearch

Lots of research are going on in

databases.


8/18/2019 L15-16_PPT_IVSem

5/55

Database System ArchitecturesDatabase System Architectures

Centralized and Client-Server Systems

Server System Architectures

ara e ystems

Distributed Systems

Network Types


8/18/2019 L15-16_PPT_IVSem

6/55

Centralized SystemsCentralized Systems

Run on a single computer system and do not interact with othercomputer systems.

-

of device controllers that are connected through a common bus thatprovides access to shared memory.

Single-user system (e.g., personal computer or workstation): desk-topunit, single user, usually has only one CPU and one or two harddisks; the OS may support only one user.

Multi-user system: more disks, more memory, multiple CPUs, and amulti-user OS. Serve a large number of users who are connected tothe system vie terminals. Often called server systems.


8/18/2019 L15-16_PPT_IVSem

7/55

A Centralized Computer System A Centralized Computer System


8/18/2019 L15-16_PPT_IVSem

8/55

ClientClient--Server SystemsServer Systems

Server systems satisfy requests generated at m client systems, whose generalstructure is shown below:


8/18/2019 L15-16_PPT_IVSem

9/55

ClientClient--Server Systems (Cont.)Server Systems (Cont.)

Database functionality can be divided into:

Back-end : manages access structures, query evaluation and, .

Front-end : consists of tools such as forms , report-writers , andgraphical user interface facilities.

- -through an application program interface.


8/18/2019 L15-16_PPT_IVSem

10/55

ClientClient--Server Systems (Cont.)Server Systems (Cont.)

Advantages of replacing mainframes with networks of workstations orpersonal computers connected to back-end server machines:

flexibility in locating resources and expanding facilitiesbetter user interfaces

eas er ma ntenance


8/18/2019 L15-16_PPT_IVSem

11/55

Server System ArchitectureServer System Architecture

Server systems can be broadly categorized into two kinds:transaction servers which are widely used in relational databasesystems, and

data servers , used in object-oriented database systems


8/18/2019 L15-16_PPT_IVSem

12/55

Transaction ServersTransaction Servers

Clients send requests to the server

Transactions are executed at the server

Results are shipped back to the client.Requests are specified in SQL, and communicated to the

.

Transactional RPC allows many RPC calls to form a

transaction.Open Database Connectivity (ODBC) is a C languageapplication program interface standard from Microsoft forconnecting to a server, sending SQL requests, and receiving

resu s.JDBC standard is similar to ODBC, for Java


8/18/2019 L15-16_PPT_IVSem

13/55

Transaction Server Process StructureTransaction Server Process Structure A typical transaction server consists of multiple processes accessingdata in shared memory.

Server processes

These receive user queries (transactions), execute them and sendresults back

Processes may be multithreaded , allowing a single process toexecute several user queries concurrently

Typically multiple multithreaded server processes

Lock manager process

More on this later

Database writer process


8/18/2019 L15-16_PPT_IVSem

14/55

Transaction Server Processes (Cont.)Transaction Server Processes (Cont.)

Log writer process

Server processes simply add log records to log record buffer

og wr ter process outputs og recor s to sta e storage.

Checkpoint process

Performs periodic checkpoints

Process monitor process

Monitors other processes, and takes recovery actions if any of the otherprocesses fail

E.g. aborting any transactions being executed by a server processand restarting it


8/18/2019 L15-16_PPT_IVSem

15/55

Transaction System Processes (Cont.)Transaction System Processes (Cont.)


8/18/2019 L15-16_PPT_IVSem

16/55

Transaction System Processes (Cont.)Transaction System Processes (Cont.)

Shared memory contains shared dataBuffer pool

Log buffer Cached query plans (reused if same query submitted again)

To ensure that no two processes are accessing the same data structureat the same time, databases systems implement mutual exclusionusing either

Operating system semaphores Atomic instructions such as test-and-set

To avoid overhead of inter rocess communication for lock request/grant, each database process operates directly on the locktable

instead of sending requests to lock manager process


Lock manager process still used for deadlock detection

8/18/2019 L15-16_PPT_IVSem

17/55

Data ServersData Servers

Used in high-speed LANs, in cases where

The clients are comparable in processing power to the server

.

Data are shipped to clients where processing is performed, and thenshipped results back to the server.

s arc tecture requ res u ac -en unct ona ty at t e c ents.

Used in many object-oriented database systems

Issues:Page-Shipping versus Item-Shipping

Locking

Lock Caching


8/18/2019 L15-16_PPT_IVSem

18/55

Data Servers (Cont.)Data Servers (Cont.)- -

Smaller unit of shipping ⇒ more messagesWorth prefetching related items along with requested item

LockingOverhead of requesting and getting locks from server is high due

Can grant locks on requested and prefetched items; with pageshipping, transaction is granted lock on whole page.Locks on a refetched item can be P called back b the server and returned by client transaction if the prefetched item has notbeen used.Locks on the page can be deescalated to locks on items in the

.then be returned to server.


8/18/2019 L15-16_PPT_IVSem

19/55

Data Servers (Cont.)Data Servers (Cont.)a a ac ng

Data can be cached at client even in between transactions

But check that data is up-to-date before it is used ( cache coherency )

Check can be done when requesting lock on data item

Lock Caching

Transactions can acquire cached locks locally, without contactingserver

request. Client returns lock once no local transaction is using it.

Similar to deescalation, but across transactions.


8/18/2019 L15-16_PPT_IVSem

20/55

Parallel SystemsParallel Systems

Parallel database systems consist of multiple processors and multipledisks connected by a fast interconnection network.

- powerful processors

A massively parallel or fine grain parallel machine utilizesthousands of smaller processors.

Two main performance measures:

throughput --- the number of tasks that can be completed in aiven time interval

response time --- the amount of time it takes to complete a singletask from the time it is submitted


8/18/2019 L15-16_PPT_IVSem

21/55

SpeedSpeed- -Up and ScaleUp and Scale- -UpUp

Speedup : a fixed-sized problem executing on a small system is givento a system which is N-times larger.

Measured b :

speedup = small system elapsed timelarge system elapsed time

pee up s near equa on equa s .

Scaleup : increase the size of both the problem and the system

N-times larger system used to perform N-times larger jobMeasured by:

scaleup = small system small problem elapsed time

Scale up is linear if equation equals 1.


8/18/2019 L15-16_PPT_IVSem

22/55

SpeedupSpeedup


Speedup

8/18/2019 L15-16_PPT_IVSem

23/55

ScaleupScaleup

Scaleu


8/18/2019 L15-16_PPT_IVSem

24/55

Batch and Transaction ScaleupBatch and Transaction Scaleup

Batch scaleup :

A single large job; typical of most decision support queries and

Use an N-times larger computer on N-times larger problem.Transaction scaleup :

umerous sma quer es su m tte y n epen ent users to ashared database; typical transaction processing and timesharingsystems.

- , -many requests) to an N-times larger database, on an N-timeslarger computer.

- .


8/18/2019 L15-16_PPT_IVSem

25/55

Factors Limiting Speedup and ScaleupFactors Limiting Speedup and Scaleup

Speedup and scaleup are often sublinear due to:

Startup costs : Cost of starting up multiple processes may dominate

Interference : Processes accessing shared resources (e.g.,systembus, disks, or locks) compete with each other, thus spending timewaiting on other processes, rather than performing useful work.

Skew : Increasing the degree of parallelism increases the variance inservice times of parallely executing tasks. Overall execution timedetermined by slowest of parallely executing tasks.


8/18/2019 L15-16_PPT_IVSem

26/55

Interconnection Network ArchitecturesInterconnection Network Architectures

Bus . System components send data on and receive data from asingle communication bus;

Does not scale well with increasing parallelism.Mesh . Components are arranged as nodes in a grid, and eachcomponent is connected to all adjacent components

Communication links grow with growing number of components,and so scales better.But may require 2 √n hops to send message to a node (or √n with

wraparound connections at edge of grid).y u . omponen s are num ere n nary; componen s are

connected to one another if their binary representations differ inexactly one bit.

n com onents are connected to lo n other com onents and can

reach each other via at most log(n) links; reduces communicationdelays.


8/18/2019 L15-16_PPT_IVSem

27/55

Interconnection ArchitecturesInterconnection Architectures


8/18/2019 L15-16_PPT_IVSem

28/55

Parallel Database ArchitecturesParallel Database Architectures

Shared memory -- processors share a common memory

Shared disk -- processors share a common disk

--

common diskHierarchical -- hybrid of the above architectures


8/18/2019 L15-16_PPT_IVSem

29/55

Parallel Database ArchitecturesParallel Database Architectures


8/18/2019 L15-16_PPT_IVSem

30/55

Shared MemoryShared Memory

Processors and disks have access to a common memory, typically viaa bus or through an interconnection network.

— shared memory can be accessed by any processor without having tomove it using software.

Downside – architecture is not scalable beyond 32 or 64 processorssince the bus or the interconnection network becomes a bottleneck

Widely used for lower degrees of parallelism (4 to 8) .


8/18/2019 L15-16_PPT_IVSem

31/55

Shared DiskShared Disk

All processors can directly access all disks via an interconnectionnetwork, but the processors have private memories.

Architecture provides a degree of fault-tolerance — if a processorfails, the other processors can take over its tasks since the databaseis resident on disks that are accessible from all rocessors.

Examples: IBM Sysplex and DEC clusters (now part of Compaq)running Rdb (now Oracle Rdb) were early commercial users

subsystem.

Shared-disk systems can scale to a somewhat larger number ofrocessors, but communication between rocessors is slower.


8/18/2019 L15-16_PPT_IVSem

32/55

Shared NothingShared Nothing

Node consists of a processor, memory, and one or more disks.Processors at one node communicate with another processor atanother node using an interconnection network. A node functions asthe server for the data on the disk or disks the node owns.

Examples: Teradata, Tandem, Oracle-n CUBE

Data accessed from local disks (and local memory accesses) do notpass through interconnection network, thereby minimizing theinterference of resource sharing.

Shared-nothing multiprocessors can be scaled up to thousands ofprocessors without interference.

Main drawback: cost of communication and non-local disk access;sending data involves software interaction at both ends.


8/18/2019 L15-16_PPT_IVSem

33/55

HierarchicalHierarchical

Combines characteristics of shared-memory, shared-disk, and shared-nothing architectures.

- – interconnection network, and do not share disks or memory with eachother.

Each node of the system could be a shared-memory system with afew processors.

Alternatively, each node could be a shared-disk system, and each of

the systems sharing a set of disks could be a shared-memory system.Reduce the complexity of programming such systems by distributedvirtual-memory architectures

Also called non-uniform memory architecture (NUMA)


8/18/2019 L15-16_PPT_IVSem

34/55

Distributed SystemsDistributed Systemsata sprea over mu t p e mac nes a so re erre to as s es or

nodes ).

Network interconnects the machines

Data shared by users on multiple machines


8/18/2019 L15-16_PPT_IVSem

35/55

Distributed DatabasesDistributed Databases

Homogeneous distributed databasesSame software/schema on all sites, data may be partitionedamong sitesGoal: provide a view of a single database, hiding details ofdistribution

Heterogeneous distributed databasesDifferent software/schema on different sitesGoal: integrate existing databases to provide useful functionality

Differentiate between local and global transactions A local transaction accesses data in the single site at which thetransaction was initiated.

A global transaction either accesses data in a site different frome one a w c e ransac on was n a e or accesses a a n

several different sites.


8/18/2019 L15-16_PPT_IVSem

36/55

TradeTrade- -offs in Distributed Systemsoffs in Distributed Systems

Sharing data – users at one site able to access the data residing atsome other sites.

–stored locally.

Higher system availability through redundancy — data can bereplicated at remote sites, and system can function even if a site fails.

Disadvantage: added complexity required to ensure propercoordination among sites.

Software develo ment cost.

Greater potential for bugs.

Increased processing overhead.


Implementation Issues for DistributedImplementation Issues for Distributed

8/18/2019 L15-16_PPT_IVSem

37/55

Implementation Issues for DistributedImplementation Issues for DistributedDatabasesDatabases

Atomicity needed even for transactions that update data at multiple sites

The two-phase commit protocol (2PC) is used to ensure atomicity

as c ea: eac s te executes transact on unt ust e ore comm t,

and the leaves final decision to a coordinator Each site must follow decision of coordinator, even if there is a failure

2PC is not always appropriate: other transaction models based onpersistent messaging, and workflows, are also used

s r u e concurrency con ro an ea oc e ec on requ re

Data items may be replicated to improve data availability

Details of above in Chapter 22


8/18/2019 L15-16_PPT_IVSem

38/55

Network TypesNetwork Types

Local-area networks ( LANs) – composed of processors that aredistributed over small geographical areas, such as a single building ora few adjacent buildings.

Wide-area networks ( WANs) – composed of processors distributed

over a large geographical area .


8/18/2019 L15-16_PPT_IVSem

39/55

Networks Types (Cont.)Networks Types (Cont.)

WANs with continuous connection (e.g. the Internet) are needed forimplementing distributed database systems

discontinuous connection:

Data is replicated.

.

Copies of data may be updated independently.

Non-serializable executions can thus result. Resolution is.


8/18/2019 L15-16_PPT_IVSem

40/55

Parallel DatabasesParallel Databases

Introduction

I/O Parallelism

Intraquery ParallelismIntraoperation Parallelism

Interoperation Parallelism

Design of Parallel Systems


8/18/2019 L15-16_PPT_IVSem

41/55

IntroductionIntroduction

Parallel machines are becoming quite common and affordablePrices of microprocessors, memory and disks have droppedsharplyRecent desktop computers feature multiple processors and thistrend is projected to accelerate

Databases are growing increasingly largelarge volumes of transaction data are collected and stored for lateranalysis.

multimedia objects like images are increasingly stored ina a asesLarge-scale parallel database systems increasingly used for:

storing large volumes of dataprocessing time-consuming decision-support queriesproviding high throughput for transaction processing


8/18/2019 L15-16_PPT_IVSem

42/55

Parallelism in DatabasesParallelism in Databases

Data can be partitioned across multiple disks for parallel I/O.

Individual relational operations (e.g., sort, join, aggregation) can be

data can be partitioned and each processor can workindependently on its own partition.

,relational algebra)

makes parallelization easier.

.Concurrency control takes care of conflicts.

Thus, databases naturally lend themselves to parallelism.


ll lll l

8/18/2019 L15-16_PPT_IVSem

43/55

I/O ParallelismI/O Parallelism

Reduce the time required to retrieve relations from disk by partitioning

the relations on multiple disks.

or zon a par on ng – up es o a re a on are v e among many s ssuch that each tuple resides on one disk.

Partitioning techniques (number of disks = n):

Round-robin :

Send the ith tuple inserted in the relation to disk i mod n.

Hash partitioning :Choose one or more attributes as the partitioning attributes.

Choose hash function h with range 0… n - 1

value of a tuple. Send tuple to disk i.


I/O P ll li (C )I/O P ll li (C )

8/18/2019 L15-16_PPT_IVSem

44/55

I/O Parallelism (Cont.)I/O Parallelism (Cont.)

Partitioning techniques (cont.):

Range parti tioning :

oose an a r u e as e par on ng a r u e.

A partitioning vector [ vo, v1, ..., vn-2] is chosen.Let v be the partitioning attribute value of a tuple. Tuples such thatvi ≤ vi+1 go to disk I + 1. Tuples with v < v0 go to disk 0 and tupleswith v ≥ vn-2 go to disk n-1.

E.g., with a partitioning vector [5,11], a tuple with partitioning attributeva ue o w go o s , a up e w va ue w go o s ,while a tuple with value 20 will go to disk2.


C i f P i i i T h iC i f P i i i T h i

8/18/2019 L15-16_PPT_IVSem

45/55

Comparison of Partitioning TechniquesComparison of Partitioning Techniques

of data access:

1.Scanning the entire relation.

. – .

E.g., r.A = 25.3.Locating all tuples such that the value of a given attribute lies within aspec e range – range quer es .

E.g., 10 ≤ r.A < 25.


Comparison of Partitioning Techniques (Cont )Comparison of Partitioning Techniques (Cont )

8/18/2019 L15-16_PPT_IVSem

46/55

Comparison of Parti tioning Techniques (Cont.)Comparison of Parti tioning Techniques (Cont.)

Round robin:

Advantages

es su e or sequen a scan o en re re a on on eac query.

All disks have almost an equal number of tuples; retrieval work isthus well balanced between disks.

Range queries are difficult to process

No clustering -- tuples are scattered across all disks


Comparison of Partitioning Techniques(Cont )Comparison of Partitioning Techniques(Cont )

8/18/2019 L15-16_PPT_IVSem

47/55

Comparison of Parti tioning Techniques(Cont.)Comparison of Parti tioning Techniques(Cont.)

Hash partitioning:

Good for sequential access

Assuming hash function is good, and partitioning attributes form akey, tuples will be equally distributed between disks

Retrieval work is then well balanced between disks.

Good for point queries on partitioning attribute

,other queries.

Index on partitioning attribute can be local to disk, making lookupand u date more efficient

No clustering, so difficult to answer range queries


Comparison of Partitioning Techniques (Cont )Comparison of Partitioning Techniques (Cont )

8/18/2019 L15-16_PPT_IVSem

48/55

Comparison of Parti tioning Techniques (Cont.)Comparison of Parti tioning Techniques (Cont.)

Range partitioning:

Provides data clustering by partitioning attribute value.

oo or sequen a access

Good for point queries on partitioning attribute: only one disk needs tobe accessed.

For range queries on partitioning attribute, one to a few disks may needto be accessed

Remaining disks are available for other queries.

Good if result tuples are from one to a few blocks.

If many blocks are to be fetched, they are still fetched from one to afew disks, and potential parallelism in disk access is wasted

Example of execution skew.


P rtitioning Rel tion cross DisksP rtitioning Rel tion cross Disks

8/18/2019 L15-16_PPT_IVSem

49/55

Partitioning a Relation across DisksPartitioning a Relation across Disks

If a relation contains only a few tuples which will fit into a single diskblock, then assign the relation to a single disk.

disks.

If a relation consists of m disk blocks and there are n disks available inthe s stem, then the relation should be allocated min m,n disks.


Handling of SkewHandling of Skew

8/18/2019 L15-16_PPT_IVSem

50/55

Handling of SkewHandling of Skew

The distribution of tuples to disks may be skewed — that is, somedisks have many tuples, while others may have fewer tuples.

Attribute-value skew.

Some values appear in the partitioning attributes of many

attribute end up in the same partition.

Can occur with range-partitioning and hash-partitioning.

.

With range-partitioning, badly chosen partition vector mayassign too many tuples to some partitions and too few to

.

Less likely with hash-partitioning if a good hash-function ischosen.


Handling Skew in RangeHandling Skew in Range--PartitioningPartitioning

8/18/2019 L15-16_PPT_IVSem

51/55

g gg g g g

To create a balanced partitioning vector (assuming partitioning attributeforms a key of the relation):

.

Construct the partition vector by scanning the relation in sorted orderas follows.

th ,partitioning attribute of the next tuple is added to the partitionvector.

n denotes the number of artitions to be constructed. Duplicate entries or imbalances can result if duplicates are present inpartitioning attributes.


Handling Skew Using Virtual ProcessorHandling Skew Using Virtual Processor

8/18/2019 L15-16_PPT_IVSem

52/55

g gg gPartitioninPartitionin

Skew in range partitioning can be handled elegantly using virtualprocessor partitioning :

of processors)

Assign virtual processors to partitions either in round-robin fashionor based on estimated cost of processing each virtual partition

Basic idea:

If any normal partition would have been skewed, it is very likelythe skew is s read over a number of virtual artitionsSkewed virtual partitions get spread across a number ofprocessors, so work gets distributed evenly!


Interquery ParallelismInterquery Parallelism

8/18/2019 L15-16_PPT_IVSem

53/55

Interquery ParallelismInterquery Parallelism

Queries/transactions execute in parallel with one another.

Increases transaction throughput; used primarily to scale up a

transactions per second.

Easiest form of parallelism to support, particularly in a shared-memoryarallel database, because even se uential database s stems su ort

concurrent processing.

More complicated to implement on shared-disk or shared-nothing

architecturesLocking and logging must be coordinated by passing messagesbetween processors.

Data in a local buffer may have been updated at another processor.

Cache-coherency has to be maintained — reads and writes of datain buffer must find latest version of data.


Cache Coherency ProtocolCache Coherency Protocol

8/18/2019 L15-16_PPT_IVSem

54/55

Cache Coherency ProtocolCache Coherency Protocol

Example of a cache coherency protocol for shared disk systems:

Before reading/writing to a page, the page must be locked in.

On locking a page, the page must be read from disk

Before unlocking a page, the page must be written to disk if it was.

More complex protocols with fewer disk reads/writes exist.

Cache coherency protocols for shared-nothing systems are similar..

the page or write it to disk are sent to the home processor.


Intraquery ParallelismIntraquery Parallelism

8/18/2019 L15-16_PPT_IVSem

55/55

Intraquery ParallelismIntraquery Parallelism

Execution of a single query in parallel on multiple processors/disks;important for speeding up long-running queries.

Intraoperation Parallelism – parallelize the execution of eachindividual operation in the query.

–query expression in parallel.

the first form scales better with increasing parallelism becausethe number of tu les rocessed b each o eration is t icall more than the number of operations in a query


L15-16_PPT_IVSem

Documents