Top Banner
1 Scalable Distributed Database System: SD-SQL Server Soror SAHRI Ceria, Paris-Dauphine University [email protected] Journées Académiques Microsoft Paris, April 2006
31
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: JA2006.ppt

1

Scalable Distributed Database System: SD-SQL Server

Soror SAHRICeria, Paris-Dauphine University

[email protected]

Journées Académiques MicrosoftParis, April 2006

Page 2: JA2006.ppt

2

Overview

Objective SD-SQL Server Architecture Application Interface Implementation Performance Analysis Conclusion

Page 3: JA2006.ppt

3

Objective

Most of DBSs have distributed/parallel versions SQL Server, Oracle, DB2…

DBSs do not provide dynamic partitioning of relational tables All require manual (distributed) repartitioning when tables scale-up

Page 4: JA2006.ppt

4

Objective

http://ceria.dauphine.fr/CERIA-publications.html Research Report, December 2005

[Oracle Database 10g]

Page 5: JA2006.ppt

5

Objective

Scalable Distributed Partitioning of Relational Tables

Scalable Distributed Database System (SD-DBS)

Litwin, Rich & Schwartz “Architecture for a scalable Distributed DBSs”

2nd Intl. Workshop on Cooperative Internet Computing, 2002

Page 6: JA2006.ppt

6

SD-SQL Server

SD-SQL Server is the first prototype SD-DBSGeneralizes SQL Server

Shared Nothing Architecture Embeds SQL Server 2000 (at present) Up to 250 nodes (at present)

Page 7: JA2006.ppt

7

SD-SQL Server Architecture

LinkedSQL

Servers

D1 D2 Di Di+1

S S PC D1_T

_D1_T _D1_T

User/Application

T

sd_create_table

User/Application

sd_insert

_D1_TSplit

NDBs

SD-SQLserver

SD-SQLserver

SD-SQLclient

SD-SQL Server

Managers

SD-SQLpeer

Page 8: JA2006.ppt

8

Nodes, NDBs and SDBs

DB1 SDB

Node1 Node2 Node3 Nodei

DB1

……

DB1 DB1

DB2 SDB

DB2 DB2

MDB

Page 9: JA2006.ppt

9

NDB Types

An SD-SQL Server NDB can be: Client NDB

Carries only images Application interfaces

Server NDB Carries only the segments

Peer NDB Both functions

Primary NDB First created for an SDB

Page 10: JA2006.ppt

10

Scalable Tables

A scalable (distributed) table is a collection of segments Segments are SQL tables

A scalable table has, initially, only one primary segment At some server or peer NDB

The number of segments in a scalable table is variable Overflowing segments split

Page 11: JA2006.ppt

11

Splitting

A split occurs when an insert overflows the segment capacity

Splits produce other segments for a scalable table. Each is located at a different NDB

Within the SDB

If there is not enough NDBs, splits dynamically append new ones

Page 12: JA2006.ppt

12

SDB DB1SDB DB1SDB DB1

Splitting

Scalable Table T

sd_insert

N1 N2 N4N3

NDBDB1

NDBDB1

NDBDB1

sd_insert

NDBDB1

Ni

sd_create_node

sd_insert

N3

NDBDB1

sd_create_node_database

NDBDB1

…….

sd_create_node_database

Page 13: JA2006.ppt

13

Splitting

S

b+1

S S1

pb+1-p

p=INT(b/2)

C( S)= { c: c < h = c (b+1-p)}

C( S1)={c: c > = c (b+1-p)}

Check Constraint?

b

SELECT TOP Pi * INTO Ni.Si FROM S ORDER BY C ASCSELECT TOP Pi * WITH TIES INTO Ni.S1 FROM S ORDER BY C ASC

Page 14: JA2006.ppt

14

Images

Images hide the scalable table segments An image is a distributed updateable partitioned

view of a scalable table Union-all view with check constraints

An image presents the scalable table partitioning It do not address any new segments resulted from a

split

Page 15: JA2006.ppt

15

Images

Primary image Resides at the creation node Has the name of the scalable table

Secondary images Reside at other client or peer NDBs of the SDB Have a specific name, other than that of the

table To avoid name conflict

Scalable Views Carry images at several levels

Page 16: JA2006.ppt

16

Images

DB1 SDB

N1.DB1 N2.DB1 N3.DB1

T Scalable Table

PrimaryImage

CREATE VIEW T AS SELECT * FROM N1.DB1._Ni_TCREATE VIEW T AS SELECT * FROM N1.DB1._Ni_T UNION ALL SELECT * FROM N2.DB1._Ni_T UNION ALL SELECT * FROM N3.DB1._Ni_T

Page 17: JA2006.ppt

17

Application Interface

Applications manipulate SD-SQL Server through its command interface.

SD-SQL Server commands start with ‘sd_’ To distinguish from SQL Server commands for static

tables.

INSERT sd_insert

CREATE TABLE sd_create_table

Page 18: JA2006.ppt

18

Application Interface:The SkyServer Benchmark

Use of the SkyServer DB as benchmark http://research.microsoft.com/~gray/SDSS Use of the PhotoObj table as a scalable table. PhotoObj has 158,426 tuples (about 260 MB)

Page 19: JA2006.ppt

19

Application Interface: Node Management

Node Creation sd_create_node ‘Dell1’ /* Server by default */ sd_create_node ‘Ceria’, ‘client’

Node Alteration sd_alter_node ‘Ceria’, ‘ADD server’ /* Becomes peer*/

Node Removal sd_drop_node ‘Ceria’

Page 20: JA2006.ppt

20

Application Interface: NDBs &SDBs Management

SDB Creation sd_create_scalable_database

‘SkyServer’, ‘Dell1’, ‘Server’, 2/* Creates the primary SkyServer NDB as well at Dell1*/

NDB Creation sd_create_node_database

‘SkyServer’, ‘Ceria’, ‘Client’ SDB Removal

sd_drop_scalable_database ‘SkyServer’

Page 21: JA2006.ppt

21

Application Interface: Scalable Tables Management

Scalable Table Creation sd_create_table ‘PhotoObj (objid BIGINT PRIMARY

KEY…)’, 10000 Scalable Table Alteration

sd_alter_table ‘PhotoObj ADD t INT’, 1000 sd_create_index ‘run_index ON Photoobj (run)’ sd_drop_index ‘PhotoObj.run_index’

Scalable Table Removal sd_drop_table ‘PhotoObj’

Page 22: JA2006.ppt

22

Application Interface: Scalable Tables Management

Insert sd_insert ‘INTO PhotoObj SELECT * FROM

Ceria5.Skyserver-S.PhotoObj’ Search

sd_select ‘* FROM PhotoObj’ sd_select ‘TOP 5000 * INTO PhotoObj1 FROM

PhotoObj’, 500

Page 23: JA2006.ppt

23

Application Interface

See “Architecture and Interface of Scalable Distributed Database System SD-SQL Server”, IASTED-DBA 2006

Page 24: JA2006.ppt

24

Implementation

Commands are distributed stored procedures Processed as distributed transactions at Repeatable

Read isolation level See

“ Prototyping a Scalable Distributed Database System SD-SQL Server”, WDAS 2006

Page 25: JA2006.ppt

25

Performance Analysis (1)

0

500

1000

1500

2000

39500 79000 158000

Segment Capacity (tuples)

Exe

cuti

on

Tim

e (m

s)

SD-SQL Server Peer With IA SD-SQL Server Peer

SQL Server Peer SD-SQL Server Client With IA

SD-SQL Server Client SQL Server Client

Query (Q1) execution time

(Q1) sd_select ‘COUNT (*) FROM PhotoObj’

Page 26: JA2006.ppt

26

Performance Analysis (2)

9862.4

11578.6610898

13036.5 14071.66

11712

0

2000

4000

6000

8000

10000

12000

14000

16000

2 3 4

Number of Segments

Execu

tio

n T

ime (

ms)

IA IC

Query (Q2) with image checking only (IC) and with image adjustment (IA)

(Q2) sd_select ‘TOP 10000 x.objid FROM photoobj x, photoobj y WHERE x.obj=y.obj AND x.objid>y.objid’

Page 27: JA2006.ppt

27

Performance Analysis (3)

sd_select ‘COUNT (*) FROM PhotoObj’

326250

220156

93

343256

226164

106

436

356

93

203283

16 76123

203 220

0100200300400500

1 2 3 4 5

Number of Segments

Exe

cutio

n T

ime

(ms)

SQL Server-Distr SD-SQL Server

SQL Server-Centr. SD-SQL Server LSV

Comparison between SD-SQL Server and SQL Server

Page 28: JA2006.ppt

28

Conclusion

Scalable distributed partitioning of relational tables is now reality with SD-SQL Server No manual repartitioning

Unlike in any other DBS we know about

Performance analysis proves Efficiency of our design Immediate utility of SD-SQL Server

Witold Litwin
Acknowledgments svp comem chez Riad + Jim Gray pour SkyServerCurrent et future work ? N'oubliez pas EGov svp, vous etes payee par ce projet.
Page 29: JA2006.ppt

29

Future Works

Management of replication, data failures.. Non-availability of a server….

Management of the data merging

Application on other DBMSs Oracle, DB2….

Witold Litwin
Acknowledgments svp comem chez Riad + Jim Gray pour SkyServerCurrent et future work ? N'oubliez pas EGov svp, vous etes payee par ce projet.
Page 30: JA2006.ppt

30

Work partly supported by

CEE Project EGov MS Research CEE Project ICONS IBM Almaden Res. Cntr., CA HP Laboratories, CA

Page 31: JA2006.ppt

31

Thank You

Questions?

Soror SAHRICeria, Paris-Dauphine University

[email protected]

Journées Académiques MicrosoftParis, April 2006