Top Banner
Inside Database System Takashi HOSHINO Cybozu Labs 1
26

Inside database

Jun 22, 2015

Download

Technology

Takashi Hoshino

An introduction to database management systems.
Use internal study meeting named "DATABASE NO KIMOCHI WO SHIRU KAI" inside Cybozu.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Inside database

Inside Database System

Takashi HOSHINOCybozu Labs

1

Page 2: Inside database

Overview

• Control/data Flow• DBMS– Query Processor– Storage Engine• Transaction Management• Buffer Cache Management• Data Structures

• Storage

2

Page 3: Inside database

Control/Data Flow

ApplicationApplication

DBMSDBMS

OSOS

StorageStorage

SQL/Records

RW/Blocks

3

Page 4: Inside database

DBMS

Query ProcessorQuery Processor

Storage EngineStorage Engine

4

Page 5: Inside database

Query Processor

Hector 2010

parse

convert

apply laws

estimate result sizes

consider physical plans estimate costs

pick best

execute

{P1,P2,…..}

{(P1,C1),(P2,C2)...}

Pi

answer

SQL query

parse tree

logical query plan

“improved” l.q.p

l.q.p. +sizes

statistics

5

Page 6: Inside database

Query Plan Example

Hector 2010

B,D

R.A = “c” S.E = 2

R S

natural join

6

Page 7: Inside database

Which Plan is Good?

Hector 2010

R S

T

T R

S

S T

R

7

Page 8: Inside database

Storage Engine

TransactionManagementTransaction

ManagementBuffer CacheManagementBuffer CacheManagement

Data StructuresData Structures

8

Page 9: Inside database

Transaction Management

• Keep ACID property of data– Atomicity– Consistency– Isolation– Durability

• Concurrency Control• Logging & Recovery

9

Page 10: Inside database

Concurrency Control by Locking

• Target resources– Database– Table– Block– Record

• Locking algorithm– Shard/exclusive lock– Intention lock for fine granularity

10

Page 11: Inside database

Shared/Exclusive Lock

• S: shared lock for read• X: exclusive lock for write

SS

Trn 1Trn 1

Trn 2Trn 2

Trn 3Trn 3

XX

Trn 1Trn 1

Trn 2Trn 2

Trn 3Trn 3

11

Page 12: Inside database

Intention Lock

O O O _

_

_

____

O_O

_OO

IS IX S X

IS

IXSX

http://dev.mysql.com/doc/refman/5.5/en/innodb-lock-modes.html

IXIX

IXIX ISIS

XX SS

12

Page 13: Inside database

Logging with Redo Log

Hector 2010

T1: Read(A,t); t t2; write (A,t); Read(B,t); t t2; write (B,t);

Output(A); Output(B)

A: 8B: 8

A: 8B: 8

memory DB

LOG

1616

<T1, start><T1, A, 16><T1, B, 16>

<T1, commit>

<T1, end>

output

1616

13

Page 14: Inside database

Buffer Cache Management

• Allowance of dirty cache– No: write through– Yes: write back

• Eviction strategy– LRU: least recently used– …

• Prefetch– Sequential– …

14

Page 15: Inside database

Data Structures

DictionaryDictionary

TableTable

IndexIndex …

LogLogLogLogLogLog

TableTable

IndexIndex

StatisticsStatistics

15

Page 16: Inside database

Inside Data Block

R3

R4

R1 R2

Hector 2010

Header

Free space

16

Page 17: Inside database

Structures for Index

Hash FunctionHash Function

Tree Hash

17

Page 18: Inside database

B+tree Example

Hector 2010

Root

100

120

150

180

30

3 5 11 30 35 100

101

110

120

130

150

156

179

180

200

18

Page 19: Inside database

Hash Example

Hector 2010

INSERT:h(a) = 1h(b) = 2h(c) = 1h(d) = 0

0

1

2

3

d

a

c

b

h(e) = 1

e

19

Page 20: Inside database

Tree vs Hash for Indexing

• Tree– O(log N) for single record retrieval– Efficient range scan is available

• Hash– O(1) for single record retrieval– Range scan is not supported

20

Page 21: Inside database

Storage

Hard Disk Drive Solid State Drive

RAID StorageStorage Unit

ControllerControllerCacheCache

SCSI Protocol Stack/HBA DriversSCSI Protocol Stack/HBA Drivers

Buffer Cache ManagerBuffer Cache ManagerFile SystemFile System

Logical Unit/Software RAID ManagerLogical Unit/Software RAID Manager

OS Functionalities for Storage IO

ControllerControllerCacheCache

ControllerControllerCacheCache

21

Page 22: Inside database

Hard Disk Drive

TrackSector

Disk Platter

transferrotationheadseekaccess TTTT

Lseek size lseek size

Small lseek Large lseek (smoothed)

IO R

espo

nse

IO R

espo

nse

22

Page 23: Inside database

Summary

• DBMS– Query Processor– Storage Engine

• Storage

23

Page 24: Inside database

References

• Database System ImplementationLecture notes at Stanford University.– http://infolab.stanford.edu/~ullman/dbsi.html

• MySQL InnoDB Internal – http://www.innodb.com/wp/wp-content/uploads/

2009/05/innodb-file-formats-and-source-code-structure.pdf

• MySQL Reference Manual– http://dev.mysql.com/doc/

24

Page 25: Inside database

For Further Study

• Fundamentals of Database systems– http://www.amazon.com/Fundamentals-Database-

Systems-Ramez-Elmasri/dp/0136086209

• Books recommended by Leo’s Chronicle– http://leoclock.blogspot.com/2009/01/blog-post_07.html

25

Page 26: Inside database

Fundamentals of Database Systems

26