Top Banner
Scaling HDFS with a Strongly Consistent Relational Model for Metadata Kamal Hakimzadeh, Hooman Peiro Sajjad, Jim Dowling (mahh, shps, jdowling)@kth.se DAIS 2014
14

Scaling HDFS with a Strongly Consistent Relational Model for Metadata

Apr 12, 2017

Download

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Scaling HDFS with a Strongly Consistent Relational Model for Metadata

Scaling HDFS with a Strongly Consistent Relational Model for Metadata

Kamal Hakimzadeh,Hooman Peiro Sajjad,

Jim Dowling (mahh, shps, jdowling)@kth.se

DAIS 2014

Page 2: Scaling HDFS with a Strongly Consistent Relational Model for Metadata

I-node File Systems

Kamal Hakimzadeh, DAIS 2014

File Set

File Info

Pointers

File Info

Pointers

File Info

Pointers

I-nodes Blocks

Page 3: Scaling HDFS with a Strongly Consistent Relational Model for Metadata

Kamal Hakimzadeh, DAIS 2014

Hadoop Distributed File System (HDFS)

File Info

Pointers

File Info

Pointers

File Info

Pointers

File Info

Pointers

File Info

Pointers…I- node

sBl

ocks

NameNode (NN)

DateNode (DN) DateNode DateNode DateNode

Commodity Machines

Page 4: Scaling HDFS with a Strongly Consistent Relational Model for Metadata

High Availability in HDFS 2.0

DN DN DN DN

NNActive

NNStandby

JN JN JN

Shared NNlog stored inquorum of

journal nodes

NN

Checkpt NN

ZK ZK ZK

Master-Slave

Replicationof NN State.

Agreement on the Active Master

Faster Recovery,Cut Journal Log

Kamal Hakimzadeh, DAIS 2014

Page 5: Scaling HDFS with a Strongly Consistent Relational Model for Metadata

Kamal Hakimzadeh, DAIS 2014

NameNode Limitations and Tradeoffs

1. 60 GB JVM heap for NN

• Compression, larger blocks

2. Operation reorder in failures

3. Single writer concurrency model

4. HA consensus overhead

100 M files ≈ 10 PB

65 M files ≈ 21 PB

Eventual Consistent

Poor throughput

Page 6: Scaling HDFS with a Strongly Consistent Relational Model for Metadata

Move Metadata into Distributed DataBase

DN DN DN DN

Stateless NN

NDB

Up to 48 nodesMySQL Cluster

• Distributed, Replicated, In-Memory Database

• Transaction support • Read-committed isolation

level• Row-level locks• 17.6 M tx/sec.

Kamal Hakimzadeh, DAIS 2014

Page 7: Scaling HDFS with a Strongly Consistent Relational Model for Metadata

Kamal Hakimzadeh, DAIS 2014

Metadata Consistency

Objective: Strongly Consistent Metadata

1. Transaction per each Metadata Operation2. Read committed Isolation Level3. Row-level Locking

Seriablizable Isolation Level ≈ Strongly Consistent Model

HDFS Uses System Level Lock = Single Writer Concurrency Model

Page 8: Scaling HDFS with a Strongly Consistent Relational Model for Metadata

Kamal Hakimzadeh, DAIS 2014

HDFS Metadata

Page 9: Scaling HDFS with a Strongly Consistent Relational Model for Metadata

Kamal Hakimzadeh, DAIS 2014

Order of Locks in the DAG of Metadata

Metadata Operations:

1. Path Operation

2. Block Operation

3. Lease Operation

Conflicting Lock OrderTotal Order Locking

Locking Issues

1. Range Queries

2. Semantically Related Objects

3. Lock Upgrade

Implicit Sub-tree lock

Strongest Required Lock

Page 10: Scaling HDFS with a Strongly Consistent Relational Model for Metadata

Kamal Hakimzadeh, DAIS 2014

Scale of Capacity

48 Nodes NDB Cluster12 TB

• NDB: 3 TB, replication factor 2• File: 2 blocks, 3 replicas

HDFS: 100M files Our Solution: 4.1B files

Factor of 40

Page 11: Scaling HDFS with a Strongly Consistent Relational Model for Metadata

Kamal Hakimzadeh, DAIS 2014

Row-level lock throughput impact

Open Operation (Shared lock) Create Operation (Exclusive Lock)

Page 12: Scaling HDFS with a Strongly Consistent Relational Model for Metadata

Kamal Hakimzadeh, DAIS 2014

Improvement: Snapshotting

Page 13: Scaling HDFS with a Strongly Consistent Relational Model for Metadata

Kamal Hakimzadeh, DAIS 2014

Page 14: Scaling HDFS with a Strongly Consistent Relational Model for Metadata

Kamal Hakimzadeh, DAIS 2014