Top Banner
Thanks to our Sponsors! To connect to wireless 1. Choose Uguest in the wireless list 2. Open a browser. This will open a Uof U website 3. Choose Login
21

Thanks to our Sponsors! To connect to wireless 1. Choose Uguest in the wireless list 2. Open a browser. This will open a Uof U website 3. Choose Login.

Dec 24, 2015

Download

Documents

Sophia Bishop
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Thanks to our Sponsors! To connect to wireless 1. Choose Uguest in the wireless list 2. Open a browser. This will open a Uof U website 3. Choose Login.

Thanks to our Sponsors!

To connect to wireless 1. Choose Uguest in the wireless list

2. Open a browser. This will open a Uof U website 3. Choose Login

Page 2: Thanks to our Sponsors! To connect to wireless 1. Choose Uguest in the wireless list 2. Open a browser. This will open a Uof U website 3. Choose Login.

Introduction to

Giri VislawathSenior Software Developer

Overstock.com

[email protected]

Page 3: Thanks to our Sponsors! To connect to wireless 1. Choose Uguest in the wireless list 2. Open a browser. This will open a Uof U website 3. Choose Login.

Agenda

• What is HBase ? – What HBase is NOT?

• Relational Database vs HBase• HBase

– Architecture– Data Model– Logical & Physical View– Design Considerations– Setup– Clients

• Demo• Q & A

Page 4: Thanks to our Sponsors! To connect to wireless 1. Choose Uguest in the wireless list 2. Open a browser. This will open a Uof U website 3. Choose Login.

What is HBase?

• Open source Apache project• Non-relational, distributed Database• Runs on top of HDFS• Modeled after Google’s BigTable technology• Written in Java• NoSQL (Not Only SQL) Database• Consistent and Partition tolerant• Runs on commodity hardware• Large Database ( terabytes to petabytes).• Low latency random read / write to HDFS.• Many companies are using HBase

– Facebook, Twitter, Adobe, Mozilla, Yahoo!, Trend Micro, and StumbleUpon

Page 5: Thanks to our Sponsors! To connect to wireless 1. Choose Uguest in the wireless list 2. Open a browser. This will open a Uof U website 3. Choose Login.

HBase is NOT• A direct replacement for RDBMS • ACID (Atomicity, Consistency, Isolation, and Durability)

complaint– HBase provides row-level atomicity– A scan is NOT consistent view of a table (neither isolated)– All visible data is also durable data.

Page 6: Thanks to our Sponsors! To connect to wireless 1. Choose Uguest in the wireless list 2. Open a browser. This will open a Uof U website 3. Choose Login.

Relational Database vs HBase• Hardware

– Expensive Enterprise multiprocessor systems– Same as Hadoop

• Fault Tolerance– RDBMS are configured with high availability. Server down time

intolerable.– Built into the architecture. Individual Node failure does not

impact overall performance.• Database Size

– RDBMS can hold upto TBs (Tera bytes)– Hbase can hold PBs (Peta bytes)

• Data Layout– RDBMS are rows and columns oriented– Hbase is Column oriented

Page 7: Thanks to our Sponsors! To connect to wireless 1. Choose Uguest in the wireless list 2. Open a browser. This will open a Uof U website 3. Choose Login.

Relational Database vs HBase• Data Type

– Rich data type.– Bytes

• Transactions– Fully ACID complaint.– ACID on single row only.

• Indexes– PK, FK and other indexes.– Sorted Row-key (not a real index)

Page 8: Thanks to our Sponsors! To connect to wireless 1. Choose Uguest in the wireless list 2. Open a browser. This will open a Uof U website 3. Choose Login.

HBase Architecture

Client

Zookeeper

Master

Region Server 2

Region Server 3

Region Server 1

HDFS / Hadoop

Page 9: Thanks to our Sponsors! To connect to wireless 1. Choose Uguest in the wireless list 2. Open a browser. This will open a Uof U website 3. Choose Login.

HBase – Fault Tolerance

• What if region server dies?– The hbase master will assign a new regionserver.

• What if maser dies?– The back up master will take over.

• What if the backup master dies?– You are dead.

• Replication of Data– HBase achieves this using HDFS replication

mechanism.• Failure Detection

– Zookeeper is used for identifying failed region servers.

9

Page 10: Thanks to our Sponsors! To connect to wireless 1. Choose Uguest in the wireless list 2. Open a browser. This will open a Uof U website 3. Choose Login.

HBase Data Model• No Schema• Table

– Row-key must be unique– Rows are formed by one or more columns– Columns are grouped into Column Families – Column Families must be defined at table creation time– Any number of Columns per column family– Columns can be added on the fly– Columns can be NULL

• NULL columns are NOT stored (free of cost)• Column only exist when inserted (Sparse)

• Cell– Row Key, Column Family, Qualifier , Timestamp / Version

• Data represented in byte array– Table name, Column Family name, Column name

Page 11: Thanks to our Sponsors! To connect to wireless 1. Choose Uguest in the wireless list 2. Open a browser. This will open a Uof U website 3. Choose Login.

HBase – Logical View of Data

ID (pk) First Name

Last Name tweet Timestamp

1234 John Smith hello 20130710

5678 Joe Brown xyz 20120825

5678 Joe Brown zzz 20130916

Row key Value (Column Family, Qualifier, Version)

1234 Info{‘lastName’: ‘Smith’, ‘firstName’:’John’}pwd{‘tweet’:’hello’ @ts 20130710}

5678 Info{‘lastName’: ‘Brown’, ‘firstName’:’Joe’}pwd{‘tweet’:’xyz’ @ts 20120825, ‘tweet’:’zzz’ @ts 20130916}

RDBMS View

Logical Hbase View

Page 12: Thanks to our Sponsors! To connect to wireless 1. Choose Uguest in the wireless list 2. Open a browser. This will open a Uof U website 3. Choose Login.

HBase – Physical View of Data

Row key Column Family:Column Timestamp Value

1234 info:fn 12345678 John

1234 Info:ln 12345678 Smith

5678 Info:fn 12345679 Joe

5678 Info:ln 12345679 Brown

Info column family

Row key Column Family:Column Timestamp Value

1234 tweet:msg 12345678 Hello

5678 tweet:msg 12345679 xyz

5678 tweet:msg 12345999 zzz

tweet column family

KEY (ROW KEY, CF, QUALIFIER, TIMESTAMP) => VALUE

Page 13: Thanks to our Sponsors! To connect to wireless 1. Choose Uguest in the wireless list 2. Open a browser. This will open a Uof U website 3. Choose Login.

Hbase – Logical to Physical View

Row C1 C2 C3 C4 C5 C6 C7

ROW1 V1 V3 V6

ROW2 V4 V6 V7

ROW3 V6 V5

ROW4 V10 V11 V2

CF1 CF2

HFile for CF1 HFile for CF2

ROW1:CF1:C1:V1ROW1:CF1:C3:V3ROW2:CF1:C1:V4ROW2:CF1:C2:V6ROW2:CF1:C4:V7ROW3:CF1:C3:V6ROW4:CF1:C1:V10ROW4:CF1:C3:V11

ROW1:CF2:C6:V6ROW3:CF2:C6:V5ROW4:CF2:C6:V2

Physical View

Page 14: Thanks to our Sponsors! To connect to wireless 1. Choose Uguest in the wireless list 2. Open a browser. This will open a Uof U website 3. Choose Login.

Design Considerations• Row Key design

– To Leverage Hbase system, row-key design is very important– Row Key must be designed based on how you access data.– Salting rowkey (prefix)– Must be designed to make sure data uniformly distributed (Avoid

hotspotting)• Column Family design

– Designed based on grouping of like information (user base info, user tweets)

– Short name for column family (every row in Hfile contains the name, in bytes)

– Two to three column families per Table

Page 15: Thanks to our Sponsors! To connect to wireless 1. Choose Uguest in the wireless list 2. Open a browser. This will open a Uof U website 3. Choose Login.

Hbase - Setup• HBase is written in Java• HBase Shell is based on JRuby’s IRB (interactive ruby shell)• Download HBase from https://hbase.apache.org/• Latest stable version is 0.94.17• Hbase

– Standalone• $HBASE_HOME/bin/start-hbase.sh• $HBASE_HOME/bin/stop-hbase.sh• $HBASE_HOME/bin/hbase shell

– Single Node Cluster mode (pseudo)• Cloudera VM (on VMPlayer or VirtualBox)

(www.cloudera.com)

Page 16: Thanks to our Sponsors! To connect to wireless 1. Choose Uguest in the wireless list 2. Open a browser. This will open a Uof U website 3. Choose Login.

HBase – Clients• Program / API based clients

– Java, REST, Thrift, Avro• Batch Clients

– MapReduce (Pig, Hive)• Shell

– Command Line Interface– Supports Client and Administrative operations.

• Web-based UI– HUI (Hbase cluster UI)

Page 17: Thanks to our Sponsors! To connect to wireless 1. Choose Uguest in the wireless list 2. Open a browser. This will open a Uof U website 3. Choose Login.

Hbase – Shell (commands)Command Description

list Shows list of tables

create ‘users’, ‘info’ Creates users table with a single column family name info.

put ‘users’, ‘row1’, ‘info:fn’, ‘John’

Inserts data into users table and column family info.

get ‘users’, ‘row1’ Retrieve a row for a given row key

scan ‘users’ Iterate through table users

disable ‘users’drop ‘users’

Delete a table (requires disabling table)

CRUD explainedCREATE = PUTREAD = GETUPDATE = PUTDELETE = DELETE

Page 18: Thanks to our Sponsors! To connect to wireless 1. Choose Uguest in the wireless list 2. Open a browser. This will open a Uof U website 3. Choose Login.

Hbase – Java API (examples)Command Description

Get Get get = new Get(String.valueOf(uid).getBytes());Result[] results = table.get(gets);

Put Put p = new Put(Bytes.toBytes(""+user.getUid())); p.add(Bytes.toBytes("info"), Bytes.toBytes("fn"), Bytes.toBytes(user.getFirstName())); p.add(Bytes.toBytes("info"), Bytes.toBytes("ln"), Bytes.toBytes(user.getLastName()));table.put(p);

Delete (column, column family)

Delete d = new Delete(Bytes.toBytes(“”+user.getUid()));d.deleteColumn(Bytes.toBytes("info"), Bytes.toBytes("fn"), Bytes.toBytes(user.getFirstName()), timestapmp1);

Batch Operations List of Get, Put or Delete operations

Scan Iterate over a table. Prefer Range / Filtered scan. Expensive operation.

Page 19: Thanks to our Sponsors! To connect to wireless 1. Choose Uguest in the wireless list 2. Open a browser. This will open a Uof U website 3. Choose Login.
Page 20: Thanks to our Sponsors! To connect to wireless 1. Choose Uguest in the wireless list 2. Open a browser. This will open a Uof U website 3. Choose Login.

ReferencesHBase: The Definitive Guide by Lars George

HBase in Action by Nick Dimiduk and Amandeep Khurana

Page 21: Thanks to our Sponsors! To connect to wireless 1. Choose Uguest in the wireless list 2. Open a browser. This will open a Uof U website 3. Choose Login.

Thank You