Top Banner
NETFLIX’S BIG LEAP FROM ORACLE TO C* ROOPA TANGIRALA Engineering Manager Netflix
61

Netflix's Big Leap from Oracle to Cassandra

Jan 22, 2018

Download

Technology

Roopa Tangirala
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Netflix's Big Leap from Oracle to Cassandra

NETFLIX’S BIG LEAP FROM

ORACLE TO C*

ROOPA TANGIRALAEngineering Manager

Netflix

Page 2: Netflix's Big Leap from Oracle to Cassandra

WHO AM I?

Engineering Manager @ NetflixTwitter - @roopatangiralaEmail [email protected] - https://www.linkedin.com/pub/roopa-tangirala/3/960/2b

Page 3: Netflix's Big Leap from Oracle to Cassandra

OVERVIEW

• Brief History

• Set up

• Migration Strategy

• Migration Challenges

• Example of Real use cases

• Lessons learnt

Page 4: Netflix's Big Leap from Oracle to Cassandra

1997

Page 5: Netflix's Big Leap from Oracle to Cassandra

DATACENTER

Page 6: Netflix's Big Leap from Oracle to Cassandra

BACKEND

Page 7: Netflix's Big Leap from Oracle to Cassandra

ORACLE DATAMODEL LIMITATIONS

Page 8: Netflix's Big Leap from Oracle to Cassandra

NO HORIZONTAL SCALING

Page 9: Netflix's Big Leap from Oracle to Cassandra

LICENSE COST

Page 10: Netflix's Big Leap from Oracle to Cassandra

EVERY TWO WEEKS!!

Page 11: Netflix's Big Leap from Oracle to Cassandra

2008

Page 12: Netflix's Big Leap from Oracle to Cassandra

MOVE TO CLOUD

Page 13: Netflix's Big Leap from Oracle to Cassandra

REQUIREMENTS

• HIGHLY AVAILABLE

• MULTI DATACENTER SUPPORT

• PREDICTABLE PERFORMANCE AT SCALE

Page 14: Netflix's Big Leap from Oracle to Cassandra

WHY C* ?

• Massively scalable architecture

• Multi-datacenter, multi-directional replication

• Linear scale performance

• Transparent fault detection and recovery

• Flexible, dynamic schema data

• Guaranteed data safety

• Tunable data consistency

Page 15: Netflix's Big Leap from Oracle to Cassandra

BACKEND NOW

Page 16: Netflix's Big Leap from Oracle to Cassandra

MICRO SERVICES

• Horizontal, Homogenous, Commoditized

Page 17: Netflix's Big Leap from Oracle to Cassandra

DOWNTIME

Page 18: Netflix's Big Leap from Oracle to Cassandra

ALMOST DAILY PUSHES

Page 19: Netflix's Big Leap from Oracle to Cassandra

ACTIVE ACTIVE

Page 20: Netflix's Big Leap from Oracle to Cassandra

GLOBAL PRESSENCE

Page 21: Netflix's Big Leap from Oracle to Cassandra

MIGRATION STRATEGY

Page 22: Netflix's Big Leap from Oracle to Cassandra

BABY STEPS

Page 23: Netflix's Big Leap from Oracle to Cassandra

NEW FEATURES FIRST TO CLOUD

Page 24: Netflix's Big Leap from Oracle to Cassandra

DATA MODEL REVIEW

keyspace

column family

Rowcolumn

•name

•value

•timestamp

DB/Schema

Table

Rowcolumn

• name

• value

ORACLE

CASSANDRA

Page 25: Netflix's Big Leap from Oracle to Cassandra

SCHEMALESS DESIGN

• Row-oriented

• Number of columns/Names can differ

namexyz Paul zip 95123

nameabc Adam zip 94538 sex Male

namecde XYZ

Page 26: Netflix's Big Leap from Oracle to Cassandra

UNDERSTAND WRITE PATH

client

Commit log (Disk)

Memtable (memory)

sstable sstable sstable

Flush

Page 27: Netflix's Big Leap from Oracle to Cassandra

UNDERSTAND READ PATH

clientmemtable

sstable

sstable

sstable

Row cache/key cache

Page 28: Netflix's Big Leap from Oracle to Cassandra

LOGIC IN APPLICATION

• Stored procedures

• Functions

• Triggers

• Referential integrity constraints

Page 29: Netflix's Big Leap from Oracle to Cassandra

DATACENTER AWS

ORACLE

C*

APP

DUAL WRITESREADS

CONSISTENCY CHECKER

WRITES

READS

WRITES

FORKLIFT

Page 30: Netflix's Big Leap from Oracle to Cassandra

MAIN APPROACH

• DUAL WRITES

• FORKLIFT OLD DATASET

• CONSISTENCY CHECKER

• MIGRATE READS TO C* FIRST

• SWITCH OFF WRITES TO DC

Page 31: Netflix's Big Leap from Oracle to Cassandra

Relationships – Better in App Layer

Page 32: Netflix's Big Leap from Oracle to Cassandra

ORACLE SEQUENCE

• USE UUID for Unique keys

• Distributed Sequence Generator for Ordering

• C* counters – not so much

Page 33: Netflix's Big Leap from Oracle to Cassandra

Heavy Transactional Use Case

RDBMS

Page 34: Netflix's Big Leap from Oracle to Cassandra

CHALLENGES

Page 35: Netflix's Big Leap from Oracle to Cassandra

SECURITY

Page 36: Netflix's Big Leap from Oracle to Cassandra

DENORMALIZE

DENORMALIZE

DENORMALIZE

DENORMALIZE

DENORMALIZE

Page 37: Netflix's Big Leap from Oracle to Cassandra

Roman Riding

Page 38: Netflix's Big Leap from Oracle to Cassandra

Model Around Queries

Page 39: Netflix's Big Leap from Oracle to Cassandra

Engineering Effort

Page 40: Netflix's Big Leap from Oracle to Cassandra

Know limitations

Page 41: Netflix's Big Leap from Oracle to Cassandra

SOURCE OF TRUTHF TRUTH

Page 42: Netflix's Big Leap from Oracle to Cassandra

REAL EXAMPLES

Page 43: Netflix's Big Leap from Oracle to Cassandra

API

• High concurrency

• Range scans

• ~1MB of data

• Caused heap issues at Cassandra level

• Very high read latency

Page 44: Netflix's Big Leap from Oracle to Cassandra

Range scan replaced with inverted index

0000/odp/{ui}/pathEvaluator_2 active Scripts_1 Text datafalseallocation

0000/xbox/{ui}/pathEvaluator_

1active Scripts_1 Text datafalseallocation

0000/tvui/{ui}/pathEvaluator_1 active Scripts_1 Text datafalseallocation

active_Scripts_idx

1,2

Idx_1 /tvui/{ui}/pathEvaluator 2/odp/{ui}/pathEvaluator 2/odp/{ui}/pathEvaluator 2/odp/{ui}/pathEvaluator

scripts

client

1

2

Page 45: Netflix's Big Leap from Oracle to Cassandra

Inverted Index considerations

• Column name can be used a row key placeholder

• Hotspots!!

• Sharding

Page 46: Netflix's Big Leap from Oracle to Cassandra

VIEWING HISTORY

Page 47: Netflix's Big Leap from Oracle to Cassandra

Growth of Viewing History

47

Page 48: Netflix's Big Leap from Oracle to Cassandra

Problem

Growth Pattern: “Hockey-stick”

Retention Policy: Retain forever

Access Pattern: Retrieve all for a customer

Scaling and performance challenges as the data grows

48

Page 49: Netflix's Big Leap from Oracle to Cassandra

Goals

Small Storage Footprint

Consistent Read/Write Performance

Infinite Scalability

49

Page 50: Netflix's Big Leap from Oracle to Cassandra

Old Data Model

50

Page 51: Netflix's Big Leap from Oracle to Cassandra

Old Data Model - Pros

Simple Wide Rows

No additional processing

Fast Write

Fast Read for Majority

51

Page 52: Netflix's Big Leap from Oracle to Cassandra

Old Data Model - Cons

Read latency increases as number of viewing records increases

Cassandra Internal

• Number of SSTables

• Compaction

• Read Repair

Lesson learned : Wide Row Limitation

52

Page 53: Netflix's Big Leap from Oracle to Cassandra

New Data Model

Split Viewing History into two Column Families1. Recent• Small records with frequent updates• Cassandra tuning : compaction, read repair, etc.

2. Compressed• Large historical records with rare updates• Rollup• Compression• Cassandra: rare compaction, no read repair

53

Page 54: Netflix's Big Leap from Oracle to Cassandra

New Data Model cont’d

54

Page 55: Netflix's Big Leap from Oracle to Cassandra

RESULTS

55

Page 56: Netflix's Big Leap from Oracle to Cassandra

Think Data Archival

Data stores grow exponentially

Have a process in place to archive data

• Moving to a separate column family

• Moving to a separate cluster (non SSD)

• Setting right expectations w.r.t latencies with historical data

Cassandra TTL’s

Page 57: Netflix's Big Leap from Oracle to Cassandra

Cinematch Rating Service

• First Model

• Second Model

Movie_id12345 BLOB

Movie_id4355 BLOB

545 1 545 4 545 534512345 4 545 2

454355 2 66 2 67 4

Page 58: Netflix's Big Leap from Oracle to Cassandra

WHAT WORKED?

12345

4355

343 674

5

Name

3

4542443 242 Name

BLOB

4 BLOB52

Page 59: Netflix's Big Leap from Oracle to Cassandra

BLOB vs COLUMN/VALUE

Try out column/value approach first and hopefully it satisfies avg/95/99th

Column value pro’s:• Write payload can be smaller

• Query by specific columns

• Read path does not require reading the entire row

Blob considerations• Read percentage is very high (90’s)

• Read latencies are very important

• All the data is read most of the time

Page 60: Netflix's Big Leap from Oracle to Cassandra

LESSONS LEARNT

• Get the model right

• Baby Steps

• Huge Engineering effort

• Think Compression

• Performance test

• DB just for Data

Page 61: Netflix's Big Leap from Oracle to Cassandra

WE ARE HIRING !! – CHECK OUT JOBS.NETFLIX.COM