Top Banner

of 34

distributed caches

Feb 24, 2018

Download

Documents

Jivtesh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 7/24/2019 distributed caches

    1/34

    From distributed cachesto in-memory data grids

    TechTalk by Max A. Alexejev

    [email protected]

  • 7/24/2019 distributed caches

    2/34

    Memory Hierarchy

    R

    !ns

    "!

    #$ cycles%#!ns

    "&

    #!' cycles% #(ns

    "(

    #$& cycles% #!)ns*RAM

    +,)ns

    Flash * /0

    H**

    Ta1es% Remote systems% etc

    Max A. A2

    2ost

    torae ter

  • 7/24/2019 distributed caches

    3/34

    o3t4are caches

    5m1rove res1onse times by reducing data accesslatency

    67oad 1ersistent storages

    6nly 4ork 3or 56-bound a11lications8

    Max A. A3

  • 7/24/2019 distributed caches

    4/34

    2aches and data location

    Max A. A4

    "ocal Remote

    Hierarchical

    *istributed

    hared

    2onsistency

    1rotocol

    *istribution

    algorithm

  • 7/24/2019 distributed caches

    5/34

    6k% so ho4 do 4e gro4 beyond onenode9

    *ata re1lication

    Max A. A5

  • 7/24/2019 distributed caches

    6/34

    :ro;s and 2on;s o3 re1lication

    0est read 1er3ormance limited by single machine si?e 5n case o3 master-master re1lication% reuires com1lex

    consistency 1rotocol

    2on

    Max A. A6

  • 7/24/2019 distributed caches

    7/34

    6k% so ho4 do 4e gro4 beyond onenode9

    *ata distribution

    Max A. A7

  • 7/24/2019 distributed caches

    8/34

    :ro;s and 2on;s o3 datadistribution

    2an scale hori?ontally beyond single machinesi?e

    Reads and 4rites 1er3ormance scaleshori?ontally

    :ro

    o 3ault tolerance 3or cached data 5ncreased latency o3 reads

  • 7/24/2019 distributed caches

    9/34

    Bhat do high-load a11lications need3rom cache9

    "o4latency

    "inearhori?ont

    alscalabili

    ty

    Max A. A9

    *istributed cache

  • 7/24/2019 distributed caches

    10/34

    2ache access 1atterns>

    2ache AsideFor reading data:1. Application ass !or so"e data !or a given e#

    2. $%ec t%e cac%e

    3. &! data is in t%e cac%e ret'rn it to t%e 'ser

    4. &! data is not in t%e cac%e !etc% it !ro" t%e ()*p't it in t%e cac%e* ret'rn it to t%e 'ser.

    For +riting data

    5. Application +rites so"e ne+ data or 'pdatesexisting.

    6. ,rite it to t%e cac%e

    7. ,rite it to t%e ().

    -verall:

    &ncreases reads per!or"ance

    -oads () reads

    &ntrod'ces race conditions !or +rites

    Max A. Alexejev1/

    2lient

    *0

    2ache

  • 7/24/2019 distributed caches

    11/34

    2ache access 1atterns>

    Read ThroughFor reading data:

    1. Application ass !or so"e data !or a givene#

    2. $%ec t%e cac%e

    3. &! data is in t%e cac%e ret'rn it to t%e 'ser

    4. &! data is not in t%e cac%e 0 cac%e +illinvoe !etc%ing it !ro" t%e () # %i"sel!*

    saving retrieved val'e and ret'rning it tot%e 'ser.

    -verall:

    ed'ces reads latenc#

    -oads read load !ro" 'nderl#ing storage

    Ma# %ave locing e%avior* t%'s %elping+it% dogpile eect

    e'ires s"arter cac%e nodes

    Max A. Alexejev11

    2lient

    *0

    2ache

  • 7/24/2019 distributed caches

    12/34

    2ache access 1atterns>

    Brite ThroughFor +riting data

    1. Application +rites so"e ne+data or 'pdates existing.

    2. ,rite it to t%e cac%e

    3. $ac%e +ill t%en s#nc%rono'sl#

    +rite it to t%e ().

    -verall:

    lig%tl# increases +rites latenc#

    rovides nat'ral invalidation

    e"oves race conditions on

    +rites

    Max A. Alexejev12

    2lient

    *0

    2ache

  • 7/24/2019 distributed caches

    13/34

    2ache access 1atterns>

    Brite 0ehindFor +riting data1. Application +rites so"e ne+ data or

    'pdates existing.

    2. ,rite it to t%e cac%e

    3. $ac%e adds +rites re'est to itsinternal 'e'e.

    4. ater* cac%e as#nc%rono'sl# ;'s%es

    'e'e to () on a periodic asis and

  • 7/24/2019 distributed caches

    14/34

    A variety o3 1roducts on the marketC

    Max A. A14

    Memcached Ha?elcast

    Terracotta

    Dh2ache

    6racle2oherence

    Riak

    Redis

    Mongo*0

    2assandra

    Eiga1aces

    5nnis1an

    C

  • 7/24/2019 distributed caches

    15/34

    "ets sort em out8

    Gcaches

    Memcached

    Dhcache

    C

    oI"

    Redis

    2assandra

    Mongo*0

    C

    *aEri

    6r2ohe

    Eem

    Eiga

    Erid

    Ha?

    5nn

    o"e prod'cts are reall#

    %ard to sort 0 lie

    >erracotta in ot% (-

    and ?xpress "odes.

    Max A. Alexejev15

  • 7/24/2019 distributed caches

    16/34

    Bhy don;t 4e have any distributedin-memory R*0M9

    5s% i3 3act% an exam1le o3 re1lication Hel1s 4ith reads distribution% but does not hel1 4ith

    4rites

    *oes not scale beyond single master

    Master J Multilaves conguration

    Hel1s 4ith reads and 4rites 3or datasets 4ith good dataaKnity

    *oes not 4ork nicely 4ith joins semantics

  • 7/24/2019 distributed caches

    17/34

    Gey-alue caches

    Max A. A17

    Memcached and DH2ache are good exam1les tolook at

    Geys and values are arbitrary binary

  • 7/24/2019 distributed caches

    18/34

    Memcached (eveloped !or

    ive@o'rnal in 2//3

    as client liraries in* @ava* '#* #t%onand "an# ot%ers

    Bodes are independentand donCt co""'nicate

    +it% eac% ot%er

    Max A. Alexejev18

  • 7/24/2019 distributed caches

    19/34

    DH2ache &nitiall# na"ed ?as# iernate

    $ac%e

    @avacentric* "at're prod'ct +it%openso'rce and co""ercialeditions

    -penso'rce version provides onl#replication capailities* distri'tedcac%ing re'ires co""erciallicense !or ot% ?$ac%e and

    >erracotta >A

    Max A. Alexejev19

  • 7/24/2019 distributed caches

    20/34

    oI" ystems

    A 4hole bunch o3 diLerent 1roducts 4ith both1ersistent and non-1ersistent storage o1tions."ets call them caches and storages% accordingly.

    0uilt to 1rovide good hori?ontal scalability

    Try to ll the 3eature ga1 bet4een 1ure G and3ull-blo4n R*0M

    Max A. A2/

  • 7/24/2019 distributed caches

    21/34

    2ase study> Redis

    Britten in 2% su11orted byMBare

    2lient libraries 3or 2% 2% Nav

    cala% :H:% Drlang% etc ingle-threaded async im1l

    Has congurable 1ersistence

    Borks 4ith G- 1airs% 4here a string and may be either

    number% string or 6bject strings% hashes% sorted lists%sets% sorted sets

    u11orts transactions

    %set 'sers:go'

    po+erlevel 9/// %get

    'sers:go' po+erlevel

    Max A. Alexejev21

  • 7/24/2019 distributed caches

    22/34

    /se cases> Redis

    Eood 3or xed lists% tagging% ratings% counters%analytics and ueues

  • 7/24/2019 distributed caches

    23/34

    2ase study>2assandra

    Britten in Nava% develo1ein Facebook.

    5ns1ired by Ama?on

    *ynamo re1licationmechanics% but usescolumn-based data mode

    Eood 3or logs 1rocessingindex storage% voting% job

    storage etc.

    0ad 3or transactional1rocessing.

    Bant to kno4 more9 AskAlexey8

    Max A. Alexejev23

  • 7/24/2019 distributed caches

    24/34

    5n-Memory *ata Erids

    Max A. A24

    Be+ generation o! cac%ing prod'cts* tr#ing to co"ine

    eneDts o! replicated and distri'ted sc%e"es.

  • 7/24/2019 distributed caches

    25/34

    5M*E> Dvolution

    Max A. A25

    Modern

    5M*E

    *ata Erids Reliable storage

    and live databalancing amonggrid nodes

    2om1utationalErids Reliable jobs

    execution%scheduling andload balancing

  • 7/24/2019 distributed caches

    26/34

    5M*E> 2aching conce1ts

    5m1lements G cache inter3ace

    :rovides indexed search by values

    :rovides reliable distributed locks inter3ace

    2aching scheme J 1artitioned or distributed% may be s1ecied 1er cache or cache service

    :rovides events subscri1tion 3or entries

  • 7/24/2019 distributed caches

    27/34

    5M*E> /nder the

    hood All data is split in a n'"er o!sections* called partitions.

    artition* rat%er t%en entr#* is anato"ic 'nit o! data "igration +%engrid realances. B'"er o! partitionsis Dxed !or cl'ster li!eti"e.

    &ndexes are distri'ted a"ong gridnodes.

    $lients "a# or "a# not e part o! t%egrid cl'ster.

    Max A. Alexejev27

  • 7/24/2019 distributed caches

    28/34

    5M*E /nder the hood>

    Reuests routingFor getE and p'tE re'ests:

    1. $l'ster "e"er* t%at "aes a re'est* calc'latese# %as% code.

    2. artition n'"er is calc'lated 'sing t%is %as%code.

    3. Bode is identiDed # partition n'"er.

    4. e'est is t%en ro'ted to identiDed node*exec'ted* and res'lts are sent ac to t%e client

    "e"er +%o initiated re'est.

    For Dlter 'eries:

    5. $l'ster "e"er initiating re'ests sends it to allstorage enaled nodes in t%e cl'ster.

    6. G'er# is exec'ted on ever# node 'sing distri'tedindexes and partial res'lts are sent to t%ere'esting "e"er.

    7. e'esting "e"er "erges partial res'lts locall#.

    8. Final res'lt set is ret'rned !ro" Dlter "et%od.

    Max A. Alexejev28

  • 7/24/2019 distributed caches

    29/34

    5M*E> Advanced use-cases

    Messaging

    Ma1-Reduce calculations

    2luster-4ide singleton

    And moreC

    Max A. A29

  • 7/24/2019 distributed caches

    30/34

    E2 tuning 3or large grid nodes

    An easy 4ay to go> rolling restarts or storage-enabledcluster nodes. 2an not be used in any 1roject.

    A com1lex 4ay to go> ne-tune 2M collector toensure that it 4ill al4ays kee1 u1 cleaning garbage

    concurrently under normal 1roduction 4orkload.

    An ex1ensive 4ay to go> use 6LHea1 storages1rovided by some vendors

  • 7/24/2019 distributed caches

    31/34

    5M*E> Market 1layers

    6racle 2oherence> commercial% 3ree 3or evaluationuse.

    Eiga1aces> commercial.

    EridEain> commercial.

    Ha?elcast> o1en-source.

    5nnis1an> o1en-source.

    Max A. A31

  • 7/24/2019 distributed caches

    32/34

    Terracotta

    Max A. A32

    A co"pan# e%ind ?$ac%e* G'art= and >erracotta erver Arra#

    Ac'ired # o!t+are AH.

  • 7/24/2019 distributed caches

    33/34

    Terracotta erver Array

    Max A. A33

    All data is split in a n'"er o! sections* called stripes.

    tripes consist o! 2 or "ore >erracotta nodes. -ne o! t%e" is Active node* ot%ers %ave assive stat's.

    All data is distri'ted a"ong stripes and replicated inside stripes.

    -pen o'rce li"itation: onl# one stripe. 'c% set'p +ill s'pport A* 't +ill not distri'te cac%e data. &.e.* it is not %ori=ontall# scala

  • 7/24/2019 distributed caches

    34/34

    IA essionAnd thank you 3or coming8