Top Banner
Scott Gonyea Epoch, LLC. @acts_as github.com/aitrus [email protected] NoSQL 1 Tuesday, December 14, 2010
23

Los Angeles R users group - Dec 14 2010 - Part 3

Jan 20, 2015

Download

Business

rusersla

 
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Los Angeles R users group - Dec 14 2010 - Part 3

Scott GonyeaEpoch, LLC.

@acts_as

github.com/aitrus

[email protected]

NoSQL

1Tuesday, December 14, 2010

Page 2: Los Angeles R users group - Dec 14 2010 - Part 3

NoSQL

2Tuesday, December 14, 2010

Page 3: Los Angeles R users group - Dec 14 2010 - Part 3

{ {{ {{{ NoSQL }}} }} }* GONG *

3Tuesday, December 14, 2010

Page 4: Los Angeles R users group - Dec 14 2010 - Part 3

NoSQLSo... What is it?

4Tuesday, December 14, 2010

Page 5: Los Angeles R users group - Dec 14 2010 - Part 3

NoSQL

• It’s “Not SQL”

• Less Structured; Sometimes Un-Structured

• Usually Implies:

• Key-Value Store

• Document Datastore

• Graph Databases (and its Variants)

• Often not much more than that

5Tuesday, December 14, 2010

Page 6: Los Angeles R users group - Dec 14 2010 - Part 3

NoSQLRecap and Comparison

6Tuesday, December 14, 2010

Page 7: Los Angeles R users group - Dec 14 2010 - Part 3

NoSQLSELECT persons.* FROM persons LEFT JOIN automobiles ON persons.id = automobiles.person_id WHERE automobiles.color IS 'RED'

/* Wait! We can NORMALIZE this some more! */

SELECT persons.* FROM persons LEFT JOIN automobiles LEFT JOIN colors ON persons.id = automobiles.person_id AND colors.id = automobiles.color_id WHERE colors.name IS 'RED-34'

/* Wait! ... */

7Tuesday, December 14, 2010

Page 8: Los Angeles R users group - Dec 14 2010 - Part 3

NoSQLSELECT persons.* FROM persons LEFT JOIN automobiles ON persons.id = automobiles.person_id WHERE automobiles.color IS 'RED'

/* Wait! We can NORMALIZE this some more! */

SELECT persons.*, automobiles.* /* Selects */ FROM persons LEFT JOIN automobiles /* Joins */ LEFT JOIN colors ON persons.id = automobiles.person_id AND colors.id = automobiles.color_id WHERE colors.name IS 'RED-34' /* Conditions */

/* Wait! ... */

8Tuesday, December 14, 2010

Page 9: Los Angeles R users group - Dec 14 2010 - Part 3

NoSQLPersonsPersonsPersons

id First Name Last Name

1 Seymore Butts

2 Amanda Hugginkiss

PersonsVehiclesPersonsVehiclesPersonsVehiclesPersonsVehicles

id vehicle_id person_id color_id

1 7384 1 212

2 7231 2 212

ColorsColors

id name

212 RED-34

213 BLUE-32

AutomobilesAutomobilesAutomobilesAutomobiles

id make model Year

23192 Honda Civic 2003

19763 Chevy Tahoe 1998

Normalization

Join Table =>

VehiclesVehiclesVehicles

id identifier automobile_id

7231 1FALP62W4WH128703 23192

7384 4WH1287031FALP62W 19763

Relationship: “Belongs To” =>

9Tuesday, December 14, 2010

Page 10: Los Angeles R users group - Dec 14 2010 - Part 3

NoSQLKey-Value Stores

Key

“John”

Value at “John”

“Smith”

AKA: Dictionary,Hash Table, etc.

KV

Database

Give me what’s at “John”

“John”

“Smith”

10Tuesday, December 14, 2010

Page 11: Los Angeles R users group - Dec 14 2010 - Part 3

NoSQL

• Very Fast

• Simple

• Key-Value Pairs are Self-Contained

∴Easier Replication

- Twitter, Facebook, Google, etc.

- Also, Your Computer

Key-Value Stores

11Tuesday, December 14, 2010

Page 12: Los Angeles R users group - Dec 14 2010 - Part 3

NoSQLDocument[-Oriented] Stores

“John Smith”“John Smith”“John Smith”“John Smith”“John Smith”

string first_namefirst_name JohnJohn

string last_namelast_name SmithSmith

embeddeddocument

automobilesautomobilesautomobilesautomobilesembeddeddocument doc_id makemake Honda

embeddeddocument

string modelmodel Civic

embeddeddocument

int yearyear 2003

embeddeddocument

string colorcolor Red

string employeremployer Epoch, LLC.Epoch, LLC.

“John Smith”“John Smith”“John Smith”“John Smith”“John Smith”

string first_namefirst_name JohnJohn

string last_namelast_name SmithSmith

embeddeddocument

automobilesautomobilesautomobilesautomobilesembeddeddocument doc_ref vehiclevehicle id(fd18d0af6c053886d)

embeddeddocument

string colorcolor Red

string employeremployer Epoch, LLC.Epoch, LLC.

Very De-Normalized Less De-Normalized

fd18d0af6c053886dfd18d0af6c053886dfd18d0af6c053886dfd18d0af6c053886dfd18d0af6c053886d

string makemake HondaHonda

string modelmodel CivicCivic

int yearyear 20032003

embeddeddocument

featuresfeaturesfeaturesfeaturesembeddeddocument string breaksbreaks anti-lock

embeddeddocument

bool air_condair_cond TRUE

geo_coord made_atmade_at 39.975542, -82.99209639.975542, -82.992096

12Tuesday, December 14, 2010

Page 13: Los Angeles R users group - Dec 14 2010 - Part 3

NoSQL

• Analogous to Printed Documents

• Varying Levels of De-normalization

• Documents Still Relatively Self-Contained

∴Easier Replication, Too

- Twitter, Facebook, Google, etc.

Document[-Oriented] Stores

13Tuesday, December 14, 2010

Page 14: Los Angeles R users group - Dec 14 2010 - Part 3

NoSQLGraph Databases 12/14/10 2:09 PMuntitled

Page 1 of 1http://upload.wikimedia.org/wikipedia/commons/5/5b/6n-graf.svg

1

23

546

Six Degrees of Separation

or: How you realized you have no friends :-(14Tuesday, December 14, 2010

Page 15: Los Angeles R users group - Dec 14 2010 - Part 3

NoSQL

• Relationships Derived Through “Distance”

• Some Use-Cases:• Geographic

• Networking

• Routing / Shortest Path

• Social

• Molecular Modeling

• Kevin Bacon Jokes

• Also Quite Trendy:

- Twitter, Facebook, Google [Maps, Mail, & Your Life]

Graph Databases

15Tuesday, December 14, 2010

Page 16: Los Angeles R users group - Dec 14 2010 - Part 3

NoSQLLet’s Explore: redis

16Tuesday, December 14, 2010

Page 17: Los Angeles R users group - Dec 14 2010 - Part 3

NoSQL

• Versatile Key/Value Store

• Type Aware (Optional)

‣ Nums, Strings, Lists, Sets, Hashes

• Atomic Operations

• Subscriptions

• Transactional (When needed)

• Customizable Durability

Let’s Explore: redis

17Tuesday, December 14, 2010

Page 18: Los Angeles R users group - Dec 14 2010 - Part 3

redisNeeds a Better R Package

> install.packages("rredis")> require('rredis')> redisConnect()

> key <- 'zero'> value <- 'hero'

> redisSet(key, value)[1] TRUE

> redisGet(key)[1] "hero"

18Tuesday, December 14, 2010

Page 19: Los Angeles R users group - Dec 14 2010 - Part 3

redisdoRedis (rredis + foreach)

redis Subscriptions

# Uses Redis' publish/subscribe messaging to distribute workloadsrequire('doRedis')

registerDoRedis('jobs')

foreach(j=1:1000,.combine=sum,.multicombine=TRUE) %dopar% 4*sum((runif(1000000)^2 + runif(1000000)^2)<1)/10000000

removeQueue('jobs')

19Tuesday, December 14, 2010

Page 20: Los Angeles R users group - Dec 14 2010 - Part 3

redisSets, From Ruby

ruby redis

require "redis"

redis = Redis.newusers = %w{albert bernard charles}

users.each {|usr| redis.sadd "users", usr}

redis.smembers "users"# => ["charles", "bernard", "albert"]

redis> smembers "users"1. "charles"2. "bernard"3. "albert"

20Tuesday, December 14, 2010

Page 21: Los Angeles R users group - Dec 14 2010 - Part 3

redisSet Intersections, Unions, Rand

admins = %w{bernard frank alice}

admins.each do |adm| redis.sadd "admins", admend

redis.sinter("users", "admins")

# => ["bernard"]

redis.sunion "admins", "users"

# => ["alice", "bernard", "albert", "frank", "charles"]

redis.srandmember "users"

# => "albert"

21Tuesday, December 14, 2010

Page 22: Los Angeles R users group - Dec 14 2010 - Part 3

redisAtomic String Operations

redis.set "foo", "bar"# => "OK"

redis.append "foo", "baz"# => 6

redis.get "foo"# => "barbaz"

22Tuesday, December 14, 2010

Page 23: Los Angeles R users group - Dec 14 2010 - Part 3

redisAtomic Operations

# Stringsredis.set "foo", "bar" # => "OK"redis.append "foo", "baz" # => 6redis.get "foo" # => "barbaz"

# Fixnumsredis.set "one", 1 # => "OK"redis.incr "one" # => 2redis.incrby "one", 3 # => 5redis.get "empty" # => nilredis.incr "empty" # => 1redis.get "empty" # => "1"

23Tuesday, December 14, 2010