Top Banner
NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden
36

NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

Jan 11, 2016

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

NoSQL for the SQL Server Pro

Lynn Langit

Feb 2013 – SDC, Sweden

Page 2: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

Is NoSQL just Hadoop?

• HUGE Hype factor over last few years

Apache Hadoop is a software framework that supports data-intensive distributed applications under a free license• enables applications to work with thousands of nodes and petabytes of data• was inspired by Google's MapReduce and Google File System (GFS) papers

Page 3: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

Hadoop in the Enterprise

Page 4: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

Working with HadoopCommon Tools / Languages• Java (JDK) / Eclipse• MapReduce

• Map (query/format)• Reduce (aggregate)• plug-in for Eclipse (Java)

• Pig (ETL -- Java)• Hive (HQL Query)

• HBase tables• Others

• Mahout (analyze)• Karmasphere (analyze)• R (analyze)

Page 5: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

Demo -HDInsight– Cluster Allocation

Page 6: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

What is the relationship?

NoSQL BigData

Page 7: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

BigData = Exponentially More Data• Retail Example -> ‘Feedback Economy’– Number of transactions– Number of behaviors (collected every minute)

12:00 12:30 1:00 1:30 2:00 2:300

500

1000

1500

2000

2500

PurchasesLocationsPhone data

Page 8: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

BigData = ‘Next State’ Questions

• What could happen?• Why didn’t this happen?• When will the next new thing

happen?• What will the next new thing be?• What happens?

Collecting Behavioral

data

Page 9: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

Demo - HDInsight - MapReduce

Page 10: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

Hitting (Relational) Walls

• CA– Highly-available

consistency• CP– Enforced consistency

• AP– Eventual consistency

Page 11: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

So many NoSQL options• More than just the Elephant in the room• Over 120+ types of NoSQL databases

Page 12: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

Flavors of NoSQL

Page 13: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

Key / Value Database• Schema-less• State (Persistent or Volatile)• Examples– AWS Dynamo DB– Riak

Page 14: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

Column Database

• Wide, sparse column sets• Examples:– Cassandra– HBase– BigTable– GAE HR DS– Azure Tables– SQL 2012

Tabular Model

Page 15: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

More about Column Databases

• Type A– Column-families– Non-relational– Sparse– Examples: HBase, Cassandra, xVelocity (SQL 2012 Tabular)

• Type B– Column-stores– Relational– Dense– Example:

• SQL Server 2012 Columnstore index

Page 16: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

Demo - Document Database (Mongo DB)

• document-oriented (collection of JSON documents) w/semi structured data– Encodings include BSON, JSON, XML…

• binary forms – PDF, Microsoft Office documents --

Word, Excel…)

Page 17: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

Demo - Graph Database (Neo4j)• a lot of many-to-many relationships• recursive self-joins • when your primary objective is quickly

finding connections, patterns and relationships between the objects within lots of data

Page 18: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

So which type of NoSQL? Back to CAP…

ConsistencyAvailability

Partitioning

CP = NoSQL/columnHadoopBig TableH-baseMemCacheDB

CA = SQL/RDBMSSQL Sever /OracleMySQL

AP = NoSQL/document or key/valueDynamoDBCouchDBCassandraVoldemort

Page 19: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

Which type of NoSQL for which type of data?

Type of Data Type of NoSQL solution Example

Log files Wide Column HBase

Product Catalogs Key Value on disk DynamoDB

User profiles Key Value in memory Redis

Startups Document MongoDB

Social media connections Graph Neo4j

LOB w/Transactions NONE! Use RDBMS SQL Server

Page 20: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

Cloud-hosted NoSQL up to 50x CHEAPER

Page 21: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

The reality…two pivots

Storage Methods• SQL (RDBMS) • NoSQL

Storage Locations• On premises • Cloud-hosted

Page 22: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

NoSQL (Cloud) BLOB Storage Buckets• Amazon – S3 or Glacier– The gold standard

• Google – Cloud Storage– Free for developers

• Microsoft Azure BLOBS• DropBox, Box…

Page 23: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

Cloud-hosted RDBMS• AWS RDS – SQL Server,

mySQL, Oracle– Medium cost– Solid feature set, i.e. backup,

snapshot– Use existing tooling

• Google – mySQL– Lowest cost– Most limited RDBMS

functionality• Microsoft – SQLAzure

– Highest cost

Page 24: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

Demo - AWS RDS

• SQL Server, MySQL or Oracle• Essential to understand pricing models

Page 25: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

Cloud Offerings– RDBMS AND NoSQL

AWS Google Microsoft

Cloud RDBMS RDS – all major mySQL SQL Azure

NoSQL buckets S3 or Glacier Cloud Storage Azure Blobs

NoSQL databases DynamoDB H/R Data on GAE Azure Tables

Streaming ML or (Mahout)

Custom EC2 Prospective Search &Prediction API

StreamInsight

Document or Graph MongoDB on EC2 Freebase MongoDB on Windows Azure

Hadoop Elastic MapReduce using S3 & EC2

none HDInsight

Dremel/Warehousing

RedShift BigQuery none

Page 26: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

Data Scientists…

Page 27: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

Com

parin

g…

Page 28: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

Karmasphere Studio for AWS

Page 29: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

Hadoop Connector to Excel

Page 30: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

Google BigQuery• Hadoop-like (Dremel) based service• For massive amounts of data• SQL-like query language

Page 31: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

Dremel Realized => Impala

• Interactive Hadoop?

Page 32: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

Other types of cloud data services

Hosting public datasets• Pay to read• Earn revenue by offering for

read

Cleaning / matching (your) data • ETL – Microsoft Data

Explorer, Google Refine• Data Quality – Windows

Azure Data Market, InfoChimps, DataMarket.com

Page 33: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

NoSQL To-Do ListUnderstand CAP & types of NoSQL databases• Use NoSQL when business needs designate• Use the right type of NoSQL for your business problem

Try out NoSQL on the cloud• Quick and cheap for behavioral data• Mashup cloud datasets• Good for specialized use cases, i.e. dev, test , training environments

Learn noSQL access technologies• New query languages, i.e. MapReduce, R, Infer.NET • New query tools (vendor-specific) – Google Refine, Amazon

Karmasphere, Microsoft Excel connectors, etc…

Page 34: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

The Changing Data Landscape

NoSQLRDBMS

OtherServices

Page 35: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

www.TeachingKidsProgramming.org• Free Courseware ( • Do a Recipe Teach a Kid (Ages 10 ++)• Java or Microsoft SmallBasic

• recipes)

Page 36: NoSQL for the SQL Server Pro Lynn Langit Feb 2013 – SDC, Sweden.

Toward Data Craftsmanship…

Follow me @LynnLangit

RSS my blog www.LynnLangit.com

Hire me• To help build your BI/Big Data solution• To teach your team next gen BI• To learn more about using NoSQL solutions