Transcript

©2013 DataStax Confidential. Do not distribute without consent.

Extreme Data VelocityContinuous AvailabilityOperational SimplicityMichael ShalerSenior Director, Business Development

What is Big Data’s payoff?

DataStax: CRN’s “10 Coolest Big Data Startups” Cassandra: InfoWorld’s Technology of the Year

1,000+ production deployments and 300 customers$84M in funding from industry-leading investors

BHAGWe are the first viable alternative to

Oracle for modern online applications.

We seek to be the first and best choice in databases.

No, Seriously…

Real-world Use Cases

7

Internet of Things Database Requirements

• “UTC subject predicate”: Time series data and metadata are the lingua franca of sensors/device data communications

• FAST AND ALWAYS ON: High-velocity ingest rates from geographically dispersed inputs with variable schemas/data models is the norm—and unless you tell them to do so, sensors never, ever sleep…

• HOT AND COLD: Real-time data and analytics vs. data reservoir/data factory needs vary.

• DHTs: Wide-row column-oriented distributed hash tables are the optimal home for IoT operational datastores

• AND: Other key functionality needed includes indexed search, along with both batch and real-time analytics—with data-in-flight and data-at-rest security an emerging need

• SPOILER ALERT: DataStax Enterprise supports all of the above

Time Series Analytics: 70B readings

Smart Grid Proof of Concept: Analyze 2 years of Smart Meter data for 1M households

Improvements in demand forecasting could yield EBITDA > $100M per GW saved

• $5M CAPEX• 10 man/months delivery

(Deploy, DevOps, Tuning)• Ongoing OPEX of > $1M

• $450K OPEX• 2 DevOps running 15 AWS nodes• Faster performance in 2 weeks• …All in the cloud

Major Changes: The Evolving Data Center

LOBApp

Oracle

LOBApp

MySQL

LOBApp

SQLServer

“What’s Happening?”Hyper VelocityTransactional

NoSQL

Data Warehouse

Teradata/Exadata

“What Happened?”Massive Volume

Bit Bucket

Hadoop

The Application World *HAS* Changed

11

Common Use Cases

• Big data OLTP and write intensive systems

• Time series data management

• High velocity device data consumption and analysis

• Healthcare systems input and analysis

• Media streaming (music, movies, etc.)

• Online Web retail (shopping carts, user transactions, etc.)

• Online gaming (real-time messaging, etc.)

• Real time data analytics

• Social media input and analysis

• Web click-stream analysis

• Buyer event and behavior analytics

• Fraud detection and analysis

• Risk analysis and management

• Supply chain analytics

• Web product searches

• Internal document search (law firms, etc.)

• Real estate/property searches

• Social media match ups

• Web & application log management / analysis

Continuous Availability Commentary

LondonVirginia

Santa ClaraSydney

D3A1

A2

A3

B1

B2B3

C1

C2

C3

D1

D2Cassandra: Architecture as Foundation

14

The New DR: Simian Army “Dystopia as a Service”

15

Heterogeneous Workloads: Active Everywhere

WriteAnalyze

ReadSearch

Write

Write

Read

Search

Our Product Solution

• DataStax Enterprise powers the big data apps that transform business.

• Extreme Data Velocity

• Continuous Availability

• Operational Simplicity

17

©2012 DataStax

33M streaming customers

2TAPI calls/year

~1,200Servers

55AWS clusters

12 developers

4 operators

0New data centers

Operational Simplicity

“Our primary operational data store is now Cassandra, not Oracle.”

Performance: NoSQL Leadership

Source: Solving Big Data Challenges for Enterprise Application Performance Management

Tillman Rabl, University of Toronto et al VLDB 2012 (August 2012, Istanbul)

Cassandra vs. HBase:

• 10x more read throughput

• 100x faster read latency

• 8x more write throughput

• 8x faster scan latency

• 4x more scan throughput

19

Performance: NoSQL Leadership

©2012 DataStax

YCSB Load Process

YCSB Read-write mix

YCSB Read-mostly

YCSB Write-mostly

20

From STB to the Scalable Cloud Message Bus

Enabling a richer active consumer experience across multiple devices, multiple platforms

Even in pre-production environment prior to tuning, achieved near-linear scalability

21

Instagram Scales Engaged Networks

• Transitioned from Redis (in-memory cache) to Cassandra in Amazon Web Services EC2

• Doubled cluster—and then doubled again—to support 150MM users on new infrastructure

• Continue to scale in spite of Justin Bieber storms, video formats, new features, new markets

Our Vision

DataStax is driving Cassandra to be the first viable alternative to the Oracle database for companies who are transforming the way they interact with customers.

Getting ahead of exploding growth• Sign big, new contracts all the time (ESPN)

• 200M unique users per month• 40TB of data

Flexible architecture • “Couldn’t shoehorn RDBMS technology”

Very small operations team• 3 people• 20 clusters• 100’s of nodes

Why We Exist

Today’s applications must be always available and lightning fast as they scale to previously unimaginable levels.

Cassandra delivers both with a beautifully simple and elegant architecture.

“We need a real-time, massively scalable architecture, where no one node is a single point of failure, that can easily span multiple data centers and cloud availability zones, and that’s Cassandra.”

What We Do Best

Cassandra was designed to do things that are impossible in other databases when it comes to availability and performance.  Forget about losing a machine here or there -- Cassandra delivers a world where you can lose an entire datacenter and still perform as your customers expect.

“We have to be ready for disaster recovery all the time. It’s really great that Cassandra allows for active-active multiple data centers where we can read and write anywhere”

Jay PatelTechnical Architect at eBay(Describing why they switched from legacy relational architecture)

The Modern “Application”

The Modern “Application”

Fraud Detection and Prevention

What It Means In Real Life

What It Means In Real Life

Cassandra Summit SF 2013

Real Growth In Production

We are the first viable alternative to Oracle for

modern online applications.

©2013 DataStax Confidential. Do not distribute without consent.

Thank You

We power the big data apps that transform business.

©2013 DataStax Confidential. Do not distribute without consent.

DataStax OpsCenter 4.0

©2013 DataStax Confidential. Do not distribute without consent.

DataStax OpsCenter 4.0

©2013 DataStax Confidential. Do not distribute without consent.

DataStax OpsCenter 4.0

©2013 DataStax Confidential. Do not distribute without consent.

DataStax OpsCenter 4.0

©2013 DataStax Confidential. Do not distribute without consent.

DataStax OpsCenter 4.0

Security in Cassandra FEA

TU

RES

BEN

EFIT

S

Internal Authentication

Manages login IDs and passwords inside

the database

+Ensures only authorized users

can access a database system

using internal validation

+Simple to implement and easy

to understand

+No learning curve from the relational

world

Object Permission Management

controls who has access to what and

who can do what in the database

+Provides granular based control over

who can add/change/delete/re

ad data

+Uses familiar GRANT/REVOKE from relational systems

+No learning curve

Client to Node Encryption

protects data in flight to and from a

database cluster

+Ensures data cannot be captured/stolen in route to a server

+Data is safe both in flight from/to a

database and on the database; complete coverage is ensured

Advanced Security in DataStax EnterpriseFEA

TU

RES

BEN

EFIT

S

External Authentication uses

external security software packages to

control security

+Only authorized users have access

to a database system using

external validation

+Uses most trusted external security

packages (Kerberos, LDAP), mainstays in

government and finance

+Single sign on to all data domains

Transparent Data Encryption

encrypts data at rest

+Protects sensitive data at rest from

theft and from being read at the file system level

+No changes needed at application level

+Can encrypt both Cassandra and Hadoop data

Data Auditingprovides trail of who

did and looked at what/when

+Supplies admins with an audit trail of

all accesses and changes

+Granular control to audit only what’s

needed

+Uses log4j interface to ensure

performance and efficient audit

operations

top related