Top Banner
Introduction to Data Gravity By: John Tkaczewski President of FileCatalyst March 4, 2015
25

An Introduction to Data Gravity by John Tkaczewski of FileCatalyst

Jul 18, 2015

Download

Technology

ETCenter
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: An Introduction to Data Gravity by John Tkaczewski of FileCatalyst

Introduction to Data Gravity

By: John Tkaczewski

President of FileCatalyst

March 4, 2015

Page 2: An Introduction to Data Gravity by John Tkaczewski of FileCatalyst

Data Gravity

• A term first coined by Dave McCrory circa 2010

• Data is difficult to move around

• Data attracts greater and greater amount of Apps, Services and other tools as it grows

Page 3: An Introduction to Data Gravity by John Tkaczewski of FileCatalyst
Page 4: An Introduction to Data Gravity by John Tkaczewski of FileCatalyst

Why is the data “stuck”?

Page 5: An Introduction to Data Gravity by John Tkaczewski of FileCatalyst

Throughput and latency • As throughput and latency to the Data increase, the gravitational pull

of the data mass also increases

• Which forces the apps and services to move closer to the data

Page 6: An Introduction to Data Gravity by John Tkaczewski of FileCatalyst

If the model stopped here… all apps and services would end up in a single giant online BLOB (the cloud) to be closer to the data

Page 7: An Introduction to Data Gravity by John Tkaczewski of FileCatalyst

There are other forces that keep some data away…

Page 8: An Introduction to Data Gravity by John Tkaczewski of FileCatalyst

Forces that push away

• Privacy

• Security

• Cost

• Features, Convenience

Page 9: An Introduction to Data Gravity by John Tkaczewski of FileCatalyst

There is a balance between the gravity and the “Forces that push away”

Page 10: An Introduction to Data Gravity by John Tkaczewski of FileCatalyst

Real Life Scenario USB Thumb Drive VS. Amazon S3

• Unlimited flexible growing storage

• Easy Sharing with the rest of the world

• Security

• Convenience

• Fast Access to Data

• Practically Free

• Can be physically moved

Page 11: An Introduction to Data Gravity by John Tkaczewski of FileCatalyst
Page 12: An Introduction to Data Gravity by John Tkaczewski of FileCatalyst

Data Gravity on the Cloud

• Make inbound data as light as possible

• Make outbound data as heavy as possible

• Cost in VS. cost out

• Make Context of the data proprietary (example of a picture on flickr from http://datagravity.org/)

Page 13: An Introduction to Data Gravity by John Tkaczewski of FileCatalyst

Data Gravity as a computational theory

• Borrows from gravitational theory

• Similarities with the way nations negotiate trade tariffs and trade agreements between countries and cities (ref)

• Shannon’s law how much information can be squeezed down a wire

• Von Newmann Bottleneck, how fast the data can move from Persistent Storage to Memory to CPU cache to CPU

Page 14: An Introduction to Data Gravity by John Tkaczewski of FileCatalyst

How does accelerated file transfer fit in all of this?

Page 15: An Introduction to Data Gravity by John Tkaczewski of FileCatalyst

Traditional File Transfers

FTP, SFTP, HTTP, WebDav, SMTP, CIFS etc… • All use TCP

• Provides reliability, error checking, ordered packets in a stream

• Congestion control built in

• Internet could not survive without it

• Works well for most internet traffic, email, web browsing small ad-hoc transfers

Page 16: An Introduction to Data Gravity by John Tkaczewski of FileCatalyst

Problems with TCP • Flow control limits transmission window, causes dead air with high latency

• Very aggressive in response to network congestion, cannot tune in application layer

• Result is less than ideal performance on wireless, satellite, or long haul links

• Can be tuned but still not ideal for many-one, one-many

Page 17: An Introduction to Data Gravity by John Tkaczewski of FileCatalyst

File Transfer Acceleration • Ideal for bulk file transfer

• Predictable - Can send at a perfect rate

• Not affected by latency or packet loss

• Congestion Control implemented in application layer

• Tunable congestion control aggression

• Instantly detect link capacity

Page 18: An Introduction to Data Gravity by John Tkaczewski of FileCatalyst

Overall the effects of Data Gravity are reduced (like Anti-Gravity)

Page 19: An Introduction to Data Gravity by John Tkaczewski of FileCatalyst
Page 20: An Introduction to Data Gravity by John Tkaczewski of FileCatalyst

• Data gravity still exists but is reduced by eliminating the latency component

• The gravity continues to exist towards every storage location

• With faster moving data, the owner can now have more choices where to store it.

Page 21: An Introduction to Data Gravity by John Tkaczewski of FileCatalyst
Page 22: An Introduction to Data Gravity by John Tkaczewski of FileCatalyst

Cloud growth vs. geographical location of the users

Page 23: An Introduction to Data Gravity by John Tkaczewski of FileCatalyst

• It’s not always possible to make cloud services available near the all the users

• File Transfer Acceleration can help to reach those far away users at a lower cost then building a new data center

Page 24: An Introduction to Data Gravity by John Tkaczewski of FileCatalyst

Future … • Cloud services will continue to expand (money maker)

• Local and personal storage will continue to be needed but merely as a cache to what’s on the cloud

• Throughput will continue to increase but the latency will stay the same (speed of light++ anyone??)

• The need for faster file transfers will continue to grow as the cloud, data and links get bigger.

Page 25: An Introduction to Data Gravity by John Tkaczewski of FileCatalyst

Thank you.