Top Banner
Mohammad Reza Gerami [email protected] [email protected] 1
29
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Big-Data-AryaTadbirNetworkDesigners

Mohammad Reza [email protected]

[email protected]

1

Page 2: Big-Data-AryaTadbirNetworkDesigners

2

Page 3: Big-Data-AryaTadbirNetworkDesigners

3

Page 4: Big-Data-AryaTadbirNetworkDesigners

• ‘Big Data’ is similar to ‘small data’, but bigger

•…but having data bigger it requires different approaches:• Techniques, tools and architecture

•…with an aim to solve new problems• …or old problems in a better way

4

Page 5: Big-Data-AryaTadbirNetworkDesigners

5

Page 6: Big-Data-AryaTadbirNetworkDesigners

Characteristics of Big Data: 1-Scale (Volume)

• Data Volume

Exponential increase in

collected/generated data

6

Page 7: Big-Data-AryaTadbirNetworkDesigners

Big Data in Today’s Business and Technology Environment

2.7 Zetabytes of data exist in the digital universe today. (Source)

235 Terabytes of data has been collected by the U.S. Library of Congress in April

2011. (Source)

The Obama administration is investing $200 million in big data research projects.

(Source)

IDC Estimates that by 2020,business transactions on the internet- business-to-

business and business-to-consumer – will reach 450 billion per day. (Source)

Facebook stores, accesses, and analyzes 30+ Petabytes of user generated data.

(Source)

Akamai analyzes 75 million events per day to better target advertisements.

(Source)

94% of Hadoop users perform analytics on large volumes of data not possible

before; 88% analyze data in greater detail; while 82% can now retain more of their

data. (Source)

7

Page 8: Big-Data-AryaTadbirNetworkDesigners

Walmart handles more than 1 million customer transactions

every hour, which is imported into databases estimated to

contain more than 2.5 petabytes of data. (Source)

More than 5 billion people are calling, texting, tweeting and

browsing on mobile phones worldwide. (Source)

Decoding the human genome originally took 10 years to

process; now it can be achieved in one week. (Source)

In 2008, Google was processing 20,000 terabytes of data (20

petabytes) a day. (Source)

The largest AT&T database boasts titles including the largest

volume of data in one unique database (312 terabytes) and the

second largest number of rows in a unique

8

Page 9: Big-Data-AryaTadbirNetworkDesigners

The Rapid Growth of Unstructured Data

YouTube users upload 48 hours of new video every minute

of the day. (Source)

571 new websites are created every minute of the day.

(Source)

Brands and organizations on Facebook receive 34,722

Likes every minute of the day. (Source)

100 terabytes of data uploaded daily to Facebook.

(Source)

According to Twitter’s own research in early 2012, it sees

roughly 175 million tweets every day, and has more than

465 million accounts. (Source)

30 Billion pieces of content shared on Facebook every

month. (Source)

Data production will be 44 times greater in 2020 than it

was in 2009. (Source)9

Page 10: Big-Data-AryaTadbirNetworkDesigners

The Rapid Growth of Unstructured Data

In late 2011, IDC Digital Universe published a

report indicating that some 1.8 zettabytes of

data will be created that year. (Source)

In other words, the amount of data in the world

today is equal to:

Every person in the US tweeting three tweets

per minute for 26,976 years.

Every person in the world having more than

215m high-resolution MRI scans a day.

More than 200bn HD movies – which would take a

person 47m years to watch.

10

Page 12: Big-Data-AryaTadbirNetworkDesigners

Social media and networks

(all of us are generating data)Scientific instruments

(collecting all sorts of data)

Mobile devices

(tracking all objects all the time)

Sensor technology and

networks

(measuring all kinds of data)

12

Page 13: Big-Data-AryaTadbirNetworkDesigners

• No single standard definition…

Big Data

13

Page 14: Big-Data-AryaTadbirNetworkDesigners

14

Page 15: Big-Data-AryaTadbirNetworkDesigners

15

Page 16: Big-Data-AryaTadbirNetworkDesigners

What to do with these data?

16

Page 17: Big-Data-AryaTadbirNetworkDesigners

How much data?

640K ought to be enough for anybody.

17

Page 18: Big-Data-AryaTadbirNetworkDesigners

Why Big Data

• Key enablers of appearance and growth of Big Data are

–Increase of storage capacities

–Increase of processing power

–Availability of data

–Every day we create 2.5 quintillion bytes of data; 90% of the data in the world today has been created in the last two years alone

18

Page 19: Big-Data-AryaTadbirNetworkDesigners

Big Data Analytics

• Examining large amount of data

• Appropriate information

• Identification of hidden patterns, unknown correlations

• Competitive advantage

• Better business decisions: strategic and operational

• Effective marketing, customer satisfaction, increased revenue

19

Page 20: Big-Data-AryaTadbirNetworkDesigners

Applications for Big Data Analytics

Homeland Security

Finance Smarter Healthcare Multi-channel sales

Telecom

Manufacturing

Traffic Control

Trading Analytics Fraud and Risk

Log Analysis

Search Quality

Retail: Churn, NBO

20

Page 21: Big-Data-AryaTadbirNetworkDesigners

Healthcare

• 80% of medical data is unstructured and is clinically relevant

• Data resides in multiple places like individual EMRs, lab and imaging systems, physician notes, medical correspondence, claims etc

• Leveraging Big Data• Build sustainable healthcare systems

• Collaborate to improve care and outcomes

• Increase access to healthcare

21

Page 22: Big-Data-AryaTadbirNetworkDesigners

Market Size

Source: WikibonTaming Big Data

By 2015 4.4 million IT jobs in Big Data ; 1.9 million is in US itself

22

Page 23: Big-Data-AryaTadbirNetworkDesigners

Potential Talent Pool -Big Data

India will require a minimum of 1 lakh data scientists in the next couple of years

in addition to data analysts and data managers to support the Big Data space.

23

Page 24: Big-Data-AryaTadbirNetworkDesigners

24

Page 25: Big-Data-AryaTadbirNetworkDesigners

Future of Big Data

25

Page 26: Big-Data-AryaTadbirNetworkDesigners

Big Data Analytics Technologies

NoSQL : non-relational or at least non-SQL database

solutions such as HBase (also a part of the Hadoop

ecosystem), Cassandra, MongoDB, Riak, CouchDB, and

many others.

Hadoop: It is an ecosystem of software packages,

including MapReduce, HDFS, and a whole host of other

software packages

26

Page 27: Big-Data-AryaTadbirNetworkDesigners

Main Big Data Technologies

Hadoop NoSQL Databases Analytic Databases

Hadoop

• Low cost, reliable

scale-out architecture

• Distributed computing

Proven success in

Fortune 500

companies

• Exploding interest

NoSQL Databases

• Huge horizontal scaling

and high availability

• Highly optimized for

retrieval and appending

• Types

• Document stores

• Key Value stores

• Graph databases

Analytic RDBMS

• Optimized for bulk-load

and fast aggregate

query workloads

• Types

• Column-oriented

• MPP

• In-memory

27

Page 28: Big-Data-AryaTadbirNetworkDesigners

Thank you

28

Page 29: Big-Data-AryaTadbirNetworkDesigners

29