Top Banner
Big Data in the Future Workforce Prof Dr Abdullah Gani. SMIEEE, FASc
30

Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational

Jul 17, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational

Big Data in the Future Workforce

Pro f Dr Abdu l lah Gan i . SMIEEE, FASc

Page 2: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational

Table of Contents

• What is data size

• What is Big Data and its characteristics

• Where the big data comes from?

• What processes involved

• Applications/Use cases

• Big Data Future and Ecosystem

• Job opportunities

• Salary scale

Page 3: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational

Data Size

Data Binary

Bit 1

Byte 8

Kilo byte 1000

Mega byte 10002

Giga byte 10003

Terra byte 10004

Peta byte 10005

Exa byte 10006

Zetta byte 10007

Yotta byte 10008

Page 4: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational

How much data?

• Google processes 20 PB a day (2008)

• Wayback Machine has 3 PB + 100 TB/month (3/2009)

• Facebook has 2.5 PB of user data + 15 TB/day (4/2009)

• eBay has 6.5 PB of user data + 50 TB/day (5/2009)

• CERN’s Large Hydron Collider (LHC) generates 15 PB a year

640K ought to be enough

for anybody.

Page 5: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational
Page 6: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational

1. Volume

• Data Volume• 44x increase from 2009 - 2020

• From 0.8 zettabytes to 35zb

• Data volume is increasing exponentially

6

Exponential increase in collected/generated data

Page 7: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational

2. Variety

• Various formats, types, and structures

• Text, numerical, images, audio, video, sequences, time series, social media data, multi-dim arrays, etc…

• Static data vs. streaming data

• A single application can be generating/collecting many types of data

7

To extract knowledge➔ all these types of data need to

linked together

Page 8: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational

3. Velocity

• Data is generated fast and need to be processed fast

• Online Data Analytics

• Late decisions ➔ missing opportunities

• Examples• E-Promotions: Based on your current location, your

purchase history, what you like ➔ send promotions right now for store next to you

• Healthcare monitoring: sensors monitoring your activities and body ➔ any abnormal measurements require immediate reaction 8

Page 9: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational

4. Veracity

•Is the quality or trustworthiness of the data

•E.g GPS

Page 10: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational

Big Data Sources

Social media and networks(all of us are generating data) Scientific instruments

(collecting all sorts of data)

Mobile devices (tracking all objects all the time)

Sensor technology and networks(measuring all kinds of data)

10

Page 11: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational

Processing Technologies

Platform : OpenStack,

Operating System: Linux, Windows

Page 12: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational

Big Data Challenges

1. Dealing with data growth• Storage

• Unstructured data

2. Generating insights in a timely manner

3. Recruiting and retaining big data talent• demand for big data experts —

and big data salaries have increased dramatically

4. Integrating disparate data sources

5. Validating data

6. Securing big data

7. Organizational resistance

Page 13: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational

Use Cases 1

13

◼The New York Times⚫ Large Scale Image Conversions

⚫ 100 Amazon EC2 Instances, 4TB raw TIFF data

⚫ 11 Million PDF in 24 hours and 240$

◼Facebook⚫ Internal log processing

⚫ Reporting , analytics and machine learning

⚫ Cluster of 1110 machines, 8800 cores and 12PB raw storage

⚫ Open source contributors(HIVE)

◼Twitter⚫ Store and process tweets, logs, etc.

⚫ Open source contributors (hadoop-lzo)

⚫ Large scale machine learning

Page 14: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational

Use Cases 2

14

◼Yahoo!⚫ 100,000 CPUs in 25,000 computers

⚫ Content/Ads Optimization, Search index

⚫ Machine learning (e.g. spam filtering)

⚫ Open source contributors(Pig)

◼Microsoft⚫ Natural language search (through Powerset)

⚫ 400 nodes in EC2, storage in S3

⚫ Open source contributors to Hbase

◼Amazon⚫ ElasticMapReduce service

⚫ On demand elastic Hadoop clusters for the Cloud

Page 15: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational

The Model Has Changed…

• The Model of Generating/Consuming Data has Changed

Old Model: Few companies are generating data, all others are consuming data

New Model: all of us are generating data, and all of us are consuming data

15

Page 16: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational

Analysis

Data on its own is useless unless you can make sense of it!

WHAT IS ANALYTICS?

The scientific process of transforming data into insight for making better decisions,

offering new opportunities for a competitive advantage

16

Page 17: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational

Data Visualization

• presentation of data in a pictorial

or graphical format.

• For centuries, people have

depended on visual

representations such as charts

and maps to understand

information more easily and

quickly.

17

Page 18: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational

Skill Set of Big Data

• Data collection, storage, cleaning, filtering,

Data Management

integration …

• Parallel computing

Large-scale Parallel Data Processing

• Data modeling, inference, prediction, pattern recognition …

Statistics and Machine Learning

• HCI design, visualization, story-telling …

Interface and Data Visualization

Page 19: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational

Big Data – Future

Page 20: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational
Page 21: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational

Government’s Initiative

Page 22: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational

BDA Outcomes

Page 23: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational
Page 24: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational
Page 25: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational
Page 26: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational
Page 27: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational

Prediction of Workforce

Page 28: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational

Salary

Position Salary (US)

Data Analyst 50 -75

Data Scientist 85-170

Data Science/Analytics Manager 90-240

Big Data Engineer 70-165

Page 29: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational

Conclusion

• Big Data is real and not hype

• It comes with opportunities of value creation

• Get ready with knowledge and skills of BDA

• Good luck

Page 30: Big Data in Future Workforce - British Council · Processing Technologies Platform : OpenStack, Operating System: Linux, Windows. Big Data Challenges 1. ... Securing big data 7. Organizational

Thank you…

Director,

Centre for Mobile Cloud Computing,

University of Malaya,

Kuala Lumpur

Email: [email protected]

Director,

Centre for Data Science and Analytics,

Taylors University,

Lakeside Campus

Subang Jaya

Selangor.

Email: [email protected]