Top Banner
Understanding Big Data By: Paul Kenosky
22

By: Paul Kenosky. Big Data Define Big Data Challenges Increase in Technology Characteristics of Big Data Fraud Detection Social Media Hadoop.

Dec 26, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: By: Paul Kenosky.  Big Data Define  Big Data Challenges  Increase in Technology  Characteristics of Big Data  Fraud Detection  Social Media  Hadoop.

Understanding Big Data

By: Paul Kenosky

Page 2: By: Paul Kenosky.  Big Data Define  Big Data Challenges  Increase in Technology  Characteristics of Big Data  Fraud Detection  Social Media  Hadoop.

What will be covered!

Big Data Define Big Data Challenges Increase in Technology Characteristics of Big Data Fraud Detection Social Media Hadoop BigInsight

Page 3: By: Paul Kenosky.  Big Data Define  Big Data Challenges  Increase in Technology  Characteristics of Big Data  Fraud Detection  Social Media  Hadoop.

Define

Understanding Big Data Big Data applies to information that cant

be processed or analyzed using traditional processes or tools.

Wiki Big data is the term for a collection of

data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications

Page 4: By: Paul Kenosky.  Big Data Define  Big Data Challenges  Increase in Technology  Characteristics of Big Data  Fraud Detection  Social Media  Hadoop.

Challenges

Business face big data challenges more and more in today's world

They are overloaded with information that can be beneficial to the organization

However they do not know how to make use of the raw and unstructured data

Page 5: By: Paul Kenosky.  Big Data Define  Big Data Challenges  Increase in Technology  Characteristics of Big Data  Fraud Detection  Social Media  Hadoop.

Big Data Technology

Interconnectivity: More and more systems, people, and

technology are becoming interconnected Inexpensive

Integrated circuits are continually becoming cheaper to produce and buy

This allows intelligence to be added to many devices that once seemed too costly

Page 6: By: Paul Kenosky.  Big Data Define  Big Data Challenges  Increase in Technology  Characteristics of Big Data  Fraud Detection  Social Media  Hadoop.

Big Data on the rise!

Example railway cars have hundreds of sensors. Sensors can track things such as conditions

experienced by the rail car, the state of individual parts, and GPS based data for shipment

With the rise of technology these rail cars are becoming more advanced and sensors are added to sensor data on parts that are prone to wear, so they can be replaced before they fail

Data is stored on the rails, railroad crossing sensors, weather patterns that cause rail movements, cargo location, cargo arrival, and cargo departure times

Processing all this data using a traditional relational system would be impractical if not impossible

Page 7: By: Paul Kenosky.  Big Data Define  Big Data Challenges  Increase in Technology  Characteristics of Big Data  Fraud Detection  Social Media  Hadoop.

Characteristics of Big data

Volume: Data being stored today is increasing at

an overwhelming number Booking a flight, posting to facebook,

sending a text, and more Variety:

Represents all types of data Velocity:

How quickly data is arriving, stored, and analyzed

Page 8: By: Paul Kenosky.  Big Data Define  Big Data Challenges  Increase in Technology  Characteristics of Big Data  Fraud Detection  Social Media  Hadoop.

Fraud Detection Background

Transactions Online auctions, insurance claims A big data platform can present

opportunities to increases detection success

Patterns of fraud can come and go in hours, days, or weeks.

If fraud detection pattern has a low latency by the time it is discovered the damage is already done

Page 9: By: Paul Kenosky.  Big Data Define  Big Data Challenges  Increase in Technology  Characteristics of Big Data  Fraud Detection  Social Media  Hadoop.

Fraud Detection Questions An estimate of 20% of available information that

could be useful for fraud detection is being used

Why not load the other 80 percent of data into the traditional analytic warehouse? Too expensive

Would it not pay for itself? How can we be sure this new information will be

valuable before making a costly business decision Use BigInsights to provide an elastic and cost-effective

repository to establish what of the remaining 80 percent of the information is useful for fraud modeling.

Page 10: By: Paul Kenosky.  Big Data Define  Big Data Challenges  Increase in Technology  Characteristics of Big Data  Fraud Detection  Social Media  Hadoop.

Fraud Detection

IBM teamed up with a large credit card issuer to improve there fraud detection model.

They discovered they could improve the speed of detection and have more accurate results using the new model

A process that once took three weeks was improved to just a few hours.

They also found that about half of the 80% was actually beneficial information that could be used

Page 11: By: Paul Kenosky.  Big Data Define  Big Data Challenges  Increase in Technology  Characteristics of Big Data  Fraud Detection  Social Media  Hadoop.

The Social Media Pattern

Organizations can use Big Data usage pattern in social media to find out what is being said about the company and competitors

This information can be used to significantly improve decision making

IBM has built a solution to accelerate an organization usage called Cognos Consumer Insights (CCI)

CCI allows an organization to see what people are saying, how topics are trending in social media, and all sorts of things that affect the organization

Page 12: By: Paul Kenosky.  Big Data Define  Big Data Challenges  Increase in Technology  Characteristics of Big Data  Fraud Detection  Social Media  Hadoop.

Why are they unhappy with my company?

Although you can find out what people are saying, another more important question would be why are they saying and behaving in this way?

An organization needs to look beyond that data to answer the question

Sales, promotions, loyalty programs, merchandising mix, competitor actions, and even weather can come into play.

Page 13: By: Paul Kenosky.  Big Data Define  Big Data Challenges  Increase in Technology  Characteristics of Big Data  Fraud Detection  Social Media  Hadoop.

Example

Company introduced a different kind of packaging for one of its products.

Customers were giving negative feedback on the new packaging

Months later the company discovered the problem and switched the packaging to an eco-friendly package.

This in turn increased sales and customer happiness

Page 14: By: Paul Kenosky.  Big Data Define  Big Data Challenges  Increase in Technology  Characteristics of Big Data  Fraud Detection  Social Media  Hadoop.

Example 2

An author of the book is a prolific facebook poster

Traveling on airlines is essential to his job and after a number of flight delays he posted his frustration with these airlines on his facebook wall

These flight delays were found on his facebook wall by the airline and they contacted him

Although, it doesn't mention what the airlines to did to compensate or fix the problem it does show one thing which is the company where listening

Page 15: By: Paul Kenosky.  Big Data Define  Big Data Challenges  Increase in Technology  Characteristics of Big Data  Fraud Detection  Social Media  Hadoop.

Hadoop

Hadoop is a top level apache project and is open source

Is designed to scan through large data sets to produce its results through a highly scalable, distributed batch processing system

Data is redundantly stored in multiple places across clusters

The programming model is build to expect failures and it will automatically resolve them by running portions of the program on various servers.

Hardware components might fail but due to the redundancy hadoop can provide fault tolerance

Page 16: By: Paul Kenosky.  Big Data Define  Big Data Challenges  Increase in Technology  Characteristics of Big Data  Fraud Detection  Social Media  Hadoop.

InfoSphere BigInsights

Hadoop can be complex to install, configure, and administrate

IBM takes this complexity away with the BigInsight installer

BigInsights makes it simpler for people to use Hadoop and build big data applications.

It enhances this open source technology to withstand the demands of your enterprise, adding administrative, discovery, development, provisioning, and security features, along with best-in-class analytical capabilities from IBM Research.

The result is that you get a more developed and user-friendly solution for complex, large scale analytics.

Page 17: By: Paul Kenosky.  Big Data Define  Big Data Challenges  Increase in Technology  Characteristics of Big Data  Fraud Detection  Social Media  Hadoop.

Special Thanks to

http://www-01.ibm.com/software/data/infosphere/biginsights/index.html

http://en.wikipedia.org/wiki/Big_data http://www.decalsplanet.com/item-1

0485-black-pot-of-gold.html http://drshocker.blogspot.com/2007_

03_01_archive.html http://www.mytinyphone.com/wallpa

per/31448/

https://www.facepunch.com/showthread.php?t=1332655

Page 18: By: Paul Kenosky.  Big Data Define  Big Data Challenges  Increase in Technology  Characteristics of Big Data  Fraud Detection  Social Media  Hadoop.

What Big Data Says About You

Short YouTube video that explains Big Data

Some interesting stories the speaker went over

Page 19: By: Paul Kenosky.  Big Data Define  Big Data Challenges  Increase in Technology  Characteristics of Big Data  Fraud Detection  Social Media  Hadoop.

Extra Story

Bats flying around airports Noise was produced and airports

filtered this noise out Weather patterns Airplane movement

15 years later scientists got together Collecting data on bat migration Throwing this data away

One mans garbage is another mans treasure

Page 20: By: Paul Kenosky.  Big Data Define  Big Data Challenges  Increase in Technology  Characteristics of Big Data  Fraud Detection  Social Media  Hadoop.

Extra Story

Gates foundation Eradicate polio in Nigeria

Satellite maps Found villages no one knew of Government did not know these people where there No maps showed these villages

Gates gave out GPS phones to polio eradication workers

Combining satellites, vaccine, and cell phones is not something that comes to mind when thinking of big data

Problems caused by misinformation or get the information to late

Page 22: By: Paul Kenosky.  Big Data Define  Big Data Challenges  Increase in Technology  Characteristics of Big Data  Fraud Detection  Social Media  Hadoop.

Any Questions?