Top Banner
INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation format Data to Story project work time
29

INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

Jan 15, 2016

Download

Documents

Alaina Brooks
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

INLS 151mon march 30

today’s line-up

online privacy brief intro to “big data”disclosure on Reddit – John Martin

pass back midterms

Pecha Kucha presentation format

Data to Story project work time

Page 2: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

What do we mean by privacy?• Louis Brandeis (1890)

– “right to be left alone”– protection from institutional threat:

government, press

• Alan Westin (1967)– “right to control, edit, manage, and

delete information about themselves and decide when, how, and to what extent information is communicated to others”

Page 3: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

Privacy vs. security

• Security helps enforce privacy policies• Can be at odds with each other

– e.g., invasive screening to make us more “secure” against terrorism

Privacy: what information goes where?

Security: protection against unauthorized access

Page 4: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

DEFINITION: “Big Data”

Big Data is used in the singular and refers to a collection of data sets so large and complex, it’s impossible to process them with the usual databases and tools. Because of its size and associated numbers, Big Data is hard to capture, store, search, share, analyze and visualize. The phenomenon came about in recent years due to the sheer amount of machine data being generated today – thanks to mobile devices, tracking systems, RFID, sensor networks, social networks, Internet searches, automated record keeping, video archives, e-commerce, etc. – coupled with the additional information derived by analyzing all this information, which on its own creates another enormous data set. Companies pursue Big Data because it can be revelatory in spotting business trends, improving research quality, and gaining insights in a variety of fields, from IT to medicine to law enforcement and everything in between and beyond.

Page 5: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

Massive Messy Data• Big Data analysis requires collecting

– massive amounts of– messy data .

• Messy data: The data is not in a uniform format as one would see in traditional database, it is not annotated (semantically tagged). – A technological breakthroughs was to find ways to

manipulate and analyze such data.• Massive amounts: think of every tweet ever tweeted.

They are all in the Library of Congress. – 400 million tweets a day in 2013.

Page 6: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

Patterns We Would Not Notice

• Big Data analytics can reveal important patterns that would otherwise go unnoticed.

• Taking the antidepressant Paxil together with the anti-cholesterol drug Pravachol could result in diabetic blood sugar levels. Discovered by – (1) using a symptomatic footprint characteristic of very high blood

sugar levels obtained by analyzing thirty years of reports in an FDA database, and

– (2) then finding that footprint in the Bing searches using an algorithm that detected statistically significant correlations. People taking both drugs also tended to enter search terms (“fatigue” and “headache,” for example) that constitute the symptomatic footprint.

Page 7: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

DEFINITION: “Cookie”A cookie is a small amount of data generated by a website and saved by your browser. Its purpose is to remember information about you, similar to a preference file created by a software application. Cookies are also used to store user preferences for a specific site. For example, search engines like Google or Bing store your searches. Financial websites sometimes use cookies to store recently viewed stock quotes. If a website needs to store a lot of personal information, it may use a cookie to remember who you are, but will load the information from its server.

Browser cookies come in two different flavors: "session" and "persistent." Session cookies are temporary and are deleted when the browser is closed. These types of cookies are often used by e-commerce sites to store items placed in your ‘shopping cart,’ and can serve many other purposes as well. Persistent cookies are designed to store data for an extended period of time. Each persistent cookie is created with an expiration date, which may be anywhere from a few days to several years in the future. Once the expiration date is reached, the cookie is automatically deleted.

Page 8: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

DEFINITION: “RFID”RFID stands for Radio Frequency IDentification, a technology that uses tiny computer chips smaller than a grain of sand to track items at a distance. RFID chips have been hidden in the packaging of Gillette razor products and in other products you might buy at a local Wal-Mart, Target, or Costco - and they are already being used to “spy” on people. Each tiny chip is hooked up to an antenna that picks up electromagnetic energy beamed at it from a reader device. When it picks up the energy, the chip sends back its unique identification number to the reader device, allowing the item to be remotely identified. These chips can beam back information anywhere from a couple of inches to up to 20 or 30 feet away.

Shown at left is a magnified image of actual RFID tag found in Gillette Mach3 razor blades. The chip appears as the tiny black square. The coil of wires surrounding the chip is the antenna, which transmits your information to a reader device, which can be located anywhere!

Page 9: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

DEFINITION: “RFID” (continued)

This technology is rapidly evolving and becoming more sophisticated. Now RFID chips can even be printed, meaning the dot on a printed letter "i" could be used to track you. Companies are even experimenting with making the product packages themselves serve as antennas. RFID chips can be well hidden. For example they can be sewn into the seams of clothes, sandwiched between layers of cardboard, and molded into plastic or rubber. Unlike a bar code, these chips can be read from a distance, right through your clothes, wallet, backpack or purse -- without your knowledge or consent -- by anybody with the right reader device. Many large corporations, including Philip Morris, Procter and Gamble, and Wal-Mart, have begun experimenting with RFID chip technology and have recently placed an order for up to 500 million RFID tags from a company called Alien Technology.

Page 10: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

Speaking of miniaturization…..(a slight digression)

• Smartphones and tablets outsold desktop and laptop computers in 2014; 170 million smartphones in U.S. 2014*

• The phone in your pocket has more programmable memory, more storage and more capability than several large IBM computers.

• It takes dozens of microprocessors running 100 million lines of code to get a premium car out of the driveway, and this software is only going to get more complex. In fact, the cost of software and electronics accounts for 30-40% of the price.

*Statistia

Page 11: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

What is collecting all this data?               Web Browsers                Search Engines 

Microsoft’s Internet Explorer

Mozilla’s FireFox

Google’s Chrome

Apple’s Safari

Google’s

Microsoft’s

Yahoo’s

IAC Search’s

Time-Warner’s AOLExplorer

(Non-profit foundation,used to be Netscape)

Page 12: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

What is collecting all this data? Smartphones & Apps

Apple’s iPhone(Apple O/S)

Samsung, HTC.Nokia, Motorola(Android O/S)

RIM Corp’s Blackberry(BlackBerry O/S)

Tablet Computers & Apps

Apple’s iPad

Samsung’s Galaxy

Amazon’s Kindle Fire

Page 13: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

What is collecting all this data?

Games Boxes and GPS Systems      Internet Service Providers

Page 14: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

What is collecting all this data?Smart TVs and Blu-Ray Players with built-in Internet connectivity

Movie Rental Sites

Page 15: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

What is collecting all this data?

Hospitals & Other Medical Systems            Banking & Phone Systems

Can you hear me now? (Heh heh heh!)

Pharmacies

Laboratories

Imaging Centers

Emergency Medical Services (EMS)

Hospital Information Systems

Doc-in-a-Box

Electronic Medical Records

Blood Banks

Birth & Death Records

Page 16: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

What is collecting all this data?

     A real pain in the apps! What are they collecting?• Restaurant reservations

(Open Table)• Weather in L.A. in 3 days

(Weather+)• Side effects of medications

(MedWatcher)• 3-star hotels in New Orleans

(Priceline)• Which PC should I buy and where

(PriceCheck)

Page 17: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

Who is collecting all of this data?

       Government Agencies Big Pharmaceutical Companies

Page 18: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

Who is collecting all this data?

Consumer Products Companies            Big Box Stores

Page 19: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

Who is collecting what?

Credit Card Companies What data are they getting?

Restaurant check

Grocery Bill

Airline ticket

Hotel Bill

Page 20: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

Why are they collecting all this data?

           Target Marketing• To send you catalogs for

exactly the merchandise you typically purchase.

• To suggest medications that precisely match your medical history.

• To “push” television channels to your set instead of your “pulling” them in.

• To send advertisements on those channels just for you!

        Targeted Information• To know what you need

before you even know you need it based on past purchasing habits!

• To notify you of your expiring driver’s license or credit cards or last refill on a Rx, etc.

• To give you turn-by-turn directions to a shelter in case of emergency.

Page 21: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

Examples of big data…..

Walmart handles more than 1 million customer transactions every hour, which is imported into databases estimated to contain more than 2.5 petabytes of data — the equivalent of 167 times the information contained in all the books in the US Library of Congress.

FICO Credit Card Fraud Detection System protects 2.1 billion active accounts world-wide.

The volume of business data worldwide, across all companies, doubles every 1.2 years, according to estimates

Page 22: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

Examples of Big DataWith a smart meter, a utility company goes from collecting one data point a month per customer (using a meter reader in a truck or car) to receiving 3,000 data points for each customer each month, while smart meters send usage information up to four times an hour.

One small Midwestern utility is using smart meter data to structure conservation programs that analyze existing usage to forecast future use, price usage based on demand and share that information with customers who might decide to forestall doing that load of wash until they can pay for it at the nonpeak price.

Page 23: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

Examples of Big DataGlobal position satellite technology now allows trucking firms to track their trucks - and the merchandise inside them. Practically anything you can attach an RFID tag to can be tracked. How a company uses that information – to re-route trucks to create efficient routes, alert customers to deliveries, and forecast and price services – depends on the ability to manage and analyze data effectively.

Page 24: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

Big Brother Needs Big DataIn March 2012, the Obama Administration announced the Big Data Research and Development Initiative, $200 million in new R&D investments, which will explore how Big Data could be used to address important problems facing the government. The initiative was composed of 84 different Big Data programs spread across six departments.

http://tinyurl.com/85oytkj

Page 25: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

What are some impacts of Big Data?• Decisions like your credit score and your

insurance rates may be based on the analysis of big data, for good or bad.

• After Haiti’s 2010 earthquake, Columbia University tracked the movements of 2 million refugees by the SIM cards in their cell phones and were able to determine where health risks would likely develop.

Page 26: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

Is Big Data good or bad for consumers?

• How would you feel about paying more for the same product than the person checking out in front of you?

• The real challenge: are you willing to get better value and more innovation for some loss of privacy?

• Since there is no way to stop the accumulation of Big Data, should its use be regulated by the Federal government?

Page 27: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

On the surface, big data appears to help companies improve their business and make them more efficient, however the problem many people are worried about is whether their personal and private information is secure.

-Alex

malicious intent…-Emily

The amount of people who are not disturbed by this violation of their privacy and claim "they have nothing to hide" is alarming and dangerous.

-Cherish

Page 28: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

How Can You Avoid Big Data?

• Pay cash for everything!• Never go online!• Don’t use a telephone!• Don’t use Kroger or Harris Teeter cards!

• Don’t fill any prescriptions!• Never leave your house!

Page 29: INLS 151 mon march 30 today’s line-up online privacy brief intro to “big data” disclosure on Reddit – John Martin pass back midterms Pecha Kucha presentation.

Pecha kucha presentation

https://www.youtube.com/watch?v=wGaCLWaZLI4