LITTLE Issues withbig Data and AI
Jim Isaak
2015 SSIT Vice President
2010 Computer Society President
Nov. 2017 v2
Society on Social Implications of Technology
What’s coming?
A quick history of how big data is, and why the 21st century is not the same as the previous millennium
And then some of the “So What?”
– Challenges the public needs to consider
– That technologists need to consider
– That Policy makers need to consider
– But first, a word from our Sponsor:
11/20/20172
Society on Social Implications of Technology
Impacts of Technology on Society
www.IEEESSIT.org
11/20/20173
Society on Social Implications of Technology
When I was a boy … (1972)
Computers typically had 32k bytes of RAM
And 2.5 MB disk drives
And took forever to do things we consider common place now
Moore’s Law – double in density/2 yrs(speed, and ½ price)
11/20/20174
Society on Social Implications of Technology
But now…
Intel’s latest “desktop” chip is 4GHz(1,000,000,000,000 faster than my 1970’s system)
Consider a person walking at 4 mph
now 6x the speed of light
My local storage has gone from two novels
to the Library of Congress
11/20/20175
Society on Social Implications of Technology
Bytes per (8bits):
11/20/20176
Item Bytes
Short novel 1 Megabyte1,000,000
A pickup truckFilled with books
1 Gigabyte1,000,000,000
The Library of Congress – print collection
10 Terabytes10,000,000,000,000
Note: storage is measured in Bytes,
Communications in bits …
“Broadband” network typically 500 kilobits and up
50,000 bytes – or 20 seconds per book
Society on Social Implications of Technology
A comparison in human terms
It takes me six seconds to get an ingredient from the frig (load from RAM)
(Ideally a single cycle for a processor)
For a 4GHz processor, Rotational delay2ms => 1.5 years!
For seek time plus rotation delay is4ms => 4.5 yrs and 100ms =>77 yrs
How many items can I get from the frig while waiting for one from “the store”
(and don’t even consider net latency!)
11/20/20177
Society on Social Implications of Technology
The new challenge/limitation
Watts –
– Power requirements
– And BTU’s of heat generated by thousands of processors/disk drives
It’s why Google, Amazon, et al are placing data centers near hydro power & cooling options ..or at least cheap power
And – how are you going to use that much stuff?
11/20/20178
Society on Social Implications of Technology
An example – Bluffdale Utah, NSA
65 MegaWatts(1 MW – 600+ homes)(200MW Kennecott Utah Copper)
Aug. 2016 water use6.6M Gallons for cooling
11/20/20179
Est 3-12 Exabytes3,000,000,000,000,000
(i.e. one 64bit processor of
address space)
Society on Social Implications of Technology
The 21st century realization(Google, et al)
1. All data has value – and you don’t know what will be useful in the future(Buffdale center: storing pocket litter)
2. Critically missing in traditional systemsfault tolerance, massive scalabilitymalleable schema’s, flexible queries
=> Community Development e.g. Hadoop
11/20/201710
Society on Social Implications of Technology
Going Viral(getting Real- Time)
“Google Flu Trends appeared to detect regional outbreaks of influenza 7–10 days before conventional Centers for Disease Control and Prevention surveillance systems” Clinical Infectious Diseases (2009) doi: 10.1086/630200
Simple concept: track search trends relate to symptoms, relate to location, identify potential hot spots. – this specific concept has been picked up with more focused algorithms applied
The Google experience did not work as well as might be desired– big data hubris
An associate of mine was researching social media streams to track potential ‘hot spots’ for civil unrest, terrorism, etc. She now works for NSA
….. Now Trending….11/20/201711
Society on Social Implications of Technology
Patients Like Me
“We're unleashing the power of data for good by empowering people to take control of their health because we believe real-world evidence can change the healthcare system”
Can trigger “instant” medical studies based on 400,000+ participants with 2500+ medical conditions
Lithium, Bi-Polar and ALS – 16 patients in journal article – but PLM found 69 in a day.
https://www.ted.com/talks/jamie_heywood_the_big_idea_my_brother_inspired
11/20/201712
Society on Social Implications of Technology
Now let’s see “Applications”
Summer 2016 CEO Cambridge Analytica(11 minutes)
CEO of Cambridge Analytica March 17(30 minutes)
Composite from these two presentations (25 Min) (not online)
11/20/201713
Society on Social Implications of Technology
Election 2016
Democratic DB – every voter, likelihood of voting, feedback from surveys on candidate preferences … do everything you can to get the expected supporters out.
Trump “Project Alamo” w/ Cambridge Analytica: Facebook “psych” survey + profiles + external data on 220,000,000 Americans w/4000+ data points each “voter registration records, gun ownership records, credit card purchase histories, and internet account identities”=>personally targeted ads to either:
– Gain support (funding, voting)
– Suppress turnout of targeted groups
11/20/201714
Society on Social Implications of Technology
What Data Sources?
Facebook profile
OCEAN like personality test
Credit Cards
Credit Record
Browser Searches
Email “terms”
Church attendance
CATV viewing
Car registration
Home ownership
Magazine subscriptions
11/20/201715
“and the beat goes on…”
Society on Social Implications of Technology
OCEAN personality analysis
Openness, which refers to how readily an individual will
take on new experiences or acceptance of non-conventional ideas, levels of creativity …
Conscientiousness, which applies to attention to
detail, vigilance, organization and a desire to complete a
Extraversion, which relates to assertiveness,
enjoyment of human interactions and risk-taking.
Agreeableness, which tends to be indicative of co-
operation, kindness and consideration for others.
Neuroticism, which relays levels of anxiety, ability to
deal with stress and maintaining calmness under pressure.
11/20/201716
Society on Social Implications of Technology
“They [the Trump campaign] were using40–50,000 different variants of ad every day that were continuously measuring responses and then adapting and evolving based on that response,” – Martin Moore, director of Kings College’s Centre for the
Study of Media, Communication and Power, told The
Guardian in early December.
11/20/201717
Society on Social Implications of Technology
Predictive analysis
Predictive analysis: finding and quantifying hidden patterns in the data using complex mathematical models that can be used to predict future outcomes.
“Amazon customers like you ….”
Think “Minority Report” … without the prescient mediums
11/20/201718
Society on Social Implications of Technology11/20/201719
Society on Social Implications of Technology11/20/201720
Society on Social Implications of Technology11/20/201721
Society on Social Implications of Technology11/20/201722
Society on Social Implications of Technology
From the man who “Liked” OCEAN
Dr. Michal Kosinski found that just a few facebook “Likes” could match you to your OCEAN profile with high probability. 3 Million Facebook Profiles (1/1000)
10+ and you know a personality as well as their co-workers
100+ family/friends
250+ you know them better than their spouse
• Michael’s Keynote on Privacy
11/20/201723
Society on Social Implications of Technology
Every friend you “like”
Sexual orientation 88%
Gender, political views, race (95%)
Age, IQ,
Birds of a feather – friends like friends
∑ trivial data points => non-trivial
– Facebook + credit-card + search…
Also language use …
11/20/201724
Society on Social Implications of Technology
To summarize
Your face may disclose:Humans can do gender, age, introvert, …
Political views, sexual orientation
Gay, liberal, atheism – capital crimes some places
5 pictures sufficient to get ‘gay’ at 92%
Also captured:
– Location data, continuous
– Sensors – heart rate
11/20/201725
Society on Social Implications of Technology
I fed a sample from my web page into the Cambridge tool
Test 1 – 10yr old text
22 yr old male
89% liberal
69% hard working
19% contemplative
51% team oriented
22% laid back
60% leader potential
INTJ “Jungian style”
Test 2 - Recent text
30 year old male
38% conservative
67% hardworking
27% contemplative
35% competitive
22% laid back
34% leader potential
ISTJ style
11/20/201726
Society on Social Implications of Technology
Save the Rhinos
Noseong Park, Edoardo Serra, andV.S. Subrahmanian document their predictive analytics software to save rhinosIEEE Intelligent Systems, August 2015
Tracking, and then predicting rhino movement, and poacher movement can help target drone and ranger patrols to save more rhinos
Ends with the caveat that your rhinos may differ
11/20/201727
Society on Social Implications of Technology
What if?
We used all of the Cambridge Analyticaand other available data …
And analyzed which persons were most likely to:
– Commit suicide (most common form of gun violence)
– Attack a church congregation
– Initiate a terrorist attack
“Subject 47 has bought 3 assault rifles in the last week and 300 clips of ammo”
What would/should we do?
11/20/201728
Society on Social Implications of Technology
AI – coming of age
Less than “the movies” view, Butmore than folks expect
Past the tipping point, so it’s hard to see where it can lead
11/20/201729
“Alexa …”
Society on Social Implications of Technology
Recent in AI: Deep Learning
Watson has spoken
– It’s not just a game show any more
– It’s natural language in context
– It’s open ended responses to open ended questions (Siri, Hello Barbie etc.)
And the AI folks are on board
– Deep Learning to go beyond understanding data to modeling “you”
Prof. Pedro Domingo’s , UW in his book “The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World”
11/20/201730
Society on Social Implications of Technology
Open Source Tools Emerging
For big data manage
Analytics
AI methods
And emerging open-data sources
“Data wants to be free”
=> Letting a thousand flowers bloom
11/20/201731
Society on Social Implications of Technology
Bit Rot
Vint Cerf, “father of the Internet”raises the concern
Consider the media (floppy disc), and the associated reading device(s), and the encoding technique (PC-DOS files in ASCII with data for WordStar) and the required environment (DOS 2.0)
Will we be able to access the data?
11/20/201732
Society on Social Implications of Technology
Provenance
Credibility – or is it just the number of times the lie is re-told?
– This is one rationale for ‘citations’ in academic literature
– And for “reproducibility” in the scientific method … But
For Big Data can there be quality control, authority, chain of evidence, credible source, validation..???
There will be “data jamming” attacks
11/20/201733
Society on Social Implications of Technology
The Right to be Forgotten
1998 the Spanish newspaper La Vanguardia published an announcement regarding the forced sale of properties
A property belonged to Mario Costeja González, who was named
In 2009, Costeja contacted the newspaper to complain that when his name was entered in the Google search engine it led to the announcements
In 2010 …
11/20/201734
Society on Social Implications of Technology
Jurisdictions
He took his concerns to the Spanish Agency of Data Protection
From there it went to the EU Advocate General
Then to the EU Court of Justice
Google’s online form for EU citizens or EFTA nationals to request the removal of links if the data linked is "inadequate, irrelevant or no longer relevant, or excessive in relation to the purposes for which they were processed“ 2014
11/20/201735
Society on Social Implications of Technology
POP Quiz
Can you name two politicians who would like some of their history “forgotten”
Or more challenging, can you name one who would not like this to happen?
11/20/201736
Society on Social Implications of Technology
The Proxy Did It!
O’Neil, Cathy; Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy; Random House, 2016 (also Discover Mag, Oct 2016 issue)
“models and algorithms encode human prejudice”
11/20/201737
Society on Social Implications of Technology
E-Scores vs FICO
Fair Isaac credit scores are based on YOUR personal financial history,But cannot be used in sales/marketing(just hiring, promotions, loans, etc.)
eScores are proxies for FICO in some ways, matching you into “buckets” and affecting YOUR job, credit, even your time on hold to get service
“e-scores are arbitrary, unaccountable, unregulated, and often unfair” (O’Neil)
11/20/201738
Society on Social Implications of Technology
Time on hold????? ???
Managing call center traffic
(“please dial 1 if you are rich, dial 2 to go on hold, dial 3 to talk to someone in India, and 4 if you just would like to dial in more numbers.”)
Ditto for web credit card web sites – before you even “look” your browsing and purchasing patterns are being evaluated.
These may not be your Friend
“People Like You…” (zip, job, search…)
11/20/201739
Society on Social Implications of Technology
e_Scores to “Score” ?
CreditScoreDating.com “at least the customers know what they are getting into and why” (O’Neil)
Job Applicants are “researched” on the web BEFORE any contact from the company – eScores, Facebook, etc.
“The law stipulates employers must alert job
seekers when credit issues disqualify them…”
(O’Neil) Right….
11/20/201740
Society on Social Implications of Technology
Show of Hands:
Your wait time should be based on e-score proxies --- folks like you….
Your wait time should be based onspecific knowledge about you …(“tends to complain a lot, let him wait a bit longer, play the subliminal message tape”)
11/20/201741
Society on Social Implications of Technology
How Big is BIG?
Microsoft and U. Washington have developed a system to store binary data in DNA sequences.
– All of the data on the 2016 Internet could fit into a shoe box
– Much lower energy, less risk of bit rot, but … right now, real slow read/write times
Oct 28, 2016 WSJ insert “Fast Forward Tech”
• Memristor’s (HP/SanDisk term)/ReRAMNot this high of density, but significantly faster, denser and lower power than SDRAM.
Jan 2017, IEEE Consumer Electronics Magazine
11/20/201742
Society on Social Implications of Technology
Opting Out
https://www.privacyrights.org/
http://www.stopdatamining.me/opt-out-list/
11/20/201743
Society on Social Implications of Technology
From the SSIT Blog
An Asian firm, “Deep Knowledge” has appointed a virtual director to their Board. In this case it is a construct designed to detect trends that the human directors might miss.
One suspects that Apple might want a model of Steve Jobs around for occasional consultation, if not back in control again
11/20/201744
Society on Social Implications of Technology
AI and Ethics
The Partnership on AI Ethicshttp://www.partnershiponai.org/
IBM, Google, Microsoft, Amazon, Facebook
IEEE Standards – Autonomous Systems Ethicshttp://standards.ieee.org/news/2016/ieee_autonomous_systems.html
11/20/201745
Society on Social Implications of Technology
Resources
http://www.bigbrotherawards.org/(European – Privacy International)
“Saving Rhinos with Predictive Analytics” IEEE
Computer Society “Edge”
IEEE Computer Magazine, April 2016Special Issue on Big Data
http://bigdata.ieee.org/https://sites.google.com/site/io/underneath-the-covers-at-google-current-systems-and-future-directions
https://applymagicsauce.com/ Cambridge Univ. Evaluation tool
11/20/201746
Society on Social Implications of Technology
SSITIEEE’s Forum for Academic, Practical and Policy dialog
on the Impact of Technology on Society
Engineers and Technologists who care about how
their products, discoveries, and services will
affect humanity
• Conferences world wide
• Quarterly publication
• Ongoing social media interactions
• Perennial issues to consider as technology happens
Major topics include:
Privacy, Security, Health, Ethics, Equity, Quality of Life
As affected by technology such as:
NanoTech, Genomics, networks, computing, RFID, drones
47 11/20/2017
Society on Social Implications of Technology
Social Media – Public Dialog
Blog and comments
LinkedIn Group
Facebook Group
YouTube Channel
WWW.IEEESSIT.ORG
11/20/201748
Society on Social Implications of Technology
Questions?
Answers???
Thank You
11/20/201749
Society on Social Implications of Technology11/20/201750
Society on Social Implications of Technology
Alpha:the first step towards Omega
1992: DEC introduces Alpha, the first 64 bit commercial computer chip …
64 bits can directly address 16EB (Exabytes, 16 Billion GB) of “real” memory .. And Alpha was the fastest chip – so could seriously index lots of data
The Alpha App: Altavista – 1995 the first web index
1997- IBM introduces 16GB disk array
11/20/201751
Society on Social Implications of Technology
Donald Knuth, Stanford
Volume 3 (first ed. 1973)
Sorting and Searching, Second Edition (Reading, Massachusetts: Addison-Wesley, 1998), xiv+780pp.+foldout.ISBN 0-201-89685-0
Advisor and mentor to two students:Larry Page and Sergey Brin decided to implement a full version – 1998They call it “Google”
11/20/201752
Society on Social Implications of Technology
A side note on performance
Computer cycle times from MIPS to GIPS (instructions per second)
Disk rotation latency (half turn average)
Seek Time (1/3 of disc surface average)
Solid State Drives change the game again
Add DNA and Intel’s new chip 3Dxxx?
11/20/201753
4,000 RPM 7.14 ms 7 million Instructions
15,000 RPM 2 ms 2 million Instructions
100ms 100 million instructions
4ms 4 million instructions
Society on Social Implications of Technology
Emerging “Tricks”
For highly compact storage (not fast)
DNA tools are being developedmassive storage – slow access
• Intel 3-D “Optane” memory:
• Pushing “flash ram” capabilities into higher speed, more dense devices
• Data tools are now doing “memory first” operations, expecting terabytes of RAM
11/20/201754
Society on Social Implications of Technology
Data for Good “movement”
DataKind.org
– Harnessing the power of data science in the service of humanity
– DataKind is a unique way to build your skills and network with top data scientists around the world
The Data for Good Exchange is part of a long Bloomberg tradition of advocacy for using data science and human capital to solve problems at the core of society
11/20/201755