http://poloclub.gatech.edu/cse6242 CSE6242 / CX4242: Data & Visual Analytics Analytics Building Blocks Duen Horng (Polo) Chau Assistant Professor Associate Director, MS Analytics Georgia Tech Partly based on materials by Professors Guy Lebanon, Jeffrey Heer, John Stasko, Christos Faloutsos
38
Embed
Analytics Building Blocks - Visualizationpoloclub.gatech.edu/cse6242/2016spring/slides/CSE6242-1...Shashank Pandit, Duen Horng (Polo) Chau, Samuel Wang, Christos Faloutsos. WWW 2007
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
http://poloclub.gatech.edu/cse6242CSE6242 / CX4242: Data & Visual Analytics
Analytics Building Blocks
Duen Horng (Polo) Chau Assistant ProfessorAssociate Director, MS AnalyticsGeorgia Tech
Partly based on materials by Professors Guy Lebanon, Jeffrey Heer, John Stasko, Christos Faloutsos
Polo’s definition: the interdisciplinary science of combining computation techniques and interactive visualization to transform and model data to aid discovery, decision making, etc.
What is Data & Visual Analytics?
No formal definition!
3
What are the “ingredients”?
3
What are the “ingredients”?
Need to worry (a lot) about: storage, complex system design, scalability of algorithms, visualization techniques, interaction techniques, statistical tests, etc.
Wasn’t this complex before this big data era. Why?
What is big data? Why should you care?(“big data” is buzz word, so is “IoT” - Internet of Things)
• large amount of info
• Healthcare (EHRs; mobile health sensor data)
• personal history; sickness
• Sports (baseball, basketball, soccer)
• Navigation and maps
• Business (“need analysis”: infer what clients want)
• Social media
• Sensor networks
• stock market (high frequency trading)
• supply chain logistics
(From previous class)What is big data? Why care?
• Many companies’ businesses are based on big data (Google, Facebook, Amazon, Apple, Symantec, LinkedIn, and many more)
• Web search
• Rank webpages (PageRank algorithm)
• Predict what you’re going to type
• Advertisement (e.g., on Facebook)
• Infer users’ interest; show relevant ads
• Infer what you like, based on what your friends like
• Recommendation systems (e.g., Netflix, Pandora, Amazon)
• Online education
• Health IT: patient records (EMR)
• Bio and Chemical modeling:
• Finance
• Cybersecruity
• Internet of Things (IoT)
Good news! Many jobs!Most companies are looking for “data scientists”
The data scientist role is critical for organizations looking to extract insight from information assets for ‘big data’ initiatives and requires a broad combination of skills that may be fulfilled better as a team- Gartner (http://www.gartner.com/it-glossary/data-scientist)
Breadth of knowledge is important.
This course helps you learn some important skills.
NetProbe: A Fast and Scalable System for Fraud Detection in Online Auction Networks. Shashank Pandit, Duen Horng (Polo) Chau, Samuel Wang, Christos Faloutsos. WWW 2007
Find bad sellers (fraudsters) on eBay who don’t deliver their items
NetProbe: The Problem
Buyer
$$$
Seller
16
Auction fraud is #3 online crime in 2010source: www.ic3.gov
Apolo: Making Sense of Large Network Data by Combining Rich User Interaction and Machine Learning. Duen Horng (Polo) Chau, Aniket Kittur, Jason I. Hong, Christos Faloutsos. CHI 2011.