-
1/15
Introduction Overview Administrivia Process Overview Q & A
Conclusion References Files Vita
Big Data: Data Analysis Boot CampIntroduction and Overview
Chuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge,
PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge,
PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge,
PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge,
PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge,
PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge,
PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhD
29 March 201929 March 201929 March 201929 March 201929 March
201929 March 201929 March 201929 March 201929 March 201929 March
201929 March 201929 March 201929 March 201929 March 201929 March
201929 March 201929 March 201929 March 201929 March 201929 March
201929 March 2019
c©Old Dominion University
-
2/15
Introduction Overview Administrivia Process Overview Q & A
Conclusion References Files Vita
Table of contents (1 of 1)
1 IntroductionThe global view
2 OverviewThe world from 50,000feet.Text
3 AdministriviaMiscellaneous andnecessary things
4 Process Overview5 Q & A6 Conclusion7 References8 Files9
Vita
c©Old Dominion University
-
3/15
Introduction Overview Administrivia Process Overview Q & A
Conclusion References Files Vita
The global view
Big Data: Data Analysis Boot Camp
We will cover aspects common toall Big Data
investigations,including: defining Big Data,surveying tools and
techniquesfor processing Big Data, andvisualizing selected aspects
ofBig Data.The emphasis of the camp is tounderstand what is Big
Datadata analysis beyond themarketing hype of the 3Vs ofvolume,
variety, and velocity,
Image from [1].
More detailed information
at:https://www.odu.edu/cepd/bootcamps/data-analysisc©Old Dominion
University
https://www.odu.edu/cepd/bootcamps/data-analysis
-
4/15
Introduction Overview Administrivia Process Overview Q & A
Conclusion References Files Vita
The world from 50,000 feet.
Things we’ll be covering over the next three days:
Friday1 Administrivia2 What is BD?3 What is R?4 Looking at the
built-in
iris and Titanic datasetsSaturday
1 Visualizing data withdifferent packages
2 Exploring cluster analysis(of different types)
3 Linear regression andsome variants
4 Classification techniques5 Text analysis6 Serial vs.
parallel
processing
Sunday1 R limitations2 R and Hadoop3 R and SQL and No-SQL
DBMs4 Hands-on with real-world
crime data5 Wrap-up
c©Old Dominion University
-
5/15
Introduction Overview Administrivia Process Overview Q & A
Conclusion References Files Vita
Text
Not required reading, but referenced throughout ourtime
together.
Learning Predictive Analyticswith R (LPAR)
Big Data Analytics with R(BDAR)
Not necessary, but really helpful.c©Old Dominion University
-
6/15
Introduction Overview Administrivia Process Overview Q & A
Conclusion References Files Vita
Text
Code samples
There are lots. And, they looklike this:
library(cluster.datasets)
data(all.us.city.crime.1970)
crime =
all.us.city.crime.1970
plot(crime[5:10])
Available in a separate file embedded in each presentation.
c©Old Dominion University
-
7/15
Introduction Overview Administrivia Process Overview Q & A
Conclusion References Files Vita
Miscellaneous and necessary things
All things related to paper work.
Parking – front and backwithout permitsBreaks – yes we’ll have
them.Lunch – yes places near by:right a main light to “fast
food”Text books – recommended butnot necessary, have good
ideas,techniquesNon-credit optionCredit option – two
additionalassignment
Hours – 9AM to 5PM with abreak for lunchSunday access – yes,
check inwith securitySoft copies – all presentations,and software
are availableComputer logins and passwords– will be
coordinatedBreak room – across hallBathrooms – around elevator
Other things as well.
c©Old Dominion University
-
8/15
Introduction Overview Administrivia Process Overview Q & A
Conclusion References Files Vita
Miscellaneous and necessary things
Soft copies available from Internet
All information(presentations, scripts, anddata) is available on
yourVM desktop (static)
All information is availablevia the I’net (dynamic)
Errata updated nightly
I’m not a web designer, nor do Iplay one on TV.
http://www.cs.odu.edu/
~ccartled/Teaching/
c©Old Dominion University
http://www.cs.odu.edu/~ccartled/Teaching/http://www.cs.odu.edu/~ccartled/Teaching/
-
9/15
Introduction Overview Administrivia Process Overview Q & A
Conclusion References Files Vita
Miscellaneous and necessary things
Same image.
http://www.cs.odu.edu/~ccartled/Teaching/
c©Old Dominion University
http://www.cs.odu.edu/~ccartled/Teaching/
-
10/15
Introduction Overview Administrivia Process Overview Q & A
Conclusion References Files Vita
How do Data Wrangling, Analysis, and Visualization
fittogether?
Notionally, there are threedistinct phases in data analysis.
1 Data wrangling – getting theraw data into a usable form
2 Data analysis – evaluatingand understanding the data
3 Data visualization –presenting the analyticalresults in an
intelligiblemanner
Management continues across allphases. The other phases
mayoverlap.
c©Old Dominion University
-
11/15
Introduction Overview Administrivia Process Overview Q & A
Conclusion References Files Vita
Q & A time.
Q: What is the square root of4b2?A: To be or not to be.
c©Old Dominion University
-
12/15
Introduction Overview Administrivia Process Overview Q & A
Conclusion References Files Vita
What have we covered?
Where we are.Where we’re going.How we’ll get there.
Now!! On to exploring the world of Big Data!
c©Old Dominion University
-
13/15
Introduction Overview Administrivia Process Overview Q & A
Conclusion References Files Vita
References (1 of 1)
[1] Vangie Beal, Big
Data,https://www.webopedia.com/TERM/B/big_data.html,2017.
c©Old Dominion University
https://www.webopedia.com/TERM/B/big_data.html
-
14/15
Introduction Overview Administrivia Process Overview Q & A
Conclusion References Files Vita
Files of interest
1 Code snippets
c©Old Dominion University
library(cluster.datasets)data(all.us.city.crime.1970)crime =
all.us.city.crime.1970plot(crime[5:10])
"Chuck Cartledge"
-
15/15
Introduction Overview Administrivia Process Overview Q & A
Conclusion References Files Vita
Who am I?
Father
Husband (only 42 years, but it seemslonger)
PhD, Computer Science, 2014
CAPT, USN retired 2004 (31+ years)
Professional software developer (38 years)
A perennial student
1st computer: 1970, donated ICBMguidance computer, machine
code,paper/mylar tape, and drum memory
Interests: autonomic systems, real–time applications,
distributed processing,long-term preservation of digital data, Big
Data
c©Old Dominion University
IntroductionThe global view
OverviewThe world from 50,000 feet.Text
AdministriviaMiscellaneous and necessary things
Process OverviewQ & AConclusionReferencesFilesVita