-
Data Science: Data Visualization Boot CampWhat is R?
Chuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge,
PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge,
PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge,
PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge,
PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge,
PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge,
PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhD
24 January 202024 January 202024 January 202024 January 202024
January 202024 January 202024 January 202024 January 202024 January
202024 January 202024 January 202024 January 202024 January 202024
January 202024 January 202024 January 202024 January 202024 January
202024 January 202024 January 202024 January 2020
1/32
-
2/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Table of contents (1 of 1)
1 Intro.2 What is R?
The languageAvailability
3 RStudioBasic how-tos (left side)Basic how-tos (right side)
4 R BasicsTypes of numbersVariables
Operations and functions
5 Hands-on
6 Q & A
7 Conclusion
8 References
9 Files
-
3/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
What are we going to cover?
We’re going to talk about:
What is the language R?
What GUI do I use to write andexecute R programs?
What are some basic variable typesin R?
-
4/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
The language
The official definition.
“R is a language and environment for statistical computingand
graphics. It is a GNU project which is similar to the Slanguage and
environment which was developed at Bell Labo-ratories (formerly
AT&T, now Lucent Technologies) by JohnChambers and colleagues.
R can be considered as a differ-ent implementation of S. There are
some important differ-ences, but much code written for S runs
unaltered under R.R provides a wide variety of statistical (linear
and nonlinearmodeling, classical statistical tests, time-series
analysis, classifi-cation, clustering, . . . ) and graphical
techniques, and is highlyextensible. The S language is often the
vehicle of choice forresearch in statistical methodology, and R
provides an OpenSource route to participation in that
activity.”
CRAN Staff [2]
-
5/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Availability
R is available for almost all major operating systems.
Linux (and its variants)
(Mac) OS X
Windows
Get the R environment and a command line interface.Download
from: https://cloud.r-project.org/Source code is available for
custom OSs.https://github.com/wch/r-source
https://cloud.r-project.org/https://github.com/wch/r-source
-
6/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Basic how-tos (left side)
A complete IDE
A complete, integrated Rdevelopment environment.
1 Text editor
2 R console
3 Variable list and contents
4 Tabbed display for differentuses
See software overview and designdocument for version anddownload
information.
-
7/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Basic how-tos (left side)
Same image.
-
8/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Basic how-tos (left side)
Editor
“Smart” editor
CTRL + O to open a file
CTRL + S to save a file
CTRL + A to highlightcontents
CTRL + Enter to transfercontents to Console
Multiple files can be openedat once
-
9/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Basic how-tos (left side)
Same image.
-
10/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Basic how-tos (left side)
Console
Interprets R commands
Commands from editor,other panels, or manuallyentered
Execution errors appear here
Contents of print functionappear here
-
11/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Basic how-tos (left side)
Same image.
-
12/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Basic how-tos (right side)
Variables
Displays contents of selectedenvironment
(includingvariables)
Display history of consolecommands
Can save and load data fromdata files
-
13/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Basic how-tos (right side)
Same image.
-
14/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Basic how-tos (right side)
Tabbed display
Displays files in the currentdirectory
Displays plots from theconsole
Allows packages to beadded, or removed from theconsole
Provides help/man pages forR functions and packages
-
15/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Basic how-tos (right side)
Same image.
-
16/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Basic how-tos (right side)
Starting an R script in the background
The image shows a Windowsenvironment.A *nix environment command
is:Rscript backend.R &
-
17/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Basic how-tos (right side)
Same image.
-
18/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Basic how-tos (right side)
Basic help with functions[1]1 Based on subject:
help.search("data input")
2 Based on pattern matching:apropos("lm")
3 Looking for a specific item:find("lm")
4 About a specific item:?lm
??lm
5 Example of a function:example(lm)
6 Source code for a function:lm
7 Demonstration of a function:demo(persp)
8 Demonstration of a function:vignette("moveline",
package="grid")
9 Contents of a library:library(help=spatial)
10 Install a new library:install.packages("Kfn")
11 Which data are included in
apackage:data(package="ggplot2")
12 Which data are included in allpackages:data(package =
.packages(all.available =
TRUE))
13 Find an overview of R packages:https://cran.r-project.
org/web/views/
https://cran.r-project.org/web/views/https://cran.r-project.org/web/views/
-
19/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Types of numbers
Lots of different number types
We’ll dive into each type shortly.Other things:
Each builds on another.
Each may have attributes.
Each has a type.
Each has a class.
And, you can create your own.
-
20/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Types of numbers
Same image.
And, you can create your own.
-
21/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Types of numbers
Definition of types (1 of 3)
Character: surrounded by “ (“hi”) or ’ (’bye’). Special
characters are escapedwith \
Complex: a combination of a real and an imaginary number in the
form a + bi
DataFrame is a table or a two-dimensional array-like structure
in which eachcolumn contains values of one variable and each row
contains oneset of values from each column.
Date: number of days relative to January 1, 1970 (Unix
dates)
Diff time: represent the amount of time between pairs of dates
or date-times
Double: numbers be specified in decimal (0.1234), scientific
(1.23e4), orhexadecimal (0xcafe)
Factor: Conceptually, factors take on a limited number of
different values;such variables are often referred to as
categorical variables
-
22/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Types of numbers
Definition of types (2 of 3)
Integer: are written similarly to doubles but must be followed
by upper caseell (L) (1234L, 1e4L, or 0xcafeL)
List: objects which contain elements of different types like
numbers,strings, vectors, and another list inside it
Logical: can have only one of two values (T[RUE] or F[ALSE])
NULL: NULL represents the null object in R. NULL is used mainly
torepresent the lists with zero length
Numeric: the default computational data type
POSIXct: Portable Operating System Interface (POSIX) a family
ofcross-platform standards, “ct” standards for calendar time
POSIXlt: Portable Operating System Interface (POSIX) a family
ofcross-platform standards, “lt” standards for local time
Raw: data is stored as raw bytes
-
23/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Types of numbers
Definition of types (3 of 3)
Scalar: an individual value (actually a vector of length 1)
Tibble: are a modern take on data frames. They keep the features
thathave stood the test of time, and drop the features that used to
beconvenient but are now frustrating
Vector: a basic data structure in R. It contains elements of the
same type.
-
24/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Variables
Variable types (part 1 of 2)[3]
1 Variable names:
Names are case sensitiveNames cannot beginwith numbers or
specialsymbolsNames cannot haveinternal spaces
2 Scalars (simple values):variable
-
25/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Variables
Variable types (part 2 of 2)[3]
1 Data frames (each column must have the same number of
values):L3
-
26/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Operations and functions
Operation
The basic data type is avector.
It is easy to create a vector,one way is as a sequence
x
-
27/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Operations and functions
Functions are supported
1 Have the same namingconventions as variables
2 Have three parts:1 Optional pass parameters
(named, evaluated,unnamed)
2 Text of the function3 The environment where
and while the functionexecutes
3 The last value evaluated isreturned.
4 Statements grouped by“curly braces” or semicolons.
functionName
-
28/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Some simple exercises to get familiar with R andRStudio
1 Create a variable andassign it the value 3
2 Print your variable
3 Create a function thattakes one parameter andreturns the
square of thatvalue
4 Use your function tocompute the square of 45
5 Print the value of thepassed parameter inside thefunction
6 Open the file library.Rand explain what thefunction dumpObject
does
-
29/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Q & A time.
Q: Do you know what the deathrate around here is?A: One per
person.
-
30/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
What have we covered?
Covered a little bit of R’sbackgroundLooked at RStudio, a
crossplatform GUI for working with RLooked at some R basics
(variabletypes and functions)
Next: what is data visualization anyway?
-
31/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
References (1 of 1)
[1] Michael J. Crawley, The R Book, John Wiley & Sons,
2012.
[2] CRAN Staff, What is R?,https://www.r-project.org/about.html,
2017.
[3] Simon Walkowiak, Big Data Analytics with R, PacktPublishing
Ltd., 2016.
https://www.r-project.org/about.html
-
32/32
Intro. What is R? RStudio R Basics Hands-on Q & A Conclusion
References Files
Files of interest
1 Software installation
-
Software in Support of the Old Dominion UniversityCollege of
Continuing Education and Professional
Development Big Data: Data Visualization Boot Camp
Chuck Cartledge
November 24, 2019
Contents
1 Introduction 1
2 Discussion 1
3 Conclusion 2
A Software on each workstation 2
B Software installation checkout 4
C Files 5
1 Introduction
A work in progress for software needed and used in the support
of the Old Dominion Uni-versity (ODU) College of Continuing
Education and Professional Development (CEPD) BigData: Data
Visualization boot camp.
2 Discussion
Software will be needed on each virtual machine for the boot
camp. This draft report containsa list of needed software, R
scripts to install necessary libraries, and simple R scripts to
testthe installation (see Section B).
-
3 Conclusion
After installing all the software identified in this report on
their personal computers, thestudent will be able to replicate all
boot camp activities.
A Software on each workstation
This section contains the assumptions about the operating system
environment, and softwareload out for each work station.
1. Operating system: Windows 7
2. Software
(a) R
• Version: 3.3.2• Available from:
https://cran.r-project.org/bin/windows/base/
(b) R Packages An install script is available to
programmatically download the neededlibraries (see Section B). The
list of libraries/packages include:
• bitops• cluster.datasets• clusterSim• colorspace•
colourlovers• dplyr• ellipse• gcookbook• geosphere• getopt• ggmap•
ggplot2• ggpubr• gnm• grDevices• grid
• gridBase• gridExtra• httr• jpeg• kernlab• KernSmooth• knitr•
magrittr• mapdata• maps• methods• modeest• mvtnorm• NISTunits• oec•
OpenStreetMap
• pdftools• plotrix• plyr• png• purrr• RColorBrewer• RCurl•
readr• readxl• reshape• rgdal• rgl• rglwidget• rJava• rjson•
scales
• sf
• sp
• sphereplot
• tidyr
• tm
• USAboundraries
• UScensus2000tract
• utils
• vcd
• vcdExtra
• xlsx
• xlsxjars
• XML
(c) R-Studio
2
https://cran.r-project.org/bin/windows/base/
-
• Version: 0.99.903• Available from:
https://www.rstudio.com/products/rstudio/download/
(d) wget
• Version: 1.*• Available from:
https://eternallybored.org/misc/wget/
The PATH environment variable should be updated to include the
location of the Rinterpreter.
3
https://www.rstudio.com/products/rstudio/download/
https://eternallybored.org/misc/wget/
-
B Software installation checkout
There is an extensive list of software to be installed to
support the boot camp. Afterthe software is installed, it is
necessary to configure the software and test that it is
installedcorrectly. A number of detailed procedureal files and R
scripts are included in this document(see Section C) to facilitate
the installation checkout. The R script files can be run inRStudio,
or any other R environment that supports setting the current
working directory.
The checkout is:
1. Associate the file extension “.R” with the RStudio
program.
2. Set the current RStudio working directory to the location of
installLibraries.R andrun the installLibraries.R script. There
should be no errors.
4
-
C Files
A collection of miscellaneous files mentioned in the report.
• installLibraries.R – an R script to install all necessary
libraries/packages from “the
cloud”
A complete collection of files (presentations, data, scripts,
etc.) can be downloaded fromthe boot camp web site using this *nix
command:
wget -np -r
https://www.cs.odu.edu/~ccartled/Teaching/2019-Spring/DataVisualization/
or, this Windows command
wget -r -np -nH --cut-dirs=3 -R index.*
https://www.cs.odu.edu/~ccartled/Teaching/2019-Spring/DataVisualization/
The Windows version of wget sometimes leaves “trashy” files
behind, like “index.html@C=D;O=A”and so on. These files are not
part of the boot camp web page, and can be removed or ig-nored.
None of the boot camp scripts use, or process these files. The *nix
version of wgetdoes not leave trashy files.
These commands are also located
in:https://www.cs.odu.edu/~ccartled/Teaching/2020-Spring/DataVisualization/Errata/
wget.txt
5
rm(list=ls())
getNeededPackageList