Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University
Overview of Statistical SoftwareSPSS, Stata, SAS, R
Debby Kermer
Data Services
George Mason University
Software
v 25spss.com
v 15stata.com
v 9.4sas.com
3.5.1r-project.org
2
Pros and Cons
SPSS Stata SAS R
Use High Low High Growing
Jobs Some Academic Many More
Cost Expensive Depends Expensive Free
Learning Easy Middle Hard Very Hard
Extensible Scripts Users Built-in Users
3
What can it do well?
SPSS Stata
ANOVA, Factor Analysis, Discriminant Analysis
License modules separatelyTrends, Missing Data, Tables
Regression, diagnostics, and robust regression; Analysis of Survey Data, Time Series, SEM
Freely downloadable packages
SAS R
Data Management; Complex models; Mixed Model Analysis,
License components separately SAS/GIS, SAS/STAT, SAS/ACCESS
Anything, if you can find a [well written] package
Download additional packages from CRAN for free
4
Who Uses it?
SPSS Stata
Academic: Social Scientists (the “SS”), and non-scientists
Non-Academic: Companies that just want to do neat things
Academic: Economics, Public Policy, Biomedical Researchers
Non-Academic: Groups that often work with academics
SAS R
Academic: Statistics, Medicine
Non-Academic: Government, and corporations who are serious about data
Academic: Statistics, various
Non-Academic: Small companies with big plans, and others serious about data
5
Which to Pick?
SPSS Stata
Easy to start, limited capability
Best for those with infrequent and/or minimal needs
Easy syntax, highly extensible
Best for academics doing cutting-edge research
SAS R
Hard to learn, highly capable
Best for managing huge and/or complex datasets
Hard to learn, highly extensible
Best for those who program and know what they are doing
6
Job Prospects
R vs SAS vs Python
9
http://www.burtchworks.com/2016/07/13/sas-r-python-survey-2016-tool-analytics-pros-prefer/
Survey of selected “quantitative professionals”, 2016
Use in Academia
12
http://r4stats.com/articles/popularity/
# of Scholarly Articleson Google Scholar
2015
http://r4stats.com/articles/popularity/
Use in Industry
# of Analytics Jobs on Indeed.comFebruary 2014
13
Companies using it
http://blog.datacamp.com/statistical-language-wars-the-infograph/
14
Use
InterfaceSPSS Stata
SAS R
16
GUISPSS Stata
SAS Studio Deducer & R Cmdr
17
Syntax Contingency Table for variable q1 and q2;
with only n, row %, and χ2 test
SPSS
CROSSTABS/TABLES= q1 BY q2/STATISTICS=CHISQ /CELLS=COUNT ROW.
Stata
tabulate q1 q2, obs row chi2
SAS
PROC FREQ data=test; table q1*q2 / NOCOL NOPERCENT CHISQ;
RUN;
R
mytable <- table(q1, q2)mytableprop.table(mytable, 1)chisq.test(mytable)
19
Learning Curve
20
http://guides.nyu.edu/quant/statsoft#s-lib-ctab-6295863-7
Important Differences
Working with multiple files
SPSS Multiple datasets allowed, active data can be specified
Stata One dataset at a time, allows multiple instances
SAS Data always specified, no datasets in memory
R Data always specified, multiple objects in memory
22
Directories & Data Files
SPSS cd "directory" filename.sav
Stata cd "directory" filename
SAS libname name "directory" name.filename
R setwd("directory") use / or \\ filename.RData
23
Labeled/Categorical Variables
SPSS separate LABEL VALUES assigns labels to levels
Stata shared label define creates a 'label'
SAS shared PROC step creates label 'formats'
R separate defining a 'factor' creates labels for levels
24
Missing Values
SPSS . no value or user defined FALSE FALSE
Stata . highest possible value TRUE FALSE
SAS . lowest possible value FALSE TRUE
R NA no value, comparable TRUE TRUE
25
> # < #
Code Characteristics
CodeFile
Code Prompt
CommandEnd
Case Sensitive
Code Comment
SPSS Syntax File [nothing] . No *
Stata Do file . [line break] Yes *
SAS Program [line #] ; No *
R R Script > or + [interpreted] Yes #
26
Data Files
Files
Data Syntax Output Others
SPSS .sav .sps .spo / .spv .por
Stata .dta .do .smcl / .log .dct
SAS .sas7bdat .sas .lst / .log .sas7???
R .RData / .rda .R / .txt .txt .R??
28
Opening other File Types in…
29
Can open Stata and SAS directly
Use usespss, R, or Stat/Transfer (commercial)
Can import SPSS and Stata directly
Use packages foreign or haven to convert
Resources
Help transitioning, links to help for each software
http://dataservices.gmu.edu/resources/software
Single Statistical Software Initiative
https://wikis.uit.tufts.edu/confluence/display/SSSI/Home
31