Top Banner
Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of the University of Michigan
54

Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

Apr 02, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

Ann Arbor ASAUp and Running Series:

SAS

Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of the University of Michigan

Page 2: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

Contents• Starting SAS• User Interface• Libraries• Syntax• Getting Data into SAS• Examining Data• Manipulating Data• Descriptive Statistics• Graphing Data• Statistics in SAS

Up and Running Series: SAS2

Page 3: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

Starting SAS

Start SAS 9.3 (English)

Up and Running Series: SAS3

Page 4: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

User Interface

Log

Comments, warnings, etc.

Program Editor:

Write and submit commands

Output (not seen)

Explorer/ Results

Up and Running Series: SAS 4

Page 5: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

Libraries

• SAS requires the creation of Library folders to save the data– Libraries are accessed through LIBNAME command

• Four Libraries are defined by default, at the start of SAS– Maps– SASHELP: holds help info and sample datasets– SASUSER: holds settings, etc.– WORK: default temporary Library for each session

• All data stored in this folder will be deleted at the end of each SAS session

• It is recommended the creation of permanent files/Libraries

Up and Running Series: SAS5

Page 6: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

Libraries

• Create a folder called ‘my_files’ on your desktop.

• Run this command in SAS: LIBNAME a "C:\Users\uniquename\Desktop\my_files";

• Refer to datasets in that folder by with the prefix ‘a.datasetname’.

• TIP: Use memorable names for libraries, rather than ‘a’ (e.g., ‘raw’, ‘final’, ‘time1’, etc)

Up and Running Series: SAS6

Page 7: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

Syntax

• SAS divides commands into two groups– DATA step

• create/alter datasets– PROC (Procedures)

• perform statistical analyses or generate reports.

• Some exceptions to the rule:– DATA step can be used to generate reports– PROC IMPORT creates a data set– PROC SORT alters data sets

(without telling you!)

Up and Running Series: SAS7

Page 8: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

• PROC IMPORT– Allows the reading of standard file types– Allows the reading of plain text, with user-specified

delimiters (i.e., the characters which separate the data)

– WARNING – SAS changed PROC IMPORT for Excel and Access files, in 64-bit SAS

• DATA step– Allows the reading of non-standard file types, complex

file structures, and unusual delimiters.

Getting data into SAS

Up and Running Series: SAS8

Page 9: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

DATA step

• SAS syntax can be used to read in raw data files (.txt, .csv files), specifying which variables to read in, which ones are text/numeric, combining multiple rows into one case, etc.

• However, this is a more advanced topic.– Follow up with an Intro class from CSCAR, or by

going through examples from the literature

(e.g., ‘The Little SAS Book’).

Up and Running Series: SAS10

Page 10: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

Examining Data

• VIEWTABLE Window– Select dataset icon in Explorer

• PROC CONTENTS– Produces a listing of data set information, including

the variables and their properties

• PROC PRINT– Prints a subset of variables or cases to the output

window

Up and Running Series: SAS11

Page 11: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

VIEWTABLE Window

Up and Running Series: SAS12

Page 12: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

PROC CONTENTS

• In the Editor window, type:

PROC CONTENTS data=a.class2;run;

• Highlight the syntax• Submit for processing

– Click on icon of ‘running-man’– Right click on selected syntax

Submit Selection

Up and Running Series: SAS13

Page 13: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

PROC CONTENTS

Up and Running Series: SAS14

Page 14: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

PROC PRINT

• In the Editor window, type:PROC PRINT data=a.class2;run;

• Submit for processing

Up and Running Series: SAS15

Page 15: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

PROC PRINT

Up and Running Series: SAS16

Page 16: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

Manipulating Data

• Usually done within a data step– Match data sets using a shared key variable– Create new variables, or drop/rename existing

variables– Take one or more subsets of the data– Sort the data by specific variable(s).

• Overwrite existing or create new datasets– PROC SORT– Adding/Removing variables– Merging Datasets

Up and Running Series: SAS17

Page 17: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

PROC SORT

• In the Editor window, type:

PROC SORT data=a.class2 out=a.class2sorted;

by age descending weight height;run;

• Submit for processing

• WARNING: PROC SORT alters data– Store in a new dataset

out=‘newdatasetname’;

Up and Running Series: SAS18

Page 18: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

PROC SORT

Up and Running Series: SAS19

Page 19: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

Adding/Removing variables

• Create new data set, compute new variables, remove unwanted variables

DATA a.class2metric (drop=weight height sex age);

set a.class2;

height_cm=height*2.54;

weight_kg=weight/2.2;

label height_cm=‘Height in CM’

weight_kg=‘Weight in Kilograms’;run;

PROC PRINT data=a.class2metric;

run;• Submit for processing

Up and Running Series: SAS20

Page 20: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

Adding/Removing variables

Up and Running Series: SAS21

Page 21: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

Merging Datasets

• Data sets must be sorted by the same key variable(s)

proc sort data=a.class2;

by name;

proc sort data=a.class2metric;

by name;

data classmerged;

merge a.class2 a.class2metric;

by name;run;

• Submit for processing

Up and Running Series: SAS22

Page 22: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

Merging Datasets

Up and Running Series: SAS23

Page 23: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

Merging Datasets

Up and Running Series: SAS24

Page 24: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

Descriptive Statistics

• PROC FREQ– Produces a table of counts and percentages– For cross-tabulations, statistical tests can also

be performed; e.g., independence testing

• PROC MEANS– Produces descriptive statistics such as mean,

standard deviation, minimum, maximum

Up and Running Series: SAS25

Page 25: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

PROC FREQ

• In the Editor window, type proc freq data=a.class2;

tables age*sex;run;

• Submit for processing

Up and Running Series: SAS26

Page 26: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

PROC FREQ

Up and Running Series: SAS27

Page 27: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

PROC MEANS

• In the Editor window, type proc means data=a.class2;

var age weight height;

run;

• Submit for processing

Up and Running Series: SAS28

Page 28: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

PROC MEANS

Up and Running Series: SAS29

Page 29: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

Graphing DataPROC GPLOT

• Simple bivariate scatterplot• Separate lines• Multiple variables scatterplot• Options

Up and Running Series: SAS30

Page 30: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

PROC GPLOT

• Simple bivariate scatterplot:proc gplot data=a.class2;

symbol1 value=dot interpol=rl;plot weight*height;

run;

• Submit for processing

Up and Running Series: SAS31

Page 31: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

PROC GPLOT - Log

Up and Running Series: SAS32

Page 32: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

PROC GPLOT

Up and Running Series: SAS33

Page 33: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

• To graph separate lines for each level of a categorical variable, type:

proc gplot data=a.class2;

symbol1 value=dot interpol=rl;

plot weight*height = sex;

run;

• Submit for processing

PROC GPLOT

Up and Running Series: SAS34

Page 34: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

PROC GPLOT

Up and Running Series: SAS35

Page 35: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

• Multiple variables on the same graph:proc gplot data=a.class2;

symbol1 value=dot interpol=rl color=blue;

symbol2 value=dot interpol=rl color=red;

plot weight * age;plot2 height * age;

run; quit;

• Submit for processing

PROC GPLOT

Up and Running Series: SAS36

Page 36: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

PROC GPLOT

Up and Running Series: SAS37

Page 37: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

value=___

• Any character enclosed in single quotes

• Special characters– dot– plus sign– star– square– ...and many others

interpol=___

• RL / RQ / RC– linear– quadratic – cubic – regression curves

• JOIN– connects consecutive

points (line graph)• BOX

PROC GPLOT

Up and Running Series: SAS 38

Page 38: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

Statistics in SAS

• PROC CORR– Correlational analyses

• PROC REG– Statistical Regression

• PROC UNIVARIATE– To assess normality of regression residuals

Up and Running Series: SAS39

Page 39: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

PROC CORR

• Compute bivariate correlation coefficients

proc corr data = a.class2;var age;with height weight;

run;

Up and Running Series: SAS40

Page 40: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

PROC CORR

Up and Running Series: SAS41

Page 41: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

PROC REG• Run a regression on merged ‘class’ dataset

– Save residuals and predicted values in an output dataset

– Request residual plotproc reg data=a.classmerged;

model height_cm=age weight / partial; output out=reg_data p=predict r=resid

rstudent=rstudent; plot rstudent. * height_cm;

run;quit;

• Notes – the quit command terminates the regression procedure; otherwise it keeps running; the output data set will be in the work library, since no library was specified.Up and Running Series: SAS 42

Page 42: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

PROC REG

Up and Running Series: SAS43

Page 43: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

PROC REG

Up and Running Series: SAS44

Page 44: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

PROC REG

Up and Running Series: SAS45

Page 45: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

PROC REG

Up and Running Series: SAS46

Page 46: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

PROC UNIVARIATE

• Assess normality of regression residuals stored in the output dataset from PROC REG:

proc univariate data=reg_data;var rstudent;histogram;qqplot / normal (mu=est

sigma=est);run;quit;

Up and Running Series: SAS47

Page 47: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

PROC UNIVARIATE

Up and Running Series: SAS 48

Page 48: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

PROC UNIVARIATE

Up and Running Series: SAS 49

Page 49: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

PROC UNIVARIATE

Up and Running Series: SAS 50

Page 50: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

QUESTIONS

Up and Running Series: SAS51

Page 51: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

Winter 2013 Training from CSCARhttp://cscar.research.umich.edu/workshops/

Introduction to SAS® - January 28,30, February 1,4,6,8, 2013

Intermediate Topics in SPSS: Data Management and Macros - February 5,7, 2013

Intermediate Topics in SPSS: Advanced Statistical Models - February 12,14, 2013

Intermediate SAS® - February 25,27, March 1, 2013

Regression Analysis - March 11,13,15, 2013

Applications of Hierarchical Linear Models - March 18,20,22, 2013

Statistical Analysis with R - March 19,21, 2013

Introduction to NVivo - April 3, 2013

Applied Structural Equation Modeling - April 10,11,12, 2013

Up and Running Series: SAS 52

Page 52: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

Further Resources

• The Little SAS Book: A Primer• UCLA site

– software tutorials, classes and lectures on statistical methods – an incredible site! http://www.ats.ucla.edu/stat/

• SAS Documentation: http://support.sas.com/documentation/

Documentation also found in ‘SAS help’ files.

Up and Running Series: SAS53

Page 53: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

54

Other Winter 2013 Workshopsfrom Ann Arbor ASA

R  -  January 31, 1-3 PM

Angell Hall Computing Classroom B

(also known as MH444-B)

For more information go to: http://community.amstat.org/annarbor/home

Up and Running Series: SAS

Page 54: Ann Arbor ASA Up and Running Series: SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of.

    PLACE   Starbucks State & Liberty, lower level

    TIME    6:00pm – 6:45pm,

     DATE       TOPIC

    24-JAN   Business Meeting     1 -APR  Business Meeting and Election of Officers

For more information go to: http://community.amstat.org/annarbor/home

Chapter Meetings open to all

Up and Running Series: SAS 55