Top Banner
INTRODUCTION TO SAS PROGRAMMING Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng
31

Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

Dec 16, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

INTRODUCTION TO SAS

PROGRAMMING

Professional Seminar

Northwestern Polytechnic University

By

Dr. Michael M Cheng

Page 2: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

Quiz

Select the following multiple choices.

What is SAS?

a. SAS is a highly contagious disease found in the winter time in Asia.

b. SAS is sardines and salmon.

c. SAS is a software that compute statistics only. d. SAS is a 4th generation computer language

capable of performing full feature computer programming.

e. None of the above.

Page 3: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

SAS (SAS System)

A computer software system that consists of several products that provide data retrieval, management, and analysis capabilities in addition to programming (SAS Institute, Inc.)

SAS is a problem solving tool.

Page 4: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

Heuristic Problem Solving

Image Mode 1

Linguistic Mode 1

Image Mode 2 Linguistic Mode 2

The interaction between image mode and linguistic mode is calledHeuristic Problem Solving.

Page 5: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

Psychology of Communication By George Miller

Coding Decoding Channel Capacity Magic number 7 plus or minus 2

For example: 2121568931

Page 6: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

Psychology of Communication By George Miller

Coding Decoding Channel Capacity Magic number 7 plus or minus 2

For example: ??????????

Page 7: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

Psychology of Communication By George Miller

Coding Decoding Channel Capacity Magic number 7 plus or minus 2

For example: 212-156-8931

Page 8: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

SAS program source code is composed of manySAS statements, and some for PROC step, some forDATA step, and some used in either step.

Page 9: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

SAS Syntax and SAS Data Sets

SAS statements begin with an identifying keyword and end with a semicolon;

SAS statements are free-format.

A SAS data set is a collection of data values arranged in a rectangular tables.

The columns in the table are called variables. The rows in the table are called observations (or records). There are two kinds of variables:

character variables number variables

Page 10: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

VARIABLES

NAME SEX AGE HEIGHT WEIGHT ----------------------------------------------------------------------------------------------------------observations 1 JOHN M 12 59.0 99.5observations 2 JAMES M 12 57.0 83.5observations 3 AFLRED M 14 69.0 112.5 . . . . . . . . . . . . . . . . . .observations 19 ALICE F 12 56.5 84.0

Page 11: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

DATA CLASS; INPUT NAME $1-8 SEX $11 AGE 13-14 HEIGHT 16-19 WEIGHT 21-25; CARDS; data lines

PROC PRINT DATA=CLASS;PROC MEANS DATA=CLASS; VARIABLES HEIGHT WEIGHT;

Page 12: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

Raw data

DATA CLASS; INPUT NAME $1-8

SEX $11 AGE 13-14 HEIGHT 16-19

WEIGHT 21-25;

CARDS;

CLASS

Creating SAS data sets

Page 13: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

A listing of the raw data

NAME SEX AGE HEIGHT WEIGHTJOHN M 12 59.0 99.5JAMES M 12 57.3 83.0ALFRED M 14 69.0 112.5WILLIAM M 15 66.5 112.0JEFFREY M 13 62.5 84.0RONALD M 15 67.0 133.0THOMAS M 11 57.5 85.0PHILIP M 16 72.0 150.0ROBERT M 12 64.8 128.0HENRY M 14 63.5 102.5JANET F 15 62.5 112.5 JOYCE F 15 67.0 133.0JUDY F 14 64.3 90.0CAROL F 14 62.8 102.5JANE F 12 59.8 84.5 LOUISE F 12 56.3 77.0BARBARA F 13 65.3 98.0MARY F 15 66.5 112.0ALICE F 13 56.5 84.0

Page 14: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

CARDS; /* data lines */JOHN M 12 59.0 99.5JAMES M 12 57.3 83.0ALFRED M 14 69.0 112.5WILLIAM M 15 66.5 112.0JEFFREY M 13 62.5 84.0RONALD M 15 67.0 133.0THOMAS M 11 57.5 85.0PHILIP M 16 72.0 150.0ALFRED M 14 69.0 112.5ROBERT M 12 64.8 128.0HENRY M 14 63.5 102.5JANET F 15 62.5 112.5 JOYCE F 15 67.0 133.0JUDY F 14 64.3 90.0CAROL F 14 62.8 102.5JANE F 12 59.8 84.5 LOUISE F 12 56.3 77.0BARBARA F 13 65.3 98.0MARY F 15 66.5 112.0ALICE F 13 56.5 84.0

Page 15: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

PROC PRINT DATA=CLASS; SAS OBS NAME SEX AGE HEIGHT WEIGHT 1 JOHN M 12 59.0 99.5 2 JAMES M 12 57.3 83.0 3 ALFRED M 14 69.0 112.5 4 WILLIAM M 15 66.5 112.0 5 JEFFREY M 13 62.5 84.0 6 RONALD M 15 67.0 133.0 7 THOMAS M 11 57.5 85.0 8 PHILIP M 16 72.0 150.0 9 ALFRED M 14 69.0 112.5 10 HENRY M 14 63.5 102.5 11 JANET F 15 62.5 112.5 12 JOYCE F 15 67.0 133.0 13 JUDY F 14 64.3 90.0 14 CAROL F 14 62.8 102.5 15 JANE F 12 59.8 84.5 16 LOUISE F 12 56.3 77.0 17 BARBARA F 13 65.3 98.0 18 MARY F 15 66.5 112.0 19 ALICE F 13 56.5 84.0

Page 16: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

PROC MEANS DATA=CLASS; VARIABLES HEIGHT WEIGHT;

SAS VARIABLES N MEAN STANDARD MINIMUM MAXIMUM STD ERROR DEVIATION VALUE VALUE OF MEAN

WEIGHT 19 100.026316 22.7739335 50.5000000 150.000000 5.22469867 HEIGHT 19 62.336842 5.1270752 51.3000000 72.000000 1.17623173

Page 17: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

THE PROC STEP

The PROC (or PROCEDURE) statement is used to call a SAS procedure.

SAS procedures are computer programs that: read SAS data sets, compute statistics, print results, and create SAS data sets. For example: PROC MEANS SUM MAXDEC=2 DATA=CLASS; PROC CONTENTS DATA=CLASS; PROC SORT DATA=CLASS; BY SEX DESCENDING WEIGHT;

Page 18: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

Data Transformations

Assignment statement

Assignment statements are used to create new variable and to modify values of existing variables. SAS evaluates an expression and assigns the result to a variable.

variable = expression;i.e. x=1+2;

Page 19: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

Example: 1. Read three variables (YEAR, REVENUE, and EXPENSE) into a SAS data set. 2. Add a variable named INCOME, which is the difference between REVENUE and EXPENSE.3. Change the values of YEAR from 2 digits to 4 digits.

DATA PROFITS; INPUT YEAR REVENUE EXPENSE; INCOME=REVENUE–EXPENSE; YEAR = YEAR + 2000; CARDS;00 5650 105001 6280 1140PROC PRINT:

SAS OBS YEAR REVENUE EXPENSE INCOME

1 2000 5650 1050 4600 2 2001 6280 1140 5140

Page 20: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

SAS functions

Selected functions that compute simple statistics.

SUM sum MEAN arithmetic mean VAR variance MIN minimum value MAX maximum value STD standard deviation

Page 21: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

Example:

Given: Temperature data at a specific location are recorded every hour on the hour for several days. Each record in a file represents one day and contains the date and the 24 recorded temperatures for that date.Objective: Create a SAS data set that contains the date, the 24 hourly temperatures, the average temperature, the minimum temperature and the maximum temperature for each day.

DATA TEMP; INPUT DATE $1-7 @11 (T1-T24) (2.); AVGTEMP=MEAN(OF T1-T24); MINTEMP=MIN(OF T1-T24); MAXTEMP=MAX(OF T1-T24); CARDS;data lines program data vector DATE T1 . . . AVGTEMP MINTEMP MAXTEMP

Page 22: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

The RETAIN statement

SAS normally resets all variables in the program data vector to missing before each execution of the DATA step. A RETAIN statement can be used to:

- Retain variable values from the last execution of the DATA step- Give initial values to the valuables.

Example: Accumulate totals and count observations. DATA ADD; RETAIN COUNT 0 TOTAL 0; INPUT SCORE; TOTALS=TOTAL+SCORE; CARDS; 10 5 3 7 . 6 4 PROC PRINT; program data vector COUNT TOTAL SCORE

Page 23: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

The SUM statement

The SUM statement is a special assignment statement that accumulates values from one observation to thenext. It retains the values of the created variable and treats a missing value as zero.

Example: Accumulate totals and count observations.

DATA ADD; INPUT SCORE; COUNT + 1; TOTALS=TOTAL+SCORE;CARDS;10 5 3 7 . 6 4PROC PRINT;

Page 24: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

CONDITIONAL EXECUTION OF SAS STATEMENT

IF-THEN/ELSE Statements

Use of the IF-THEN statement when you want to execute a SASStatement conditional on some expression.

Numeric Comparison

IF CODE=1 THEN RESPONSE=‘GOOD’;IF CODE=2 THEN RESPONSE=FAIR’;IF CODE=3 THEN RESPONSE=‘POOR;

For efficiency, use ELSE statements.IF CODE=1 THEN RESPONSE=“GOOD’;ELSE IF CODE=2 THEN RESPONSE=‘FAIR’ ELSE IF CODE=3 THEN RESPONSE=‘POOR”;

Page 25: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

Character comparison

DATA CLASS; INPUT NAME $SEX $AGE HEIGHT WEIGHT; IF SEX=‘M’ THEN SEX=‘MALE’; ELSE SEX=‘FEMALE’; CARDS;

Page 26: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

Comparison operators

LT < less thanGT < greater thanEQ = equal thanLE <= less than or equal toGE >= greater than or equal toNE not equal NL not less thanNG not greater than

Logical operators

OR l or, either AND & andNOT not, negation

Page 27: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

DO and END statementsExecution of a DO statement specifies that all statements between the DO and its matching END statement are to be executed.

For example:DATA EMPLOY; INPUT NAME $1-8 DEPNO 10-12 COM 14-17 SALARY 19-23; IF DEPTNO=201 THEN DO; DEPT=‘SALES’; GROSSPAY = COM+SALARY; END; ELSE DO; DEPT=‘ADMIN’; GROSSPAY = SALARY; END; CARDS;

Page 28: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

JOHNSON 201 1500 18000MOSSER 101 21000LARKIN 101 24000GARRETT 201 4800 18000

PROC PRINT output

SAS OBS NAME DEPTNO COM SARLARY DEPT GROSSPAY

1 JOHNSON 201 15000 18000 SALES 19500 2 MOSSER 101 . 21000 ADMIN 21000 3 LARKIN 101 . 24000 ADMIN 24000 4 GARRETT 201 48000 18000 SALES 22800

Page 29: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

PROC SORT DATA=RATE_A; BY ZIP; PROC SORT DATA=RATE_B; BY ZIP; PROC SORT DATA=RATE_C; BY ZIP; DATA TMTL; MERGE RATE_A(IN=A) CTL_TBL(IN=B); BY ZIP; IF A & B;

DATA TMMR; MERGE RATE_B(IN=A) CTL_TBL(IN=B); BY ZIP; IF A & B;

DATA TMCR; MERGE RATE_C(IN=A) CTL_TBL(IN=B); BY ZIP; IF A & B;

Page 30: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

Conclusion

1. SAS is a 4th generation computer language.

2. SAS is a problem solving tool.

3. It makes your life easier (less stressful).

Page 31: Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.

THE END