Next Presentation: An Easy Route to a Missing Data Report with ODS+PROC FREQ+A Data Step Presenter: Mike Zdeb Mike is an assistant professor in the epidemiology & biostatistics department at the U@Albany School of Public Health in Rensselaer, NY. He's been a SAS user for 20+ years and has presented papers at SUGI, SAS Global Forums, NESUG, and numerous local user groups. Mike has written a SAS Press book, Mapping Made Easy Using SAS, and has been a reviewer for a number of SAS Press books.
34
Embed
Missing Data Report - SAS Group Presentatio… · identify and drop all variables/observations with ... call missing ... # use values from the missing data report ...
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Next Presentation: An Easy Route to a Missing Data Report with ODS+PROC FREQ+A Data Step
Presenter: Mike Zdeb
Mike is an assistant professor in the epidemiology &biostatistics department at the U@Albany School ofPublic Health in Rensselaer, NY. He's been a SAS user for20+ years and has presented papers at SUGI, SAS GlobalForums, NESUG, and numerous local user groups. Mikehas written a SAS Press book, Mapping Made EasyUsing SAS, and has been a reviewer for a number of SASPress books.
2
INTRODUCTION
# ONE OF FIRST STEPS IN DATA ANALYSIS ... A DECISIONON HOW TO HANDLE MISSING VALUES
DELETE OBSERVATIONS WITH EXCESS MISSING DATA
DELETE VARIABLES WITH EXCESS MISSING DATA
SUBSTITUTE IMPUTED VALUES FOR MISSING VALUES
TAKE NO ACTION AT ALL IF THE AMOUNT OF MISSING DATA ISINSIGNIFICANT AND NOT LIKELY TO AFFECT THE ANALYSIS
3
# DECISION FACILITATED BY KNOWING THE AMOUNTOF MISSING DATA
# USE AN ODS OUTPUT STATEMENT, PROC FREQ, ANDSOME DATA STEP PROGRAMMING
PRODUCE A MISSING DATA REPORT SHOWING THEPERCENTAGE OF MISSING DATA FOR EACH VARIABLE IN ADATA SET
IDENTIFY AND DROP ALL VARIABLES/OBSERVATIONS WITHEITHER ALL OR A HIGH PERCENTAGE OF MISSING VALUES
4
MAKE SOME DATA FOR EXAMPLES
# USE SASHELP.CLASS ...
* EXAMPLE 1;data class;set sashelp.class;if ranuni(987) le .5;if ranuni(987) le .2 then call missing(weight);if ranuni(987) le .1 then call missing(sex, age);run;
5
REVIEW
# PROC FREQ DEFAULT
proc freq data=class;run;
# TABLES OF ALL VARIABLES
# COUNTS (NO PERCENTAGES) OF MISSING DATA
6
# ADD A TABLES STATEMENT AND THE MISSINGOPTION ...
proc freq data=class;tables _all_ / missing;run;
# COUNTS AND PERCENTAGES OF MISSING DATA
# BETTER IF NON‐MISSING DATA ALL IN ONE GROUP
# USE A FORMAT TO CREATE GROUPS OF MISSING ANDNON‐MISSING DATA
7
proc format;value nm low‐high = 'OK' other = 'MISSING';value $ch ' ' = 'MISSING' other = 'OK';run;
# LOW‐HIGH in NM FORMAT AVOIDS HAVING TO SPECIFY THE RANGEOF ALL POSSIBLE MISSING NUMERIC VALUES ... REMEMBER THEREARE 27 OF THEM (._ , . , .A through .Z)