Top Banner
23/08/11 SAS A Complete programming language with report formatting with statistical and mathematical capabilities. 1st Part.. The first part called the DATA step, This part describes your input data allows you to modify that data with arithmetic operations and logical decisions. The DATA Step will format your data into a special file called SASdataset. The second part of SAS is a library of canned routines called PROCEDURES. The procedures can only use SAS datasets as input. The Data step must always have preceded the Procedure section. The procedures are executed by coding a special statement in SAS called a PROC statement . The core of SAS language: a programming language that you use to manage your data. tools for data analysis and reporting. a tool for extending and customizing software programs and for reducing text in tool that helps you find logic problems in DATA step programs. a system that delivers output in a variety of easy-to-access formats, such as SAS data sets, listing files, or Hypertext Markup Language an interactive, graphical user interface that enables you to easily run and test your SAS
195
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Notes

23/08/11SAS

A Complete programming language with report formatting with statistical and mathematical capabilities.

1st Part..• The first part called the DATA step, This part describes your input data allows you

to modify that data with arithmetic operations and logical decisions. The DATA Step will format your data into a special file called SASdataset.

The second part of SAS is a library of canned routines called PROCEDURES.• The procedures can only use SAS datasets as input.• The Data step must always have preceded the Procedure section.• The procedures are executed by coding a special statement in SAS called a PROC

statement .

The core of SAS language: 

 

 

a programming language that you use to manage your data.

tools for data analysis and reporting.

a tool for extending and customizing SAS software programs and for reducing text in your programs.

tool that helps you find logic problems in DATA step

a system that delivers output in a variety of easy-to-access formats, such as SAS data sets, listing files, or Hypertext Markup Language

an interactive, graphical user interface that enables you to easily run and test your SAS programs.

                                                                                         

A DATA step consists of a group of statements in the SAS language that can read data from external files write data to external files read SAS data sets and data views create SAS data sets and data views.

A group of procedure statements is called a PROC step. SAS procedures analyze data in SAS data sets to produce statistics, tables, reports, charts, and plots, to create SQL queries, and to perform other analyses and operations on your data. They also provide ways to manage and print SAS files SAS Macro Facility

Page 2: Notes

Base SAS software includes the SAS Macro Facility, a powerful programming tool for extending and customizing your SAS programs, and for reducing the amount of code that you must enter to do common tasks. Macros are SAS files that contain compiled macro program statements and stored text. You can use macros to automatically generate SAS statements and commands, write messages to the SAS log, accept input, or create and change the values of macro variables.

SAS ARCHITECTURE

SAS Tier - Application Layer: All the Servers Metadata, Workspace Server, Stored Process Server and OLAP (SAS ETL Processes)Mid Tier - Data Management Layer: All Data management and Reporting Services and Tools. (BaseSAS, IOM)Client Tier - Presentation Layer: All Easy (GUI) Reporting Components (SASEG, SAS Miner etc)

Examples:

data x;input name $ age sal;cards;prasad 30 20000venkat 40 60000

Page 3: Notes

ramu 50 80000run;

proc sql;create table x1 (name char, age int, sal int);quit;

proc sql;insert into x1 values('prasad1',30,3000) values('prasad2',40,4000)

values('prasad3',50,500000);

insert into x1 set name='venkat', age=20, sal=10000set age=80,name='prasad5',sal=80000;quit;

proc freq data=x1;run;

Reading Raw Data from External FilesStatement:

data y;infile 'C:\Users\home\Desktop\ukbabau.txt';input a b c;run;

proc means data=y;run;

INPUT STYLES

1.LIST INPUT

Data vaues are separated by atleast one blank space.any character values does not have any embded spaces.

Example:

DATA LISTINP; INPUT ID HEIGHT WEIGHT GENDER $ AGE;DATALINES;1 68 144 M 232 78 202 M 343 62 99 F 374 61 101 F 45;

Page 4: Notes

run;data x;input name $1-18 age sal;cards;ramuso munamrayana 20 30000meenahi sinha 30 60000run;

2.COLUMN INPUT

In raw data , the data values are availble in specific columns.

Statement:

input pid 1-3 name $7-17 age 20-21 color $25-29;cards;100 kiran kumar 89 white101 pavan 67 black102 kranthi 89 whiterun;

Example

DATA COLINPUT; INPUT ID 1 HEIGHT 2-3 WEIGHT 4-6 GENDER $ 7 AGE 8-9; DATALINES; 168144M23 278202M34 362 99F37 461101F45 ; run;

3.FORMAT INPUTIn format input method 2 symbols playing a main role.

+n,n. n indicates number@n column pointern. column range

data forin;input @1 pid 3. @7 name $10. @20 age 1. @25 color $4.;cards;100 kiran kumar 89 white101 pavan 67 black102 kranthi 89 white

Page 5: Notes

run;

3.format inputIn format input method 2 symbols plays main role.

+n,n. n indicates number +n column pointer n. column range

syntax:

data forin;input @1 pid 3. @7 name $10. @20 age 1. @25 color $4.;cards;100 kiran kumar 89 white101 pavan 67 black102 kranthi 89 whiterun;

4.absolute input

Using this method we can read standatrd data and non standard data.@n indicate column hold pointern. column range

syntax:

data absol;input @1 pid 3. @7 name $10. @20 age 2. @25 color $4.;cards;100 kiran kumar 89 white101 pavan 67 black102 kranthi 89 whiterun;

5.mixed inputif we use or write one or more input techniques in required input statement called mixed input.

data mixed;input pid 3. name $7-17 @20 age 2. @25 color $4.;cards;100 kiran kumar 89 white101 pavan 67 black102 kranthi 89 whiterun;

6.named inputSometimes data values available with variable name. In this we are using named input.

Page 6: Notes

data named;input name=$ age= sal=;cards;name=prasad age=20 sal=3000name=giri age=30 sal=4000age=40 sal=50000 name=ramurun;

Using infile options

Using infile options we can read the data in proper order.

DLM:It is used to indicate delimiters in raw data.

DATA COMMAS;INFILE DATALINES DLM=',' ;INPUT ID HEIGHT WEIGHT GENDER $ AGE;DATALINES;1,68,144,M,232,78,202,M,343,62,99,F,374,61,101,F,45;

DATA COMMAS3;INFILE datalines DLM=', & $' ;INPUT ID HEIGHT WEIGHT GENDER $ AGE;DATALINES;1,68&144,M,232,78,202,M,343$62,99,F,374,61,101$F,45;run;

/*Missover*/If any value is missing in raw data it tries to read next data value.To avoid that behaviour we can use missover.

data mis;infile cards missover ;input sal age name $;cards;2000 20 ramu3000 40 5000 60 mahesh6000 65

Page 7: Notes

4000 55 ranirun;proc print;run;

/*dsd*/1.In raw data, data values are separated by commas we will use dsd.2.SAS treats missing value in between two consuective delimitres.3.quotation marks in data values.DATA ds; INFILE DATALINES dsd ; INPUT ID HEIGHT WEIGHT GENDER $ AGE;DATALINES;1,68,144,M,232,78,202,M,343,62,99,F,374,61,101,F,45;DATA ds1; INFILE DATALINES dsd ; INPUT ID HEIGHT WEIGHT GENDER $ AGE;DATALINES;1,68,144,M,232,,202,M,343,62,99,F,374,,101,F,45;proc print;run;data scores; infile datalines dsd; input Name : $9. Score Team : $25. Div $; datalines;Joseph,76,"Red Racers, Washington",AAAMitchel,82,"Blue Bunnies, Richmond",AAASue Ellen,74,"Green Gazelles, Atlanta",AA;

/*Truncover*/It works like a missover.it adjust vary of the length for required variable & save thediskplace &reduce the storage place.orIt manage variable length of variable values.

data tr;infile cards truncover;input age sal name $20.;cards;12 3000 ramash

Page 8: Notes

23 4000 uiytrrewsdfg45 5690 ramashankarrajutyrewyuu38 5000 uiopuytrd;run; proc print;run;

/*scanover*/

used for scaning the string whenever the string starts there it reads the values.

data sca;infile cards scanover;input @'ab' name $ loc $ sal;cards;ab appolo hyd 20000ab image hyd 30000appolo sec 40000care ban 50000care hyd 60000run;proc print;run;data sca1;infile cards scanover;input @ 'ab' a b ;cards;10 20 3030 70 ab 78 89 9040 ab 49 80 run;proc print;run;

/*stopover*/when error is occured in raw data in reading,stopover tries to stops the reading.

data st;infile cards stopover;input t1 t2 t3 t4 t5;cards;10 22 33 88 9944 5599 88 77 66 60run;

DATA LISTINP; INPUT ID HEIGHT WEIGHT GENDER $ AGE;

Page 9: Notes

DATALINES;1 68 144 M 232 78 202 M 343 62 99 F 374 61 101 F 45;

PROC PRINT; RUN;*---------------------------------------------;

*--------------- EXAMPLE 2.1 ----------------;DATA COMMAS; INFILE DATALINES DLM=','; INPUT ID HEIGHT WEIGHT GENDER $ AGE;DATALINES;1,68,144,M,232,78,202,M,343,62,99,F,374,61,101,F,45;

PROC PRINT; RUN;*---------------------------------------------;

*--------------- EXAMPLE 2.2 ----------------;DATA COMMAS; INFILE DATALINES DSD; INPUT X Y TEXT;DATALINES;1,2,XYZ3,,STRING 4,5,"TESTING"6,,"ABC,XYZ";

PROC PRINT;RUN;*---------------------------------------------;

*---------------- EXAMPLE 3.1 --------------------;DATA INFORMS; INFORMAT LASTNAME $20. DOB MMDDYY8. GENDER $1.; INPUT ID LASTNAME DOB HEIGHT WEIGHT GENDER AGE; FORMAT DOB MMDDYY8.;DATALINES; 1 SMITH 1/23/66 68 144 M 262 JONES 3/14/60 78 202 M 32

Page 10: Notes

3 DOE 11/26/47 62 99 F 454 WASHINGTON 8/1/70 66 101 F 22 ;

PROC PRINT; RUN;*-------------------------------------------------;

*---------------- EXAMPLE 3.2 --------------;DATA COLONS; INPUT ID LASTNAME : $20. DOB : MMDDYY8. HEIGHT WEIGHT GENDER : $1. AGE; FORMAT DOB MMDDYY8.;DATALINES; 1 SMITH 01/23/66 68 144 M 262 JONES 3/14/60 78 202 M 32 3 DOE 11/26/47 62 99 F 45 4 WASHINGTON 8/1/70 66 101 F 22 ;

PROC PRINT; TITLE 'Example 3.2';RUN;*--------------------------------------------;

*--------------- EXAMPLE 4 --------------;DATA AMPERS; INPUT NAME & $25. AGE GENDER : $1.; DATALINES; RASPUTIN 45 MBETSY ROSS 62 FROBERT LOUIS STEVENSON 75 M;

PROC PRINT; TITLE 'Example 4';RUN;*-----------------------------------------;

*---------------------- EXAMPLE 5.1 ----------------------;DATA COLINPUT; INPUT ID 1 HEIGHT 2-3 WEIGHT 4-6 GENDER $ 7 AGE 8-9; DATALINES; 168144M23 278202M34 362 99F37 461101F45 ;

PROC PRINT;

Page 11: Notes

TITLE 'Example 5.1';RUN;*----------------------------------------------------------;

*--------------------- EXAMPLE 5.2 -----------------------;DATA COLINPUT; INPUT ID 1 HEIGHT 2-3 WEIGHT 4-6 GENDER $ 7 AGE 8-9; DATALINES; 168144M23 278202M34 362 99F37 461101F45 ;

PROC PRINT; TITLE 'Example 5.2';RUN;*----------------------------------------------------------;

*--------------- EXAMPLE 5.3 -------------;DATA COLINPUT; INPUT ID 1 AGE 8-9; DATALINES; 168144M23 278202M34 362 99F37 461101F45 ;

PROC PRINT; TITLE 'Example 5.3';RUN;*------------------------------------------;

*--------------- EXAMPLE 5.4 -------------;DATA COLINPUT; INPUT AGE 8-9 ID 1 WEIGHT 4-6 HEIGHT 2-3 GENDER $ 7; DATALINES; 168144M23 278202M34 362 99F37

Page 12: Notes

461101F45 ;

PROC PRINT; TITLE 'Example 5.4';RUN;*------------------------------------------;

*------------ EXAMPLE 6.1 ---------;DATA POINTER; INPUT @1 ID 3. @5 GENDER $1. @7 AGE 2. @10 HEIGHT 2. @13 DOB MMDDYY6.; FORMAT DOB MMDDYY8.;DATALINES; 101 M 26 68 012366102 M 32 78 031460103 F 45 62 112647104 F 22 66 080170;

PROC PRINT; TITLE 'Example 6.1';RUN;*-----------------------------------;

*------------- EXAMPLE 7.1 -------------;DATA POINTER; INPUT #1 @1 ID 3. @5 GENDER $1. @7 AGE 2. @10 HEIGHT 2. @13 DOB MMDDYY6. #2 @5 SBP 3. @9 DBP 3. @13 HR 3.; FORMAT DOB MMDDYY8.;DATALINES; 101 M 26 68 012366101 120 80 68102 M 32 78 031460102 162 92 74 103 F 45 62 112647103 134 86 74104 F 22 66 080170104 116 72 67;

Page 13: Notes

PROC PRINT; TITLE 'Example 7.1';RUN;*----------------------------------------;

*------------- EXAMPLE 7.2 ---------;DATA SKIPSOME; INPUT #2 @1 ID 3. @12 SEX $6. #4;DATALINES;101 256 RED 9870980101 898245 FEMALE 7987644101 BIG 9887987101 CAT 397 BOAT 68102 809 BLUE 7918787102 732866 MALE 6856976102 SMALL 3884987 102 DOG 111 CAR 14;

PROC PRINT;TITLE 'Example 7.2';RUN;*------------------------------------;

*----------------- EXAMPLE 8.1 -----------------;DATA PARTS; INPUT @1 PARTID $14. @1 ST $2. @6 WT 3. @13 YR 2. @16 PARTDESC $24. @41 QUANT 4.; DATALINES;NY101110060172 LEFT-HANDED WHIZZER 233MA102085112885 FULL-NOSE BLINK TRAP 1423CA112216111291 DOUBLE TONE SAND BIT 45NC222845071970 REVERSE SPIRAL RIPSHANK 876;

PROC PRINT; TITLE 'Example 8.1';RUN;*------------------------------------------------;

*------------- EXAMPLE 9.1 ----------;DATA LONGWAY; INPUT ID 1-3 Q1 4

Page 14: Notes

Q2 5 Q3 6 Q4 7 Q5 8 Q6 9-10 Q7 11-12 Q8 13-14 Q9 15-16 Q10 17-18 HEIGHT 19-20 AGE 21-22; DATALINES; 1011132410161415156823102143321212141316722110323342141412121066281041553216161314126622;

PROC PRINT; TITLE 'Example 9.1';RUN;*-------------------------------------;

*------------- EXAMPLE 9.2 ----------;DATA SHORTWAY; INPUT ID 1-3 @4 (Q1-Q5)(1.) @9 (Q6-Q10 HEIGHT AGE)(2.); DATALINES; 1011132410161415156823102143321212141316722110323342141412121066281041553216161314126622;

PROC PRINT; TITLE 'Example 9.2';RUN;*-------------------------------------;

*------------- EXAMPLE 9.3 -----------;DATA PAIRS; INPUT @1 ID 3. @6 (QN1-QN5)(1. +3) @7 (QC1-QC5)($1. +3) @26 (HEIGHT AGE)(2. +1 2.); DATALINES; 101 1A 3A 4B 4A 6A 68 26102 1A 3B 2B 2A 2B 78 32103 2B 3D 2C 4C 4B 62 45

Page 15: Notes

104 1C 5C 2D 6A 6A 66 22;

PROC PRINT; TITLE 'Example 9.3';RUN;*--------------------------------------;

*------------- EXAMPLE 10 ------------;/*Single trailing:the trailing single @ tells the program not to go to the next data line for the next INPUT statement inthe data step*/

/*or*//*To hold line pointer*/

DATA TRAILING;INPUT @6 TYPE $1. @ ;IF TYPE = '1' THEN INPUT AGE 1-2;ELSE IF TYPE = '2' THENINPUT AGE 3-4;DROP TYPE;DATALINES;23 1 44 2;PROC PRINT DATA=TRAILING;RUN;

DATA MIXED; INPUT @20 TYPE $1. @; IF TYPE = '1' THEN INPUT ID 1-3 AGE 4-5 WEIGHT 6-8; ELSE IF TYPE = '2' THEN INPUT ID 1-3 AGE 10-11 WEIGHT 15-17; DATALINES; 00134168 1 00245155 1 003 23 220 2 00467180 1 005 35 190 2 ;

PROC PRINT; RUN;data med;

Page 16: Notes

input pid visit drug $ @;output;input visit drug $ @;output;input visit drug $ @;output;input visit drug $ @;output;cards;100 1 5mg 2 5mg 3 10mg 4 15mg101 1 10mg 2 10mg 3 15mg 4 15mgrun;proc print;run;*--------------------------------------;

*----------- EXAMPLE 11.1 --------;DATA LONGWAY; INPUT X Y; DATALINES; 1 2 3 4 5 6 6 9 10 12 13 14 ;

PROC PRINT; TITLE 'Example 11.1';RUN;*----------------------------------;

*----------------------- EXAMPLE 11.2 --------------------;/*double trailing: line contains*//*more than one observation.*/

DATA SHORTWAY; INPUT X Y @@; DATALINES; 1 2 3 4 5 6 6 9 10 12 13 14 ;

PROC PRINT;RUN;

Single trailing:

Use a trailing @ at the end of the INPUT statement to hold the record in the input buffer for the execution of the next INPUT statement.

Use an IF statement on the portion that is read in to test for a condition.

Page 17: Notes

If the condition is met, use another INPUT statement to read the remainder of the record to create an observation.

If the condition is not met, the record is released and control passes back to the top of the DATA step.

1. Use a trailing @ at the end of the INPUT statement to hold the record in the input buffer for the execution of the next INPUT statement.

2. Use an IF statement on the portionthat is read in to test for a condition.

3. If the condition is met, use another INPUT statement to read the remainder of the record to create an observation.

4. If the condition is not met, the record is released and control passes back to the top of the DATA step.

5. Reading the parts of data more than once.

data red_team; input Team $ 13-18 @; if Team='red'; input IdNumber 1-4 StartWeight 20-22 EndWeight 24-26; datalines;1023 David red 189 1651049 Amelia yellow 145 1241219 Alan red 210 1921246 Ravi yellow 194 1771078 Ashley red 127 1181221 Jim yellow 220 . ;

proc print data=red_team; run;

data values informat format30/02/2003 ddmmyy10. or s10.30-02-2003 ddmmyy10. ddmmyyd10. 30:02:2003 ddmmyyc10.30.02.2003 ddmmyyp10.30 02 2003 ddmmyyb10.

Eg:2

data values informat format30/02/03 ddmmyy8. or s8.

Page 18: Notes

30-02-03 ddmmyy8. ddmmyyd8. 30:02:03 ddmmyyc8.30.02.03 ddmmyyp8.30 02 03 ddmmyyb8.

Eg:

data value informat format02/30/2003 mmddyy10. ddmmyy10.

invalid formats

ddyymm10.yyddmm10.mm/yy/ddyy/mm/dd

date value informat format

eg:

23oct2003 date9. date9.23oct03 date7. date7. dec2003 monyy7. monyy7. dec03 monyy5. monyy5.

data x;input a:mmddyy10.;format a ddmmyy10.;cards;01oct201003feb201104jan2000run;proc print;run;

julian date informat format 2003032 julian7. julian7.2003==>year032==>no of days completed in year

data x;input a:julian7.;format a julian7.;cards;20030322004058

Page 19: Notes

2005098run;proc print;run;

old date format:

worddate18.(max)worddate15.(min)data x;input a:ddmmyy10.;format a worddate18.;cards;01-02-200302-04-201005-08-2011run;proc print;run;

worddate18.(max)worddate15.(min)weekdate24.(min)weekdate30(max)data x;input a:ddmmyy10.;format a worddate18.;

data x;input a:ddmmyy10.;format a weekdate30.;cards;01-02-200302-04-201005-08-2011run;proc print;run;

data x;input eid sal comma6. pf dollar11.;format sal comma6. pf dollar11.;cards;100 23,000 $1345,000101 34,000 $1,234,678run;proc print;run;

time

datavalues informat format10:12:23AM time10. timeampm12

Page 20: Notes

02:23:52PM

22:10:2010:12:20 time8. time8.

12oct2003:02:23:30pm12oct2003:15:23:30 datetime20. dateampm22 datetime18. datetime18.

data x;input id stime :time10. etime :time8.;format stime timeampm12. etime time8.;cards;200 10:23:34am 15:34:19201 02:23:34pm 18:23:34run;proc print;run;

27 th August 2011-08-27

Proc sort

Used for sorting analysis. By default it gives ascending order.if want to descending we need to specify descending option in sort procedure.

/*duplicate data value*/The same data value repeated in that variable. So it s called DDV.Duplicate data value can find based on required variable./*duplicate observation*/The same observation repeated in that data. duplicate observation can find out based onall variables in data.

/*nodupkeys and noduprecs same*/data x;input id sal age;cards;100 2000 24100 7000 26100 2000 24100 2000 24100 3000 25101 4000 56101 4000 56101 4000 5699 5999 6788 8999 90

Page 21: Notes

100 7000 2699 7890 5788 4000 2877 5000 8988 4999 4544 8999 8044 7000 6844 8908 5656 7890 8956 8908 4556 8908 4556 8908 4556 8908 4578 4389 6545 5678 7845 6785 7845 6785 7845 6785 7845 6785 7845 6785 7845 6785 7845 1000 90;run;proc sort data=x out=x1 noduprecs;by id;run;

data x;input id sal age;cards;100 2000 24100 7000 26100 2000 24100 2000 24100 3000 25101 4000 56101 4000 56101 4000 5699 5999 6788 8999 90100 7000 2699 7890 5788 4000 2877 5000 8988 4999 4544 8999 8044 7000 6844 8908 56

Page 22: Notes

56 7890 8956 8908 4556 8908 4556 8908 4556 8908 4578 4389 6545 5678 7845 6785 7845 6785 7845 6785 7845 6785 7845 6785 7845 6785 7845 1000 90;run;proc sort data=x out=x1 noduprecs;by id sal age ;run;

/*Double*/we can give gap between observations.data x;input id sal age;cards;100 2000 24100 7000 26100 2000 24100 2000 24100 3000 25101 4000 56101 4000 56101 4000 5699 5999 6788 8999 90100 7000 2699 7890 5788 4000 2877 5000 8988 4999 4544 8999 8044 7000 6844 8908 5656 7890 8956 8908 4556 8908 4556 8908 4556 8908 4578 4389 65

Page 23: Notes

45 5678 7845 6785 7845 6785 7845 6785 7845 6785 7845 6785 7845 6785 7845 1000 90;run;proc print data=x double;run;

/*noobs*/using noobs option ,we can remove obs column from output.

data x;input id sal age;cards;100 2000 24100 7000 26100 2000 24100 2000 24100 3000 25101 4000 56101 4000 56101 4000 5699 5999 6788 8999 90100 7000 2699 7890 5788 4000 2877 5000 8988 4999 4544 8999 8044 7000 6844 8908 5656 7890 8956 8908 4556 8908 4556 8908 4556 8908 4578 4389 6545 5678 7845 6785 7845 6785 7845 6785 7845 6785 7845 6785 78

Page 24: Notes

45 6785 7845 1000 90;run;proc print data=x noobs;run;

/*heading:*/

Using this option,we can change the column heading format (horizontal/vertical) for reporting.data x;input id sal age;cards;100 2000 24100 7000 26100 2000 24100 2000 24100 3000 25101 4000 56101 4000 56101 4000 5699 5999 6788 8999 90100 7000 2699 7890 5788 4000 2877 5000 8988 4999 4544 8999 8044 7000 6844 8908 5656 7890 8956 8908 4556 8908 4556 8908 4556 8908 4578 4389 6545 5678 7845 6785 7845 6785 7845 6785 7845 6785 7845 6785 7845 6785 7845 1000 90;run;proc print data=x heading=vertical;run;

Page 25: Notes

/*Width*/

using this option,we can give gap between the column(width=minimum/full).

data x;input id sal age;cards;100 2000 24100 7000 26100 2000 24100 2000 24100 3000 25101 4000 56101 4000 56101 4000 5699 5999 6788 8999 90100 7000 2699 7890 5788 4000 2877 5000 8988 4999 4544 8999 8044 7000 6844 8908 5656 7890 8956 8908 4556 8908 4556 8908 4556 8908 4578 4389 6545 5678 7845 6785 7845 6785 7845 6785 7845 6785 7845 6785 7845 6785 7845 1000 90;run;proc print data=x width=full;run;

/*var*/Using this statement,we can report required variable in specific order.

data x;input id sal age;

Page 26: Notes

cards;100 2000 24100 7000 26100 2000 24100 2000 24100 3000 25101 4000 56101 4000 56101 4000 5699 5999 6788 8999 90100 7000 2699 7890 5788 4000 2877 5000 8988 4999 4544 8999 8044 7000 6844 8908 5656 7890 8956 8908 4556 8908 4556 8908 4556 8908 4578 4389 6545 5678 7845 6785 7845 6785 7845 6785 7845 6785 7845 6785 7845 6785 7845 1000 90;run;proc print data=x;var sal id;run;

/*sum*/Using this statement,we can do column wise sum for reporting.

data x;input id sal age;cards;100 2000 24100 7000 26100 2000 2477 5000 8988 4999 45

Page 27: Notes

44 8999 80;run;proc print data=x;sum sal;run;

/*id*/Using this statement,we can replace the obs column.data x;input id sal age;cards;100 2000 24100 7000 26100 2000 24100 2000 24100 3000 25101 4000 56101 4000 56101 4000 5699 5999 6788 8999 90100 7000 2699 7890 5788 4000 2877 5000 8988 4999 4544 8999 8044 7000 6844 8908 5656 7890 8956 8908 4556 8908 4556 8908 4556 8908 4578 4389 6545 5678 7845 6785 7845 6785 7845 6785 7845 6785 7845 6785 7845 6785 7845 1000 90;run;proc print data=x;id id;sum sal;run;

Page 28: Notes

/*label statement*/Using this statement,we change the variable names for reporting.If we write label statement in procedure block,these labels are temporary labels.data set block-permanent.data x;input id sal age;cards;100 2000 24100 7000 26100 2000 24100 2000 24100 3000 25101 4000 56101 4000 56101 4000 5699 5999 6788 8999 90100 7000 2699 7890 5788 4000 2877 5000 8988 4999 4544 8999 8044 7000 6844 8908 5656 7890 8956 8908 4556 8908 4556 8908 4556 8908 4578 4389 6545 5678 7845 6785 7845 6785 7845 6785 7845 6785 7845 6785 7845 6785 7845 1000 90;run;proc print data=x label;label id='Empid' sal='Empsal';

run;/*Permanent labels*/If we write in datasets, it will be permanent label.

Page 29: Notes

data x;input id sal age;label id='Empid' sal='Empsal'

age='Empage';cards;100 2000 24100 7000 26100 2000 24100 2000 24100 3000 25101 4000 56101 4000 56101 4000 5699 5999 6788 8999 90100 7000 2699 7890 5788 4000 2877 5000 89;run;proc print data=x label;run;

data x;input id sal age;cards;100 2000 24100 7000 26100 2000 24100 2000 24100 3000 25101 4000 56101 4000 56101 4000 5699 5999 6788 8999 90100 7000 2699 7890 5788 4000 2877 5000 8988 4999 4544 8999 8044 7000 6844 8908 5656 7890 8956 8908 4556 8908 4556 8908 4556 8908 45

Page 30: Notes

78 4389 6545 5678 7845 6785 7845 6785 7845 6785 7845 6785 7845 6785 7845 6785 7845 1000 90;run;proc sort data=x out=x1;by id;run;proc print data=x1;by id;/*id id;*/sum sal;run;

data x;input id sal age;cards;100 2000 24100 7000 26100 2000 24100 2000 24100 3000 25101 4000 56101 4000 56101 4000 5699 5999 6788 8999 90100 7000 2699 7890 5788 4000 2877 5000 8988 4999 4544 8999 8044 7000 6844 8908 5656 7890 8956 8908 4556 8908 4556 8908 4556 8908 4578 4389 6545 5678 7845 6785 7845 6785 78

Page 31: Notes

45 6785 7845 6785 7845 6785 7845 6785 7845 1000 90;run;proc sort data=x out=x1;by id;run;proc print data=x1;by id;id id;sum sal;run;

proc sort data=bank out=x1;by id;run;data tr(keep=id ct);set x1;by id;if first.id then ct=0;ct+1;if last.id;

run;proc print;run;

/*we can hold values across obseravation*/or/*we can do cumulative totals.*/data x(keep=a ct);input a b;/*retain ct 0;*//*ct=ct+a;*/ct+a;cards;10 2020 8030 9040 80. 1020 10run;proc print data=x;run;

Page 32: Notes

data x;input a b;k=sum(a,b);k1=a+b;cards;10 2020 .30 90run;proc print data=x;run;

data x;input id sal;cards;10 200011 300012 4000run;data x1;input id sal;cards;14 800015 900011 8000run;data c;set x x1;run;data c1;set c;if sal >3000;run;proc print;run;

proc sort data=bank out=x1;by id;run;data tr(keep=id ct);set x1;by id;if first.id then ct=0;ct+1;if last.id;

run;proc print;run;

DATA MANAGEMENT

Page 33: Notes

Where:

data company;input company$ @8 area $ 9. invest strengh;cards;satyam vizag 2000000 567tcs hyderabad 5000000 956cts hyderabad 6000000 345tcs madras 5000000 678wipro hyderabad 7000000 789wipro bangalore 6700000 456satyam bangalore 6700000 453wipro pune 6700000 789hsbc hyderabad 8900000 673icici bangalore 8000000 893hdfc pune 9000000 796run;proc print data=company;run;data satyam;set company;where company in('hdfc','tcs');run;proc print;run;proc print data=company;where company='satyam' and invest ge 6000000;run;proc print data=company;where company eq 'satyam' or company='tcs';run;proc print data=company;where invest between 6000000 and 8000000;run;proc print data=company;where company ne 'satyam' or company ne'tcs';run;

data x;input id sal;cards;100 3000101 4000103 5000102 6000108 9000105 6000run;data x1;input id loc $;cards;

Page 34: Notes

100 hyd104 sec110 ban111 mum108 hyd102 banrun;proc sort data=x;by id;run;proc sort data=x1;by id;run;data lm;merge x(in=a) x1(in=b);by id;if a or b;run;proc print;run;

/*Left join*/data lm;merge x(in=a) x1(in=b);by id;if a ;run;proc print;run;

/*Right join*/data rm;merge x(in=a) x1(in=b);by id;if b;run;proc print;run;

/*inner join*/

data fm;merge x(in=a) x1(in=b);by id;if a and b;run;proc print;run;

/*Full join*/data lm;merge x(in=a) x1(in=b);by id;if a or b;

Page 35: Notes

run;proc print;run;

data management:

is one type of data reading concept.using thisconcept to accept data from different sas filesand arrange the data into specific order.

data management concept runs on 2 concepts:1.Adding 2.combine

Adding: concept can be done in 2ways1.appending 2.Concatination

Appending:- To add one or more data set data into a existed data set. this concept is called appending.Append procedure:Using this procedure we can do appending and concatination.

data x;input pid age gender $;cards;100 34 female101 56 malerun;data x1;input pid age gender $;cards;200 56 male201 34 femalerun;data x2;input pid age gender $;cards;300 58 male301 39 femalerun;data x3;set x x1 x2;run;proc print;run;

/*one to one merge*/data demo;input id age sex $ weight;cards;100 45 female 67101 34 male 34

Page 36: Notes

102 23 female 45103 34 male 67run;data medi;input id drug $ sdate:date9.;cards;102 col5mg 12oct2003100 col10mg 15nov2003101 col15mg 14dec2003run;proc sort data=demo;by id;run;proc sort data=medi;by id;run;data onetoone;merge demo medi;by id;run;proc print;run;

/*one to many*/data demo;input id age sex $ weight;cards;100 45 female 67101 34 male 34102 23 female 45103 34 male 67run;data lab;input id test $ units;cards;100 hr 79101 hr 78102 hr 75103 hr 76100 sbp 178102 sbp 178103 sbp 145100 dbp 89101 dbp 88102 dbp 85103 dbp 89run;proc sort data=demo;by id;run;proc sort data=lab;by id;run;data onemany;merge demo lab;

Page 37: Notes

by id;run;proc print;run;

/*many to one*/

data demo;input id age sex $ weight;cards;100 45 female 67101 34 male 34102 23 female 45103 34 male 67run;data lab;input id test $ units;cards;100 hr 79101 hr 78102 hr 75103 hr 76100 sbp 178102 sbp 178103 sbp 145100 dbp 89101 dbp 88102 dbp 85103 dbp 89run;proc sort data=demo;by id;run;proc sort data=lab;by id;run;data onemany;merge lab demo;by id;run;proc print;run;

/*To run the merge concept based on matching*//*and non-matching observation.*/

data medi;input stno drug$ sdate edate;informat sdate edate date9.;format sdate edate date9.;cards;210 col5mg 12jan2003 19jan2003190 col10mg 14jan2003 20jan2003203 col5mg 15jan2003 20jan2003

Page 38: Notes

211 col10mg 15feb2003 18feb2003178 col5mg 14mar2003 21mar2003run;data exadvent;input stno exad$ sdate:date9.;format sdate date9.;cards;203 eardis 19jan2003211 eyedis 16feb2003run;proc sort data=medi;by stno;run;proc sort data=exadvent;by stno;run;

/*To report who got expected adverse events and their medicine information.*/

data exadmed;merge exadvent(in=var) medi;by stno;if var=1;run;proc print data=exadmed;run;

proc append:

data demo1;input pid age sex $;cards;100 34 female101 56 malerun;data demo2;input pid age sex $;cards;200 56 male201 34 femalerun;proc append base=demo1 data=demo2;run;

data demo1;input pid age sex $ loc $;cards;100 34 female hyd101 56 male secrun;data demo2;input pid age sex $ grade;cards;200 56 male 89201 34 female 90run;

Page 39: Notes

Force : In appending time ,additional variablesare occured in transition daaset we can't run the append.If we want to run we should use force option.

proc append base=demo1 data=demo2 force;run;proc print;run;

Update :

Update :can be replace master data values with trasition data values based on matching variable.data oldemp1;input eid sal;cards;100 4000102 2000103 3000run;data newemp2;input eid sal;cards;103 6000100 7000run;proc sort data=oldemp1;by eid;run;proc sort data=newemp2;by eid;run;data emp3;update oldemp1 newemp2;by eid;run;proc print;run;

Both datasets have non-matchingobservations,in this cases if we run the update ,internally appending is running.

data oldemp1;input eid sal;cards;100 4000102 2000103 3000104 9000run;data newemp2;

Page 40: Notes

input eid sal;cards;103 6000100 7000104 .110 8000120 9000run;proc sort data=oldemp1;by eid;run;proc sort data=newemp2;by eid;run;data emp3;update oldemp1 newemp2;by eid;run;proc print;run;

can be replace master data values with trasition data values based on matching variable.data oldemp1;input eid sal;cards;100 4000102 2000103 3000run;data newemp2;input eid sal;cards;103 6000100 7000run;proc sort data=oldemp1;by eid;run;proc sort data=newemp2;by eid;run;data emp3;update oldemp1 newemp2;by eid;run;proc print;run;

27/08/11

data x2;set x;

Page 41: Notes

select;when (age <=10) group = "child ";when (11 <= age <= 19) group = "teenager ";when (20 <= age <= 29) group = "young adult";when (30 <= age <= 45) group = "adult ";when (46 <= age <= 59) group = "middle age ";otherwise group = "senior ";

end;

run;proc print;run;

data x2;set x;select;

when (age <=10) group = "child ";when (11 <= age <= 19) group = "teenager ";when (20 <= age <= 29) group = "young adult";when (30 <= age <= 45) group = "adult ";when (46 <= age <= 59) group = "middle age ";otherwise group = "senior ";

end;

run;proc print;run;

data x;input age sal;cards;1 20002 30003 70004 8000run;data x1;set x;select (Age);when (1) Limit = 110;when (2) Limit = 120;when (3) Limit = 130;otherwise limit=200;end;run;proc print;run;

data x;input age sal;

Page 42: Notes

cards;1 20002 30003 70004 8000run;data x1;set x;select (Age);when (1) Limit = 110;when (2) Limit = 120;when (3) Limit = 130;otherwise limit=200;end;run;

data x2;set x1;if limit=110 or limit=120 then group='senior';else group='junior';run;proc print;run;

data x3;set x1;if limit in(110,120) then group='senior';else group='junior';run;proc print;run;

data x;input age sal;cards;1 20002 30003 70004 8000run;data x4;set x;where age=4 ;run;proc print;run;

data x;input age sal;cards;1 20002 30003 7000

Page 43: Notes

4 8000run;data x4(where=( age=2));set x;/*where age=4 ;*/run;proc print;run;

proc print data=x(where=( age=3));run;

proc print data=x;where age=1;;run;

Do :

data x;input age sal;cards;1 20002 30003 70004 80005 80009 200010 500011 800012 9000;run;data x2;set x;if age in(1,4,5) then do;bonus=sal*0.2;hike=sal+bonus;end;else do;bonus=sal*0.5;hike=sal+bonus;end;run;proc print;run;

loop concept:

can be used to run required statementsmutiple times.It must have 3 statements.1.loop variable and some value2.condition3.increment

Page 44: Notes

output:

can be used to store current observation in current data set.

data x;do i=1 to 10;end;run;proc print;run;data x1;do i=1 to 10;output;end;run;proc print;run;data x2;do i=1 to 10;k=i+1;output;end;run;proc print;run;data x3;do i=1 to 10;output;k=i+1;end;run;proc print;run;data x4;do i=1 to 10;output;i=i+1;end;run;proc print;run;data x5;do i=1 to 10;i=i+1;output;end;run;proc print;run;

data one;x=10;output;x=5;output;

Page 45: Notes

run;

data one two;x=10;output one;x=5;output two;run;proc print ;run;

data demo;i=100;do while(i<=120);cid=i;output;i=i+1;end;drop i;run;proc print;run;

data compound;amount=5000;rate=.075;yearly=amount*rate;qty+((qty+amount)*rate/4);output;qty+((qty+amount)*rate/4);output;qty+((qty+amount)*rate/4);output;qty+((qty+amount)*rate/4);output;run;proc print;run;data comp;amount=5000;rate=.075;yearly=amount*rate;do qtr=1 to 4;quarterly+(quarterly+amount)*rate/4;end;run;proc print;run;data invest;do year=2001 to 2003;capital+5000;capital+(capital*.075);output;end;run;proc print;run;

data xxx;do month='jan','feb','mar';output;

Page 46: Notes

end;run;proc print;run;

data xxx;do month=1,2,3,4,5,6;output;end;run;proc print;run;data x;do i=1 to 20 by 2;output;end;run;proc print;run;data x;do i=14 to 2 by -2;output;end;run;proc print;run;

data inv;do until(capital>1000000);year+1;capital+5000;capital+(capital* .075);output;end;run;proc print;run;data inv1;do while(capital<=1000000);year+1;capital+5000;capital+(capital* .075);output;end;run;proc print;run;

30/08/11

functions:Requires some arguments(variables or data values)and do some action and generate some result.These result will be stored in other variable.

These are 3 types.1.Numeric2.character3.date and time

int:we can get integral part of variable.round:we can round up of the nearest integer or

Page 47: Notes

nearest decimal places.Absolute: This convert all data values into postive format.

Mod: It returns remainder.

data x;input id age weight;cards;100 23 78.76101 24 56.45102 34 -67.33104 35 78.55106 23 56.44106 23 67.74run;data x1;set x;wint=int(weight);wround=round(weight);wabs=abs(weight);run;proc print;run;data one;x=mod(20,5);run;proc print;run;data x;set sashelp.class;i=mod(_n_,2);if i=0;run;proc print;run;data x;set sashelp.class;i=mod(_n_,2);if i=1;run;proc print;run;

SUM():can be used to do row wise sum analysis or row wise sum.data x;input a b;x=sum(a,b);y=a+b;cards;12 3456 7845 67

Page 48: Notes

89 9023 .run;proc print;run;

data x1;input fname $23. lname $;x=compress(fname,'j');xgap=compress(fname);cards;ramudfghj jkkllliutreww kprasaddfg hhjjkloiyy bsasideqazx cvbnmkiu rrun;proc print;run;

string functions:1.length();using this function we can find out length of string.(no of character includes blankspaces).length function returns numeric value.

index():It returns position of characters in string.it works based on characters and wordwise.

Scan():using scan function we can get requiredword from string.or we can get nth word of string.

Substr():Using this function,we can get partof string.It requires 3 arguments.1.variable name 2.starting postion 3.number of characters.

data x;input name $13. id;x=length(name);xi=index(name,'k');xw=index(name,'an');xs=scan(name,2);xst=substr(name,1,4);xst1=substr(name,3);cards;shahrukh khan 100salman khan 101ameer khan 102prasad babu 103;

Page 49: Notes

run;proc print;run;

concatination(combine):It is used for combiningthe strings.symbol is ||.data x1;length k $24.;input fname $23. lname $;k=fname||lname;cards;ramudfghjjkkllliutreww kprasaddfghhjjkloiyy bsasideqazxcvbnmkiu rrun;proc print;run;

compress():can be used to remove specificcharacter from the string.it requires 2 arguments.it is working based on character wise.

note:if we ommit 2nd argument in compress function,it removes the blankspaces from the string.

data x1;input fname $23. lname $;x=compress(fname,'j');xgap=compress(fname);cards;ramudfghj jkkllliutreww kprasaddfg hhjjkloiyy bsasideqazx cvbnmkiu rrun;proc print;run;Translate():can be used to replace the required character in string.

data x2;input name $ sal;x=translate(name,'a','x');cards;prasad 2000ramesh 3000rajesh 4000run;proc print;run;

propcase():

Page 50: Notes

to captialize first letter of string.data x2;input name $ sal;xp=propcase(name);cards;prasad 2000ramesh 3000rajesh 4000run;proc print;run;

semantic errors:if we send wrong number of arguments forfunctions.In this case we will get one typeof execution error.this execution error is called semantic error.

31/08/11

cat(): concatenates strings.it without removes leading and trailing blanks.

cats():concatenates strings.it removes leading and trailing blanks.

catt():concatenates strings.it removes only trailing blanks.

catx():concatenates strings.it removes leading and trailing blanks and insert separators.

data x;x=' abc';x1=' abc ';y=' 123 ';k=cat(x1,y);ks=cats(x,y);kt=catt(x1,y);kx=catx(',',x1,y);run;proc print;run;

cat():concatenates strings.it without removes leading and trailing blanks.

Page 51: Notes

cats():concatenates strings.it removes leading and trailing blanks.

catt():concatenates strings.it removes only trailing blanks.

catx():concatenates strings.it removes leading and trailing blanks and insert separators.

data x;x=' abc';x1=' abc ';y=' 123 ';k=cat(x1,y);ks=cats(x,y);kt=catt(x1,y);kx=catx(',',x1,y);run;proc print;run;tranwrd():It removes or replace all occurancesof a word in string.data x1;name='prasad babu mr';x=tranwrd(name,'mr','mrs');run;proc print;run;quote():It add double quotation marks to string.data x2;input name $;x=quote(name);cards;abcxyz;run;proc print;run;dquote():It removes quotation marks.data x2;input name $;x=quote(name);y=dquote(x);cards;abcxyz;run;

Page 52: Notes

proc print;run;

date/timeformat

data demo;input id svdate:date9. svtime:time8.;format svdate date9. svtime time8.;cards;100 12jan2010 12:23:34101 13feb2003 13:23:34102 14feb2009 11:23:34103 15mar2011 10:23:23run;data x;set demo;svday=day(svdate);svmonth=month(svdate);svyear=year(svdate);svhour=hour(svtime);scmin=minute(svtime);svsec=second(svtime);run;proc print;run;

data x;input id svdtime:datetime18.;format svdtime datetime18.;cards;100 12jan2003:12:23:24101 13feb2005:13:23:34run;data x1;format xdate date9. xtime time8.;set x;xdate=datepart(svdtime);xtime=timepart(svdtime);run;proc print;run;

data demo;input id svdate:date9. svtime:time8.;format svdate date9. svtime time8.;cards;100 12jan2010 12:23:34101 13feb2003 13:23:34102 14feb2009 11:23:34103 15mar2011 10:23:23run;data x;set demo;

Page 53: Notes

svday=day(svdate);svmonth=month(svdate);svyear=year(svdate);svhour=hour(svtime);scmin=minute(svtime);svsec=second(svtime);run;proc print;run;datepart():can be used to get datevalue from the date and time variable.

timepart():can be used to get timevalue from the date and time variable.

data x;input id svdtime:datetime18.;format svdtime datetime18.;cards;100 12jan2003:12:23:24101 13feb2005:13:23:34run;data x1;format xdate date9. xtime time8.;set x;xdate=datepart(svdtime);xtime=timepart(svdtime);run;proc print;run;intck():can be used to report difference between the datevalues in day interval,month interval or year interval.

data x;input id (sdate edate) (:date9.);format sdate edate date9.;days=intck('day',sdate,edate);months=intck('month',sdate,edate);year=intck('year',sdate,edate);cards;100 12jan2003 14dec2003101 15jan2003 14oct2004run;proc print;run;

INTNX(): can be used toincrements dates by intervalsdata x;input id (sdate edate) (:date9.);format sdate edate newdate newmonth newyear date9. ;

Page 54: Notes

newdate=intnx('day',sdate,10);newmonth=intnx('month',sdate,2);newyear=intnx('year',sdate,2);cards;100 12jan2003 14dec2003101 15jan2003 14oct2004run;proc print;run;

INTNX has three required arguments and one optional argument, commonly used as follows for SAS date values.

INTNX(interval, start-from, increment <,alignment>); interval is the unit of measure (days, weeks, months, quarters, years, etc.) by which start-from is incremented.start-from is a SAS date value to be incremented.increment is the integer number of intervals by which start-from is incremented (negative values = earlier dates).alignment is where start-from is alignedwithin interval prior to being incremented. Possible values are “beginning”, “middle”,“end”, and (new in Version 9) “sameday”. The default value is“beginning”.

data x;input id (sdate edate) (:date9.);format sdate edate newdate newmonth newyear date9. ;newdate=intnx('day',sdate,10,'B');newmonth=intnx('month',sdate,2);newyear=intnx('year',sdate,2);cards;100 12jan2003 14dec2003101 15jan2003 14oct2004run;proc print;run;

data demo;input id svdate:date9. svtime:time8.;format svdate date9. svtime time8.;cards;100 12jan2010 12:23:34101 13feb2003 13:23:34102 14feb2009 11:23:34103 15mar2011 10:23:23run;data x;

Page 55: Notes

set demo;svday=day(svdate);svmonth=month(svdate);svyear=year(svdate);svhour=hour(svtime);scmin=minute(svtime);svsec=second(svtime);run;proc print;run;datepart():can be used to get datevalue from the date and time variable.

timepart():can be used to get timevalue from the date and time variable.

data x;input id svdtime:datetime18.;format svdtime datetime18.;cards;100 12jan2003:12:23:24101 13feb2005:13:23:34run;data x1;format xdate date9. xtime time8.;set x;xdate=datepart(svdtime);xtime=timepart(svdtime);run;proc print;run;intck():can be used to report difference between the datevalues in day interval,month interval or year interval.

data x;input id (sdate edate) (:date9.);format sdate edate date9.;days=intck('day',sdate,edate);months=intck('month',sdate,edate);year=intck('year',sdate,edate);cards;100 12jan2003 14dec2003101 15jan2003 14oct2004run;proc print;run;

INTNX(): can be used toincrements dates by intervalsdata x;input id (sdate edate) (:date9.);

Page 56: Notes

format sdate edate newdate newmonth newyear date9. ;newdate=intnx('day',sdate,10);newmonth=intnx('month',sdate,2);newyear=intnx('year',sdate,2);cards;100 12jan2003 14dec2003101 15jan2003 14oct2004run;proc print;run;

INTNX has three required arguments and one optional argument, commonly used as follows for SAS date values.

INTNX(interval, start-from, increment <,alignment>); interval is the unit of measure (days, weeks, months, quarters, years, etc.) by which start-from is incremented.start-from is a SAS date value to be incremented.increment is the integer number of intervals by which start-from is incremented (negative values = earlier dates).alignment is where start-from is alignedwithin interval prior to being incremented. Possible values are “beginning”, “middle”,“end”, and (new in Version 9) “sameday”. The default value is“beginning”.

data x;input id (sdate edate) (:date9.);format sdate edate newdate newmonth newyear date9. ;newdate=intnx('day',sdate,10,'B');newmonth=intnx('month',sdate,2);newyear=intnx('year',sdate,2);cards;100 12jan2003 14dec2003101 15jan2003 14oct2004run;proc print;run;data x;input id (sdate edate) (:date9.);format sdate edate newdate newmonth newyear date9. ;newdate=intnx('day',sdate,-10);newmonth=intnx('month',sdate,-2);newyear=intnx('year',sdate,-2);cards;100 12jan2003 14dec2003101 15jan2003 14oct2004run;

Page 57: Notes

proc print;run;

Procedures:

Proc Means

proc means:can be used to generate summary statistical analysis.data market;input pno area $ product $ stock sale;cards;101 vizag lux 10 9109 hyd lux 16 12101 hyd cintol 78 76110 hyd cintol 78 76110 hyd cintol 12 10108 vizag rin 15 10108 vizag lux 23 18101 vij lux 20 19190 hyd rin 10 09190 hyd lux 10 10110 vizag lifebuoy 13 11145 vizag lifebuoy 13 11110 vizag cintol 19 17101 vizag lux 13 12108 vizag lifebuoy 10 11190 hyd lifebuoy 10 09101 vij lifebuoy 13 12110 vizag lifebuoy 12 14run;proc means data=market sum mean n;run;data market;input pno area $ product $ stock sale;cards;101 vizag lux 10 9109 hyd lux 16 12101 hyd cintol 78 76110 hyd cintol 78 76110 hyd cintol 12 10108 vizag rin 15 10108 vizag lux 23 18101 vij lux 20 19190 hyd rin 10 09190 hyd lux 10 10110 vizag lifebuoy 13 11145 vizag lifebuoy 13 11110 vizag cintol 19 17101 vizag lux 13 12

Page 58: Notes

108 vizag lifebuoy 10 11190 hyd lifebuoy 10 09101 vij lifebuoy 13 12110 vizag lifebuoy 12 14run;proc means data=market sum mean n;run;

var:requires analysis variable.analysisvariable must be numeric.data market;input pno area $ product $ stock sale;cards;101 vizag lux 10 9109 hyd lux 16 12101 hyd cintol 78 76110 hyd cintol 78 76110 hyd cintol 12 10108 vizag rin 15 10108 vizag lux 23 18101 vij lux 20 19190 hyd rin 10 09190 hyd lux 10 10110 vizag lifebuoy 13 11145 vizag lifebuoy 13 11110 vizag cintol 19 17101 vizag lux 13 12108 vizag lifebuoy 10 11190 hyd lifebuoy 10 09101 vij lifebuoy 13 12110 vizag lifebuoy 12 14run;proc means data=market;var sale;run;data market;input pno area $ product $ stock sale;cards;101 vizag lux 10 9109 hyd lux 16 12101 hyd cintol 78 76110 hyd cintol 78 76110 hyd cintol 12 10108 vizag rin 15 10108 vizag lux 23 18101 vij lux 20 19190 hyd rin 10 09190 hyd lux 10 10110 vizag lifebuoy 13 11

Page 59: Notes

145 vizag lifebuoy 13 11110 vizag cintol 19 17101 vizag lux 13 12108 vizag lifebuoy 10 11190 hyd lifebuoy 10 09101 vij lifebuoy 13 12110 vizag lifebuoy 12 14run;proc means data=market;var sale stock;run;class:requires grouping variable.This variable isalso called as classification variable.class variable take either character ornumeric.data market;input pno area $ product $ stock sale;cards;101 vizag lux 10 9109 hyd lux 16 12101 hyd cintol 78 76110 hyd cintol 78 76110 hyd cintol 12 10108 vizag rin 15 10108 vizag lux 23 18101 vij lux 20 19190 hyd rin 10 09190 hyd lux 10 10110 vizag lifebuoy 13 11145 vizag lifebuoy 13 11110 vizag cintol 19 17101 vizag lux 13 12108 vizag lifebuoy 10 11190 hyd lifebuoy 10 09101 vij lifebuoy 13 12110 vizag lifebuoy 12 14run;proc means data=market;class area;var sale stock;run;

proc means:can be used to generate summary statistical analysis.data market;input pno area $ product $ stock sale;cards;101 vizag lux 10 9109 hyd lux 16 12

Page 60: Notes

101 hyd cintol 78 76110 hyd cintol 78 76110 hyd cintol 12 10108 vizag rin 15 10108 vizag lux 23 18101 vij lux 20 19190 hyd rin 10 09190 hyd lux 10 10110 vizag lifebuoy 13 11145 vizag lifebuoy 13 11110 vizag cintol 19 17101 vizag lux 13 12108 vizag lifebuoy 10 11190 hyd lifebuoy 10 09101 vij lifebuoy 13 12110 vizag lifebuoy 12 14run;proc means data=market sum mean n;run;data market;input pno area $ product $ stock sale;cards;101 vizag lux 10 9109 hyd lux 16 12101 hyd cintol 78 76110 hyd cintol 78 76110 hyd cintol 12 10108 vizag rin 15 10108 vizag lux 23 18101 vij lux 20 19190 hyd rin 10 09190 hyd lux 10 10110 vizag lifebuoy 13 11145 vizag lifebuoy 13 11110 vizag cintol 19 17101 vizag lux 13 12108 vizag lifebuoy 10 11190 hyd lifebuoy 10 09101 vij lifebuoy 13 12110 vizag lifebuoy 12 14run;proc means data=market sum mean n;run;

var:requires analysis variable.analysisvariable must be numeric.data market;input pno area $ product $ stock sale;cards;

Page 61: Notes

101 vizag lux 10 9109 hyd lux 16 12101 hyd cintol 78 76110 hyd cintol 78 76110 hyd cintol 12 10108 vizag rin 15 10108 vizag lux 23 18101 vij lux 20 19190 hyd rin 10 09190 hyd lux 10 10110 vizag lifebuoy 13 11145 vizag lifebuoy 13 11110 vizag cintol 19 17101 vizag lux 13 12108 vizag lifebuoy 10 11190 hyd lifebuoy 10 09101 vij lifebuoy 13 12110 vizag lifebuoy 12 14run;proc means data=market;var sale;run;data market;input pno area $ product $ stock sale;cards;101 vizag lux 10 9109 hyd lux 16 12101 hyd cintol 78 76110 hyd cintol 78 76110 hyd cintol 12 10108 vizag rin 15 10108 vizag lux 23 18101 vij lux 20 19190 hyd rin 10 09190 hyd lux 10 10110 vizag lifebuoy 13 11145 vizag lifebuoy 13 11110 vizag cintol 19 17101 vizag lux 13 12108 vizag lifebuoy 10 11190 hyd lifebuoy 10 09101 vij lifebuoy 13 12110 vizag lifebuoy 12 14run;proc means data=market;var sale stock;run;class:requires grouping variable.This variable isalso called as classification variable.

Page 62: Notes

class variable take either character ornumeric.data market;input pno area $ product $ stock sale;cards;101 vizag lux 10 9109 hyd lux 16 12101 hyd cintol 78 76110 hyd cintol 78 76110 hyd cintol 12 10108 vizag rin 15 10108 vizag lux 23 18101 vij lux 20 19190 hyd rin 10 09190 hyd lux 10 10110 vizag lifebuoy 13 11145 vizag lifebuoy 13 11110 vizag cintol 19 17101 vizag lux 13 12108 vizag lifebuoy 10 11190 hyd lifebuoy 10 09101 vij lifebuoy 13 12110 vizag lifebuoy 12 14run;proc means data=market;class area;var sale stock;run;data market;input pno area $ product $ stock sale;cards;101 vizag lux 10 9109 hyd lux 16 12101 hyd cintol 78 76110 hyd cintol 78 76110 hyd cintol 12 10108 vizag rin 15 10108 vizag lux 23 18101 vij lux 20 19190 hyd rin 10 09190 hyd lux 10 10110 vizag lifebuoy 13 11145 vizag lifebuoy 13 11110 vizag cintol 19 17101 vizag lux 13 12108 vizag lifebuoy 10 11190 hyd lifebuoy 10 09101 vij lifebuoy 13 12110 vizag lifebuoy 12 14run;

Page 63: Notes

proc means data=market;class area;var sale stock;run;

/*sales wise analysis*/proc means data=market;var sale;run;

/*sales wise analysis based on product*/proc means data=market;class product;var sale;run;/*sales wise analysis based on area*/proc means data=market;class area;var sale;run;/*All types of analysis*/printalltypes:can be used to generate all types of analysis based classification variables and analysis variables.

proc means data=market printalltypes;class area product;var sale;run;

01/09/11

data one;input res$@@;cards;y n y n n n y y nnnrun;proc means data=one;class res;run;proc freq data=one;table res;run;data medi3;input group$ week$ drug$ sub;cards;g100 week1 col5mg 90g200 week1 col5mg 90g300 week1 col5mg 90g100 week2 col5mg 70g200 week2 col10mg 80

Page 64: Notes

g300 week3 col15mg 68g200 week3 col15mg 64g300 week4 col15mg 60g200 week4 col15mg 60g300 week4 col5mg 64run;proc print data=medi3;run;proc freq data=medi3;table group drug week;run;proc freq data=medi3;table group /nopercent norow nocol;;weight sub;run;proc freq data=medi3;table group*drug/nopercent norow nocol;run;proc freq data=medi3;table group*drug/nopercent norow nocol;weight sub;run;proc means data=medi3 sum; class group drug; var sub; run; proc freq data=medi3; table group*drug/nopercent norow nocol; weight sub; run; proc freq data=medi3; table group*week/nopercent norow nocol; weight sub; run; proc sort data=medi3; by group; run; proc freq darta=medi3; table week*drug/nopercent norow nocol; by group;run;proc freq data=medi3;table week*drug/nopercent norow nocol;by group;weight sub;run;data medi4;input group$ week$ drug$ sub;cards;g100 week1 col5mg 90 . week1 col5mg 90g300 week1 col5mg 90g100 week2 col5mg 70

Page 65: Notes

g200 week2 col10mg 80 . week3 col15mg 68g200 week3 col15mg 64g300 week4 col15mg 60g200 week4 col15mg 60g300 week4 col5mg 64run;proc print data=medi4;run;proc freq data=medi4;table group/missingout=medi5;run;proc format;value $gp ' '='miss';run;proc freq data=medi4;table group/missing out=medi6;format group $gp.;run;data hdr;input pid (dos1-dos3)($);cards;100 y n y101 y y y102 y n n103 y y n104 n y y105 n n y106 y y n107 y y yrun;proc print data=hdr;run;/*Who has taken dos1*/proc freq data=hdr;table dos1;where dos1='y';run;proc freq data=hdr;table dos1;where dos1='y' and dos2='n' and dos3='n';run;/*Who has taken dos1 and dos2*/proc freq data=hdr;table dos1*dos2/nopercent norow nocol;where dos1='y' and dos2='y' and dos3='n';run;

02/09/11

Proc format:can be used to create user defined informats and formats.

Page 66: Notes

Invalue statement: can be used to create user defined informats for data reading.

proc format;invalue ds 'l'=0.05 'm'=0.1

'h'=0.15; run;

data medi;input stno week$ drug$ dose:ds.;cards;100 week1 col l101 week1 col l102 week1 col m100 week2 col h101 week2 col m102 week2 col hrun;proc print;run;

Value statement:can be used to create user defined formatsfor reporting.

proc format;value $wk 'w1'='week1' 'w2'='week2';value dg 1='coll5mg' 2='col10mg'

3='col15mg'; run;

data medi1;input stno week$ drug;format week $wk. drug dg.;cards;100 w1 1101 w1 2100 w2 2102 w2 3101 w2 1102 w2 3run;proc print data=medi1;run;

proc format;value gen 2='Female' 1='Male';

run;

Page 67: Notes

data x4;input name $ sal sex ;format sex gen.;cards;raju 2000 1rani 3000 2ramu 4000 1run;proc print;run;

proc format;value $wk 'w1'='week1' 'w2'='week2' 'w3'='week3';value dg 1='coll5mg' 2='col10mg'

3='col15mg'; run;

data x;input id week $ dose;format week $wk. dose dg.;cards;100 W1 1101 W1 2100 W2 2101 W2 1100 W3 2101 W3 3run;proc print;run;

proc format;invalue sal low-1000=1000 1000<-<3000=3000

3000-5000=50005001<-high=_same_;run;

data emp; input eid salary:sal.; cards; 100 500 234 700 345 1000 456 2300 345 1690 478 4000 567 5600

Page 68: Notes

678 7900 908 9800 876 12000 run; proc print;run;

Picture statement:

can be used to create templete. proc format;picture rpt low-<150='999 normal stage' 150-<180='999 control stage' 180-high='999 uncontrol stage'; run;data x;input id sbp;format sbp rpt.;cards;100 89101 145102 190103 160104 155105 4run;proc print;run;

/*INPUT(source, informat.) *//*Character to numeric*/data x;input id sal $;sal_num=input(sal,best.);cards;100 20000101 3000103 40000run;proc print;run;

PUT(source, format.) data demo2;set medi;g1=put(gender,gn.);w1=input(week,$wk.);d1=input(drug,ds.);run;proc print data=demo2;run;

Page 69: Notes

03/9/11

SQL:structured query language

Using SQL concept,we can handle any data base.Using SQL concept,we can generate result in sas environment.SQL concept mainly running based on four concepts.

1.DDL-data definition language2.DML-data manuplication language3.DCL-data controlling language4.query language

DDL-Using concept,we can create a table with variables without obseravations(null data set).

DML:using this concept,we can insert data in existed table,update data values,delete the obseravations from existed table.

DCL:We can control data process.

Query language:

Using with Query language we can retrieve thedata for reporting and storage.

create statement:can be used to create thetables with variables and without observation.

insert statement:can be used to insert data in existed table.

As insert statement is working based on value clause or set clause.

Value clause is working based on variable postion.set clause is working based on variable name.

Select statement:

is a query statement.It can be used toretrieve data for reporting and storage.

insert statement:can be used to insert data in existed table.

Page 70: Notes

proc sql;create table prasad(name char,age int,sal int);quit;proc sql;insert into prasad values('prasad',20,20000);insert into prasad values('raju',30,80000);insert into prasad set name='mahesh',age=30,sal=40000;insert into prasad set age=35,sal=50000,name='shahrukh';quit;

eg:

proc sql;create table prasad(name char(10),age int,sal int);quit;proc sql;insert into prasad values('prasad',20,20000) values('raju',30,80000) values('mahi',35,6000);insert into prasad set name='mahesh',age=30,sal=40000 set age=35,sal=50000,name='shahrukh' set age=40,sal=60000,name='salmankhan';quit;

proc sql;create table x(Name char,Sex char, Age int,Height int,Weight int);insert into x select * from sashelp.class; quit;

proc sql;create table x1(Name char,Sex char, Age int,Height int,Weight int);insert into x select * from sashelp.classwhere sex="F"; quit;

/*Reporting*/proc sql;select * from sashelp.classwhere age>=15;quit;/*Storage*/proc sql;create table x3 as select * from sashelp.class

Page 71: Notes

where age>=15;quit;

data x;input age sal;cards;10 200021 300035 700047 800020 900016 100014 700019 900055 900065 900058 900043 765434 876522 876533 908643 987424 456348 234223 7896run;data x2;set x;select;

when (age <=10) group = "child ";when (11 <= age <= 19) group = "teenager ";when (20 <= age <= 29) group = "young adult";when (30 <= age <= 45) group = "adult ";when (46 <= age <= 59) group = "middle age ";otherwise group = "senior ";

end;

run;proc sql;create table x5 asselect *,case when (age <=10) then "child "

when (11 <= age <= 19) then "teenager "when (20 <= age <= 29) then "young adult"when (30 <= age <= 45) then "adult "when (46 <= age <= 59) then "middle age "else "senior "

end as groupfrom x;

Page 72: Notes

quit;proc print;run;

proc compare base=x2 compare=x5;var group;with group;run;

Update statement:

can be used to modify the data values in existed variable.

Syntax:update <table name> set <variable name>=<expression>

alter statement:

can be used to modify the existed table.modifications:1.Adding new variable2.adding or droped required column.3.assign constraints4.delete constraints

proc sql;alter table emp add anualsal num ;quit;proc sql;update emp set anualsal=salary*12;quit;

proc sql;alter table emp add anualsal num, bonus num, netsal num;update emp set anualsal=salary*12;update emp set bonus=salary*0.5;update emp set netsal=anualsal+bonus;quit;

input pid sbp;cards;100 156 101 176 102 140

Page 73: Notes

103 180 104 145 105 167 run;proc sql;create table case2 as select *, casewhen sbp>=170 then '15mg'when sbp>=150 and sbp<170 then '10mg'

else '5mg' end as drug,

casewhen sbp>=170 then 3when sbp<170 then 2else 3 end asdailydose from med;quit;

SQL:structured query language

Using SQL concept,we can handle any data base.Using SQL concept,we can generate result in sas environment.SQL concept mainly running based on four concepts.

1.DDL-data definition language2.DML-data manuplication language3.DCL-data controlling language4.query language

DDL-Using concept,we can create a table with variables without obseravations(null data set).

DML:using this concept,we can insert data in existed table,update data values,delete the obseravations from existed table.

DCL:We can control data process.

Query language:

Using with Query language we can retrieve thedata for reporting and storage.

create statement:can be used to create thetables with variables and without observation.

Page 74: Notes

insert statement:can be used to insert data in existed table.

As insert statement is working based on value clause or set clause.

Value clause is working based on variable postion.set clause is working based on variable name.

Select statement:

is a query statement.It can be used toretrieve data for reporting and storage.

insert statement:can be used to insert data in existed table.

proc sql;create table prasad(name char,age int,sal int);quit;proc sql;insert into prasad values('prasad',20,20000);insert into prasad values('raju',30,80000);insert into prasad set name='mahesh',age=30,sal=40000;insert into prasad set age=35,sal=50000,name='shahrukh';quit;

proc sql;create table prasad(name char(10),age int,sal int);quit;proc sql;insert into prasad values('prasad',20,20000) values('raju',30,80000) values('mahi',35,6000);insert into prasad set name='mahesh',age=30,sal=40000 set age=35,sal=50000,name='shahrukh' set age=40,sal=60000,name='salmankhan';quit;

proc sql;create table x(Name char,Sex char, Age int,Height int,Weight int);insert into x select * from sashelp.class; quit;

proc sql;create table x1(Name char,Sex char, Age int,Height int,Weight int);

Page 75: Notes

insert into x select * from sashelp.classwhere sex="F"; quit;

proc sql;create table x2 asselect * from sashelp.shoes;quit;

Order by:can be used to report the data in ascending or descending order.Default ascending order.

If we want to descendingorder we use descending or desc option in after variable name of order by clause.

proc sql;create table x3 asselect * from sashelp.classorder by age;quit;

proc sql;create table x3 asselect * from sashelp.classorder by age desc;quit;

where clause:

can be used to create a subset of data for reporting and storage.

proc sql;create table x3 asselect * from sashelp.classwhere age>=15;quit;/*Reporting*/proc sql;select * from sashelp.classwhere age>=15;quit;/*Storage*/proc sql;create table x3 as select * from sashelp.classwhere age>=15;

Page 76: Notes

quit;

/*To store age>=14 sub and generatereport in descending based on weight variable*/proc sql;create table x3 as select * from sashelp.classwhere age>=15order by weight desc;quit;

proc sql;create table x1 asselect name,age from sashelp.class;quit;proc print;run;

data emp;input eid salary sale;cards;100 2000 500101 3000 300102 4000 600104 4000 900105 5000 400106 6000 500107 7000 400108 8000 400109 6000 300;run;data x4;set emp;if sale < 500 then nsal=2000;else if 501<sale<700 then nsal=4000;else nsal=5000;run;proc print;run;proc sql;create table x4 asselect *,case when sale Lt 500 then 2000 when sale gt 501 and sale Lt 700 then 4000

else 5000 end as nsal

from emp;quit;data x;input eid salary sale;cards;100 2000 500

Page 77: Notes

101 3000 300102 4000 600104 4000 900105 5000 400106 6000 500107 7000 400108 8000 400109 6000 300;run;proc sql;create table x4 asselect *,salary+case when sale ge 500 then 2000

else 100 end as nsal from x;quit;proc sql;create table x6 asselect *,case when sale ge 500 then 2000 else 100 end as nsal

from x;quit;proc sql;create table x7 asselect *,salary+nsal as netfrom x6;quit;proc compare base=x4 compare=x7;var nsal;with net;run;data x;input age sal;cards;10 200021 300035 700047 800020 900016 100014 700019 900055 900065 900058 900043 765434 876522 8765

Page 78: Notes

33 908643 987424 456348 234223 7896run;data x2;set x;select;

when (age <=10) group = "child ";when (11 <= age <= 19) group = "teenager ";when (20 <= age <= 29) group = "young adult";when (30 <= age <= 45) group = "adult ";when (46 <= age <= 59) group = "middle age ";otherwise group = "senior ";

end;

run;proc sql;create table x5 asselect *,case when (age <=10) then "child "

when (11 <= age <= 19) then "teenager "when (20 <= age <= 29) then "young adult"when (30 <= age <= 45) then "adult "when (46 <= age <= 59) then "middle age "else "senior "

end as groupfrom x;quit;proc print;run;proc compare base=x2 compare=x5;var group;with group;run;Update statement:can be used to modify the data values in existed variable.

Syntax:update <table name> set <variable name>=<expression>

alter statement:can be used to modify the existed table.modifications:1.Adding new variable2.adding or droped required column.3.assign constraints4.delete constraints

Page 79: Notes

proc sql;alter table emp add anualsal num ;quit;proc sql;update emp set anualsal=salary*12;quit;

proc sql;alter table emp add anualsal num, bonus num, netsal num;update emp set anualsal=salary*12;update emp set bonus=salary*0.5;update emp set netsal=anualsal+bonus;quit;

data x;input eid salary sale;cards;100 2000 500101 3000 300102 4000 600104 4000 900105 5000 400106 6000 500107 7000 400108 8000 400109 6000 300;run;proc sql;alter table x add nsal num;quit;proc sql;update x set nsal=salary+case when sale ge 500 then 2000 when sale ge 400 and sale lt 500 then 1500

else 100 end;quit;

data med;input pid sbp;cards;100 156 101 176 102 140 103 180 104 145

Page 80: Notes

105 167 run;proc sql;create table case2 as select *, casewhen sbp>=170 then '15mg'when sbp>=150 and sbp<170 then '10mg'

else '5mg' end as drug,

casewhen sbp>=170 then 3when sbp<170 then 2else 3 end asdailydose from med;quit;

Operators:

Using operators we can add the tables for reporting.

Types:

1.union all2.union3.intersect4.except

1. union all:Can be used to add the report in sequential order for reporting.Or produces all rows from both queries.

OUTER UNION concatenates the query results.

UNION produces all unique rows from both queries.

data lab1;input stno test$ units;date='12jan2003'd;format date date9.;cards;100 hr 78101 hr 90100 dbp 89100 sbp 156101 dbp 90101 sbp 178102 hr 78

Page 81: Notes

run;proc print data=lab1;run;data lab2;input stno test$ units;date='13jan2003'd;format date date9.;cards;102 hr 78103 hr 90103 dbp 89103 sbp 156102 dbp 90102 sbp 178102 sbp 178102 sbp 178run;proc sql;create table unall asselect * from lab1union allselect * from lab2;quit;data u;set lab1 lab2;run;proc print;run;proc sql;create table un asselect * from lab1union select * from lab2;quit;

INTERSECT produces rows that are common to both query results.

data ex;input stno ad $ date:date9.;format date date9.;cards;100 eyedis 12jan2003105 eardis 12jan2003102 eyedis 12jan2003run;data unex;input stno ad $ date:date9.;format date date8.;cards;

Page 82: Notes

103 nervous 12jan2003104 coma 12jan2003105 eardis 12jan2003run;proc sql;select * from exintersectselect * from unex;quit;

EXCEPT produces rows that are part of thefirst query only.

proc sql;select * from unexexceptselect * from ex;quit;

data ex;input stno ad $ date:date9.;format date date9.;cards;100 eyedis 12jan2003105 eardis 12jan2003102 eyedis 12jan2003run;data unex;input stno ad $ date:date9.;format date date8.;cards;103 nervous 12jan2003104 coma 12jan2003105 eardis 12jan2003run;proc sql;select * from exintersectselect * from unex;quit;proc sql;select * from unexexceptselect * from ex;quit;proc sql;(select * from ex except select * from unex) union (select * from unex

Page 83: Notes

except select * from ex); quit;

Joins:1.simple join2.inner join3.outer join======1.left join or left outer join 2.right join or right outer join

3.full join or full outer join4.natural join5.self join

simple join:

we can report matching observations from the required data sets.

data event; input stno exad$ exdate:date9.; format exdate date9.; cards; 230 eyedis 12feb2005 456 skinprb 15feb2005 345 cold 16mar2005 run; data unevent; input stno unexad$ uexdate:date9.; format uexdate date9.; cards; 230 nervous 28feb2005 156 eardis 17mar2005 245 diabets 18mar2005 run;proc sql; select * from event as ex, unevent as uex where ex.stno=uex.stno; quit;

inner join:

works like a simple join.But inner join canbe used between two tables.

on clause can be used instead of where clause.inner join can be activated with inner join.

proc sql;

Page 84: Notes

create table innerjoin asselect * from event as ex inner join unevent as uex on ex.stno=uex.stno;quit;

left join:reports all observation from left side tableand only matching observations from condition based right side table.

proc sql;create table leftjoin asselect * from event as ex left join unevent as uex on ex.stno=uex.stno;quit;

Right join:

reports all observation from right side tableand only matching observations from condition based left side table.

proc sql;create table rightjoin asselect * from event as ex right join unevent as uex on ex.stno=uex.stno;quit;

Full join:

Reports all observation from 2 tables andmatch the rows.

proc sql;create table fulljoin asselect * from event as ex full join unevent as uex on ex.stno=uex.stno;quit;

Natural join:

Page 85: Notes

we can report matching observation from required data sets without using anycondition.

proc sql;create table natural asselect * from event natural join unevent; quit;

self join:If we join the table internally with sametable then it is called self join.data trt9;input stno bsbp drug$ asbp;cards;190 167 col5mg 178123 178 col15mg 167198 167 col10mg 146237 172 col10mg 134run;proc sql;select *from trt9where bsbp> asbp;quit;

Cartsign product:proc sql;create table cartsign as select * from event as ex, unevent as uex ; quit;

/*Equivalent SQL and datastep coding for joins*/data event; input stno exad$ exdate:date9.; format exdate date9.; cards; 230 eyedis 12feb2005 456 skinprb 15feb2005 345 cold 16mar2005 run; data unevent; input stno unexad$ uexdate:date9.; format uexdate date9.; cards; 230 nervous 28feb2005

Page 86: Notes

156 eardis 17mar2005 245 diabets 18mar2005 run;proc sql;create table simplejoin as select * from event as ex, unevent as uex where ex.stno=uex.stno; quit;proc sql;create table innerjoin asselect * from event as ex inner join unevent as uex on ex.stno=uex.stno;quit;proc sort data=event out=x;by stno;run;proc sort data=unevent out=x1;by stno;run;data inner;merge x(in=a) x1(in=b);by stno;if a and b;run;proc print;run;

proc sql;create table leftjoin asselect * from event as ex left join unevent as uex on ex.stno=uex.stno;quit;data left;merge x(in=a) x1(in=b);by stno;if a ;run;proc print;run;proc sql;create table rightjoin asselect * from event as ex right join unevent as uex on ex.stno=uex.stno;quit;data right;merge x(in=a) x1(in=b);

Page 87: Notes

by stno;if b ;run;proc print;run;

proc sql;create table fulljoin asselect * from event as ex full join unevent as uex on ex.stno=uex.stno;quit;data full;merge x(in=a) x1(in=b);by stno;if a or b ;run;proc print;run;

Joins:1.simple join2.inner join3.outer join======1.left join or left outer join 2.right join or right outer join

3.full join or full outer join4.natural join5.self join

simple join:

we can report matching observations from the required data sets.

data event; input stno exad$ exdate:date9.; format exdate date9.; cards; 230 eyedis 12feb2005 456 skinprb 15feb2005 345 cold 16mar2005 run; data unevent; input stno unexad$ uexdate:date9.; format uexdate date9.; cards; 230 nervous 28feb2005 156 eardis 17mar2005 245 diabets 18mar2005

Page 88: Notes

run;proc sql; select * from event as ex, unevent as uex where ex.stno=uex.stno; quit;

inner jion:

works like a simple join.But inner join canbe used between two tables.

on clause can be used instead of where clause.inner join can be activated with inner join.

proc sql;create table innerjoin asselect * from event as ex inner join unevent as uex on ex.stno=uex.stno;quit;

left join:reports all observation from left side tableand only matching observations from condition based right side table.

proc sql;create table leftjoin asselect * from event as ex left join unevent as uex on ex.stno=uex.stno;quit;

Right join:

reports all observation from right side tableand only matching observations from condition based left side table.

proc sql;create table rightjoin asselect * from event as ex right join unevent as uex on ex.stno=uex.stno;quit;

Page 89: Notes

Full join:

Reports all observation from 2 tables andmatch the rows.

proc sql;create table fulljoin asselect * from event as ex full join unevent as uex on ex.stno=uex.stno;quit;

Natural join:

we can report matching observation from required data sets without using anycondition.

proc sql;create table natural asselect * from event natural join unevent; quit;

self join:If we join the table internally with sametable then it is called self join.data trt9;input stno bsbp drug$ asbp;cards;190 167 col5mg 178123 178 col15mg 167198 167 col10mg 146237 172 col10mg 134run;proc sql;select *from trt9where bsbp> asbp;quit;

Cartsign product:proc sql;create table cartsign as select * from event as ex, unevent as uex ;

Page 90: Notes

quit;

/*Equivalent SQL and datastep coding for joins*/data event; input stno exad$ exdate:date9.; format exdate date9.; cards; 230 eyedis 12feb2005 456 skinprb 15feb2005 345 cold 16mar2005 run; data unevent; input stno unexad$ uexdate:date9.; format uexdate date9.; cards; 230 nervous 28feb2005 156 eardis 17mar2005 245 diabets 18mar2005 run;proc sql;create table simplejoin as select * from event as ex, unevent as uex where ex.stno=uex.stno; quit;proc sql;create table innerjoin asselect * from event as ex inner join unevent as uex on ex.stno=uex.stno;quit;proc sort data=event out=x;by stno;run;proc sort data=unevent out=x1;by stno;run;data inner;merge x(in=a) x1(in=b);by stno;if a and b;run;proc print;run;

proc sql;create table leftjoin asselect * from event as ex left join unevent as uex on ex.stno=uex.stno;

Page 91: Notes

quit;data left;merge x(in=a) x1(in=b);by stno;if a ;run;proc print;run;proc sql;create table rightjoin asselect * from event as ex right join unevent as uex on ex.stno=uex.stno;quit;data right;merge x(in=a) x1(in=b);by stno;if b ;run;proc print;run;

proc sql;create table fulljoin asselect * from event as ex full join unevent as uex on ex.stno=uex.stno;quit;data full;merge x(in=a) x1(in=b);by stno;if a or b ;run;proc print;run;

aggregate functions:

using aggregate function,we can do arthematicmanipulation in sql.using aggregate function,we can do column and row wise analysis.

data clinical9;input center$ trail$ sub adsub;cards;appolo phase1 67 12nims phase1 75 14nims phase1 80 40care phase1 34 10care phase2 40 20

Page 92: Notes

appolo phase2 267 22nims phase2 178 14care phase2 234 30appolo phase1 245 50appolo phase2 260 50run;proc sql;create table aggregate asselect center,trail,sum(sub) as totalfrom clinical9group by center, trail;quit;data x;length key $14.;set clinical9;key=compress(center||trail);run;proc sort data=x out=x1;by key;run;data x2;set x1;by key;if first.key then ct=0;ct+sub;if last.key;run;proc print;run;

proc sql;create table one asselect trail,count(trail) as obsfrom clinical9 group by trail;quit;proc sort data=clinical9 out=y;by trail;run;data one1;set y;by trail;if first.trail then ct=0;ct+1;if last.trail;keep trail ct;run;proc print;run;

data clinical;infile cards truncover;input center $ trail $ sub adsub;cards;appolo phase1 67 12

Page 93: Notes

nims phase1 75 14nims phase1 80 40care phase1 34 10care phase2 40 20 appolo phase2 267 22nims phase2 178 14care phase2 234 30appolo phase1 245 50appolo phase2 260 50 . 90run;coalesce:can be used to replace the missing valuesfor reporting.1.numeric missing values we can replaceusing with numeric.2.character missing values we can replaceusing with character.proc sql;select coalesce(center,'miss')as center,trail,coalesce(sub,0) as sub,adsub from clinical;quit;

distinct():can be used to report unique valuesfrom required variable.

proc sql;select distinct(center) ascentlist from clinical9;quit;

/*Sequence*/

wheregroup byhavingorder by

data clinical;infile cards truncover;input center $ trail $ sub adsub;cards;appolo phase1 67 12nims phase1 75 14nims phase1 80 40care phase1 34 10

Page 94: Notes

care phase2 40 20 appolo phase2 267 22nims phase2 178 14care phase2 234 30appolo phase1 245 50appolo phase2 260 50 . 90run;Having clause:

work like where clause if you want to dogrouping analysis based on condition,we can use having clause.

proc sql;create table x asselect center,trail,sum(sub) as totalfrom clinicalgroup by centerhaving center not in('care');quit;

proc sql;create table x1 asselect center,trail,sum(sub) as totalfrom clinicalwhere center not in('care')group by center;quit;

count():can be used to report frequency analysis.If we use * as argument in count function,it is report number of observation generated by current query statement.

proc sql;select count(*) as norows from clinical9;quit;proc sql;select trail,count(trail) as obsfrom clinical9 group by trail;quit;proc sql;select center,trail,count(center) as obsfrom clinical9 group bycenter,trail;

Page 95: Notes

quit;proc sql;select center,trail,count(center),sum(sub)as obsfrom clinical9 group bycenter,trail;quit;proc sql;select center,trail,count(center) as obs,sum(sub)as totsubfrom clinical9 group bycenter,trail;quit;

Views:

views can be works like sas datasets.we can create a views from theexisted sas datasets.data clinical;input center$ trail$ sub adsub;cards;appolo phase1 67 12nims phase1 78 14care phase1 34 10appolo phase2 267 22nims phase2 178 14care phase2 234 40appolo phase2 267 22nims phase2 300 14care phase2 200 40run;proc sql;create view appolo asselect * from clinicalwhere center='appolo';quit;proc print data=appolo;run;data one/view=option;set sashelp.class;where age<=14;run;proc print;run;proc sql;create view two asselect * from

Page 96: Notes

sashelp.class;quit;proc print;run;/*describe statement:*/can be used to report structure of the table or view.

proc sql;describe tablesashelp.class;quit;proc contents data=sashelp.class;run;data emp;input eid salary sale;cards;100 2000 500101 3000 300102 4000 600104 4000 900105 5000 400106 6000 500107 7000 400108 8000 400109 6000 300run;proc sql;create table emp1as select * from empwhere sale>=500;quit;proc print;run;proc sql;create view emp2as select * from emp;quit;proc sql;create view emp3 as select * from emp2where sale>500;quit;proc print;run;proc sql;create table emp4as select * from emp2where sale<500;quit;proc print;run;data trt;input stno bsbp drug$ asbp;cards;190 167 col5mg 178123 178 col5mg 167198 167 col10mg 146237 172 col10mg 134run;proc sql;

Page 97: Notes

select * from trtwhere bsbp>asbp;quit;data co;infile cards dsd;input pid age gender$ race$;cards;100 56 female asian101 56 male african 34 female african103 male asian104 56 female asian104 45 female asianrun;proc print;run;proc sql;select * from sashelp.class;quit;proc sql;select name,sexfrom sashelp.classwhere name like 'J%' andsex='F';quit;proc sql;create table devi.class asselect * from sashelp.class;quit;proc print;run;proc sql;drop table devi.class;quit;proc sql;select sex,count(*) as countfrom sashelp.classgroup by sexorder by count desc;quit;proc sql;select weight,casewhen weight=. then ''when .<weight<80 then 'light'when 80<=weight<=110 then 'medium'else 'heavy'end as weight_classfrom sashelp.class;

Page 98: Notes

quit;proc sql;select sex,case sexwhen 'M' then 'boy'when 'F' then 'girl'end as boy_or_girlfrom sashelp.class;quit;proc sql;select round(mean(height),0.01), round(mean(weight),0.01)into :avg_height_boys, :avg_weight_boysfrom sashelp.classwhere sex='M'; quit;

drop statement:can be used to drop the table or viewsfrom the sas environment.

data emp;input eid salary sale;cards;100 2000 500101 3000 300102 4000 600104 4000 900105 5000 400106 6000 500107 7000 400108 8000 400109 6000 300run;proc sql;create table emp1as select * from empwhere sale>=500;quit;proc sql;drop table emp1;quit;

/*subquery*/data suplier;input scode $ sname $ addr $ ;cards;s01 raja hyd s02 rani sec s03 raghu hyd

Page 99: Notes

s04 radha banrun;data product;input pcode $ pname $ pcolor $;cards;p1 item1 redp2 item2 greenp3 item3 redp4 item4 bluerun;data trasaction;input trno scode $ pcode $;cards;1 s01 p32 s02 p13 s03 p44 s04 p25 s01 p46 s02 p37 s03 p38 s04 p16 s01 p2run; /*list out all suppliers names who have supply product code'p2'*/proc sql;create table sn as select scode,sname,addr from suplier where scode in (select scode from trasaction where pcode='p2');quit;/*list out of the suppliers details who supplied red color products*/proc sql;create table sd asselect * from suplier where scode in(select scode from trasaction where pcode in(select pcode from product where pcolor='red'));quit;

data x;input eid sal;cards;100 2000101 3000102 5000104 890099 100088 12000run;/*Highest sal*/proc sql;

Page 100: Notes

create table Highest asselect max(sal) as highestsal from x;quit;/*second max sal*/proc sql;select max(sal) from x where sal< (select max(sal) from x);quit;

proc sql;create table detail asselect * from x where sal =(select max(sal) from x where sal< (select max(sal) from x));quit;proc print;run;

data x;input eid sal;cards;100 2000101 3000102 5000104 890099 100088 12000run;/*Highest sal*/proc sql;create table Highest asselect max(sal) as highestsal from x;quit;/*second max sal*/proc sql;select max(sal) from x where sal< (select max(sal) from x);quit;

proc sql;create table detail asselect * from x where sal =(select max(sal) from x where sal< (select max(sal) from x));quit;proc print;run; /*Nth max sal*/

proc sql;SELECT DISTINCT (a.sal) FROM x A WHERE (5-1)= (SELECT COUNT (DISTINCT (b.sal))

Page 101: Notes

FROM x B WHERE a.sal<=b.sal);quit;

/*Highest sal*/proc sql;create table Highest asselect max(sal) as highestsal from x;quit;/*second max sal*/proc sql;select max(sal) from x where sal< (select max(sal) from x);quit;

proc sql;create table detail asselect * from x where sal =(select max(sal) from x where sal< (select max(sal) from x));quit;proc print;run; /*Nth max sal*/

proc sql;SELECT DISTINCT (a.sal) FROM x A WHERE (5-1)= (SELECT COUNT (DISTINCT (b.sal))FROM x B WHERE a.sal<=b.sal);quit;

/*SQL pass through facility*//*Retrieved data from db2 table*/

proc sql noprint;connect to DB2 (database=XXXX user=xxxxxx password=xxxxxx);create table x as select * from connection to DB2 (select name,age,sal,loc from db2.emp order by name); disconnect from DB2; quit;

06/09/11.

MACROS

Using macros language, we can customize and reduce SAS language.

Page 102: Notes

Using macros language, we can develop reusable application. macro language is character based language. If we want to develop macro application in SAS, we need 2 requirements i. e one is

macro compiler or processor and the second is macro language.

Macro language:I. This is one of part of the SAS.

macro language:II. can be used to interact with macro processor.

macro triggers(%,&): can be used to identify macro language.

percentage(%):-This is called macro reference. Each and every macro statement startswith %.

Ampson(&): This is called macro variable reference. It can be used for reporting macro variable.

macro coding can be written outside and inside of the macro block.%macro <macroname>;

SAS coding (include dataset block, proc block, open code) macro coding%mend;catalogue:whenever we run the macro application, SAS do compilation & stores compilation coding in catalogue. catalogue name same name of macro.

Macro call: To call required macro for execution.

Example:

%macro pr;proc print;run;%mend;data x;input pid age;cards;100 89101 90;run;%pr;data x1;input pid drug $;cards;100 5mg101 6mgrun;%let name1=ramu;%put &name;data x;input name $ sal;

Page 103: Notes

if name="&name1";cards;ramu 10000rani 20000mahi 30000raju 40000run;%pr;%macro x(dsname,var1,var2);proc print data=&dsname;var &var1 &var2;run;%mend;%x(SAShelp.class,name,age);%x(SAShelp.shoes,product,region);

concepts in macros:1.macro variable creation2.passing arguments to macro3.macro quoting function4.macro options5.macro expressions6.macro interface functions

1.macro variable creation:macro variables are 3 types1.global macro variable2.local macro variable3.automatic macro variableAll macro variable work like character.%let a=10;%let ab=20;%let c=%eval(&a+&b);%put &c;

07/09/11

Global macro variable can becreated any in the programcoding(inside/outside of macro coding).we can use any where program coding.%global statement:syntax:%global <macro variable>;%let statement:

Page 104: Notes

can be used to assign required value to macro variable.

orcan be used to create user defined macrovariable.%let name=prasad;%put name is &name;

%let name=prasad;

%macro cre;%global cname source;%let cname=raju;%let source=oracle;%let id=&name;%put &id;%mend;%cre;%put &cname;%put &source;

%put statement:can be used to print required text and macro variable result in log window.

Local macro varaiable:can be created inside of macro blockand we can use in only in current macro block.

local macro variable values storesin local symbol tables.

%macro loc;%local cname1 source1;%let cname1=raju;%let source1=oracle;%put var1 is &cname1 and var2 is &source1;%mend;%loc;%put &cname1;%put &source1;Automatic macro variable creation:Automatic variables created by SAS and values assigned by SAS.It is a system defined.User can useautomatic macro variables but user cannot reassign any vlue to automatic

Page 105: Notes

macro variables.Its values stores intoglobal symbol table.

%put &sysdate;%put &systime;2.passing arguments to macros:

arguments:macro arguments shouldbe written in after the macronamewithin brackets.

%macro pr(dname);proc print data=&dname;run;%mend;%pr(SAShelp.clASS);%pr(SAShelp.shoes);we can pass the arguments in two ways.1.postional paarameter or arguments.2.keyword paarameter or arguments.

1.postional paarameter or arguments.

can be used to pass the argumentsbased on position.

%macro srt(ename,new,svar);proc sort data=&ename out=&new;by &svar;run;%mend;%srt(SAShelp.clASS,x,sex);

data demo;input pid age gender $;cards;100 34 male101 23 female102 45 male101 45 male101 45 male102 34 male103 56 female100 34 male101 23 female102 45 male101 45 male101 45 male101 45 male102 34 male

Page 106: Notes

103 56 femalerun;%macro dup(dname,ename,var);proc sort data=&dname out=&ename;by &var;run;%mend;%dup(demo,dup1 nodupkey,pid);

%macro dup(dname,ename,var);proc sort data=&dname out=&ename;by &var;run;%mend;

%dup(demo,dupobs nodupkey,pid age gender);

2.keyword parameters:can be used to pass the arguments to macro using with parametrs or argument name.

data demo;input pid age gender $;cards;100 34 male101 23 female102 45 male101 45 male101 45 male102 34 male103 56 female100 34 male101 23 female102 45 male101 45 male101 45 male101 45 male102 34 male103 56 femalerun;%macro dup(dname=,ename=,var=);proc sort data=&dname out=&ename;by &var;run;%mend;

%dup(dname=demo,ename=dupobs1 nodupkey,var=pid age gender);

data demo;

Page 107: Notes

input pid age gender $;cards;100 34 male101 23 female102 45 male101 45 male101 45 male102 34 male103 56 female100 34 male101 23 female102 45 male101 45 male101 45 male101 45 male102 34 male103 56 femalerun;%macro dup(dname=,ename=,var=);proc sort data=&dname out=&ename;by &var;run;%mend;

%dup(ename=dupobs2 nodupkey,var=pid age gender,dname=demo);

08/09/11

To join 2 variables we need to add . ( dot ) in between.

When to use single and double ampersand:

During indirect references.

%macro Ex2_refvar;%let mo = 12;%let yr1 = 2002;%let yr2 = 2003;%do y = 1 %to 2;%do day = 1 %to 2;%put &mo./&day./&yr&y;%put &mo./&day./&&yr&y;%end;%end;%mend Ex2_refvar;%Ex2_refvar;

Page 108: Notes

Macro quoting functions:

%macro y;%let mo = December;%let yr = 2002;%do day = 1 %to 3;%put &mo &day, yr;%put &mo &day, &yr.;%end;%mend;%y;

%let yr=2002;%let day=13;%put &dayDecember&yr;%put &day.December&yr;%let a=&yr.&day;%put &a.;

%let section1=internet;%let section2=networking;%let section3=operating system;%let section4=programing language;%let section5=web design;

%let n=5;%put &&section&n;

%let section1=internet;%let section2=networking;%let section3=operating system;%let section4=programing language;%let section5=web design;

%let n=5;

%let whatever=section;%put &&&whatever&n;

%macro Ex2_refvar;%let mo = 12;%let yr1 = 2002;%let yr2 = 2003;%do y = 1 %to 2;%do day = 1 %to 2;%put &mo./&day./&yr&y;%put &mo./&day./&&yr&y;%end;%end;%mend Ex2_refvar;

Page 109: Notes

%Ex2_refvar;3.macro quoting functions:

Using with quoting functions we can mask the special charactersat compilation time.These are 2 types.%str():using this function,we can mask all special characters except macro tiggers and unmatched quotations and unmatched brackets.

%nrstr():using this function,we can mask all special characters includes macro tiggers atcompilation time.

%macro subset(new,ename,con);data &new;set &ename;where &con;run;proc print;run;%mend;%subset(x,sashelp.class,age<=18);%subset(x,sashelp.class,%str(sex='F'));

%subset(x2,sashelp.class, %nrstr(age>12 & name='Mary'));

%subset(x,sashelp.class,%str(name='Jeffrey'));

%subset(x,sashelp.class,name='Jeffrey');%print;%subset(x1,sashelp.class,age>12);%print;%subset(x2,sashelp.class,age>12 & name='Mary');

%let month=%substr(jan,feb,mar,5,3);%put &month;

%let month=%substr(%str(jan,feb,mar),5,3);%put &month;

%let reptxt=jan&feb %salereport;%put &reptxt;

Page 110: Notes

%let reptxt=%str(jan&feb %salereport);%put &reptxt;

%let reptxt=%nrstr(jan&feb %salereport);%put &reptxt;

macro expression:are 3 types1.text expression2.Arithematic expression3.logical expression1.text expression:-macro coding is also called as text expression.2.Arithematic expression:

can be used to run arithematic operations in macros.%let a=10;%let b=20;%let c=&a+&b;%put &c;%eval():can be used to do arithmatic operations using macro variables.

%let c1=%eval(&a+&b);%put &c1;

%let a=10;%let b=20;%let c=&a+&b;%put &c;%eval():can be used to do arithmatic operations using macro variables.

%let c1=%eval(&a+&b);%put &c1;%sysevalf():If macro variables have period of characters(float values) then we will use sysevalf for arithmatic operation.%let a1=10.34;%let b1=20.60;%let s=%sysevalf(&a1+&b1);%put &s;Macro function (or) string function;

It requires operands(variables).

Page 111: Notes

%length():using this function,we can reportlength of macro variable.

%let dnames=demo lab med;%let len=%length(&dnames);%put &len;

%index():using this function,we can report specific character position in string.

%let dnames=demo lab med;%let pos=%index(&dnames,l);%put &pos;

%scan():using this function,we can get requiredword from string.orIt extract nth word of string%let dnames=demo lab med;%let rw=%scan(&dnames,2);%put &rw;

%upcase():It shows required in capital letters.%let dnames=demo lab med;%let cap=%upcase(&dnames);%put &cap;

%lowercase():It shows required in small letters.%substr():we can get part of string from macro variable.%let dnames=demo lab med;%let sub=%substr(&dnames,1,5);%put &sub;

%sysfunc():using this function,we can call dataset functions in macros.%let a=23.34;%let b=20.98;%let c=%sysevalf(&a+&b);%let in=%sysfunc(int(&c));%put &c;%put &in;

or%let a=23.34;%let b=20.98;

Page 112: Notes

%let c1=%sysfunc(int(%sysevalf(&a+&b)));

%put &c1;

09/09/11

/*To develop merge application in macros*/data main;input eid sal bonus;cards;100 2000 .2101 3000 .3102 3500 .4105 5000 .5run;data tr;input eid dept $;cards;100 testing101 dwh102 mainframe103 testing 105 dwh106 BI108 BIrun;%macro srt(ename,new,svar);proc sort data=&ename out=&new;by &svar;run;%mend;%macro merge(new,enames,mvar);data &new;merge &enames;by &mvar;run;%mend;%srt(main,y,eid);%srt(tr,y1,eid);%merge(mdata,%str(main(in=a) tr(in=b)), eid;if a=1 & b=1);%merge(ldata,%str(main(in=a) tr(in=b)), eid;if a=1);%merge(rdata,%str(main(in=a) tr(in=b)), eid;if b=1);%merge(fdata,%str(main(in=a) tr(in=b)), eid;if a=1 or b=1);

Page 113: Notes

/*TO DEVELOP JOINS IN MACROS*/

%macro join(e1,e2,jo,con,mvar);proc sql;select * from &e1 &jo &e2 &con &e1..&mvar=&e2..&mvar; quit;%mend;%join(main,tr,%str(,),where,eid);%join(main,tr,left join,on,eid);%join(main,tr,right join,on,eid);data emp;input eid sal;cards;100 2300110 4500230 5600run;data emp2;input eid istage:percent4.;cards;230 30%110 20%100 10%run;data emp3;input eid sal;cards;108 2000110 9000run;%macro dmanage(rq,dname,enames,var);%if %upcase(&rq)=MERGE %then %do;data &dname;merge &enames;by &var;run;%end;%else %if %upcase(&rq)=UPDATE %then %do;data &dname;update &enames;by &var;run;%end;%else %if %upcase(&rq)=MODIFY %then %do;data &dname;modify &enames;

Page 114: Notes

by &var;run;%end;%else %do;proc sort data=&dname out=&enames;by &var;%end;run;%mend;%dmanage( ,emp,z,eid);%dmanage( ,emp2,z1,eid);%dmanage( ,emp3,z4,eid);%dmanage(update,k,z z4,eid);%dmanage( modify,z,z z1,%str(eid;sal=sal+sal*istage));data medi;input pid drug$ sbp;cards;100 col5mg 156101 col10mg 167108 col10mg 145102 col5mg 150109 col10mg 160run;data med;input pid drug$ sbp;cards;100 col5mg 156101 col10mg 167run;data success;input pid sbp;cards;101 145100 150run;%macro sprint(rq,ename,new,var);%if %upcase(&rq)=SORT %then %do;proc sort data=&ename out=&new;by &var;run;proc print data=&new;run;%end;%else %do;proc print data=&ename;run;%end;%mend;proc print data=medi1;run;%sprint(sort,medi,medi1,pid);

Page 115: Notes

%sprint(sort,med,med1,pid);%sprint(sort,success,success1,pid);%macro dmanage(rq,dname,enames,var);%if %upcase(&rq)=MERGE %then %do;data &dname;merge &enames;by &var;run;%end;%else %if %upcase(&rq)=UPDATE%then %do;data &dname;update &enames;by &var;run;%end;%else %if %upcase(&rq)=MODIFY %then %do;data &dname;modify &enames;by &var;run;%end;%else %do;proc sort data=&enames out=&dname;by &var;run;%end;%mend;

%macro print(dname);proc print data=&dname;run;%mend;/*%macro dmanage(rq,dname,enames,var);*/

%dmanage( ,medi2,medi,pid);%print(medi2);%dmanage( ,medi4,med,pid);%print(medi4);%dmanage( ,success1,success,pid);%print(success1);

%dmanage(update,medi5,medi4 success1,pid);

%dmanage(merge,medi6,medi2 medi4 success1,pid);%print(medi5);

options mprint;%dmanage( ,medi1,medi,pid);

Page 116: Notes

%dmanage( ,success1,success,pid);%dmanage(update,medi2,medi1 success1,pid);%print(medi2);/*%DO %WHILE LOOP*/%global dname;%let dname=emp1 emp2 emp3 emp4;%macro dh;%local dat i;%let i=1;%do %while(&i<=5);%let dat=%scan(&dname,&i);proc print data=&dat;run;%let i=%eval(&i+1);%end;%mend;%dh;options mprint mlogic symbolgen;%macro dh1;%local dat i;%let i=1;%let dat=%scan(&dname,&i);%do %while(&dat ne );proc print data=&dat;run;%let i=%eval(&i+1);%let dat=%scan(&dname,&i);%end;%mend;%dh1;/*%DO %WHILE LOOP*/

data emp1;input eid sal;cards;100 2000101 3000102 4000103 5000104 6000105 8000run;data emp2;input eid sal;cards;200 2000201 3000202 4000203 5000204 6000

Page 117: Notes

205 8000run;data emp3;input eid sal;cards;300 2000301 3000302 4000303 5000304 6000305 8000run;data emp4;input eid sal;cards;400 2000401 3000402 4000403 5000404 6000405 8000run;%global dname;%let dname=emp1 emp2 emp3 emp4 emp5 emp6;%macro dh;%local dat i;%let i=1;%do %while(&i<=7);%let dat=%scan(&dname,&i);proc print data=&dat;run;%let i=%eval(&i+1);%end;%mend;%dh;options mprint mlogic symbolgen;%macro dh1;%local dat i;%let i=1;%let dat=%scan(&dname,&i);%do %while(&dat ne );proc print data=&dat;run;%let i=%eval(&i+1);%let dat=%scan(&dname,&i);%end;%mend;%dh1;Macro options:Macro options is a type of global options.

Page 118: Notes

Its deault working whenever we run the macro application.macro options can be changed by using option statement.This statement should be written outside of the macro block.

It displays a warning message in log window whenevermacro call is not resolved.

Merror:using this option,we can trace outrequired catalogue(macro call) existed or not.

Serror:prints warning message in log window whenever macro variable is not resolved.

mprint:Using mprint option,we can trace out requiredmacrocall to report errors in sas coding.symbolegen:can be used to trace out macro variable value.

orIt prints message in log window how to resolve macro variable.

Mlogic:can be used to trace out logical expressions.

/*%DO %WHILE LOOP*/

data emp1;input eid sal;cards;100 2000101 3000102 4000103 5000104 6000105 8000run;data emp2;input eid sal;cards;200 2000201 3000202 4000

Page 119: Notes

203 5000204 6000205 8000run;data emp3;input eid sal;cards;300 2000301 3000302 4000303 5000304 6000305 8000run;data emp4;input eid sal;cards;400 2000401 3000402 4000403 5000404 6000405 8000run;%global dname;%let dname=emp1 emp2 emp3 emp4 emp5 emp6;options mprint mlogic symbolgen;%macro dh1;%local dat i;%let i=1;%let dat=%scan(&dname,&i);%do %while(&dat ne );proc print data=&dat;run;%let i=%eval(&i+1);%let dat=%scan(&dname,&i);%end;%mend;%dh1;

10/09/11

In macros we need to put . (dot) to add names.

http://www.stat.berkeley.edu/classes/s100/sas.pdfConcatination of macro variables:%let surname=kolla;%let name=lava kumar;%let concat=&surname.&name.;

Page 120: Notes

%put &concat;GO TO BLOCK:The statement is working based on label statement and run group of required statements.

label statement:this statemnt indicate group of statements.If we want to run 'go to' statement we will use conditional if.data x;input eid sale;cards;100 2000101 3000102 4000104 5678105 7890106 5678107 8908108 4567109 8654110 9000run;data x1;set x;if 2000<=sale <=3000 then goto la1;else if 3000<sale <=5000 then goto la2;else if sale> 5000 then goto la3;

la1:salary=2000;bonus=salary*0.5;netsalary=salary+bonus;return;la2:salary=3000;bonus=salary*0.5;netsalary=salary+bonus;return;la3:salary=4000;bonus=salary*0.5;netsalary=salary+bonus;return;

run;

data x;input eid sale;cards;100 2000101 3000

Page 121: Notes

102 4000104 5678105 7890106 5678107 8908108 4567109 8654110 9000run;data x2;set x;if 2000<=sale <=3000 then do;salary=2000;bonus=salary*0.5;netsalary=salary+bonus;end;else if 3000<sale <=5000 then do;salary=3000;bonus=salary*0.5;netsalary=salary+bonus;end;

else if sale> 5000 then do;salary=4000;bonus=salary*0.5;netsalary=salary+bonus;end;run;

data emp;input eid sal;cards;100 2000101 3000102 4000104 5678105 7890106 5678run;data dept;input eid loc $;cards;102 chicago101 delhi106 hyd104 paris105 newyork100 londonrun;%macro gto(rq,dname,enames,var);

Page 122: Notes

%if %upcase(&rq)=SORT %then %goto srt;%else %if %upcase(&rq)=MERGE %then %goto merge;%srt:proc sort data=&enames out=&dname;by &var;run;%goto ext;%merge:data &dname;merge &enames;by &var;run;%goto ext;%ext:%mend; %gto(sort,emps,emp,eid);%gto(sort,depts,dept,eid);%gto(merge,mr,emps depts,eid);Macro interface functions:-

Interface functions are 2 types.1.Dataset interface functions(datastep).2.macro interface function.Dataset interface functions:call symput:-It is a call routine(function).Using this function,we can createmacro variables from the dataset variables during dataset execution.Syntax:call symput("macro variable",dataset varname);Note:If dataset has mutiple valuescall symput functiondefault stores last data values in macro variable.data x;input name $ sal;call symput("name1",name);cards;raju 2000mahi 4000suri 5000run;%put &name1;

data x;input name $ sal;CALL SYMPUT('v'||LEFT(_N_), name);cards;raju 2000mahi 4000suri 5000

Page 123: Notes

;run;%put &v1;%put &v2;%put &v3;

data x;input name $ sal;anual+sal;cards;raju 2000mahi 4000suri 5000;run;

DATA _NULL_;SET x END=LAST;IF LAST THEN CALL SYMPUT('N',anual);RUN;%put &n;Symget:If want to get values of macro variableat datastep level.data x;input name $ sal;anual+sal;cards;raju 2000mahi 4000suri 5000;run;

DATA _NULL_;SET x END=LAST;IF LAST THEN CALL SYMPUT('N',anual);RUN;data x2;anualsal=symget('n');run;call execute:-using call execute,we can call required catalog(macro call) from the dataset block.syntax:call execute('%macro call');

data x;input name $ sal;anual+sal;

Page 124: Notes

cards;raju 2000mahi 4000suri 5000;run;

DATA _NULL_;SET x END=eof;IF eof THEN CALL execute('%gto(merge,mr2,emps depts,eid)');RUN;2.macro interface function.%sysfunc():-using this function,we can call data set functions in macros.

%let a=20.23;%let b=34.78;%let c=%sysfunc(int(%sysevalf(&a+&b)));%put &c;

dataset function:exist():-Using this function,we can report requiredsas file is existed or not.If it is existed it returns 1 otherwise 0.syntax:Exist('datasetname');data _null_;if exist('emp')=1 then put 'dataset is existed';else put 'dataset does not exist';run;open():-Using this function,we can open dataset internally.syntax:open('datasetname');Attrn():-Using this function,we can count number of rows andnumber og variables using open results.close():-Using this function,we can close open dataset.syntax:close(open datasetname);data _null_;if exist('sashelp.class')=1 then do;op=open('sashelp.class');

Page 125: Notes

NV=attrn(op,'nvars');NO=attrn(op,'nobs');CL=close(op);put 'no of observation' no;put 'no of variables ' nv;end;else put 'dataset not existed';run;

%macro dex(dname);%if %sysfunc(exist(&dname))=1 %then %do;%let op=%sysfunc(open(&dname));%let NV=%sysfunc(attrn(&op,nvars));%let NO=%sysfunc(attrn(&op,nobs));%let CL=%sysfunc(close(&op));%put no of observation &no;%put no of variables &nv;%end;%else %put &dname not existed;%mend;%dex(sashelp.class);%dex(sashelp.shoes);/*To create a macro variable with sql block*/select and into clause,using these two options we can create macro variable the dataset variable.Here sas system default stores 1st datavalue or first occurance in macro variable.

data med;input gid $ visit drug $;cards;G100 1 col105mgG200 1 col10mgg300 1 col15mgG100 2 col28mgG200 2 col30mgg300 2 col20mgrun;proc sql noprint;select drug into:medicine from med;quit;%put &medicine;/*To create multiable macro variables*/proc sql noprint;select drug into:medicine1-:medicine6 from med;quit;%put &medicine1;%put &medicine2;

Page 126: Notes

%put &medicine3;%put &medicine4;%put &medicine5;%put &medicine6;

proc sql noprint;select count(*) into:n from med;select drug into:medicine1-:medicine%sysfunc(left(&n)) from med;quit;%put &n;%put &medicine1;%put &medicine2;%put &medicine3;%put &medicine4;%put &medicine5;%put &medicine6;%put &medicine7;symdel:-Using this function,we can delete macro variable from sas environment.%symdel medicine7;_Global_:-Using _global_ statement ,we can report list the macro variables(global variables) with values.

%put _global_;

_local_:-Using _local_ statement ,we can report list the macro variables(local variables) with values.

%macro mvar;%let a=10;%let b=20;%let c=%eval(&a+&b);%put _local_;%mend;%mvar;_user_:Using this statement,we can report user defined macro variable.1.if we write inside of macroblock,it reports both global and local macro variables. 2.if we write outside of macroblock,

Page 127: Notes

it reports both global macro variables.%put _user_;_automatic_:Using this statement,we can report list of automatic macro variables.%put _automatic_;%put _automatic_;%put &sysdate;%put &systime;%put &sysdsn;

Concatination of macro variables:%let surname=kolla;%let name=lava kumar;%let concat=&surname.&name.;%put &concat;GO TO BLOCK:The statement is working based on labelstatement and run group of required statements.

label statement:this statemnt indicate group of statements.If we want to run 'go to' statement we will use conditional if.data x;input eid sale;cards;100 2000101 3000102 4000104 5678105 7890106 5678107 8908108 4567

Page 128: Notes

109 8654110 9000run;data x1;set x;if 2000<=sale <=3000 then goto la1;else if 3000<sale <=5000 then goto la2;else if sale> 5000 then goto la3;

la1:salary=2000;bonus=salary*0.5;netsalary=salary+bonus;return;la2:salary=3000;bonus=salary*0.5;netsalary=salary+bonus;return;la3:salary=4000;bonus=salary*0.5;netsalary=salary+bonus;return;

run;

data x;input eid sale;cards;100 2000101 3000102 4000104 5678105 7890106 5678107 8908108 4567109 8654110 9000run;data x2;set x;if 2000<=sale <=3000 then do;salary=2000;bonus=salary*0.5;netsalary=salary+bonus;end;else if 3000<sale <=5000 then do;salary=3000;

Page 129: Notes

bonus=salary*0.5;netsalary=salary+bonus;end;

else if sale> 5000 then do;salary=4000;bonus=salary*0.5;netsalary=salary+bonus;end;run;

data emp;input eid sal;cards;100 2000101 3000102 4000104 5678105 7890106 5678run;data dept;input eid loc $;cards;102 chicago101 delhi106 hyd104 paris105 newyork100 londonrun;%macro gto(rq,dname,enames,var);%if %upcase(&rq)=SORT %then %goto srt;%else %if %upcase(&rq)=MERGE %then %goto merge;%srt:proc sort data=&enames out=&dname;by &var;run;%goto ext;%merge:data &dname;merge &enames;by &var;run;%goto ext;%ext:%mend; %gto(sort,emps,emp,eid);%gto(sort,depts,dept,eid);%gto(merge,mr,emps depts,eid);Macro interface functions:-

Page 130: Notes

Interface functions are 2 types.1.Dataset interface functions(datastep).2.macro interface function.Dataset interface functions:call symput:-It is a call routine(function).Using this function,we can createmacro variables from the dataset variables during dataset execution.Syntax:call symput("macro variable",dataset varname);Note:If dataset has mutiple valuescall symput functiondefault stores last data values in macro variable.data x;input name $ sal;call symput("name1",name);cards;raju 2000mahi 4000suri 5000run;%put &name1;

data x;input name $ sal;CALL SYMPUT('v'||LEFT(_N_), name);cards;raju 2000mahi 4000suri 5000;run;%put &v1;%put &v2;%put &v3;

11/09/11

%macro mvar;%let a=10;%let b=20;%let c=%eval(&a+&b);%put _local_;%mend;%mvar;

data multdat;

Page 131: Notes

input id $ date mmddyy10. pr1 pr2 pr3 t1 t2 t3;format date mmddyy10.;cards;A1 11/01/2004 223 204 195 30 28 27B7 11/01/2004 211 192 183 31 28 26;run;proc print;run;data unidat;set multdat;keep id date time pressure temp;time=1;pressure=pr1;temp=t1;;output;time=2;pressure=pr2;temp=t2;;output;time=3;pressure=pr3;temp=t3;;output;run;proc print;run;

data uni;format date mmddyy10.;input id $ date mmddyy10. @;time=1;input pressure temp @;output;time=2;input pressure temp @;output;time=3;input pressure temp @;output;cards;A1 11/01/2004 223 204 195 30 28 27B7 11/01/2004 211 192 183 31 28 26;run;proc print;run;data r;x=22/7;idvar='a';y1=round(x,1);y_1=round(x,0.1);y_01=round(x,0.01);output;x=33/7;idvar='b';y1=round(x,1);y_1=round(x,0.1);y_01=round(x,0.01);output;run;

proc transpose data=r out=tr;var x y1 y_1 y_01;

Page 132: Notes

run;proc print;run;

proc transpose data=r out=tr name=orig_var prefix=y_;var x y1 y_1 y_01;id idvar;run;proc print;run;

proc sort data=multdat out=x;by id date;run;proc transpose data=x out=p(drop=_name_ rename=(col1=p));by id date;var pr1 pr2 pr3;run;proc transpose data=x out=t(drop=_name_ rename=(col1=t));;by id date;var t1 t2 t3;run;data um;merge p t;by id date;run;proc print;run;

Proc transpose:Using this procedure,we can convert variables into rows and rows into variables.

Id statement:it requires which data variable valuesto tranpose or convert as variables.

var statement:It requires which variable values to convert or transpose as observation or datavalue.prefix;can be used to add required textfor tranpose variable.

data lab;input id test $ units;cards;100 hr 90101 hr 89100 dbp 98101 dbp 97

Page 133: Notes

100 sbp 156101 sbp 167run;proc sort data=lab out=x;by id;run;proc transpose data=x out=tr1 name=detailsprefix=test_;by id;var units;id test;run;proc print data=tr1;run;

Name:gives name to new variable that contains the name of tranposed variables(the variables listed on var statement).

if you donot enter this option,sas automatically includes variable called _name_.You can drop it with(drop=_name_) placed immediately following out=<tr_dataset_name> or enter the rename=(_name_=new_name) option.

prefix:provides initial characters for the names of the new variables that will be appended with the value of variablelisted on ID statement,such as prefix=p will list new variables as p1,p2,p3..... sometimes _ is convenient choice for the first character of the tranposed variable.13/09/11Proc datasetsproc datasets:using this procedure we can do1.Rename the datasets2.Exchange data between the datsets.3.copy the sas files from one libray to another library.4.modify the datasets a) aasiagn constraints b)delete constraints5.Append the datasets values from one dataset to another.6. we can report descriptive information for required library.

1.Rename the datasets

Page 134: Notes

change statement:can be used torename the dataset.

data emp;input eid sal jod:ddmmyy10.;cards;100 2000 30-01-2010101 3000 23-02-2009103 3500 25-03-2008run;proc datasets lib=work;change emp=employee;quit;

2.Exchange data between the datsets.exchange statement:can be used to exchange between the datasets.Note:If we want to exchange the data betweenthe sas file,then 2 sas files mustbe available in same library.

data emp1;input eid sal jod:ddmmyy10.;cards;100 2000 30-01-2010101 3000 23-02-2009103 3500 25-03-2008run;proc datasets lib=work;exchange employee=emp1;quit;

3.copy the sas files from one libray to another library.copy statement:can be used to copy the sas files betweenthe libraries or we can transfersas files between libraries.

proc datasets lib=sashelp;copy in=sashelp out=work;quit;

proc datasets lib=sashelp;copy in=sashelp out=work;select class shoes;quit;

Page 135: Notes

memtype option:can be used to copy the required sasfile type or required memtype.values of memtypes are: All,data,view,cat.

we can copy the required sas files use with select and exclude statement.select statement indicates required sas files.exclude statement indicates non-required sas files.

proc datasets lib=sashelp;copy in=sashelp out=work memtype=data;select class shoes;quit;proc datasets lib=sashelp;copy in=sashelp out=work memtype=cat;quit;proc datasets lib=sashelp;copy in=sashelp out=work memtype=view;quit;

proc datasets lib=sashelp;copy in=sashelp out=work memtype=data;exclude class shoes;quit;

14/09/11

4.modify the datasets:modify statement can be used to change the formats.data emp;input eid sal jod:ddmmyy10.;format jod date9.;cards;100 2000 30-01-2010101 3000 23-02-2009103 3500 25-03-2008run;proc datasets lib=work;modify emp;format jod worddate20.;quit;proc print;run;proc datasets lib=work;modify emp;format jod worddate20.;rename jod =joindate;quit;proc print;run;

Page 136: Notes

data emp;input eid sal jod:ddmmyy10.;format jod date9.;cards;100 2000 30-01-2010101 3000 23-02-2009103 3500 25-03-2008run;proc datasets;modify emp;format jod weekdate22.;rename jod =joindate;quit;proc print;run;5.Appending statement:

using append statement,we can append the values or load the values from one dataset to other.Append statement can be used only in data set procedure.data x;input a b;cards;10 4534 56run;data y;input a b;cards;100 350120 567run;proc datasets ;append base=x data=y;quit;data x;input a b ;cards;10 4534 56run;data y;input a b c;cards;100 350 45120 567 789run;proc datasets ;append base=x data=y force;

Page 137: Notes

quit;data x;input a b d;cards;10 45 30034 56 450run;data y;input a b c;cards;100 350 45120 567 789run;proc datasets ;append base=x data=y force;quit;6.To report descriptive information for required library.Details options:can be used to report descriptive information for required library.proc datasets lib=workdetails;quit;contents statement:To report descriptive information for required dataset.proc datasets lib=work;contents data=emp;quit;

Delete statemnt:delete required dataset from library.proc datasets lib=work;delete emp;quit;

17/09/11To get data in pyramid shape:

Data _Null_ ;Length Text $ 200 ;Max = 10 ;Do I = 1 To Max ; Text = Repeat( Strip(Put( I , Best32. )) || ' ' , I - 1 ) ; Put Text; End ;Run ;

constraints:can be used to load necessary data in tables.

Page 138: Notes

we can assign constraints in 2 ways.

1.column constraints2.table constraints

integrity constraints types:Unique:can be used to load the datawithout duplicate data values.proc sql;create table demo(pid num unique,age num, gender char,race char);

quit;

proc sql;insert into demo values(100,21,"f","asian") values(101,22,"m","african")

values(102,23,"f","mangoli") values(100,23,"f","mangoli");

quit;proc sql;describe table demo;quit;

not null:can be used to load the data withoutmissing values(numeric type and character type).proc sql;create table demo(pid num not null,age num, gender char,race char);

quit;

proc sql;insert into demo values(100,21,"f","asian") values(101,22,"m","african")

values(102,23,"f","mangoli") values(100,23,"f","mangoli") values(100,23,"f","mangoli");

quit;proc sql;describe table demo;quit;

check:can be used to load the data based on condition.proc sql;create table demo(pid num,age num, gender char check(gender="f"),race char );

quit;

Page 139: Notes

proc sql;insert into demo values(100,21,"f","asian") values(101,22,"f","african")

values(102,23,"f","mangoli") values(100,23,"f","mangoli") values(100,23,"f","mangoli") ;

quit; proc sql;

create table demo(pid num,age num, gender char check(gender="f"),race char );

quit;

proc sql;insert into demo values(100,21,"f","asian") values(101,22,"m","african")

values(102,23,"f","mangoli") values(100,23,"f","mangoli") values(100,23,"f","mangoli") ;

quit;/*proc sql;*//*describe table demo;*//*quit;*/

/*proc sql;*//*describe table demo;*//*quit;*/

primary key:can be used to load the data without duplicate data values and duplicate observations and without missing values.Table constraints:using table constraint,we can avoid duplicate observation.

primary key:Using primary key as a tableconstraint,we can avoid duplicate observations in loading time.

constraint statement:can be used to assign requiredconstraint for required variable.

syntax:constraint <integrityname> <type> <variable>proc sql;

Page 140: Notes

create table demo1(pid num,age num, gender char ,race char, constraint pk primary key(pid));

quit;proc sql;describe table demo1;quit;

proc sql;insert into demo1 values(100,21,"f","asian") values(101,22,"f","african")

values(102,23,"f","mangoli") ;

quit;proc sql;create table demo1(pid num,age num, gender char ,race char, constraint pk primary key(pid));

quit;proc sql;describe table demo1;quit;

proc sql;insert into demo1 values(100,21,"f","asian") values(101,22,"f","african")

values(102,23,"f","mangoli") values(100,23,"f","mangoli") values(100,23,"f","mangoli") ;

quit;proc sql;create table demo1(pid num,age num, gender char ,race char, constraint pk primary key(pid));

quit;proc sql;insert into demo1 values(100,21,"f","asian") values(101,22,"f","african")

values(102,23,"f","mangoli") ;

quit;proc sql;create table lab(pid num,test char,units num,constraint fk foreign key(pid) references demo1);quit;proc sql;

Page 141: Notes

insert into lab values(100,'hr',76) values(101,'sbp',156);

quit;proc append base=demo1 data=lab force;run;data x;set demo1 lab;run;19/09/11

proc sql;create table demo(pid num ,age num, gender char,race char);

quit;proc datasets lib=work;modify demo;ic create uk=unique(pid);quit;proc sql;describe table demo;quit;proc sql;insert into demo values(100,21,"f","asian") values(101,22,"m","african")

values(102,23,"f","mangoli") ;

quit;proc datasets lib=work;modify demo;ic create nt=not null(gender);ic create ck=check(where=(gender='f'));quit;proc sql;insert into demo values(100,21,"f","asian") values(101,22,"f","african")

values(102,23,"f","mangoli") values(103,23,"f","mangoli") ;

quit;

proc sql;insert into demo values(100,21,"f","asian") values(101,22,"f","african")

values(102,23,"f","mangoli") values(103,23,"f","mangoli") values(104,23,"m","mangoli");

quit;

proc report:if we want columns use column

Page 142: Notes
Page 143: Notes

Result :

Page 144: Notes

data _null_;file 'C:\Users\home\Desktop\New folder (3)\uk.txt';put @1 'patientname' @18 'medicine' @28 'No of visits' @43 'No of patients';

run;proc sort data=trtment out=x;by visit;run;data _null_;set x;by visit;file 'C:\Users\home\Desktop\New folder (3)\uk.txt' mod;put @1 gid @22 drug @38 visit @56 sub;if first.visit then ct=0;ct+sub;if last.visit then put @58 'visit' visit 'total' ct;

run;

20/09/11proc report:using this procedure,we can generate required analysis and generate a reports in requiredformat.This is powerful reporting tool.using this procedure,we can do frequency proceure analysis,mean procedure analysis,tabulate procedure analysis andprint procedure analysis.

Report window:report procedure generatereport in report window.columns statement:

Page 145: Notes

It require variable list and thesevariables playing a main role in analysis and report.Define statement:can be used to indicate sas system, how to use required variable in analysis and reporting.order,group,across options:The main use of these options isto arrange the data in requiredorder for reporting.break statement:can be used to give break summary breaks in middle of the reportsbased on group variable.Break statement is working on 2 options.1).After:-It indicate to give the break after grouping.2).Before:-It indicate to give the break before grouping.ol-overlineul-underlinedol-double overlinedul-double underlinesummarize option:It can be used to report the required analysis.Rbreak statement:can be used to give summary break end of the report or begining of the report based on after or before options.compute block:using this block,we can do newanalysis for reporting.1.To generate new data value for reporting.2.To create new variable for reporting.compute block ends with endcomp.compute block also working based on after and before options.

data boats;input name$1-12 port $ 14-20 locomotion $ 22-26 type $ 28-30 price 32-36;cards;silent lady maalea sail sch 75.00american II maalea sail yac 32.95aloha anai lahaina sail cat 62.00ocean spirit maalea power cat 22.00anuenue maalea sail sch 47.50hana lei maalea power cat 28.99leilani maalea power yac 19.99kalakaua maalea power cat 29.50

Page 146: Notes

reef runner lahaina power yac 29.95blue dolphin maalea sail cat 42.95run;data natparks;input name $ 1-21 type $ region $ museums camping;cards;dinosaur nm west 2 6ellis island nm east 1 0everglades np east 5 2grand canyon np west 5 3great smoky mountains np east 3 10hawaii volcanoes np west 2 2lava beds nm west 1 1statue of liberty nm east 1 0theodore roosevelt np . 2 2yellowstone np west 9 11yosemite np west 2 13run;proc print;run;proc report data=natparks nowd headskip;run;proc report data=natparks nowindows headline;column museums camping;run;proc report data=natparks nowindows headline;column museums camping;define museums/display analysis sum;define camping/display analysis sum; run;proc report data=natparks nowindows headline;column museums camping;define museums/display analysis mean;define camping/display analysis mean; run;proc report data=natparks nowindows headline missing;column region name museums camping;define region/order;define camping/analysis 'camp/grounds';run;proc report data=natparks nowindows headline missing;column region type museums camping;define region/group;define camping/group;run;proc report data=natparks nowindows headline missing;column region type,( museums camping);define region/group;define type/across;run;proc report data=natparks nowindows headline missing;column name region museums camping;define region/order;

Page 147: Notes

break after region/summarize ol skip;rbreak after/summarize ol skip;run;data trtment;input gid$ drug$ visit sub;cards;g1234 col5mg 1 90g2345 col5mg 1 89g4567 col5mg 1 78g1234 col5mg 2 50g2345 col6mg 2 79g4567 col6mg 2 38g1234 col6mg 3 70g2345 col6mg 3 89g1234 col7mg 4 90g2345 col7mg 4 89run;proc report data=trtment nowindows;column gid drug sub;define gid/group;define sub/sum;break after gid/ol ul summarize;rbreak after/dol dul summarize;compute after gid;gid ='total';endcomp;compute after;gid='gtotal';endcomp;run;data x;input eid sal;cards;100 2000101 2500102 3000run;proc report data=x nowd;columns eid sal ;define eid/'employeeid' width=10 display ;define sal/'empsalary' width=10;run;proc report data=x nowd;columns eid sal anualsal ;define eid/'employeeid' width=10 display ;define sal/'empsalary' width=10 display;define anualsal/ 'anualsal' computed;compute anualsal;anualsal=sal*12;endcomp;

Page 148: Notes

run;

data company;input cname$ details$ amount;cards;satyam invest 6700tcs invest 6800satyam profit 3400tcs profit 2300wipro invest 5600wipro profit 3400run;proc report data=company headline nowindows;columns cname(details,amount);define cname/group;define details/across;break after cname/dul;run;data medi;input pid bsbp drug$ asbp;cards;100 178 col5mg 165101 156 col5mg 159102 178 col10mg 168103 177 col10mg 177104 180 col15mg 182105 169 col15mg 134run;proc report data=medi headline;columns pid drug bsbp asbp status;define pid/group;define status/computed;break after pid/ol ul;compute status/character length=19;if _c3_ < _c4_ thenstatus='drug is not working';else if _c3_ > _c4_ thenstatus='drug is working';else status='change the drug';endcomp;run;

Page 149: Notes

proc report:using this procedure,we can generate required analysis and generate a reports in requiredformat.This is powerful reporting tool.using this procedure,we can do frequency proceure analysis,mean procedure analysis,tabulate procedure analysis andprint procedure analysis.

Report window:report procedure generatereport in report window.columns statement:

Page 150: Notes

It require variable list and thesevariables playing a main role in analysis and report.Define statement:can be used to indicate sas system, how to use required variable in analysis and reporting.order,group,across options:The main use of these options isto arrange the data in requiredorder for reporting.break statement:can be used to give break summary breaks in middle of the reportsbased on group variable.Break statement is working on 2 options.1).After:-It indicate to give the break after grouping.2).Before:-It indicate to give the break before grouping.ol-overlineul-underlinedol-double overlinedul-double underlinesummarize option:It can be used to report the required analysis.Rbreak statement:can be used to give summary break end of the report or begining of the report based on after or before options.compute block:using this block,we can do newanalysis for reporting.1.To generate new data value for reporting.2.To create new variable for reporting.compute block ends with endcomp.compute block also working based on after and before options.

data boats;input name$1-12 port $ 14-20 locomotion $ 22-26 type $ 28-30 price 32-36;cards;silent lady maalea sail sch 75.00american II maalea sail yac 32.95aloha anai lahaina sail cat 62.00ocean spirit maalea power cat 22.00anuenue maalea sail sch 47.50hana lei maalea power cat 28.99leilani maalea power yac 19.99kalakaua maalea power cat 29.50

Page 151: Notes

reef runner lahaina power yac 29.95blue dolphin maalea sail cat 42.95run;data natparks;input name $ 1-21 type $ region $ museums camping;cards;dinosaur nm west 2 6ellis island nm east 1 0everglades np east 5 2grand canyon np west 5 3great smoky mountains np east 3 10hawaii volcanoes np west 2 2lava beds nm west 1 1statue of liberty nm east 1 0theodore roosevelt np . 2 2yellowstone np west 9 11yosemite np west 2 13run;proc print;run;proc report data=natparks nowd headskip;run;proc report data=natparks nowindows headline;column museums camping;run;proc report data=natparks nowindows headline;column museums camping;define museums/display analysis sum;define camping/display analysis sum; run;proc report data=natparks nowindows headline;column museums camping;define museums/display analysis mean;define camping/display analysis mean; run;proc report data=natparks nowindows headline missing;column region name museums camping;define region/order;define camping/analysis 'camp/grounds';run;proc report data=natparks nowindows headline missing;column region type museums camping;define region/group;define camping/group;run;proc report data=natparks nowindows headline missing;column region type,( museums camping);define region/group;define type/across;run;proc report data=natparks nowindows headline missing;column name region museums camping;define region/order;

Page 152: Notes

break after region/summarize ol skip;rbreak after/summarize ol skip;run;data trtment;input gid$ drug$ visit sub;cards;g1234 col5mg 1 90g2345 col5mg 1 89g4567 col5mg 1 78g1234 col5mg 2 50g2345 col6mg 2 79g4567 col6mg 2 38g1234 col6mg 3 70g2345 col6mg 3 89g1234 col7mg 4 90g2345 col7mg 4 89run;proc report data=trtment nowindows;column gid drug sub;define gid/group;define sub/sum;break after gid/ol ul summarize;rbreak after/dol dul summarize;compute after gid;gid ='total';endcomp;compute after;gid='gtotal';endcomp;run;data x;input eid sal;cards;100 2000101 2500102 3000run;proc report data=x nowd;columns eid sal ;define eid/'employeeid' width=10 display ;define sal/'empsalary' width=10;run;proc report data=x nowd;columns eid sal anualsal ;define eid/'employeeid' width=10 display ;define sal/'empsalary' width=10 display;define anualsal/ 'anualsal' computed;compute anualsal;anualsal=sal*12;endcomp;

Page 153: Notes

run;

data company;input cname$ details$ amount;cards;satyam invest 6700tcs invest 6800satyam profit 3400tcs profit 2300wipro invest 5600wipro profit 3400run;proc report data=company headline nowindows;columns cname(details,amount);define cname/group;define details/across;break after cname/dul;run;data medi;input pid bsbp drug$ asbp;cards;100 178 col5mg 165101 156 col5mg 159102 178 col10mg 168103 177 col10mg 177104 180 col15mg 182105 169 col15mg 134run;proc report data=medi headline;columns pid drug bsbp asbp status;define pid/group;define status/computed;break after pid/ol ul;compute status/character length=19;if _c3_ < _c4_ thenstatus='drug is not working';else if _c3_ > _c4_ thenstatus='drug is working';else status='change the drug';endcomp;run;

Task 1:data x;input eid sal;cards;100 2000100 2500101 2500

Page 154: Notes

101 2500100 2000100 2000100 2000101 2500102 3000103 4500110 5000105 5000108 7000109 4000110 3000111 4000112 8000113 5500run;

Create new dataset containing duplicate obs…proc sort data=x out=dups;by eid;run;data x2;set dups;by eid;if not (first.eid and last.eid);run;proc print;run

or ( new )proc sort data=x out=x1nodupkey dupout=dups;by eid;run;Output Delivery systemods pdf file='C:\Users\home\Desktop\New folder (3)/uk.pdf';proc report data=sashelp.class nowd;run;ods pdf close;

ods html file='C:\Users\home\Desktop\New folder (3)/uk.html';proc report data=sashelp.class nowd;run;ods html close;

ods rtf file='C:\Users\home\Desktop\New folder (3)/uk.rtf';

Page 155: Notes

proc report data=sashelp.class nowd;run;ods rtf close;

Data x;input cust $ name $ age sex $ proc $ charge hrp_charge rent_charge;cards;100 raja 24 M 890 900 1000 2000200 kaja 67 M 900 987 2000 8000100 rani 89 m 300 800 600 876300 hani 56 F 800 908 123 8908run;PROC SQL;CREATE TABLE x1 AS SELECT *, PUT(SUM(CHARGES),BEST.) AS CHARGES1, PUT(SUM(HRP_ALLOW),BEST.) AS HRP_ALLOW1, PUT(SUM(PPOALLOW),BEST.) AS PPOALLOW1 FROM x GROUP BY cust;QUIT;

PROC SQL;CREATE TABLE x2 AS SELECT *,SUM(CHARGES) AS CHARGES2 ,SUM(HRP_ALLOW) AS HRP_ALLOW2 ,SUM(PPOALLOW) AS PPOALLOW2 FROM x1;QUIT;

DATA TEMP1_REP;length key1 $150. key2 $150. KEY3 $150. KEY4 $150. KEY5 $150.;SET x2;DOB_DATE1=put(input(DOB_DATE,anydtdte19.),mmddyys10.);ADATE1=put(input(ADATE,anydtdte19.),mmddyys10.);KEY1=CAT(cust,'^2',NAME,'^3',sex);KEY2=CAT(proc,'0');KEY3=CAT(charge,'^2',hrp_charge,'^3',rent_charge);KEY4=CAT(cust,'^2',CHARGES,'^3',HRP_ALLOW,'^4',PPOALLOW);RUN;

options papersize=letter mlogic mprint symbolgen nodate nonumber;footnote;ods pdf file='C:\Users\home\Desktop\New folder (3)/uk2.pdf' UNIFORM notoc;/*ods pdf option UNIFORM;*/proc report data=x2;

Page 156: Notes

COLUMNS KEY1 KEY2 KEY3 KEY4 KEY5 KEY6;

define KEY1 /GROUP ' ' noprint style(column)={cellwidth=2 just=left};define KEY2 /GROUP ' ' noprint;define KEY3 /GROUP ' ' noprint;define KEY4 /GROUP ' ' noprint;define KEY5 /GROUP ' ' noprint;define KEY6 /GROUP ' ' noprint;

break before KEY1 / SKIP;

compute before KEY1 / style={just=left};cust=Substr(KEY1,1,index(key1,'^2')-1);NAME=Substr(KEY1,index(key1,'^2')+2,(index(key1,'^3')-index(key1,'^2'))-2);SEX=substr(key1,index(key1,'^3')+2);LINE @1 'cust:' cust $9. @25 'NAME:' NAME $22. @40 'SEX:' PAT_SEX $1.;ENDCOMP; COMPUTE BEFORE KEY2;proc=substr(key2,index(key2,'0');LINE @1 'proc:' proc $12.; ENDCOMP;KEY3=CAT(charge,'^2',hrp_charge,'^3',rent_charge);COMPUTE BEFORE KEY3;charge=Substr(key3,1,index(key3,'^2')-1);hrp_charge=Substr(key3,index(key3,'^2')+2,(index(key3,'^3')-index(key3,'^2'))-2);rent_charge=Substr(key3,index(key3,'^3')+2,(index(key3,'^4')-index(key3,'^3'))-2);LINE @1 'charge:' charge $9. @22 'hrp_charge:' hrp_charge $9. @39 'rent_charge' rent_charge $9. ;ENDCOMP;

COMPUTE BEFORE KEY4;cust=Substr(key4,1,index(key4,'^2')-1);CHARGES=Substr(key4,index(key4,'^2')+2,(index(key4,'^3')-index(key4,'^2'))-2);HRP_ALLOW=Substr(key4,index(key4,'^3')+2,(index(key4,'^4')-index(key4,'^3'))-2);PPOALLOW=Substr(key4,index(key4,'^4')+2);LINE @1 'TOTALS FOR :' cust $5.; LINE @1 'CHARGES:' CHARGES $7. @20 'HRP_ALLOW:' HRP_ALLOW $7. @40 'PPO_ALLOW:' PPOALLOW $8.;ENDCOMP;

COMPUTE AFTER;LINE @1 "GRAND TOTALS" @15 "CHARGES: &CHARGES2" @40 "HRP_ALLOW: &HRP_ALLOW2" @65 "PPO_ALLOW: &PPOALLOW2";

Page 157: Notes

ENDCOMP;

run; ods pdf close;

I want report like below and must in the of PDF file. Pls send code very soon. cust:100 name:raja sex:M proc:890charge:900 hrp_charge:1000 rent_charge:2000 cust:100 name:rani sex:Fproc:300charge:908 hrp_charge:123 rent_charge:8908 Total for 100 charge:1808 hrp_charge:1123 rent_charge:110908 cust:200 name:kaja sex:Mproc:900charge:987 hrp_charge:2000 rent_charge:8000 Total for 200 charge:987 hrp_charge:2000 rent_charge:8000 cust:300 name:hani sex:Fproc:800charge:908 hrp_charge:123 rent_charge:8908 Total for300 charge:908 hrp_charge:123 rent_charge:8908 Grand total charge:3595 hrp_charge:3723 rent_charge:19784

Sending email

FILENAME myemail EMAIL from=("[email protected]") to=("[email protected]" ) cc=("[email protected]") Subject = "An automatic email sent from SAS"Attach = "C:\Users\home\Desktop\New folder (3)\uk.pdf";data _null_; file myemail; put "Your report is now available online." / / "Thank you and have a great day." / / " " / /"Sincerely," / /"Venkat Prasad Sandu" / / " " / /

Page 158: Notes

"This is an automated email sent by SAS on behalf of Venkat Prasad Sandu"; run;Proc reportdata test;input Region $ Sales AgentID;cards;E 12 1E 14 1E 17 2E 12 1E 14 3E 12 2E 18 3N 18 4N 16 4N 12 5N 17 4N 25 4S 12 7S 12 8S 13 7S 12 8W 27 9;run;options nodate nonumber;title;ods pdf file ='C:\Users\home\Desktop\New folder (3)\rep.pdf'; proc report data = test nowd style=[frame=void rules=none ]style(header)=[background=white];column Region Sales AgentID;define region / group;define sales / analysis sum;define AgentID / analysis n "Number of Agents";rbreak after / summarize Ol UL;run; ods pdf close;Task 2:%let name=prasad;output:'prasad'( Need Prasad in inverted commas).Solution:%let name=prasad;%let ab=%str(%'&name%');%put &ab;

Task 3:

Page 159: Notes

data trans;input CustomerID $ transactiondate mmddyy10. amount category $;cards;9801234 10/01/1998 123.98 toys9802234 12/10/1997 80.34 books9802234 12/20/1997 100.00 apparel9805556 08/01/1996 22.90 toys9805556 09/10/1996 25.50 apparel9805556 10/11/1996 18.90 books9801134 11/11/1999 12.11 toys;run;1. Total and average amount spent by category 2. Which category has the highest average purchase 3. What is the average number of categories that customers purchase 4. What is the average and total amount by customer 5. What is the average number of days between purchases (as of today).

data trans;input CustomerID $ transactiondate mmddyy10. amount category $;cards;9801234 10/01/1998 123.98 toys9802234 12/10/1997 80.34 books9802234 12/20/1997 100.00 apparel9805556 08/01/1996 22.90 toys9805556 09/10/1996 25.50 apparel9805556 10/11/1996 18.90 books9801134 11/11/1999 12.11 toys;run;

/*(1)Total and average amount spent by category*/ proc sql;create table one as select category, sum(amount) as Total, avg(amount) as Averagefrom transgroup by categoryorder by category;quit;

/*(2)Which category has the highest average purchase*/ proc sql outobs=1;create table two as select category, avg(amount) as Averagefrom transgroup by category

Page 160: Notes

order by 2 desc;quit;

/*(3)What is the average number of categories that customers purchase */ proc sql;create table three as select CustomerID,count(distinct category) as Avg_catfrom transgroup by CustomerID;quit;

/*(4)What is the average and total amount by customer */ proc sql;create table four as select CustomerID,sum(amount) as Total, avg(amount) as Averagefrom transgroup by CustomerIDorder by CustomerID; quit;

/*(5)What is the average number ofdays between purchases (as of today) */ proc sort data=trans;by CustomerID transactiondate;run; data findavg(drop=mult_trans daysinbet td1 td2 amount categorytransactiondate);set trans;by CustomerID transactiondate;retain td1 td2;if first.CustomerID then do; daysinbet=0; td1=transactiondate; mult_trans=0; avgdays=0; end;else do; td2=transactiondate; daysinbet=td2-td1; mult_trans+1; avgdays=daysinbet/mult_trans; end;if last.CustomerID then output;

Page 161: Notes

run;proc print;run;

/*Read in the data*/data COUNTY;input COUNTY_ID $ STATE_NAME $ COUNTY_NAME $;cards;1 Texas Collin2 Texas Dallas3 Georgia DeKalb;run;

/*Read in the data*/ data AGE_DISTRIBUTION_DESC;length CATEGORY_DESCRIPTION $23;input CATEGORY_NAME $ 1-11 CATEGORY_DESCRIPTION &;cards;AGE_0_10 < 10 yearsAGE_10_20 Between 10 and 20 yearsAGE_20_40 Between 20 and 40 yearsAGE_40_PLUS > 40 years;run;data AGE_DISTRIBUTION;input COUNTY_ID $ AGE_0_10 AGE_10_20 AGE_20_40 AGE_40_PLUS;cards;1 100 20 40 602 10 10 40 503 45 100 56 67;run;/*Sort the data for next merge step*/ proc sort data=COUNTY; by county_id; run;proc sort data=AGE_DISTRIBUTION;by county_id; run;

/*Transpose the data by County_id*/ proc transpose data=AGE_DISTRIBUTION out=trans (rename=(_name_=category_name col1=total_num));by county_id;run;/*Merge the 2 datasets to get the County Info at one place*/ data countyinfo;merge county trans;by county_id;run;/*Sort the data for next merge step*/

Page 162: Notes

proc sort data=countyinfo; by CATEGORY_NAME; run;proc sort data=AGE_DISTRIBUTION_DESC; by CATEGORY_NAME; run;/*Merge the 2 datasets to get the County Info at one place*/ data final;merge countyinfo AGE_DISTRIBUTION_DESC;by CATEGORY_NAME;run; proc sort data=final; by county_name; run; /*Macro to Create the Report in HTML and Excel Version */ %macro report(type);filename report "C:\Users\home\Desktop\html - Copy\results.&type"; ods listing close;ods html body=report; /* First Report */ title;proc report data=final nowd split='*' style(header)=[foreground=blue background=grey] /*style for the header*/ style(column)=[foreground=black background=white]; /*style for the Columns*/ columns county_name ("Age Distribution" CATEGORY_DESCRIPTION total_num) ;define county_name / group 'County*Name' ;define CATEGORY_DESCRIPTION / 'Category' style(header)=[foreground=red background=grey];define total_num / analysis 'Total Number' style(header)=[foreground=red background=grey];/*break after county_name/skip ;*/ run; /* Second Report */ proc tabulate data=final format=3. style=[ background=white foreground=black]; /*style for the data area*/ class county_name/style=[background=grey foreground=red]; /*style for the column name*/ class category_description /order=data style=[background=grey foreground=red]; /*style for the column name*/ var total_num;table category_description='Category' all*{style=[background =white font_style=italic foreground=black]},county_name='County'*total_num=' ' ;keylabel all='Total' sum=' ';keyword all/style=[font_weight=extra_light background=white foreground=black font_style=italic];classlev category_description county_name/style=[background=grey foreground=black];run; /* Third Report */ proc tabulate data=final format=3. style=[background=white foreground=black];

Page 163: Notes

class state_name/descend style=[background=grey foreground=red];class category_description/order=data style=[background=grey foreground=red];var total_num;table category_description='Category' all*{style=[background =white font_style=italic foreground=black]},state_name='State'*total_num=' ';keylabel all='Total' sum=' ';classlev category_description state_name/style=[background=grey foreground=black];keyword all/style=[font_weight=extra_light background=white foreground=black font_style=italic];run; ods html close;ods listing;%mend report; %report(html); /*HTML Version*/ %report(xls); /*Excel Version*/