23/08/11 SAS A Complete programming language with report formatting with statistical and mathematical capabilities. 1st Part.. • The first part called the DATA step, This part describes your input data allows you to modify that data with arithmetic operations and logical decisions. The DATA Step will format your data into a special file called SASdataset. The second part of SAS is a library of canned routines called PROCEDURES. • The procedures can only use SAS datasets as input. • The Data step must always have preceded the Procedure section. • The procedures are executed by coding a special statement in SAS called a PROC statement . The core of SAS language: a programming language that you use to manage your data. tools for data analysis and reporting. a tool for extending and customizing software programs and for reducing text in tool that helps you find logic problems in DATA step programs. a system that delivers output in a variety of easy-to-access formats, such as SAS data sets, listing files, or Hypertext Markup Language an interactive, graphical user interface that enables you to easily run and test your SAS
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
23/08/11SAS
A Complete programming language with report formatting with statistical and mathematical capabilities.
1st Part..• The first part called the DATA step, This part describes your input data allows you
to modify that data with arithmetic operations and logical decisions. The DATA Step will format your data into a special file called SASdataset.
The second part of SAS is a library of canned routines called PROCEDURES.• The procedures can only use SAS datasets as input.• The Data step must always have preceded the Procedure section.• The procedures are executed by coding a special statement in SAS called a PROC
statement .
The core of SAS language:
a programming language that you use to manage your data.
tools for data analysis and reporting.
a tool for extending and customizing SAS software programs and for reducing text in your programs.
tool that helps you find logic problems in DATA step
a system that delivers output in a variety of easy-to-access formats, such as SAS data sets, listing files, or Hypertext Markup Language
an interactive, graphical user interface that enables you to easily run and test your SAS programs.
A DATA step consists of a group of statements in the SAS language that can read data from external files write data to external files read SAS data sets and data views create SAS data sets and data views.
A group of procedure statements is called a PROC step. SAS procedures analyze data in SAS data sets to produce statistics, tables, reports, charts, and plots, to create SQL queries, and to perform other analyses and operations on your data. They also provide ways to manage and print SAS files SAS Macro Facility
Base SAS software includes the SAS Macro Facility, a powerful programming tool for extending and customizing your SAS programs, and for reducing the amount of code that you must enter to do common tasks. Macros are SAS files that contain compiled macro program statements and stored text. You can use macros to automatically generate SAS statements and commands, write messages to the SAS log, accept input, or create and change the values of macro variables.
SAS ARCHITECTURE
SAS Tier - Application Layer: All the Servers Metadata, Workspace Server, Stored Process Server and OLAP (SAS ETL Processes)Mid Tier - Data Management Layer: All Data management and Reporting Services and Tools. (BaseSAS, IOM)Client Tier - Presentation Layer: All Easy (GUI) Reporting Components (SASEG, SAS Miner etc)
Examples:
data x;input name $ age sal;cards;prasad 30 20000venkat 40 60000
Using infile options we can read the data in proper order.
DLM:It is used to indicate delimiters in raw data.
DATA COMMAS;INFILE DATALINES DLM=',' ;INPUT ID HEIGHT WEIGHT GENDER $ AGE;DATALINES;1,68,144,M,232,78,202,M,343,62,99,F,374,61,101,F,45;
DATA COMMAS3;INFILE datalines DLM=', & $' ;INPUT ID HEIGHT WEIGHT GENDER $ AGE;DATALINES;1,68&144,M,232,78,202,M,343$62,99,F,374,61,101$F,45;run;
/*Missover*/If any value is missing in raw data it tries to read next data value.To avoid that behaviour we can use missover.
data mis;infile cards missover ;input sal age name $;cards;2000 20 ramu3000 40 5000 60 mahesh6000 65
4000 55 ranirun;proc print;run;
/*dsd*/1.In raw data, data values are separated by commas we will use dsd.2.SAS treats missing value in between two consuective delimitres.3.quotation marks in data values.DATA ds; INFILE DATALINES dsd ; INPUT ID HEIGHT WEIGHT GENDER $ AGE;DATALINES;1,68,144,M,232,78,202,M,343,62,99,F,374,61,101,F,45;DATA ds1; INFILE DATALINES dsd ; INPUT ID HEIGHT WEIGHT GENDER $ AGE;DATALINES;1,68,144,M,232,,202,M,343,62,99,F,374,,101,F,45;proc print;run;data scores; infile datalines dsd; input Name : $9. Score Team : $25. Div $; datalines;Joseph,76,"Red Racers, Washington",AAAMitchel,82,"Blue Bunnies, Richmond",AAASue Ellen,74,"Green Gazelles, Atlanta",AA;
/*Truncover*/It works like a missover.it adjust vary of the length for required variable & save thediskplace &reduce the storage place.orIt manage variable length of variable values.
data tr;infile cards truncover;input age sal name $20.;cards;12 3000 ramash
used for scaning the string whenever the string starts there it reads the values.
data sca;infile cards scanover;input @'ab' name $ loc $ sal;cards;ab appolo hyd 20000ab image hyd 30000appolo sec 40000care ban 50000care hyd 60000run;proc print;run;data sca1;infile cards scanover;input @ 'ab' a b ;cards;10 20 3030 70 ab 78 89 9040 ab 49 80 run;proc print;run;
/*stopover*/when error is occured in raw data in reading,stopover tries to stops the reading.
data st;infile cards stopover;input t1 t2 t3 t4 t5;cards;10 22 33 88 9944 5599 88 77 66 60run;
DATA LISTINP; INPUT ID HEIGHT WEIGHT GENDER $ AGE;
DATALINES;1 68 144 M 232 78 202 M 343 62 99 F 374 61 101 F 45;
*---------------- EXAMPLE 3.1 --------------------;DATA INFORMS; INFORMAT LASTNAME $20. DOB MMDDYY8. GENDER $1.; INPUT ID LASTNAME DOB HEIGHT WEIGHT GENDER AGE; FORMAT DOB MMDDYY8.;DATALINES; 1 SMITH 1/23/66 68 144 M 262 JONES 3/14/60 78 202 M 32
3 DOE 11/26/47 62 99 F 454 WASHINGTON 8/1/70 66 101 F 22 ;
*---------------- EXAMPLE 3.2 --------------;DATA COLONS; INPUT ID LASTNAME : $20. DOB : MMDDYY8. HEIGHT WEIGHT GENDER : $1. AGE; FORMAT DOB MMDDYY8.;DATALINES; 1 SMITH 01/23/66 68 144 M 262 JONES 3/14/60 78 202 M 32 3 DOE 11/26/47 62 99 F 45 4 WASHINGTON 8/1/70 66 101 F 22 ;
PROC PRINT; TITLE 'Example 3.2';RUN;*--------------------------------------------;
*--------------- EXAMPLE 4 --------------;DATA AMPERS; INPUT NAME & $25. AGE GENDER : $1.; DATALINES; RASPUTIN 45 MBETSY ROSS 62 FROBERT LOUIS STEVENSON 75 M;
PROC PRINT; TITLE 'Example 4';RUN;*-----------------------------------------;
*---------------------- EXAMPLE 5.1 ----------------------;DATA COLINPUT; INPUT ID 1 HEIGHT 2-3 WEIGHT 4-6 GENDER $ 7 AGE 8-9; DATALINES; 168144M23 278202M34 362 99F37 461101F45 ;
PROC PRINT;
TITLE 'Example 5.1';RUN;*----------------------------------------------------------;
*--------------------- EXAMPLE 5.2 -----------------------;DATA COLINPUT; INPUT ID 1 HEIGHT 2-3 WEIGHT 4-6 GENDER $ 7 AGE 8-9; DATALINES; 168144M23 278202M34 362 99F37 461101F45 ;
PROC PRINT; TITLE 'Example 5.2';RUN;*----------------------------------------------------------;
*--------------- EXAMPLE 5.3 -------------;DATA COLINPUT; INPUT ID 1 AGE 8-9; DATALINES; 168144M23 278202M34 362 99F37 461101F45 ;
PROC PRINT; TITLE 'Example 5.3';RUN;*------------------------------------------;
*--------------- EXAMPLE 5.4 -------------;DATA COLINPUT; INPUT AGE 8-9 ID 1 WEIGHT 4-6 HEIGHT 2-3 GENDER $ 7; DATALINES; 168144M23 278202M34 362 99F37
461101F45 ;
PROC PRINT; TITLE 'Example 5.4';RUN;*------------------------------------------;
*------------ EXAMPLE 6.1 ---------;DATA POINTER; INPUT @1 ID 3. @5 GENDER $1. @7 AGE 2. @10 HEIGHT 2. @13 DOB MMDDYY6.; FORMAT DOB MMDDYY8.;DATALINES; 101 M 26 68 012366102 M 32 78 031460103 F 45 62 112647104 F 22 66 080170;
PROC PRINT; TITLE 'Example 6.1';RUN;*-----------------------------------;
*------------- EXAMPLE 7.1 -------------;DATA POINTER; INPUT #1 @1 ID 3. @5 GENDER $1. @7 AGE 2. @10 HEIGHT 2. @13 DOB MMDDYY6. #2 @5 SBP 3. @9 DBP 3. @13 HR 3.; FORMAT DOB MMDDYY8.;DATALINES; 101 M 26 68 012366101 120 80 68102 M 32 78 031460102 162 92 74 103 F 45 62 112647103 134 86 74104 F 22 66 080170104 116 72 67;
PROC PRINT; TITLE 'Example 7.1';RUN;*----------------------------------------;
*------------- EXAMPLE 7.2 ---------;DATA SKIPSOME; INPUT #2 @1 ID 3. @12 SEX $6. #4;DATALINES;101 256 RED 9870980101 898245 FEMALE 7987644101 BIG 9887987101 CAT 397 BOAT 68102 809 BLUE 7918787102 732866 MALE 6856976102 SMALL 3884987 102 DOG 111 CAR 14;
PROC PRINT; TITLE 'Example 9.3';RUN;*--------------------------------------;
*------------- EXAMPLE 10 ------------;/*Single trailing:the trailing single @ tells the program not to go to the next data line for the next INPUT statement inthe data step*/
/*or*//*To hold line pointer*/
DATA TRAILING;INPUT @6 TYPE $1. @ ;IF TYPE = '1' THEN INPUT AGE 1-2;ELSE IF TYPE = '2' THENINPUT AGE 3-4;DROP TYPE;DATALINES;23 1 44 2;PROC PRINT DATA=TRAILING;RUN;
DATA MIXED; INPUT @20 TYPE $1. @; IF TYPE = '1' THEN INPUT ID 1-3 AGE 4-5 WEIGHT 6-8; ELSE IF TYPE = '2' THEN INPUT ID 1-3 AGE 10-11 WEIGHT 15-17; DATALINES; 00134168 1 00245155 1 003 23 220 2 00467180 1 005 35 190 2 ;
PROC PRINT; RUN;data med;
input pid visit drug $ @;output;input visit drug $ @;output;input visit drug $ @;output;input visit drug $ @;output;cards;100 1 5mg 2 5mg 3 10mg 4 15mg101 1 10mg 2 10mg 3 15mg 4 15mgrun;proc print;run;*--------------------------------------;
PROC PRINT; TITLE 'Example 11.1';RUN;*----------------------------------;
*----------------------- EXAMPLE 11.2 --------------------;/*double trailing: line contains*//*more than one observation.*/
DATA SHORTWAY; INPUT X Y @@; DATALINES; 1 2 3 4 5 6 6 9 10 12 13 14 ;
PROC PRINT;RUN;
Single trailing:
Use a trailing @ at the end of the INPUT statement to hold the record in the input buffer for the execution of the next INPUT statement.
Use an IF statement on the portion that is read in to test for a condition.
If the condition is met, use another INPUT statement to read the remainder of the record to create an observation.
If the condition is not met, the record is released and control passes back to the top of the DATA step.
1. Use a trailing @ at the end of the INPUT statement to hold the record in the input buffer for the execution of the next INPUT statement.
2. Use an IF statement on the portionthat is read in to test for a condition.
3. If the condition is met, use another INPUT statement to read the remainder of the record to create an observation.
4. If the condition is not met, the record is released and control passes back to the top of the DATA step.
5. Reading the parts of data more than once.
data red_team; input Team $ 13-18 @; if Team='red'; input IdNumber 1-4 StartWeight 20-22 EndWeight 24-26; datalines;1023 David red 189 1651049 Amelia yellow 145 1241219 Alan red 210 1921246 Ravi yellow 194 1771078 Ashley red 127 1181221 Jim yellow 220 . ;
proc print data=red_team; run;
data values informat format30/02/2003 ddmmyy10. or s10.30-02-2003 ddmmyy10. ddmmyyd10. 30:02:2003 ddmmyyc10.30.02.2003 ddmmyyp10.30 02 2003 ddmmyyb10.
Eg:2
data values informat format30/02/03 ddmmyy8. or s8.
data x;input id stime :time10. etime :time8.;format stime timeampm12. etime time8.;cards;200 10:23:34am 15:34:19201 02:23:34pm 18:23:34run;proc print;run;
27 th August 2011-08-27
Proc sort
Used for sorting analysis. By default it gives ascending order.if want to descending we need to specify descending option in sort procedure.
/*duplicate data value*/The same data value repeated in that variable. So it s called DDV.Duplicate data value can find based on required variable./*duplicate observation*/The same observation repeated in that data. duplicate observation can find out based onall variables in data.
/*nodupkeys and noduprecs same*/data x;input id sal age;cards;100 2000 24100 7000 26100 2000 24100 2000 24100 3000 25101 4000 56101 4000 56101 4000 5699 5999 6788 8999 90
/*we can hold values across obseravation*/or/*we can do cumulative totals.*/data x(keep=a ct);input a b;/*retain ct 0;*//*ct=ct+a;*/ct+a;cards;10 2020 8030 9040 80. 1020 10run;proc print data=x;run;
data x;input a b;k=sum(a,b);k1=a+b;cards;10 2020 .30 90run;proc print data=x;run;
data x;input id sal;cards;10 200011 300012 4000run;data x1;input id sal;cards;14 800015 900011 8000run;data c;set x x1;run;data c1;set c;if sal >3000;run;proc print;run;
data company;input company$ @8 area $ 9. invest strengh;cards;satyam vizag 2000000 567tcs hyderabad 5000000 956cts hyderabad 6000000 345tcs madras 5000000 678wipro hyderabad 7000000 789wipro bangalore 6700000 456satyam bangalore 6700000 453wipro pune 6700000 789hsbc hyderabad 8900000 673icici bangalore 8000000 893hdfc pune 9000000 796run;proc print data=company;run;data satyam;set company;where company in('hdfc','tcs');run;proc print;run;proc print data=company;where company='satyam' and invest ge 6000000;run;proc print data=company;where company eq 'satyam' or company='tcs';run;proc print data=company;where invest between 6000000 and 8000000;run;proc print data=company;where company ne 'satyam' or company ne'tcs';run;
data x;input id sal;cards;100 3000101 4000103 5000102 6000108 9000105 6000run;data x1;input id loc $;cards;
data fm;merge x(in=a) x1(in=b);by id;if a and b;run;proc print;run;
/*Full join*/data lm;merge x(in=a) x1(in=b);by id;if a or b;
run;proc print;run;
data management:
is one type of data reading concept.using thisconcept to accept data from different sas filesand arrange the data into specific order.
data management concept runs on 2 concepts:1.Adding 2.combine
Adding: concept can be done in 2ways1.appending 2.Concatination
Appending:- To add one or more data set data into a existed data set. this concept is called appending.Append procedure:Using this procedure we can do appending and concatination.
data x;input pid age gender $;cards;100 34 female101 56 malerun;data x1;input pid age gender $;cards;200 56 male201 34 femalerun;data x2;input pid age gender $;cards;300 58 male301 39 femalerun;data x3;set x x1 x2;run;proc print;run;
/*one to one merge*/data demo;input id age sex $ weight;cards;100 45 female 67101 34 male 34
102 23 female 45103 34 male 67run;data medi;input id drug $ sdate:date9.;cards;102 col5mg 12oct2003100 col10mg 15nov2003101 col15mg 14dec2003run;proc sort data=demo;by id;run;proc sort data=medi;by id;run;data onetoone;merge demo medi;by id;run;proc print;run;
/*one to many*/data demo;input id age sex $ weight;cards;100 45 female 67101 34 male 34102 23 female 45103 34 male 67run;data lab;input id test $ units;cards;100 hr 79101 hr 78102 hr 75103 hr 76100 sbp 178102 sbp 178103 sbp 145100 dbp 89101 dbp 88102 dbp 85103 dbp 89run;proc sort data=demo;by id;run;proc sort data=lab;by id;run;data onemany;merge demo lab;
by id;run;proc print;run;
/*many to one*/
data demo;input id age sex $ weight;cards;100 45 female 67101 34 male 34102 23 female 45103 34 male 67run;data lab;input id test $ units;cards;100 hr 79101 hr 78102 hr 75103 hr 76100 sbp 178102 sbp 178103 sbp 145100 dbp 89101 dbp 88102 dbp 85103 dbp 89run;proc sort data=demo;by id;run;proc sort data=lab;by id;run;data onemany;merge lab demo;by id;run;proc print;run;
/*To run the merge concept based on matching*//*and non-matching observation.*/
/*To report who got expected adverse events and their medicine information.*/
data exadmed;merge exadvent(in=var) medi;by stno;if var=1;run;proc print data=exadmed;run;
proc append:
data demo1;input pid age sex $;cards;100 34 female101 56 malerun;data demo2;input pid age sex $;cards;200 56 male201 34 femalerun;proc append base=demo1 data=demo2;run;
data demo1;input pid age sex $ loc $;cards;100 34 female hyd101 56 male secrun;data demo2;input pid age sex $ grade;cards;200 56 male 89201 34 female 90run;
Force : In appending time ,additional variablesare occured in transition daaset we can't run the append.If we want to run we should use force option.
can be replace master data values with trasition data values based on matching variable.data oldemp1;input eid sal;cards;100 4000102 2000103 3000run;data newemp2;input eid sal;cards;103 6000100 7000run;proc sort data=oldemp1;by eid;run;proc sort data=newemp2;by eid;run;data emp3;update oldemp1 newemp2;by eid;run;proc print;run;
27/08/11
data x2;set x;
select;when (age <=10) group = "child ";when (11 <= age <= 19) group = "teenager ";when (20 <= age <= 29) group = "young adult";when (30 <= age <= 45) group = "adult ";when (46 <= age <= 59) group = "middle age ";otherwise group = "senior ";
end;
run;proc print;run;
data x2;set x;select;
when (age <=10) group = "child ";when (11 <= age <= 19) group = "teenager ";when (20 <= age <= 29) group = "young adult";when (30 <= age <= 45) group = "adult ";when (46 <= age <= 59) group = "middle age ";otherwise group = "senior ";
data x;input age sal;cards;1 20002 30003 70004 80005 80009 200010 500011 800012 9000;run;data x2;set x;if age in(1,4,5) then do;bonus=sal*0.2;hike=sal+bonus;end;else do;bonus=sal*0.5;hike=sal+bonus;end;run;proc print;run;
loop concept:
can be used to run required statementsmutiple times.It must have 3 statements.1.loop variable and some value2.condition3.increment
output:
can be used to store current observation in current data set.
data x;do i=1 to 10;end;run;proc print;run;data x1;do i=1 to 10;output;end;run;proc print;run;data x2;do i=1 to 10;k=i+1;output;end;run;proc print;run;data x3;do i=1 to 10;output;k=i+1;end;run;proc print;run;data x4;do i=1 to 10;output;i=i+1;end;run;proc print;run;data x5;do i=1 to 10;i=i+1;output;end;run;proc print;run;
data one;x=10;output;x=5;output;
run;
data one two;x=10;output one;x=5;output two;run;proc print ;run;
data demo;i=100;do while(i<=120);cid=i;output;i=i+1;end;drop i;run;proc print;run;
data compound;amount=5000;rate=.075;yearly=amount*rate;qty+((qty+amount)*rate/4);output;qty+((qty+amount)*rate/4);output;qty+((qty+amount)*rate/4);output;qty+((qty+amount)*rate/4);output;run;proc print;run;data comp;amount=5000;rate=.075;yearly=amount*rate;do qtr=1 to 4;quarterly+(quarterly+amount)*rate/4;end;run;proc print;run;data invest;do year=2001 to 2003;capital+5000;capital+(capital*.075);output;end;run;proc print;run;
data xxx;do month='jan','feb','mar';output;
end;run;proc print;run;
data xxx;do month=1,2,3,4,5,6;output;end;run;proc print;run;data x;do i=1 to 20 by 2;output;end;run;proc print;run;data x;do i=14 to 2 by -2;output;end;run;proc print;run;
data inv;do until(capital>1000000);year+1;capital+5000;capital+(capital* .075);output;end;run;proc print;run;data inv1;do while(capital<=1000000);year+1;capital+5000;capital+(capital* .075);output;end;run;proc print;run;
30/08/11
functions:Requires some arguments(variables or data values)and do some action and generate some result.These result will be stored in other variable.
These are 3 types.1.Numeric2.character3.date and time
int:we can get integral part of variable.round:we can round up of the nearest integer or
nearest decimal places.Absolute: This convert all data values into postive format.
Mod: It returns remainder.
data x;input id age weight;cards;100 23 78.76101 24 56.45102 34 -67.33104 35 78.55106 23 56.44106 23 67.74run;data x1;set x;wint=int(weight);wround=round(weight);wabs=abs(weight);run;proc print;run;data one;x=mod(20,5);run;proc print;run;data x;set sashelp.class;i=mod(_n_,2);if i=0;run;proc print;run;data x;set sashelp.class;i=mod(_n_,2);if i=1;run;proc print;run;
SUM():can be used to do row wise sum analysis or row wise sum.data x;input a b;x=sum(a,b);y=a+b;cards;12 3456 7845 67
string functions:1.length();using this function we can find out length of string.(no of character includes blankspaces).length function returns numeric value.
index():It returns position of characters in string.it works based on characters and wordwise.
Scan():using scan function we can get requiredword from string.or we can get nth word of string.
Substr():Using this function,we can get partof string.It requires 3 arguments.1.variable name 2.starting postion 3.number of characters.
data x;input name $13. id;x=length(name);xi=index(name,'k');xw=index(name,'an');xs=scan(name,2);xst=substr(name,1,4);xst1=substr(name,3);cards;shahrukh khan 100salman khan 101ameer khan 102prasad babu 103;
run;proc print;run;
concatination(combine):It is used for combiningthe strings.symbol is ||.data x1;length k $24.;input fname $23. lname $;k=fname||lname;cards;ramudfghjjkkllliutreww kprasaddfghhjjkloiyy bsasideqazxcvbnmkiu rrun;proc print;run;
compress():can be used to remove specificcharacter from the string.it requires 2 arguments.it is working based on character wise.
note:if we ommit 2nd argument in compress function,it removes the blankspaces from the string.
data x1;input fname $23. lname $;x=compress(fname,'j');xgap=compress(fname);cards;ramudfghj jkkllliutreww kprasaddfg hhjjkloiyy bsasideqazx cvbnmkiu rrun;proc print;run;Translate():can be used to replace the required character in string.
data x2;input name $ sal;x=translate(name,'a','x');cards;prasad 2000ramesh 3000rajesh 4000run;proc print;run;
propcase():
to captialize first letter of string.data x2;input name $ sal;xp=propcase(name);cards;prasad 2000ramesh 3000rajesh 4000run;proc print;run;
semantic errors:if we send wrong number of arguments forfunctions.In this case we will get one typeof execution error.this execution error is called semantic error.
31/08/11
cat(): concatenates strings.it without removes leading and trailing blanks.
cats():concatenates strings.it removes leading and trailing blanks.
catt():concatenates strings.it removes only trailing blanks.
catx():concatenates strings.it removes leading and trailing blanks and insert separators.
data x;x=' abc';x1=' abc ';y=' 123 ';k=cat(x1,y);ks=cats(x,y);kt=catt(x1,y);kx=catx(',',x1,y);run;proc print;run;
cat():concatenates strings.it without removes leading and trailing blanks.
cats():concatenates strings.it removes leading and trailing blanks.
catt():concatenates strings.it removes only trailing blanks.
catx():concatenates strings.it removes leading and trailing blanks and insert separators.
data x;x=' abc';x1=' abc ';y=' 123 ';k=cat(x1,y);ks=cats(x,y);kt=catt(x1,y);kx=catx(',',x1,y);run;proc print;run;tranwrd():It removes or replace all occurancesof a word in string.data x1;name='prasad babu mr';x=tranwrd(name,'mr','mrs');run;proc print;run;quote():It add double quotation marks to string.data x2;input name $;x=quote(name);cards;abcxyz;run;proc print;run;dquote():It removes quotation marks.data x2;input name $;x=quote(name);y=dquote(x);cards;abcxyz;run;
proc print;run;
date/timeformat
data demo;input id svdate:date9. svtime:time8.;format svdate date9. svtime time8.;cards;100 12jan2010 12:23:34101 13feb2003 13:23:34102 14feb2009 11:23:34103 15mar2011 10:23:23run;data x;set demo;svday=day(svdate);svmonth=month(svdate);svyear=year(svdate);svhour=hour(svtime);scmin=minute(svtime);svsec=second(svtime);run;proc print;run;
data x;input id svdtime:datetime18.;format svdtime datetime18.;cards;100 12jan2003:12:23:24101 13feb2005:13:23:34run;data x1;format xdate date9. xtime time8.;set x;xdate=datepart(svdtime);xtime=timepart(svdtime);run;proc print;run;
data demo;input id svdate:date9. svtime:time8.;format svdate date9. svtime time8.;cards;100 12jan2010 12:23:34101 13feb2003 13:23:34102 14feb2009 11:23:34103 15mar2011 10:23:23run;data x;set demo;
svday=day(svdate);svmonth=month(svdate);svyear=year(svdate);svhour=hour(svtime);scmin=minute(svtime);svsec=second(svtime);run;proc print;run;datepart():can be used to get datevalue from the date and time variable.
timepart():can be used to get timevalue from the date and time variable.
data x;input id svdtime:datetime18.;format svdtime datetime18.;cards;100 12jan2003:12:23:24101 13feb2005:13:23:34run;data x1;format xdate date9. xtime time8.;set x;xdate=datepart(svdtime);xtime=timepart(svdtime);run;proc print;run;intck():can be used to report difference between the datevalues in day interval,month interval or year interval.
data x;input id (sdate edate) (:date9.);format sdate edate date9.;days=intck('day',sdate,edate);months=intck('month',sdate,edate);year=intck('year',sdate,edate);cards;100 12jan2003 14dec2003101 15jan2003 14oct2004run;proc print;run;
INTNX(): can be used toincrements dates by intervalsdata x;input id (sdate edate) (:date9.);format sdate edate newdate newmonth newyear date9. ;
INTNX has three required arguments and one optional argument, commonly used as follows for SAS date values.
INTNX(interval, start-from, increment <,alignment>); interval is the unit of measure (days, weeks, months, quarters, years, etc.) by which start-from is incremented.start-from is a SAS date value to be incremented.increment is the integer number of intervals by which start-from is incremented (negative values = earlier dates).alignment is where start-from is alignedwithin interval prior to being incremented. Possible values are “beginning”, “middle”,“end”, and (new in Version 9) “sameday”. The default value is“beginning”.
data x;input id (sdate edate) (:date9.);format sdate edate newdate newmonth newyear date9. ;newdate=intnx('day',sdate,10,'B');newmonth=intnx('month',sdate,2);newyear=intnx('year',sdate,2);cards;100 12jan2003 14dec2003101 15jan2003 14oct2004run;proc print;run;
data demo;input id svdate:date9. svtime:time8.;format svdate date9. svtime time8.;cards;100 12jan2010 12:23:34101 13feb2003 13:23:34102 14feb2009 11:23:34103 15mar2011 10:23:23run;data x;
set demo;svday=day(svdate);svmonth=month(svdate);svyear=year(svdate);svhour=hour(svtime);scmin=minute(svtime);svsec=second(svtime);run;proc print;run;datepart():can be used to get datevalue from the date and time variable.
timepart():can be used to get timevalue from the date and time variable.
data x;input id svdtime:datetime18.;format svdtime datetime18.;cards;100 12jan2003:12:23:24101 13feb2005:13:23:34run;data x1;format xdate date9. xtime time8.;set x;xdate=datepart(svdtime);xtime=timepart(svdtime);run;proc print;run;intck():can be used to report difference between the datevalues in day interval,month interval or year interval.
data x;input id (sdate edate) (:date9.);format sdate edate date9.;days=intck('day',sdate,edate);months=intck('month',sdate,edate);year=intck('year',sdate,edate);cards;100 12jan2003 14dec2003101 15jan2003 14oct2004run;proc print;run;
INTNX(): can be used toincrements dates by intervalsdata x;input id (sdate edate) (:date9.);
INTNX has three required arguments and one optional argument, commonly used as follows for SAS date values.
INTNX(interval, start-from, increment <,alignment>); interval is the unit of measure (days, weeks, months, quarters, years, etc.) by which start-from is incremented.start-from is a SAS date value to be incremented.increment is the integer number of intervals by which start-from is incremented (negative values = earlier dates).alignment is where start-from is alignedwithin interval prior to being incremented. Possible values are “beginning”, “middle”,“end”, and (new in Version 9) “sameday”. The default value is“beginning”.
proc means:can be used to generate summary statistical analysis.data market;input pno area $ product $ stock sale;cards;101 vizag lux 10 9109 hyd lux 16 12101 hyd cintol 78 76110 hyd cintol 78 76110 hyd cintol 12 10108 vizag rin 15 10108 vizag lux 23 18101 vij lux 20 19190 hyd rin 10 09190 hyd lux 10 10110 vizag lifebuoy 13 11145 vizag lifebuoy 13 11110 vizag cintol 19 17101 vizag lux 13 12108 vizag lifebuoy 10 11190 hyd lifebuoy 10 09101 vij lifebuoy 13 12110 vizag lifebuoy 12 14run;proc means data=market sum mean n;run;data market;input pno area $ product $ stock sale;cards;101 vizag lux 10 9109 hyd lux 16 12101 hyd cintol 78 76110 hyd cintol 78 76110 hyd cintol 12 10108 vizag rin 15 10108 vizag lux 23 18101 vij lux 20 19190 hyd rin 10 09190 hyd lux 10 10110 vizag lifebuoy 13 11145 vizag lifebuoy 13 11110 vizag cintol 19 17101 vizag lux 13 12
108 vizag lifebuoy 10 11190 hyd lifebuoy 10 09101 vij lifebuoy 13 12110 vizag lifebuoy 12 14run;proc means data=market sum mean n;run;
var:requires analysis variable.analysisvariable must be numeric.data market;input pno area $ product $ stock sale;cards;101 vizag lux 10 9109 hyd lux 16 12101 hyd cintol 78 76110 hyd cintol 78 76110 hyd cintol 12 10108 vizag rin 15 10108 vizag lux 23 18101 vij lux 20 19190 hyd rin 10 09190 hyd lux 10 10110 vizag lifebuoy 13 11145 vizag lifebuoy 13 11110 vizag cintol 19 17101 vizag lux 13 12108 vizag lifebuoy 10 11190 hyd lifebuoy 10 09101 vij lifebuoy 13 12110 vizag lifebuoy 12 14run;proc means data=market;var sale;run;data market;input pno area $ product $ stock sale;cards;101 vizag lux 10 9109 hyd lux 16 12101 hyd cintol 78 76110 hyd cintol 78 76110 hyd cintol 12 10108 vizag rin 15 10108 vizag lux 23 18101 vij lux 20 19190 hyd rin 10 09190 hyd lux 10 10110 vizag lifebuoy 13 11
145 vizag lifebuoy 13 11110 vizag cintol 19 17101 vizag lux 13 12108 vizag lifebuoy 10 11190 hyd lifebuoy 10 09101 vij lifebuoy 13 12110 vizag lifebuoy 12 14run;proc means data=market;var sale stock;run;class:requires grouping variable.This variable isalso called as classification variable.class variable take either character ornumeric.data market;input pno area $ product $ stock sale;cards;101 vizag lux 10 9109 hyd lux 16 12101 hyd cintol 78 76110 hyd cintol 78 76110 hyd cintol 12 10108 vizag rin 15 10108 vizag lux 23 18101 vij lux 20 19190 hyd rin 10 09190 hyd lux 10 10110 vizag lifebuoy 13 11145 vizag lifebuoy 13 11110 vizag cintol 19 17101 vizag lux 13 12108 vizag lifebuoy 10 11190 hyd lifebuoy 10 09101 vij lifebuoy 13 12110 vizag lifebuoy 12 14run;proc means data=market;class area;var sale stock;run;
proc means:can be used to generate summary statistical analysis.data market;input pno area $ product $ stock sale;cards;101 vizag lux 10 9109 hyd lux 16 12
101 hyd cintol 78 76110 hyd cintol 78 76110 hyd cintol 12 10108 vizag rin 15 10108 vizag lux 23 18101 vij lux 20 19190 hyd rin 10 09190 hyd lux 10 10110 vizag lifebuoy 13 11145 vizag lifebuoy 13 11110 vizag cintol 19 17101 vizag lux 13 12108 vizag lifebuoy 10 11190 hyd lifebuoy 10 09101 vij lifebuoy 13 12110 vizag lifebuoy 12 14run;proc means data=market sum mean n;run;data market;input pno area $ product $ stock sale;cards;101 vizag lux 10 9109 hyd lux 16 12101 hyd cintol 78 76110 hyd cintol 78 76110 hyd cintol 12 10108 vizag rin 15 10108 vizag lux 23 18101 vij lux 20 19190 hyd rin 10 09190 hyd lux 10 10110 vizag lifebuoy 13 11145 vizag lifebuoy 13 11110 vizag cintol 19 17101 vizag lux 13 12108 vizag lifebuoy 10 11190 hyd lifebuoy 10 09101 vij lifebuoy 13 12110 vizag lifebuoy 12 14run;proc means data=market sum mean n;run;
var:requires analysis variable.analysisvariable must be numeric.data market;input pno area $ product $ stock sale;cards;
101 vizag lux 10 9109 hyd lux 16 12101 hyd cintol 78 76110 hyd cintol 78 76110 hyd cintol 12 10108 vizag rin 15 10108 vizag lux 23 18101 vij lux 20 19190 hyd rin 10 09190 hyd lux 10 10110 vizag lifebuoy 13 11145 vizag lifebuoy 13 11110 vizag cintol 19 17101 vizag lux 13 12108 vizag lifebuoy 10 11190 hyd lifebuoy 10 09101 vij lifebuoy 13 12110 vizag lifebuoy 12 14run;proc means data=market;var sale;run;data market;input pno area $ product $ stock sale;cards;101 vizag lux 10 9109 hyd lux 16 12101 hyd cintol 78 76110 hyd cintol 78 76110 hyd cintol 12 10108 vizag rin 15 10108 vizag lux 23 18101 vij lux 20 19190 hyd rin 10 09190 hyd lux 10 10110 vizag lifebuoy 13 11145 vizag lifebuoy 13 11110 vizag cintol 19 17101 vizag lux 13 12108 vizag lifebuoy 10 11190 hyd lifebuoy 10 09101 vij lifebuoy 13 12110 vizag lifebuoy 12 14run;proc means data=market;var sale stock;run;class:requires grouping variable.This variable isalso called as classification variable.
class variable take either character ornumeric.data market;input pno area $ product $ stock sale;cards;101 vizag lux 10 9109 hyd lux 16 12101 hyd cintol 78 76110 hyd cintol 78 76110 hyd cintol 12 10108 vizag rin 15 10108 vizag lux 23 18101 vij lux 20 19190 hyd rin 10 09190 hyd lux 10 10110 vizag lifebuoy 13 11145 vizag lifebuoy 13 11110 vizag cintol 19 17101 vizag lux 13 12108 vizag lifebuoy 10 11190 hyd lifebuoy 10 09101 vij lifebuoy 13 12110 vizag lifebuoy 12 14run;proc means data=market;class area;var sale stock;run;data market;input pno area $ product $ stock sale;cards;101 vizag lux 10 9109 hyd lux 16 12101 hyd cintol 78 76110 hyd cintol 78 76110 hyd cintol 12 10108 vizag rin 15 10108 vizag lux 23 18101 vij lux 20 19190 hyd rin 10 09190 hyd lux 10 10110 vizag lifebuoy 13 11145 vizag lifebuoy 13 11110 vizag cintol 19 17101 vizag lux 13 12108 vizag lifebuoy 10 11190 hyd lifebuoy 10 09101 vij lifebuoy 13 12110 vizag lifebuoy 12 14run;
proc means data=market;class area;var sale stock;run;
/*sales wise analysis*/proc means data=market;var sale;run;
/*sales wise analysis based on product*/proc means data=market;class product;var sale;run;/*sales wise analysis based on area*/proc means data=market;class area;var sale;run;/*All types of analysis*/printalltypes:can be used to generate all types of analysis based classification variables and analysis variables.
proc means data=market printalltypes;class area product;var sale;run;
01/09/11
data one;input res$@@;cards;y n y n n n y y nnnrun;proc means data=one;class res;run;proc freq data=one;table res;run;data medi3;input group$ week$ drug$ sub;cards;g100 week1 col5mg 90g200 week1 col5mg 90g300 week1 col5mg 90g100 week2 col5mg 70g200 week2 col10mg 80
g200 week2 col10mg 80 . week3 col15mg 68g200 week3 col15mg 64g300 week4 col15mg 60g200 week4 col15mg 60g300 week4 col5mg 64run;proc print data=medi4;run;proc freq data=medi4;table group/missingout=medi5;run;proc format;value $gp ' '='miss';run;proc freq data=medi4;table group/missing out=medi6;format group $gp.;run;data hdr;input pid (dos1-dos3)($);cards;100 y n y101 y y y102 y n n103 y y n104 n y y105 n n y106 y y n107 y y yrun;proc print data=hdr;run;/*Who has taken dos1*/proc freq data=hdr;table dos1;where dos1='y';run;proc freq data=hdr;table dos1;where dos1='y' and dos2='n' and dos3='n';run;/*Who has taken dos1 and dos2*/proc freq data=hdr;table dos1*dos2/nopercent norow nocol;where dos1='y' and dos2='y' and dos3='n';run;
02/09/11
Proc format:can be used to create user defined informats and formats.
Invalue statement: can be used to create user defined informats for data reading.
proc format;invalue ds 'l'=0.05 'm'=0.1
'h'=0.15; run;
data medi;input stno week$ drug$ dose:ds.;cards;100 week1 col l101 week1 col l102 week1 col m100 week2 col h101 week2 col m102 week2 col hrun;proc print;run;
Value statement:can be used to create user defined formatsfor reporting.
can be used to create templete. proc format;picture rpt low-<150='999 normal stage' 150-<180='999 control stage' 180-high='999 uncontrol stage'; run;data x;input id sbp;format sbp rpt.;cards;100 89101 145102 190103 160104 155105 4run;proc print;run;
/*INPUT(source, informat.) *//*Character to numeric*/data x;input id sal $;sal_num=input(sal,best.);cards;100 20000101 3000103 40000run;proc print;run;
PUT(source, format.) data demo2;set medi;g1=put(gender,gn.);w1=input(week,$wk.);d1=input(drug,ds.);run;proc print data=demo2;run;
03/9/11
SQL:structured query language
Using SQL concept,we can handle any data base.Using SQL concept,we can generate result in sas environment.SQL concept mainly running based on four concepts.
1.DDL-data definition language2.DML-data manuplication language3.DCL-data controlling language4.query language
DDL-Using concept,we can create a table with variables without obseravations(null data set).
DML:using this concept,we can insert data in existed table,update data values,delete the obseravations from existed table.
DCL:We can control data process.
Query language:
Using with Query language we can retrieve thedata for reporting and storage.
create statement:can be used to create thetables with variables and without observation.
insert statement:can be used to insert data in existed table.
As insert statement is working based on value clause or set clause.
Value clause is working based on variable postion.set clause is working based on variable name.
Select statement:
is a query statement.It can be used toretrieve data for reporting and storage.
insert statement:can be used to insert data in existed table.
proc sql;create table prasad(name char,age int,sal int);quit;proc sql;insert into prasad values('prasad',20,20000);insert into prasad values('raju',30,80000);insert into prasad set name='mahesh',age=30,sal=40000;insert into prasad set age=35,sal=50000,name='shahrukh';quit;
eg:
proc sql;create table prasad(name char(10),age int,sal int);quit;proc sql;insert into prasad values('prasad',20,20000) values('raju',30,80000) values('mahi',35,6000);insert into prasad set name='mahesh',age=30,sal=40000 set age=35,sal=50000,name='shahrukh' set age=40,sal=60000,name='salmankhan';quit;
proc sql;create table x(Name char,Sex char, Age int,Height int,Weight int);insert into x select * from sashelp.class; quit;
proc sql;create table x1(Name char,Sex char, Age int,Height int,Weight int);insert into x select * from sashelp.classwhere sex="F"; quit;
/*Reporting*/proc sql;select * from sashelp.classwhere age>=15;quit;/*Storage*/proc sql;create table x3 as select * from sashelp.class
when (age <=10) group = "child ";when (11 <= age <= 19) group = "teenager ";when (20 <= age <= 29) group = "young adult";when (30 <= age <= 45) group = "adult ";when (46 <= age <= 59) group = "middle age ";otherwise group = "senior ";
end;
run;proc sql;create table x5 asselect *,case when (age <=10) then "child "
when (11 <= age <= 19) then "teenager "when (20 <= age <= 29) then "young adult"when (30 <= age <= 45) then "adult "when (46 <= age <= 59) then "middle age "else "senior "
can be used to modify the data values in existed variable.
Syntax:update <table name> set <variable name>=<expression>
alter statement:
can be used to modify the existed table.modifications:1.Adding new variable2.adding or droped required column.3.assign constraints4.delete constraints
proc sql;alter table emp add anualsal num ;quit;proc sql;update emp set anualsal=salary*12;quit;
proc sql;alter table emp add anualsal num, bonus num, netsal num;update emp set anualsal=salary*12;update emp set bonus=salary*0.5;update emp set netsal=anualsal+bonus;quit;
input pid sbp;cards;100 156 101 176 102 140
103 180 104 145 105 167 run;proc sql;create table case2 as select *, casewhen sbp>=170 then '15mg'when sbp>=150 and sbp<170 then '10mg'
else '5mg' end as drug,
casewhen sbp>=170 then 3when sbp<170 then 2else 3 end asdailydose from med;quit;
SQL:structured query language
Using SQL concept,we can handle any data base.Using SQL concept,we can generate result in sas environment.SQL concept mainly running based on four concepts.
1.DDL-data definition language2.DML-data manuplication language3.DCL-data controlling language4.query language
DDL-Using concept,we can create a table with variables without obseravations(null data set).
DML:using this concept,we can insert data in existed table,update data values,delete the obseravations from existed table.
DCL:We can control data process.
Query language:
Using with Query language we can retrieve thedata for reporting and storage.
create statement:can be used to create thetables with variables and without observation.
insert statement:can be used to insert data in existed table.
As insert statement is working based on value clause or set clause.
Value clause is working based on variable postion.set clause is working based on variable name.
Select statement:
is a query statement.It can be used toretrieve data for reporting and storage.
insert statement:can be used to insert data in existed table.
proc sql;create table prasad(name char,age int,sal int);quit;proc sql;insert into prasad values('prasad',20,20000);insert into prasad values('raju',30,80000);insert into prasad set name='mahesh',age=30,sal=40000;insert into prasad set age=35,sal=50000,name='shahrukh';quit;
proc sql;create table prasad(name char(10),age int,sal int);quit;proc sql;insert into prasad values('prasad',20,20000) values('raju',30,80000) values('mahi',35,6000);insert into prasad set name='mahesh',age=30,sal=40000 set age=35,sal=50000,name='shahrukh' set age=40,sal=60000,name='salmankhan';quit;
proc sql;create table x(Name char,Sex char, Age int,Height int,Weight int);insert into x select * from sashelp.class; quit;
proc sql;create table x1(Name char,Sex char, Age int,Height int,Weight int);
insert into x select * from sashelp.classwhere sex="F"; quit;
proc sql;create table x2 asselect * from sashelp.shoes;quit;
Order by:can be used to report the data in ascending or descending order.Default ascending order.
If we want to descendingorder we use descending or desc option in after variable name of order by clause.
proc sql;create table x3 asselect * from sashelp.classorder by age;quit;
proc sql;create table x3 asselect * from sashelp.classorder by age desc;quit;
where clause:
can be used to create a subset of data for reporting and storage.
proc sql;create table x3 asselect * from sashelp.classwhere age>=15;quit;/*Reporting*/proc sql;select * from sashelp.classwhere age>=15;quit;/*Storage*/proc sql;create table x3 as select * from sashelp.classwhere age>=15;
quit;
/*To store age>=14 sub and generatereport in descending based on weight variable*/proc sql;create table x3 as select * from sashelp.classwhere age>=15order by weight desc;quit;
proc sql;create table x1 asselect name,age from sashelp.class;quit;proc print;run;
data emp;input eid salary sale;cards;100 2000 500101 3000 300102 4000 600104 4000 900105 5000 400106 6000 500107 7000 400108 8000 400109 6000 300;run;data x4;set emp;if sale < 500 then nsal=2000;else if 501<sale<700 then nsal=4000;else nsal=5000;run;proc print;run;proc sql;create table x4 asselect *,case when sale Lt 500 then 2000 when sale gt 501 and sale Lt 700 then 4000
else 5000 end as nsal
from emp;quit;data x;input eid salary sale;cards;100 2000 500
101 3000 300102 4000 600104 4000 900105 5000 400106 6000 500107 7000 400108 8000 400109 6000 300;run;proc sql;create table x4 asselect *,salary+case when sale ge 500 then 2000
else 100 end as nsal from x;quit;proc sql;create table x6 asselect *,case when sale ge 500 then 2000 else 100 end as nsal
when (age <=10) group = "child ";when (11 <= age <= 19) group = "teenager ";when (20 <= age <= 29) group = "young adult";when (30 <= age <= 45) group = "adult ";when (46 <= age <= 59) group = "middle age ";otherwise group = "senior ";
end;
run;proc sql;create table x5 asselect *,case when (age <=10) then "child "
when (11 <= age <= 19) then "teenager "when (20 <= age <= 29) then "young adult"when (30 <= age <= 45) then "adult "when (46 <= age <= 59) then "middle age "else "senior "
end as groupfrom x;quit;proc print;run;proc compare base=x2 compare=x5;var group;with group;run;Update statement:can be used to modify the data values in existed variable.
Syntax:update <table name> set <variable name>=<expression>
alter statement:can be used to modify the existed table.modifications:1.Adding new variable2.adding or droped required column.3.assign constraints4.delete constraints
proc sql;alter table emp add anualsal num ;quit;proc sql;update emp set anualsal=salary*12;quit;
proc sql;alter table emp add anualsal num, bonus num, netsal num;update emp set anualsal=salary*12;update emp set bonus=salary*0.5;update emp set netsal=anualsal+bonus;quit;
data x;input eid salary sale;cards;100 2000 500101 3000 300102 4000 600104 4000 900105 5000 400106 6000 500107 7000 400108 8000 400109 6000 300;run;proc sql;alter table x add nsal num;quit;proc sql;update x set nsal=salary+case when sale ge 500 then 2000 when sale ge 400 and sale lt 500 then 1500
run;proc print data=lab1;run;data lab2;input stno test$ units;date='13jan2003'd;format date date9.;cards;102 hr 78103 hr 90103 dbp 89103 sbp 156102 dbp 90102 sbp 178102 sbp 178102 sbp 178run;proc sql;create table unall asselect * from lab1union allselect * from lab2;quit;data u;set lab1 lab2;run;proc print;run;proc sql;create table un asselect * from lab1union select * from lab2;quit;
INTERSECT produces rows that are common to both query results.
data ex;input stno ad $ date:date9.;format date date9.;cards;100 eyedis 12jan2003105 eardis 12jan2003102 eyedis 12jan2003run;data unex;input stno ad $ date:date9.;format date date8.;cards;
103 nervous 12jan2003104 coma 12jan2003105 eardis 12jan2003run;proc sql;select * from exintersectselect * from unex;quit;
EXCEPT produces rows that are part of thefirst query only.
proc sql;select * from unexexceptselect * from ex;quit;
data ex;input stno ad $ date:date9.;format date date9.;cards;100 eyedis 12jan2003105 eardis 12jan2003102 eyedis 12jan2003run;data unex;input stno ad $ date:date9.;format date date8.;cards;103 nervous 12jan2003104 coma 12jan2003105 eardis 12jan2003run;proc sql;select * from exintersectselect * from unex;quit;proc sql;select * from unexexceptselect * from ex;quit;proc sql;(select * from ex except select * from unex) union (select * from unex
except select * from ex); quit;
Joins:1.simple join2.inner join3.outer join======1.left join or left outer join 2.right join or right outer join
3.full join or full outer join4.natural join5.self join
simple join:
we can report matching observations from the required data sets.
data event; input stno exad$ exdate:date9.; format exdate date9.; cards; 230 eyedis 12feb2005 456 skinprb 15feb2005 345 cold 16mar2005 run; data unevent; input stno unexad$ uexdate:date9.; format uexdate date9.; cards; 230 nervous 28feb2005 156 eardis 17mar2005 245 diabets 18mar2005 run;proc sql; select * from event as ex, unevent as uex where ex.stno=uex.stno; quit;
inner join:
works like a simple join.But inner join canbe used between two tables.
on clause can be used instead of where clause.inner join can be activated with inner join.
proc sql;
create table innerjoin asselect * from event as ex inner join unevent as uex on ex.stno=uex.stno;quit;
left join:reports all observation from left side tableand only matching observations from condition based right side table.
proc sql;create table leftjoin asselect * from event as ex left join unevent as uex on ex.stno=uex.stno;quit;
Right join:
reports all observation from right side tableand only matching observations from condition based left side table.
proc sql;create table rightjoin asselect * from event as ex right join unevent as uex on ex.stno=uex.stno;quit;
Full join:
Reports all observation from 2 tables andmatch the rows.
proc sql;create table fulljoin asselect * from event as ex full join unevent as uex on ex.stno=uex.stno;quit;
Natural join:
we can report matching observation from required data sets without using anycondition.
self join:If we join the table internally with sametable then it is called self join.data trt9;input stno bsbp drug$ asbp;cards;190 167 col5mg 178123 178 col15mg 167198 167 col10mg 146237 172 col10mg 134run;proc sql;select *from trt9where bsbp> asbp;quit;
Cartsign product:proc sql;create table cartsign as select * from event as ex, unevent as uex ; quit;
/*Equivalent SQL and datastep coding for joins*/data event; input stno exad$ exdate:date9.; format exdate date9.; cards; 230 eyedis 12feb2005 456 skinprb 15feb2005 345 cold 16mar2005 run; data unevent; input stno unexad$ uexdate:date9.; format uexdate date9.; cards; 230 nervous 28feb2005
156 eardis 17mar2005 245 diabets 18mar2005 run;proc sql;create table simplejoin as select * from event as ex, unevent as uex where ex.stno=uex.stno; quit;proc sql;create table innerjoin asselect * from event as ex inner join unevent as uex on ex.stno=uex.stno;quit;proc sort data=event out=x;by stno;run;proc sort data=unevent out=x1;by stno;run;data inner;merge x(in=a) x1(in=b);by stno;if a and b;run;proc print;run;
proc sql;create table leftjoin asselect * from event as ex left join unevent as uex on ex.stno=uex.stno;quit;data left;merge x(in=a) x1(in=b);by stno;if a ;run;proc print;run;proc sql;create table rightjoin asselect * from event as ex right join unevent as uex on ex.stno=uex.stno;quit;data right;merge x(in=a) x1(in=b);
by stno;if b ;run;proc print;run;
proc sql;create table fulljoin asselect * from event as ex full join unevent as uex on ex.stno=uex.stno;quit;data full;merge x(in=a) x1(in=b);by stno;if a or b ;run;proc print;run;
Joins:1.simple join2.inner join3.outer join======1.left join or left outer join 2.right join or right outer join
3.full join or full outer join4.natural join5.self join
simple join:
we can report matching observations from the required data sets.
data event; input stno exad$ exdate:date9.; format exdate date9.; cards; 230 eyedis 12feb2005 456 skinprb 15feb2005 345 cold 16mar2005 run; data unevent; input stno unexad$ uexdate:date9.; format uexdate date9.; cards; 230 nervous 28feb2005 156 eardis 17mar2005 245 diabets 18mar2005
run;proc sql; select * from event as ex, unevent as uex where ex.stno=uex.stno; quit;
inner jion:
works like a simple join.But inner join canbe used between two tables.
on clause can be used instead of where clause.inner join can be activated with inner join.
proc sql;create table innerjoin asselect * from event as ex inner join unevent as uex on ex.stno=uex.stno;quit;
left join:reports all observation from left side tableand only matching observations from condition based right side table.
proc sql;create table leftjoin asselect * from event as ex left join unevent as uex on ex.stno=uex.stno;quit;
Right join:
reports all observation from right side tableand only matching observations from condition based left side table.
proc sql;create table rightjoin asselect * from event as ex right join unevent as uex on ex.stno=uex.stno;quit;
Full join:
Reports all observation from 2 tables andmatch the rows.
proc sql;create table fulljoin asselect * from event as ex full join unevent as uex on ex.stno=uex.stno;quit;
Natural join:
we can report matching observation from required data sets without using anycondition.
self join:If we join the table internally with sametable then it is called self join.data trt9;input stno bsbp drug$ asbp;cards;190 167 col5mg 178123 178 col15mg 167198 167 col10mg 146237 172 col10mg 134run;proc sql;select *from trt9where bsbp> asbp;quit;
Cartsign product:proc sql;create table cartsign as select * from event as ex, unevent as uex ;
quit;
/*Equivalent SQL and datastep coding for joins*/data event; input stno exad$ exdate:date9.; format exdate date9.; cards; 230 eyedis 12feb2005 456 skinprb 15feb2005 345 cold 16mar2005 run; data unevent; input stno unexad$ uexdate:date9.; format uexdate date9.; cards; 230 nervous 28feb2005 156 eardis 17mar2005 245 diabets 18mar2005 run;proc sql;create table simplejoin as select * from event as ex, unevent as uex where ex.stno=uex.stno; quit;proc sql;create table innerjoin asselect * from event as ex inner join unevent as uex on ex.stno=uex.stno;quit;proc sort data=event out=x;by stno;run;proc sort data=unevent out=x1;by stno;run;data inner;merge x(in=a) x1(in=b);by stno;if a and b;run;proc print;run;
proc sql;create table leftjoin asselect * from event as ex left join unevent as uex on ex.stno=uex.stno;
quit;data left;merge x(in=a) x1(in=b);by stno;if a ;run;proc print;run;proc sql;create table rightjoin asselect * from event as ex right join unevent as uex on ex.stno=uex.stno;quit;data right;merge x(in=a) x1(in=b);by stno;if b ;run;proc print;run;
proc sql;create table fulljoin asselect * from event as ex full join unevent as uex on ex.stno=uex.stno;quit;data full;merge x(in=a) x1(in=b);by stno;if a or b ;run;proc print;run;
aggregate functions:
using aggregate function,we can do arthematicmanipulation in sql.using aggregate function,we can do column and row wise analysis.
proc sql;create table one asselect trail,count(trail) as obsfrom clinical9 group by trail;quit;proc sort data=clinical9 out=y;by trail;run;data one1;set y;by trail;if first.trail then ct=0;ct+1;if last.trail;keep trail ct;run;proc print;run;
data clinical;infile cards truncover;input center $ trail $ sub adsub;cards;appolo phase1 67 12
nims phase1 75 14nims phase1 80 40care phase1 34 10care phase2 40 20 appolo phase2 267 22nims phase2 178 14care phase2 234 30appolo phase1 245 50appolo phase2 260 50 . 90run;coalesce:can be used to replace the missing valuesfor reporting.1.numeric missing values we can replaceusing with numeric.2.character missing values we can replaceusing with character.proc sql;select coalesce(center,'miss')as center,trail,coalesce(sub,0) as sub,adsub from clinical;quit;
distinct():can be used to report unique valuesfrom required variable.
proc sql;select distinct(center) ascentlist from clinical9;quit;
/*Sequence*/
wheregroup byhavingorder by
data clinical;infile cards truncover;input center $ trail $ sub adsub;cards;appolo phase1 67 12nims phase1 75 14nims phase1 80 40care phase1 34 10
work like where clause if you want to dogrouping analysis based on condition,we can use having clause.
proc sql;create table x asselect center,trail,sum(sub) as totalfrom clinicalgroup by centerhaving center not in('care');quit;
proc sql;create table x1 asselect center,trail,sum(sub) as totalfrom clinicalwhere center not in('care')group by center;quit;
count():can be used to report frequency analysis.If we use * as argument in count function,it is report number of observation generated by current query statement.
proc sql;select count(*) as norows from clinical9;quit;proc sql;select trail,count(trail) as obsfrom clinical9 group by trail;quit;proc sql;select center,trail,count(center) as obsfrom clinical9 group bycenter,trail;
quit;proc sql;select center,trail,count(center),sum(sub)as obsfrom clinical9 group bycenter,trail;quit;proc sql;select center,trail,count(center) as obs,sum(sub)as totsubfrom clinical9 group bycenter,trail;quit;
Views:
views can be works like sas datasets.we can create a views from theexisted sas datasets.data clinical;input center$ trail$ sub adsub;cards;appolo phase1 67 12nims phase1 78 14care phase1 34 10appolo phase2 267 22nims phase2 178 14care phase2 234 40appolo phase2 267 22nims phase2 300 14care phase2 200 40run;proc sql;create view appolo asselect * from clinicalwhere center='appolo';quit;proc print data=appolo;run;data one/view=option;set sashelp.class;where age<=14;run;proc print;run;proc sql;create view two asselect * from
sashelp.class;quit;proc print;run;/*describe statement:*/can be used to report structure of the table or view.
/*subquery*/data suplier;input scode $ sname $ addr $ ;cards;s01 raja hyd s02 rani sec s03 raghu hyd
s04 radha banrun;data product;input pcode $ pname $ pcolor $;cards;p1 item1 redp2 item2 greenp3 item3 redp4 item4 bluerun;data trasaction;input trno scode $ pcode $;cards;1 s01 p32 s02 p13 s03 p44 s04 p25 s01 p46 s02 p37 s03 p38 s04 p16 s01 p2run; /*list out all suppliers names who have supply product code'p2'*/proc sql;create table sn as select scode,sname,addr from suplier where scode in (select scode from trasaction where pcode='p2');quit;/*list out of the suppliers details who supplied red color products*/proc sql;create table sd asselect * from suplier where scode in(select scode from trasaction where pcode in(select pcode from product where pcolor='red'));quit;
create table Highest asselect max(sal) as highestsal from x;quit;/*second max sal*/proc sql;select max(sal) from x where sal< (select max(sal) from x);quit;
proc sql;create table detail asselect * from x where sal =(select max(sal) from x where sal< (select max(sal) from x));quit;proc print;run;
data x;input eid sal;cards;100 2000101 3000102 5000104 890099 100088 12000run;/*Highest sal*/proc sql;create table Highest asselect max(sal) as highestsal from x;quit;/*second max sal*/proc sql;select max(sal) from x where sal< (select max(sal) from x);quit;
proc sql;create table detail asselect * from x where sal =(select max(sal) from x where sal< (select max(sal) from x));quit;proc print;run; /*Nth max sal*/
proc sql;SELECT DISTINCT (a.sal) FROM x A WHERE (5-1)= (SELECT COUNT (DISTINCT (b.sal))
FROM x B WHERE a.sal<=b.sal);quit;
/*Highest sal*/proc sql;create table Highest asselect max(sal) as highestsal from x;quit;/*second max sal*/proc sql;select max(sal) from x where sal< (select max(sal) from x);quit;
proc sql;create table detail asselect * from x where sal =(select max(sal) from x where sal< (select max(sal) from x));quit;proc print;run; /*Nth max sal*/
proc sql;SELECT DISTINCT (a.sal) FROM x A WHERE (5-1)= (SELECT COUNT (DISTINCT (b.sal))FROM x B WHERE a.sal<=b.sal);quit;
/*SQL pass through facility*//*Retrieved data from db2 table*/
proc sql noprint;connect to DB2 (database=XXXX user=xxxxxx password=xxxxxx);create table x as select * from connection to DB2 (select name,age,sal,loc from db2.emp order by name); disconnect from DB2; quit;
06/09/11.
MACROS
Using macros language, we can customize and reduce SAS language.
Using macros language, we can develop reusable application. macro language is character based language. If we want to develop macro application in SAS, we need 2 requirements i. e one is
macro compiler or processor and the second is macro language.
Macro language:I. This is one of part of the SAS.
macro language:II. can be used to interact with macro processor.
macro triggers(%,&): can be used to identify macro language.
percentage(%):-This is called macro reference. Each and every macro statement startswith %.
Ampson(&): This is called macro variable reference. It can be used for reporting macro variable.
macro coding can be written outside and inside of the macro block.%macro <macroname>;
SAS coding (include dataset block, proc block, open code) macro coding%mend;catalogue:whenever we run the macro application, SAS do compilation & stores compilation coding in catalogue. catalogue name same name of macro.
Macro call: To call required macro for execution.
Example:
%macro pr;proc print;run;%mend;data x;input pid age;cards;100 89101 90;run;%pr;data x1;input pid drug $;cards;100 5mg101 6mgrun;%let name1=ramu;%put &name;data x;input name $ sal;
concepts in macros:1.macro variable creation2.passing arguments to macro3.macro quoting function4.macro options5.macro expressions6.macro interface functions
1.macro variable creation:macro variables are 3 types1.global macro variable2.local macro variable3.automatic macro variableAll macro variable work like character.%let a=10;%let ab=20;%let c=%eval(&a+&b);%put &c;
07/09/11
Global macro variable can becreated any in the programcoding(inside/outside of macro coding).we can use any where program coding.%global statement:syntax:%global <macro variable>;%let statement:
can be used to assign required value to macro variable.
orcan be used to create user defined macrovariable.%let name=prasad;%put name is &name;
%put statement:can be used to print required text and macro variable result in log window.
Local macro varaiable:can be created inside of macro blockand we can use in only in current macro block.
local macro variable values storesin local symbol tables.
%macro loc;%local cname1 source1;%let cname1=raju;%let source1=oracle;%put var1 is &cname1 and var2 is &source1;%mend;%loc;%put &cname1;%put &source1;Automatic macro variable creation:Automatic variables created by SAS and values assigned by SAS.It is a system defined.User can useautomatic macro variables but user cannot reassign any vlue to automatic
macro variables.Its values stores intoglobal symbol table.
%put &sysdate;%put &systime;2.passing arguments to macros:
arguments:macro arguments shouldbe written in after the macronamewithin brackets.
%macro pr(dname);proc print data=&dname;run;%mend;%pr(SAShelp.clASS);%pr(SAShelp.shoes);we can pass the arguments in two ways.1.postional paarameter or arguments.2.keyword paarameter or arguments.
1.postional paarameter or arguments.
can be used to pass the argumentsbased on position.
%macro Ex2_refvar;%let mo = 12;%let yr1 = 2002;%let yr2 = 2003;%do y = 1 %to 2;%do day = 1 %to 2;%put &mo./&day./&yr&y;%put &mo./&day./&&yr&y;%end;%end;%mend Ex2_refvar;
%Ex2_refvar;3.macro quoting functions:
Using with quoting functions we can mask the special charactersat compilation time.These are 2 types.%str():using this function,we can mask all special characters except macro tiggers and unmatched quotations and unmatched brackets.
%nrstr():using this function,we can mask all special characters includes macro tiggers atcompilation time.
macro expression:are 3 types1.text expression2.Arithematic expression3.logical expression1.text expression:-macro coding is also called as text expression.2.Arithematic expression:
can be used to run arithematic operations in macros.%let a=10;%let b=20;%let c=&a+&b;%put &c;%eval():can be used to do arithmatic operations using macro variables.
%let c1=%eval(&a+&b);%put &c1;
%let a=10;%let b=20;%let c=&a+&b;%put &c;%eval():can be used to do arithmatic operations using macro variables.
%let c1=%eval(&a+&b);%put &c1;%sysevalf():If macro variables have period of characters(float values) then we will use sysevalf for arithmatic operation.%let a1=10.34;%let b1=20.60;%let s=%sysevalf(&a1+&b1);%put &s;Macro function (or) string function;
It requires operands(variables).
%length():using this function,we can reportlength of macro variable.
%scan():using this function,we can get requiredword from string.orIt extract nth word of string%let dnames=demo lab med;%let rw=%scan(&dnames,2);%put &rw;
%upcase():It shows required in capital letters.%let dnames=demo lab med;%let cap=%upcase(&dnames);%put ∩
%lowercase():It shows required in small letters.%substr():we can get part of string from macro variable.%let dnames=demo lab med;%let sub=%substr(&dnames,1,5);%put ⊂
%sysfunc():using this function,we can call dataset functions in macros.%let a=23.34;%let b=20.98;%let c=%sysevalf(&a+&b);%let in=%sysfunc(int(&c));%put &c;%put ∈
205 8000run;data emp3;input eid sal;cards;300 2000301 3000302 4000303 5000304 6000305 8000run;data emp4;input eid sal;cards;400 2000401 3000402 4000403 5000404 6000405 8000run;%global dname;%let dname=emp1 emp2 emp3 emp4 emp5 emp6;%macro dh;%local dat i;%let i=1;%do %while(&i<=7);%let dat=%scan(&dname,&i);proc print data=&dat;run;%let i=%eval(&i+1);%end;%mend;%dh;options mprint mlogic symbolgen;%macro dh1;%local dat i;%let i=1;%let dat=%scan(&dname,&i);%do %while(&dat ne );proc print data=&dat;run;%let i=%eval(&i+1);%let dat=%scan(&dname,&i);%end;%mend;%dh1;Macro options:Macro options is a type of global options.
Its deault working whenever we run the macro application.macro options can be changed by using option statement.This statement should be written outside of the macro block.
It displays a warning message in log window whenevermacro call is not resolved.
Merror:using this option,we can trace outrequired catalogue(macro call) existed or not.
Serror:prints warning message in log window whenever macro variable is not resolved.
mprint:Using mprint option,we can trace out requiredmacrocall to report errors in sas coding.symbolegen:can be used to trace out macro variable value.
orIt prints message in log window how to resolve macro variable.
Mlogic:can be used to trace out logical expressions.
%put &concat;GO TO BLOCK:The statement is working based on label statement and run group of required statements.
label statement:this statemnt indicate group of statements.If we want to run 'go to' statement we will use conditional if.data x;input eid sale;cards;100 2000101 3000102 4000104 5678105 7890106 5678107 8908108 4567109 8654110 9000run;data x1;set x;if 2000<=sale <=3000 then goto la1;else if 3000<sale <=5000 then goto la2;else if sale> 5000 then goto la3;
Interface functions are 2 types.1.Dataset interface functions(datastep).2.macro interface function.Dataset interface functions:call symput:-It is a call routine(function).Using this function,we can createmacro variables from the dataset variables during dataset execution.Syntax:call symput("macro variable",dataset varname);Note:If dataset has mutiple valuescall symput functiondefault stores last data values in macro variable.data x;input name $ sal;call symput("name1",name);cards;raju 2000mahi 4000suri 5000run;%put &name1;
data x;input name $ sal;CALL SYMPUT('v'||LEFT(_N_), name);cards;raju 2000mahi 4000suri 5000
;run;%put &v1;%put &v2;%put &v3;
data x;input name $ sal;anual+sal;cards;raju 2000mahi 4000suri 5000;run;
DATA _NULL_;SET x END=LAST;IF LAST THEN CALL SYMPUT('N',anual);RUN;%put &n;Symget:If want to get values of macro variableat datastep level.data x;input name $ sal;anual+sal;cards;raju 2000mahi 4000suri 5000;run;
DATA _NULL_;SET x END=LAST;IF LAST THEN CALL SYMPUT('N',anual);RUN;data x2;anualsal=symget('n');run;call execute:-using call execute,we can call required catalog(macro call) from the dataset block.syntax:call execute('%macro call');
data x;input name $ sal;anual+sal;
cards;raju 2000mahi 4000suri 5000;run;
DATA _NULL_;SET x END=eof;IF eof THEN CALL execute('%gto(merge,mr2,emps depts,eid)');RUN;2.macro interface function.%sysfunc():-using this function,we can call data set functions in macros.
dataset function:exist():-Using this function,we can report requiredsas file is existed or not.If it is existed it returns 1 otherwise 0.syntax:Exist('datasetname');data _null_;if exist('emp')=1 then put 'dataset is existed';else put 'dataset does not exist';run;open():-Using this function,we can open dataset internally.syntax:open('datasetname');Attrn():-Using this function,we can count number of rows andnumber og variables using open results.close():-Using this function,we can close open dataset.syntax:close(open datasetname);data _null_;if exist('sashelp.class')=1 then do;op=open('sashelp.class');
NV=attrn(op,'nvars');NO=attrn(op,'nobs');CL=close(op);put 'no of observation' no;put 'no of variables ' nv;end;else put 'dataset not existed';run;
%macro dex(dname);%if %sysfunc(exist(&dname))=1 %then %do;%let op=%sysfunc(open(&dname));%let NV=%sysfunc(attrn(&op,nvars));%let NO=%sysfunc(attrn(&op,nobs));%let CL=%sysfunc(close(&op));%put no of observation &no;%put no of variables &nv;%end;%else %put &dname not existed;%mend;%dex(sashelp.class);%dex(sashelp.shoes);/*To create a macro variable with sql block*/select and into clause,using these two options we can create macro variable the dataset variable.Here sas system default stores 1st datavalue or first occurance in macro variable.
data med;input gid $ visit drug $;cards;G100 1 col105mgG200 1 col10mgg300 1 col15mgG100 2 col28mgG200 2 col30mgg300 2 col20mgrun;proc sql noprint;select drug into:medicine from med;quit;%put &medicine;/*To create multiable macro variables*/proc sql noprint;select drug into:medicine1-:medicine6 from med;quit;%put &medicine1;%put &medicine2;
proc sql noprint;select count(*) into:n from med;select drug into:medicine1-:medicine%sysfunc(left(&n)) from med;quit;%put &n;%put &medicine1;%put &medicine2;%put &medicine3;%put &medicine4;%put &medicine5;%put &medicine6;%put &medicine7;symdel:-Using this function,we can delete macro variable from sas environment.%symdel medicine7;_Global_:-Using _global_ statement ,we can report list the macro variables(global variables) with values.
%put _global_;
_local_:-Using _local_ statement ,we can report list the macro variables(local variables) with values.
%macro mvar;%let a=10;%let b=20;%let c=%eval(&a+&b);%put _local_;%mend;%mvar;_user_:Using this statement,we can report user defined macro variable.1.if we write inside of macroblock,it reports both global and local macro variables. 2.if we write outside of macroblock,
it reports both global macro variables.%put _user_;_automatic_:Using this statement,we can report list of automatic macro variables.%put _automatic_;%put _automatic_;%put &sysdate;%put &systime;%put &sysdsn;
Concatination of macro variables:%let surname=kolla;%let name=lava kumar;%let concat=&surname.&name.;%put &concat;GO TO BLOCK:The statement is working based on labelstatement and run group of required statements.
label statement:this statemnt indicate group of statements.If we want to run 'go to' statement we will use conditional if.data x;input eid sale;cards;100 2000101 3000102 4000104 5678105 7890106 5678107 8908108 4567
109 8654110 9000run;data x1;set x;if 2000<=sale <=3000 then goto la1;else if 3000<sale <=5000 then goto la2;else if sale> 5000 then goto la3;
Interface functions are 2 types.1.Dataset interface functions(datastep).2.macro interface function.Dataset interface functions:call symput:-It is a call routine(function).Using this function,we can createmacro variables from the dataset variables during dataset execution.Syntax:call symput("macro variable",dataset varname);Note:If dataset has mutiple valuescall symput functiondefault stores last data values in macro variable.data x;input name $ sal;call symput("name1",name);cards;raju 2000mahi 4000suri 5000run;%put &name1;
data x;input name $ sal;CALL SYMPUT('v'||LEFT(_N_), name);cards;raju 2000mahi 4000suri 5000;run;%put &v1;%put &v2;%put &v3;
proc sort data=multdat out=x;by id date;run;proc transpose data=x out=p(drop=_name_ rename=(col1=p));by id date;var pr1 pr2 pr3;run;proc transpose data=x out=t(drop=_name_ rename=(col1=t));;by id date;var t1 t2 t3;run;data um;merge p t;by id date;run;proc print;run;
Proc transpose:Using this procedure,we can convert variables into rows and rows into variables.
Id statement:it requires which data variable valuesto tranpose or convert as variables.
var statement:It requires which variable values to convert or transpose as observation or datavalue.prefix;can be used to add required textfor tranpose variable.
data lab;input id test $ units;cards;100 hr 90101 hr 89100 dbp 98101 dbp 97
Name:gives name to new variable that contains the name of tranposed variables(the variables listed on var statement).
if you donot enter this option,sas automatically includes variable called _name_.You can drop it with(drop=_name_) placed immediately following out=<tr_dataset_name> or enter the rename=(_name_=new_name) option.
prefix:provides initial characters for the names of the new variables that will be appended with the value of variablelisted on ID statement,such as prefix=p will list new variables as p1,p2,p3..... sometimes _ is convenient choice for the first character of the tranposed variable.13/09/11Proc datasetsproc datasets:using this procedure we can do1.Rename the datasets2.Exchange data between the datsets.3.copy the sas files from one libray to another library.4.modify the datasets a) aasiagn constraints b)delete constraints5.Append the datasets values from one dataset to another.6. we can report descriptive information for required library.
1.Rename the datasets
change statement:can be used torename the dataset.
data emp;input eid sal jod:ddmmyy10.;cards;100 2000 30-01-2010101 3000 23-02-2009103 3500 25-03-2008run;proc datasets lib=work;change emp=employee;quit;
2.Exchange data between the datsets.exchange statement:can be used to exchange between the datasets.Note:If we want to exchange the data betweenthe sas file,then 2 sas files mustbe available in same library.
data emp1;input eid sal jod:ddmmyy10.;cards;100 2000 30-01-2010101 3000 23-02-2009103 3500 25-03-2008run;proc datasets lib=work;exchange employee=emp1;quit;
3.copy the sas files from one libray to another library.copy statement:can be used to copy the sas files betweenthe libraries or we can transfersas files between libraries.
proc datasets lib=sashelp;copy in=sashelp out=work;select class shoes;quit;
memtype option:can be used to copy the required sasfile type or required memtype.values of memtypes are: All,data,view,cat.
we can copy the required sas files use with select and exclude statement.select statement indicates required sas files.exclude statement indicates non-required sas files.
using append statement,we can append the values or load the values from one dataset to other.Append statement can be used only in data set procedure.data x;input a b;cards;10 4534 56run;data y;input a b;cards;100 350120 567run;proc datasets ;append base=x data=y;quit;data x;input a b ;cards;10 4534 56run;data y;input a b c;cards;100 350 45120 567 789run;proc datasets ;append base=x data=y force;
quit;data x;input a b d;cards;10 45 30034 56 450run;data y;input a b c;cards;100 350 45120 567 789run;proc datasets ;append base=x data=y force;quit;6.To report descriptive information for required library.Details options:can be used to report descriptive information for required library.proc datasets lib=workdetails;quit;contents statement:To report descriptive information for required dataset.proc datasets lib=work;contents data=emp;quit;
Delete statemnt:delete required dataset from library.proc datasets lib=work;delete emp;quit;
17/09/11To get data in pyramid shape:
Data _Null_ ;Length Text $ 200 ;Max = 10 ;Do I = 1 To Max ; Text = Repeat( Strip(Put( I , Best32. )) || ' ' , I - 1 ) ; Put Text; End ;Run ;
constraints:can be used to load necessary data in tables.
we can assign constraints in 2 ways.
1.column constraints2.table constraints
integrity constraints types:Unique:can be used to load the datawithout duplicate data values.proc sql;create table demo(pid num unique,age num, gender char,race char);
quit;
proc sql;insert into demo values(100,21,"f","asian") values(101,22,"m","african")
not null:can be used to load the data withoutmissing values(numeric type and character type).proc sql;create table demo(pid num not null,age num, gender char,race char);
quit;
proc sql;insert into demo values(100,21,"f","asian") values(101,22,"m","african")
primary key:can be used to load the data without duplicate data values and duplicate observations and without missing values.Table constraints:using table constraint,we can avoid duplicate observation.
primary key:Using primary key as a tableconstraint,we can avoid duplicate observations in loading time.
constraint statement:can be used to assign requiredconstraint for required variable.
data _null_;file 'C:\Users\home\Desktop\New folder (3)\uk.txt';put @1 'patientname' @18 'medicine' @28 'No of visits' @43 'No of patients';
run;proc sort data=trtment out=x;by visit;run;data _null_;set x;by visit;file 'C:\Users\home\Desktop\New folder (3)\uk.txt' mod;put @1 gid @22 drug @38 visit @56 sub;if first.visit then ct=0;ct+sub;if last.visit then put @58 'visit' visit 'total' ct;
run;
20/09/11proc report:using this procedure,we can generate required analysis and generate a reports in requiredformat.This is powerful reporting tool.using this procedure,we can do frequency proceure analysis,mean procedure analysis,tabulate procedure analysis andprint procedure analysis.
Report window:report procedure generatereport in report window.columns statement:
It require variable list and thesevariables playing a main role in analysis and report.Define statement:can be used to indicate sas system, how to use required variable in analysis and reporting.order,group,across options:The main use of these options isto arrange the data in requiredorder for reporting.break statement:can be used to give break summary breaks in middle of the reportsbased on group variable.Break statement is working on 2 options.1).After:-It indicate to give the break after grouping.2).Before:-It indicate to give the break before grouping.ol-overlineul-underlinedol-double overlinedul-double underlinesummarize option:It can be used to report the required analysis.Rbreak statement:can be used to give summary break end of the report or begining of the report based on after or before options.compute block:using this block,we can do newanalysis for reporting.1.To generate new data value for reporting.2.To create new variable for reporting.compute block ends with endcomp.compute block also working based on after and before options.
data boats;input name$1-12 port $ 14-20 locomotion $ 22-26 type $ 28-30 price 32-36;cards;silent lady maalea sail sch 75.00american II maalea sail yac 32.95aloha anai lahaina sail cat 62.00ocean spirit maalea power cat 22.00anuenue maalea sail sch 47.50hana lei maalea power cat 28.99leilani maalea power yac 19.99kalakaua maalea power cat 29.50
reef runner lahaina power yac 29.95blue dolphin maalea sail cat 42.95run;data natparks;input name $ 1-21 type $ region $ museums camping;cards;dinosaur nm west 2 6ellis island nm east 1 0everglades np east 5 2grand canyon np west 5 3great smoky mountains np east 3 10hawaii volcanoes np west 2 2lava beds nm west 1 1statue of liberty nm east 1 0theodore roosevelt np . 2 2yellowstone np west 9 11yosemite np west 2 13run;proc print;run;proc report data=natparks nowd headskip;run;proc report data=natparks nowindows headline;column museums camping;run;proc report data=natparks nowindows headline;column museums camping;define museums/display analysis sum;define camping/display analysis sum; run;proc report data=natparks nowindows headline;column museums camping;define museums/display analysis mean;define camping/display analysis mean; run;proc report data=natparks nowindows headline missing;column region name museums camping;define region/order;define camping/analysis 'camp/grounds';run;proc report data=natparks nowindows headline missing;column region type museums camping;define region/group;define camping/group;run;proc report data=natparks nowindows headline missing;column region type,( museums camping);define region/group;define type/across;run;proc report data=natparks nowindows headline missing;column name region museums camping;define region/order;
break after region/summarize ol skip;rbreak after/summarize ol skip;run;data trtment;input gid$ drug$ visit sub;cards;g1234 col5mg 1 90g2345 col5mg 1 89g4567 col5mg 1 78g1234 col5mg 2 50g2345 col6mg 2 79g4567 col6mg 2 38g1234 col6mg 3 70g2345 col6mg 3 89g1234 col7mg 4 90g2345 col7mg 4 89run;proc report data=trtment nowindows;column gid drug sub;define gid/group;define sub/sum;break after gid/ol ul summarize;rbreak after/dol dul summarize;compute after gid;gid ='total';endcomp;compute after;gid='gtotal';endcomp;run;data x;input eid sal;cards;100 2000101 2500102 3000run;proc report data=x nowd;columns eid sal ;define eid/'employeeid' width=10 display ;define sal/'empsalary' width=10;run;proc report data=x nowd;columns eid sal anualsal ;define eid/'employeeid' width=10 display ;define sal/'empsalary' width=10 display;define anualsal/ 'anualsal' computed;compute anualsal;anualsal=sal*12;endcomp;
run;
data company;input cname$ details$ amount;cards;satyam invest 6700tcs invest 6800satyam profit 3400tcs profit 2300wipro invest 5600wipro profit 3400run;proc report data=company headline nowindows;columns cname(details,amount);define cname/group;define details/across;break after cname/dul;run;data medi;input pid bsbp drug$ asbp;cards;100 178 col5mg 165101 156 col5mg 159102 178 col10mg 168103 177 col10mg 177104 180 col15mg 182105 169 col15mg 134run;proc report data=medi headline;columns pid drug bsbp asbp status;define pid/group;define status/computed;break after pid/ol ul;compute status/character length=19;if _c3_ < _c4_ thenstatus='drug is not working';else if _c3_ > _c4_ thenstatus='drug is working';else status='change the drug';endcomp;run;
proc report:using this procedure,we can generate required analysis and generate a reports in requiredformat.This is powerful reporting tool.using this procedure,we can do frequency proceure analysis,mean procedure analysis,tabulate procedure analysis andprint procedure analysis.
Report window:report procedure generatereport in report window.columns statement:
It require variable list and thesevariables playing a main role in analysis and report.Define statement:can be used to indicate sas system, how to use required variable in analysis and reporting.order,group,across options:The main use of these options isto arrange the data in requiredorder for reporting.break statement:can be used to give break summary breaks in middle of the reportsbased on group variable.Break statement is working on 2 options.1).After:-It indicate to give the break after grouping.2).Before:-It indicate to give the break before grouping.ol-overlineul-underlinedol-double overlinedul-double underlinesummarize option:It can be used to report the required analysis.Rbreak statement:can be used to give summary break end of the report or begining of the report based on after or before options.compute block:using this block,we can do newanalysis for reporting.1.To generate new data value for reporting.2.To create new variable for reporting.compute block ends with endcomp.compute block also working based on after and before options.
data boats;input name$1-12 port $ 14-20 locomotion $ 22-26 type $ 28-30 price 32-36;cards;silent lady maalea sail sch 75.00american II maalea sail yac 32.95aloha anai lahaina sail cat 62.00ocean spirit maalea power cat 22.00anuenue maalea sail sch 47.50hana lei maalea power cat 28.99leilani maalea power yac 19.99kalakaua maalea power cat 29.50
reef runner lahaina power yac 29.95blue dolphin maalea sail cat 42.95run;data natparks;input name $ 1-21 type $ region $ museums camping;cards;dinosaur nm west 2 6ellis island nm east 1 0everglades np east 5 2grand canyon np west 5 3great smoky mountains np east 3 10hawaii volcanoes np west 2 2lava beds nm west 1 1statue of liberty nm east 1 0theodore roosevelt np . 2 2yellowstone np west 9 11yosemite np west 2 13run;proc print;run;proc report data=natparks nowd headskip;run;proc report data=natparks nowindows headline;column museums camping;run;proc report data=natparks nowindows headline;column museums camping;define museums/display analysis sum;define camping/display analysis sum; run;proc report data=natparks nowindows headline;column museums camping;define museums/display analysis mean;define camping/display analysis mean; run;proc report data=natparks nowindows headline missing;column region name museums camping;define region/order;define camping/analysis 'camp/grounds';run;proc report data=natparks nowindows headline missing;column region type museums camping;define region/group;define camping/group;run;proc report data=natparks nowindows headline missing;column region type,( museums camping);define region/group;define type/across;run;proc report data=natparks nowindows headline missing;column name region museums camping;define region/order;
break after region/summarize ol skip;rbreak after/summarize ol skip;run;data trtment;input gid$ drug$ visit sub;cards;g1234 col5mg 1 90g2345 col5mg 1 89g4567 col5mg 1 78g1234 col5mg 2 50g2345 col6mg 2 79g4567 col6mg 2 38g1234 col6mg 3 70g2345 col6mg 3 89g1234 col7mg 4 90g2345 col7mg 4 89run;proc report data=trtment nowindows;column gid drug sub;define gid/group;define sub/sum;break after gid/ol ul summarize;rbreak after/dol dul summarize;compute after gid;gid ='total';endcomp;compute after;gid='gtotal';endcomp;run;data x;input eid sal;cards;100 2000101 2500102 3000run;proc report data=x nowd;columns eid sal ;define eid/'employeeid' width=10 display ;define sal/'empsalary' width=10;run;proc report data=x nowd;columns eid sal anualsal ;define eid/'employeeid' width=10 display ;define sal/'empsalary' width=10 display;define anualsal/ 'anualsal' computed;compute anualsal;anualsal=sal*12;endcomp;
run;
data company;input cname$ details$ amount;cards;satyam invest 6700tcs invest 6800satyam profit 3400tcs profit 2300wipro invest 5600wipro profit 3400run;proc report data=company headline nowindows;columns cname(details,amount);define cname/group;define details/across;break after cname/dul;run;data medi;input pid bsbp drug$ asbp;cards;100 178 col5mg 165101 156 col5mg 159102 178 col10mg 168103 177 col10mg 177104 180 col15mg 182105 169 col15mg 134run;proc report data=medi headline;columns pid drug bsbp asbp status;define pid/group;define status/computed;break after pid/ol ul;compute status/character length=19;if _c3_ < _c4_ thenstatus='drug is not working';else if _c3_ > _c4_ thenstatus='drug is working';else status='change the drug';endcomp;run;
Create new dataset containing duplicate obs…proc sort data=x out=dups;by eid;run;data x2;set dups;by eid;if not (first.eid and last.eid);run;proc print;run
or ( new )proc sort data=x out=x1nodupkey dupout=dups;by eid;run;Output Delivery systemods pdf file='C:\Users\home\Desktop\New folder (3)/uk.pdf';proc report data=sashelp.class nowd;run;ods pdf close;
ods html file='C:\Users\home\Desktop\New folder (3)/uk.html';proc report data=sashelp.class nowd;run;ods html close;
Data x;input cust $ name $ age sex $ proc $ charge hrp_charge rent_charge;cards;100 raja 24 M 890 900 1000 2000200 kaja 67 M 900 987 2000 8000100 rani 89 m 300 800 600 876300 hani 56 F 800 908 123 8908run;PROC SQL;CREATE TABLE x1 AS SELECT *, PUT(SUM(CHARGES),BEST.) AS CHARGES1, PUT(SUM(HRP_ALLOW),BEST.) AS HRP_ALLOW1, PUT(SUM(PPOALLOW),BEST.) AS PPOALLOW1 FROM x GROUP BY cust;QUIT;
PROC SQL;CREATE TABLE x2 AS SELECT *,SUM(CHARGES) AS CHARGES2 ,SUM(HRP_ALLOW) AS HRP_ALLOW2 ,SUM(PPOALLOW) AS PPOALLOW2 FROM x1;QUIT;
I want report like below and must in the of PDF file. Pls send code very soon. cust:100 name:raja sex:M proc:890charge:900 hrp_charge:1000 rent_charge:2000 cust:100 name:rani sex:Fproc:300charge:908 hrp_charge:123 rent_charge:8908 Total for 100 charge:1808 hrp_charge:1123 rent_charge:110908 cust:200 name:kaja sex:Mproc:900charge:987 hrp_charge:2000 rent_charge:8000 Total for 200 charge:987 hrp_charge:2000 rent_charge:8000 cust:300 name:hani sex:Fproc:800charge:908 hrp_charge:123 rent_charge:8908 Total for300 charge:908 hrp_charge:123 rent_charge:8908 Grand total charge:3595 hrp_charge:3723 rent_charge:19784
Sending email
FILENAME myemail EMAIL from=("[email protected]") to=("[email protected]" ) cc=("[email protected]") Subject = "An automatic email sent from SAS"Attach = "C:\Users\home\Desktop\New folder (3)\uk.pdf";data _null_; file myemail; put "Your report is now available online." / / "Thank you and have a great day." / / " " / /"Sincerely," / /"Venkat Prasad Sandu" / / " " / /
"This is an automated email sent by SAS on behalf of Venkat Prasad Sandu"; run;Proc reportdata test;input Region $ Sales AgentID;cards;E 12 1E 14 1E 17 2E 12 1E 14 3E 12 2E 18 3N 18 4N 16 4N 12 5N 17 4N 25 4S 12 7S 12 8S 13 7S 12 8W 27 9;run;options nodate nonumber;title;ods pdf file ='C:\Users\home\Desktop\New folder (3)\rep.pdf'; proc report data = test nowd style=[frame=void rules=none ]style(header)=[background=white];column Region Sales AgentID;define region / group;define sales / analysis sum;define AgentID / analysis n "Number of Agents";rbreak after / summarize Ol UL;run; ods pdf close;Task 2:%let name=prasad;output:'prasad'( Need Prasad in inverted commas).Solution:%let name=prasad;%let ab=%str(%'&name%');%put &ab;
Task 3:
data trans;input CustomerID $ transactiondate mmddyy10. amount category $;cards;9801234 10/01/1998 123.98 toys9802234 12/10/1997 80.34 books9802234 12/20/1997 100.00 apparel9805556 08/01/1996 22.90 toys9805556 09/10/1996 25.50 apparel9805556 10/11/1996 18.90 books9801134 11/11/1999 12.11 toys;run;1. Total and average amount spent by category 2. Which category has the highest average purchase 3. What is the average number of categories that customers purchase 4. What is the average and total amount by customer 5. What is the average number of days between purchases (as of today).
/*(1)Total and average amount spent by category*/ proc sql;create table one as select category, sum(amount) as Total, avg(amount) as Averagefrom transgroup by categoryorder by category;quit;
/*(2)Which category has the highest average purchase*/ proc sql outobs=1;create table two as select category, avg(amount) as Averagefrom transgroup by category
order by 2 desc;quit;
/*(3)What is the average number of categories that customers purchase */ proc sql;create table three as select CustomerID,count(distinct category) as Avg_catfrom transgroup by CustomerID;quit;
/*(4)What is the average and total amount by customer */ proc sql;create table four as select CustomerID,sum(amount) as Total, avg(amount) as Averagefrom transgroup by CustomerIDorder by CustomerID; quit;
/*(5)What is the average number ofdays between purchases (as of today) */ proc sort data=trans;by CustomerID transactiondate;run; data findavg(drop=mult_trans daysinbet td1 td2 amount categorytransactiondate);set trans;by CustomerID transactiondate;retain td1 td2;if first.CustomerID then do; daysinbet=0; td1=transactiondate; mult_trans=0; avgdays=0; end;else do; td2=transactiondate; daysinbet=td2-td1; mult_trans+1; avgdays=daysinbet/mult_trans; end;if last.CustomerID then output;
run;proc print;run;
/*Read in the data*/data COUNTY;input COUNTY_ID $ STATE_NAME $ COUNTY_NAME $;cards;1 Texas Collin2 Texas Dallas3 Georgia DeKalb;run;
/*Read in the data*/ data AGE_DISTRIBUTION_DESC;length CATEGORY_DESCRIPTION $23;input CATEGORY_NAME $ 1-11 CATEGORY_DESCRIPTION &;cards;AGE_0_10 < 10 yearsAGE_10_20 Between 10 and 20 yearsAGE_20_40 Between 20 and 40 yearsAGE_40_PLUS > 40 years;run;data AGE_DISTRIBUTION;input COUNTY_ID $ AGE_0_10 AGE_10_20 AGE_20_40 AGE_40_PLUS;cards;1 100 20 40 602 10 10 40 503 45 100 56 67;run;/*Sort the data for next merge step*/ proc sort data=COUNTY; by county_id; run;proc sort data=AGE_DISTRIBUTION;by county_id; run;
/*Transpose the data by County_id*/ proc transpose data=AGE_DISTRIBUTION out=trans (rename=(_name_=category_name col1=total_num));by county_id;run;/*Merge the 2 datasets to get the County Info at one place*/ data countyinfo;merge county trans;by county_id;run;/*Sort the data for next merge step*/
proc sort data=countyinfo; by CATEGORY_NAME; run;proc sort data=AGE_DISTRIBUTION_DESC; by CATEGORY_NAME; run;/*Merge the 2 datasets to get the County Info at one place*/ data final;merge countyinfo AGE_DISTRIBUTION_DESC;by CATEGORY_NAME;run; proc sort data=final; by county_name; run; /*Macro to Create the Report in HTML and Excel Version */ %macro report(type);filename report "C:\Users\home\Desktop\html - Copy\results.&type"; ods listing close;ods html body=report; /* First Report */ title;proc report data=final nowd split='*' style(header)=[foreground=blue background=grey] /*style for the header*/ style(column)=[foreground=black background=white]; /*style for the Columns*/ columns county_name ("Age Distribution" CATEGORY_DESCRIPTION total_num) ;define county_name / group 'County*Name' ;define CATEGORY_DESCRIPTION / 'Category' style(header)=[foreground=red background=grey];define total_num / analysis 'Total Number' style(header)=[foreground=red background=grey];/*break after county_name/skip ;*/ run; /* Second Report */ proc tabulate data=final format=3. style=[ background=white foreground=black]; /*style for the data area*/ class county_name/style=[background=grey foreground=red]; /*style for the column name*/ class category_description /order=data style=[background=grey foreground=red]; /*style for the column name*/ var total_num;table category_description='Category' all*{style=[background =white font_style=italic foreground=black]},county_name='County'*total_num=' ' ;keylabel all='Total' sum=' ';keyword all/style=[font_weight=extra_light background=white foreground=black font_style=italic];classlev category_description county_name/style=[background=grey foreground=black];run; /* Third Report */ proc tabulate data=final format=3. style=[background=white foreground=black];