Top Banner
Working with Data in Windows HRP223 – 2010 October 4 th , 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and international treaties. Unauthorized reproduction of this presentation, or any portion of it, may result in severe civil and criminal penalties and will be prosecuted to maximum extent possible under the law.
36

Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Working with Data in Windows

HRP223 – 2010October 4th, 2010

Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved.Warning: This presentation is protected by copyright law and international treaties. Unauthorized reproduction of this presentation, or any portion of it, may result in severe civil and criminal penalties and will be prosecuted to maximum extent possible under the law.

Page 2: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Sources of Data

• Toy data– For statistics classes, you may be able to type in the data

directly into a SAS code file into EG like in TLSB for EG.• Excel

– For small amounts of HIPAA safe data you can use Excel with validation.

• Text files with columns of numbers and text– Exports created by databases frequently provide a text file

full of data and a program for loading it into SAS.• SAS

– Native SAS datasets created by somebody else.

Page 3: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Recognize File Types

• Windows adds a period and a suffix that is a couple of letters long to the names of files to indicate what program uses the file. By default, the suffix is hidden.

Page 4: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

2

3 Uncheck

4

1

5

Follow these steps to show file extensions (suffixes) in Vista.

Page 5: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Show File Extensions (Suffixes) in XP

2

3 Uncheck

41

5

Page 6: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Types of Files

.pdf Adobe portable document format

.zip archives full of compressed data

.xls Excel prior to 2007

.xlsx Excel 2007 and later

.csv comma separated values (text which Excel likes)

.txt text files

.sas SAS code files

.egp Enterprise Guide projects

.sas7bdat SAS data files

.htm or .html web pages

Page 7: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

SAS and EG files

• .sas files are text files full of instructions that a programmer can write and/or edit.

• .egp files are not.

Page 8: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Searching

• Because the contents of .egp files are incomprehensible (without special tools) you will have trouble searching for things inside of projects.

• This affects me when I can’t remember the name of a project and to find it I want to search for key words in the code (like the principal investigator’s name or the name of the source data file).– I can not find a tool to search the contents of all

the .egp files on my hard drive.

Page 9: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Files in Enterprise Guide

• You can (and should) save SAS code files outside of the EG project to make it easy to search.

• Most people create EG projects that reference data files that live outside of EG.– SAS datasets– Excel files – Text files full of data

Converted to SAS format

Native Excel format

Page 10: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Shortcuts

• Windows indicates a “shortcut” to a file that lives elsewhere with an arrow in the bottom left corner of an icon.

• EG uses the same symbol to denote a shortcut to a file outside of the project.

Page 11: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

What is in an EGP file?

• An EG project file .egp contains information and instructions but it will have links to a lot of external files.

Shortcut to a file NOT in the

project.

This is part of

the project

Shortcut to a file NOT in the

project.

Page 12: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

EG and Code

• You can write and store your “code” instructions to SAS inside of the EG project or you can create a short cut to the code file which lives outside of EG.

Right click and choose New > Program Look at the process flow No shortcut icon

Page 13: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

External SAS files

• You can easily save a code file outside of the project by choosing Save Program As… from the File menu or clicking the Save or Save As … from the program tab (when the code is open).

Shortcut

Page 14: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Where are SAS Data Sets Stored?

• While SAS can refer to files using their Windows path, it is easier to type a short name instead of a long path.

• SAS calls the short names “libraries”.• EG automatically knows about a couple of places

where data can be stored.– It creates a temporary work folder whenever EG starts.– It creates a permanent sasuser folder when EG is

installed.• The locations for data are called libraries.

Page 15: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Libraries

• By default the data goes into the sasuser library. This is a very bad idea.

• You will end up with every file in one folder.

• Anybody using SAS can access that folder, so there are significant HIPAA issues.

• Right click on a file and pick Properties to see where it is stored.

Page 16: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Libraries

• You can see the contents of libraries by going to the Server List window and opening the local libraries “file drawer.”

If you previously closed the window use the View menu to select Server List.

Double click the dataset to browse it.

Page 17: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Change the Default File Location

• On every machine you, use you should change the default file location to the work library. Do this once per machine.

Page 18: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Click 1st

Click 2x

Page 19: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Permanent Store

• I suggest that you save your data into the temporary work library by default.

• If you have a huge file which you only want to import once, or if you want to keep a permanent copy of a SAS data file, you will want to set up a permanent library.– This is just a fancy way of specifying what folder

SAS should use to save the .sas7bdat data files.

Page 20: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Loading Data The Easy Way

• First fix the problematic registry entries that are described in the instructions on installing SAS.

www.stanford.edu/class/hrp223/2010/SAS92TS2M3.pptx

• If you have mixtures of characters and number values in a column in Excel programs reading the data (including SAS) can drop the cells that have character data without warning.

Page 21: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

SASR

Page 22: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Importing the Easy Way

• The most bulletproof way for importing with EG 4.2 is to use the import wizard.

Page 23: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Always check this

on.

Page 24: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Double check that it guesses the right Type, especially for

dates.

Page 25: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Tell SAS that there is a folder which can

hold data by creating a library. This only makes it aware of the folder. It does not automatically

put stuff in the folder.

Page 26: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

It’s just a folder!• When the library is created it is just a pointer to a

preexisting folder. That folder can contain anything.

• When you want to use the folder you need to explicitly tell EG to store data in the folder.

• First rename the node and draw an arrow to indicate where the library is used. These changes are only aesthetic.

Page 27: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Now it looks good but the

import is still into work.

1st rename the node to match the library name

2nd add a line to the flowchart connecting the library to the import. It just looks good.

Page 28: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Find your library here.

Page 29: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Notice it is in the library.

A “design feature” is that you have to Refresh the library to see the freshly added file.

Page 30: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Playing with Data

• Once the data is imported you can add code “nodes” to the flowchart or use the graphical user interface to tweak the data and do analyses.

Complex changes

Quick and easy subset and sorting

Page 31: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.
Page 32: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

It gives you more options as you add in sort variables.

SQL is built behind the scenes.

Note the awful new name.

Page 33: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Convert to a 4 digit number with the input function:

input( t1.score , 4. )

Page 34: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Context sensitive menus help you describe the data you are browsing.

Before After

Page 35: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Descriptive Statistics

drag

Page 36: Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.