Top Banner
While We Are Waiting… While We Are Waiting… If you want to work along with the presentation, all the materials are available on the PRISM website materials are available on the PRISM website Go to: http://polisci.osu.edu/prism/luncheons.htm Download the following zip file onto your desktop StataIntro_08.zip Extract all of the contents of the zip folder to your desktop Double click to open the presentation file: IntroStata08_Vfinal.pdf D bl li k St t t th Doubleclick on Stata to open the program Note: Included in the zip folder Presentation: IntroToStata08_Vfinal.pdf Datasets: NES04_VstataIntro08.dta ICPSR_08865 Do file IntroStata V08 do Do file: IntroStata_V08.do 1/25/2008 Christenson & Powell: Intro to Stata 1
61

While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

May 27, 2018

Download

Documents

doanlien
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

While We Are Waiting…While We Are Waiting…• If you want to work along with the presentation, all the materials are available on the PRISM websitematerials are available on the PRISM website– Go to: http://polisci.osu.edu/prism/luncheons.htm– Download the following zip file onto your desktop

• StataIntro_08.zip• Extract all of the contents of the zip folder to your desktop– Double click to open the presentation file: IntroStata08_Vfinal.pdfD bl li k St t t th– Double click on Stata to open the program

– Note: Included in the zip folder• Presentation:  IntroToStata08_Vfinal.pdf• Datasets:  NES04_VstataIntro08.dta

ICPSR_08865• Do file IntroStata V08 do• Do file:   IntroStata_V08.do

1/25/2008 Christenson & Powell: Intro to Stata 1

Page 2: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

PRISM Brownbag: 

An Introduction to                   RDino Christenson & Scott Powell

Ohio State UniversityOhio State University

January 25th, 2008

Page 3: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Intro to StataIntro to Stata  

I. GUIII. Log fileIII. Basic statsIV Data manipulationIV. Data manipulationV. Descriptions of variablesVI. Help files!pVII. GraphingVIII. Do filesIX E ti t bl h d d tIX. Exporting tables, graphs and dataX. Importing foreign dataXI. Closingg

1/25/2008 3Christenson & Powell: Intro to Stata

Page 4: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

GUIGUI

• First, let’s identify what we’re looking at.

• Stata has several differentdifferent viewing windows, each with a differentwith a different function.

1/25/2008 4Christenson & Powell: Intro to Stata

Page 5: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

GUIGUI

• Review: ListsReview: Lists commands that have recently been entered

• Results: Show recently yobtained results

1/25/2008 5Christenson & Powell: Intro to Stata

Page 6: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

GUIGUI

• Variables: a ab es:All the existing 

blvariables in your data setset

• Command: WhereWhere commands are entered

1/25/2008 Christenson & Powell: Intro to Stata 6

Page 7: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

GUIGUI• File: More 

than open• Data: Multiple 

Avenues for • Statistics: Statistical 

Modeling Optionsthan open and save.

• Edit: What you expect

Data Manipulation

• Graphics: More to come later

• Help: More to come later

• Bottom Line: These menus offer graphical alternatives to directly typing commands into Stata

1/25/2008 Christenson & Powell: Intro to Stata 7

directly typing commands into Stata

Page 8: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

GUIGUI• Begin a 

new log• Bring up the 

Results• Edit/View Data

new log file

• Bring up the Help Viewer

Results Window

• Begin a new do file

• STOP! (The Number Crunching)Crunching)

1/25/2008 Christenson & Powell: Intro to Stata 8

Page 9: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

The Log FileThe Log File

• Log or Perish! (or at the very least you might do some crying)

• Log files keep track of everything you do in Stata, both i t d t tinput and output

• However, it does not record when dditi l i dadditional windows open up (i.e. graphs, help window etc )window, etc.)

1/25/2008 Christenson & Powell: Intro to Stata 9

Page 10: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

The Log FileThe Log File

• To start a log file, th “Fil ”access the “File” 

menu and select “Begin”

• Log files will automatically close when you end your session.  However, you can also close it manually as wellmanually, as well as suspend it during a session.

1/25/2008 Christenson & Powell: Intro to Stata 10

Page 11: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Stata as a CalculatorStata as a Calculator

• Stata can be used to compute both basic and advanced 

th ti lmathematical operations

• Use the displayd dicommand, or di, 

followed by the mathematical expressionexpression

• di 20*1.5-17

1/25/2008 Christenson & Powell: Intro to Stata 11

Page 12: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Some Basic StatisticsSome Basic Statistics

• Stata can also perform several probability p yfunctions

• Example: What’s the probability of tossing a coin ten times andcoin ten times and getting five heads?

• diBi i l(10 5 5)Binomial(10,5,.5)

1/25/2008 Christenson & Powell: Intro to Stata 12

Page 13: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Some Basic StatisticsSome Basic Statistics

• Example: CDF for h lthe normal distribution, z = 1.96

• dinormal(1 96)normal(1.96)

1/25/2008 Christenson & Powell: Intro to Stata 13

Page 14: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Some Basic StatisticsSome Basic Statistics

• Stata has many more distribution functions that can be implemented

• For a summary of these, use the following 

dcommand:

• help density ifunctions

1/25/2008 Christenson & Powell: Intro to Stata 14

Page 15: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

In The Beginning…(Opening a Data Set)

• Several optionsSeveral options exist for opening data setssets

• Using the GUI allows you to browse orbrowse or access recent data sets

• It is alsoIt is also possible to type in the usecommand

1/25/2008 Christenson & Powell: Intro to Stata 15

Page 16: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

The Data EditorThe Data Editor

• Sort data by l d

• Move variable to first or last

• Hide selected i blselected 

variableto first or last position

variable

• Preserve changes that you’ve made

• Delete selected variable or observation

you ve made

• Restore data to the state of the • And, of course,the state of the last “Preserve”

And, of course, you can edit each cell

1/25/2008 Christenson & Powell: Intro to Stata 16

Page 17: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Manipulating the DataManipulating the Data

• Stata can generate i bl dnew variables and 

edit existing ones• Let’s create a new 

variable called “left”variable called  left  using generateand replace

• gen left = 1 if ideology <0

• replace left = 0 if id l0 if ideology >=0

1/25/2008 Christenson & Powell: Intro to Stata 17

Page 18: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Manipulating the DataManipulating the Data• Notice that we now 

have a new variablehave a new variable in our list

• Let’s create a new variable by recoding an existing one

• recode marr(0=1) (1=0)(0 1) (1 0), gen(single)

• Other Expressions to k & |know: >=, <=, &, |,~, ^, ‐, /, *, +, ~=

1/25/2008 Christenson & Powell: Intro to Stata 18

Page 19: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Manipulating the DataManipulating the Data

• Stata also has the bilit t tability to generate vectors and matrices

• matrix input mat1 = (1\2\3)

• matrix input mat2 = (1,2,3)

i 3• matrix mat3 = mat1*mat2

• matrix list mat3mat3

1/25/2008 Christenson & Powell: Intro to Stata 19

Page 20: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Describing the DataDescribing the Data

• Now let’s have a look at what we created

• tab ideologyideology

• tab left• tab marr• tab single

1/25/2008 Christenson & Powell: Intro to Stata 20

Page 21: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Describing the DataDescribing the Data

• To produce a list d f lland summary of all 

variables, use the sum command

• sum

• You can also use• You can also use this command to summarize individual variables

• sum ideology

1/25/2008 Christenson & Powell: Intro to Stata 21

Page 22: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Describing the DataDescribing the Data• The tab command 

can also be used tocan also be used to create cross‐tabs when implemented with two variables

• tab left marr

• Summary statistics can be separatedcan be separated using the bycommand, but you have to sort first

• sort left• by left: sum

educ

1/25/2008 Christenson & Powell: Intro to Stata 22

Page 23: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Describing the DataDescribing the Data

• In Stata, data exists in several formats

• For a summary of data types in 

d tyour data set, use the describecommandcommand

• describe

1/25/2008 Christenson & Powell: Intro to Stata 23

Page 24: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Describing the DataDescribing the Data• Strings are non‐

numeric variablesnumeric variables• Floats are numeric 

data types that store up to 7 digits fof accuracy, 

rounding thereafter• byte, int, long, and 

double are other numeric types

• Useful commands for changing datafor changing data types: format, destring, encode

1/25/2008 Christenson & Powell: Intro to Stata 24

Page 25: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Help ViewerHelp Viewer

• The capabilities of Stata are vastThe capabilities of Stata are vast

• What you can do with Stata depends on your knowledge of the commandsknowledge of the commands

• Fortunately Stata comes with user friendly help

St t ’ t t lli i t• Stata’s greatest selling point• All commands are easily referenced

All d ith h l f l d i ti d• All commands come with helpful descriptions and examples

• All commands have been peer reviewedAll commands have been peer reviewed 

1/25/2008 Christenson & Powell: Intro to Stata 25

Page 26: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Help ViewerHelp Viewer • To open the Help Viewer click on Help  Contents

• The Help Viewer opens and allows you to browse the entire Stata database and online resourcesdatabase and online resources  

• It acts like an internet browser…1/25/2008 Christenson & Powell: Intro to Stata 26

Page 27: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Help ViewerHelp Viewer

• Take the now familiar tab command

• In command prompt or in help viewer prompt:help viewer prompt:help tabulate

• Provides information on:– Command title – Command syntax

• Note: blue font is linked;Note: blue font is linked; click on it to get more info on the given word

1/25/2008 Christenson & Powell: Intro to Stata 27

Page 28: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Help ViewerHelp Viewer

• Also providesAlso provides information on:– Command Description– Command Description 

– Command Options

Command Examples– Command Examples

– Related commands

1/25/2008 Christenson & Powell: Intro to Stata 28

Page 29: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Help ViewerHelp Viewer

• Add‐on packages also p geasy to find with the help viewer

For e g “Clarify” by G– For e.g., “Clarify” by G. King

– Search clarify: Help Search…  

– Type: clarify– Help finds the add‐onHelp finds the add on package site and provides links for its description and downloadand download

1/25/2008 Christenson & Powell: Intro to Stata 29

Page 30: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

GraphingGraphing

• Stata has numerous graphing capabilities– ANOVA and post‐estimation OLS – Time Series: ARCH, ARIMA, VAR…– Duration Analysis: exponential, weibull, cox…

E t C t ti bi i l i H dl– Event Count: negative binomial, poisson, Hurdle…– Limited Dependent Variables: logit, probit, multinomial logit and 

probit, ordered logit and probit…– Selection Models: heckman, censored probit, tobit,…Selection Models: heckman, censored probit, tobit,…– And, if it is not canned, we can program it – but that is for another 

brownbag• Furthermore Stata 10 is supposed to be a drastic improvement in 

h fl ibili f hi f ithe flexibility of graphing functions– Competition with R? 

• Let’s quickly look at some of the basic graphs you can create

1/25/2008 Christenson & Powell: Intro to Stata 30

Page 31: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

ScatterplotsScatterplots

• Perhaps we want to check if our data hints that people become more favorable to conservative values as they age

• We can graph the variables with respect to one another– scatter repthermage

• Graph viewer appears b th lt iabove the results viewer

• Toggle to and fro with graph viewer buttons on toolbar

1/25/2008 Christenson & Powell: Intro to Stata 31

Page 32: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

ScatterplotsScatterplots

• We can also look at the same relationship by a particular sample of our data

• Perhaps there is a difference between those that voted for Bush (1)that voted for Bush (1) and Kerry (0)

• Let’s sort by voteT• Try scatter reptherm age, by(vote)

1/25/2008 Christenson & Powell: Intro to Stata 32

Page 33: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Bar Charts & HistogramsBar Charts & Histograms

• Say we are interested in Say e a e te estedthe distribution of a categorical variable

• Try creating a bar chart for our measure of 

liti l id lpolitical ideology• Typehi t id l• hist ideology, discrete width(1)width(1)

1/25/2008 Christenson & Powell: Intro to Stata 33

Page 34: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

UFPCUFPC

• Say you need to paint a really b i i t f t id

Strength of Party IdentificationUFPC

basic picture of party id strength for your coworkers

• Try a pie chartgraph pie over(pid)

16.99%

14.98%12.89%

16.15%

– graph pie, over(pid) – Use options for presentation: – title(UFPC) subtitle(Strength of

17.57%9.874%

11.55%

-3 -21 0

gParty Identification) caption(-3 = Strong Rep to 3 = Strong

-1 01 23

-3 = Strong Rep to 3 = Strong Dem

p gDem) plabel(_all percent) cw

• Then quit your job; you’re working with imbecilesworking with imbeciles

1/25/2008 Christenson & Powell: Intro to Stata 34

Page 35: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Graphing with GUIGraphing with GUI• Of course, we did not need 

the exact commands to create the graphs above

• We could have used the GUI toolbar to create any ofGUI toolbar to create any of those graphs

• Just go to Graphics and select the appropriateselect the appropriate graph 

• A new viewer will appear• Select from the drop down• Select from the drop‐down 

menu to fill in the necessary variables and optionsoptions

1/25/2008 Christenson & Powell: Intro to Stata 35

Page 36: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Exporting Graphs & TablesExporting Graphs & Tables

• So why did the last chart, the UFPC, look so nice So y d d t e ast c a t, t e U C, oo so ceand the others… not so much? – 1. Used titles– 2. Used a key

• The graph was understandable on its own

3 Exported the graph as a picture– 3. Exported the graph as a picture

• Stata allows you to export its output – both tables and graphs – in various formatstables and graphs  in various formats– Depending on your typesetting system you will want to save the output in different manners

1/25/2008 Christenson & Powell: Intro to Stata 36

Page 37: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Exporting GraphsExporting Graphs• To save and export a graph, right click on the g p , ggraph (control click to my Mac friends)– Click Save GraphClick Save Graph– Save in the appropriate format

• Word: .wmf or .pngWord: .wmf or .png• Latex: .eps

• Alternatively, go to the main toolbar and clickmain toolbar and click File  Save Graph– Follow same procedure

1/25/2008 Christenson & Powell: Intro to Stata 37

Page 38: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Exporting GraphsExporting Graphs• Shortcut to word usersusers

• To merely copy a graph, right click on h h ( lthe graph (control click to my Mac friends))– Click Copy– Paste it in your word processorprocessor

– Note: you do not have a separate saved 

h i thigraph in this case

1/25/2008 Christenson & Powell: Intro to Stata 38

Page 39: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Exporting TablesExporting Tables

• The Stata table output is not appropriate for a conference paper or article submission

• Why not?– 1. Too much information– 2. Vertical lines– 3. Variable names– 4. No title or explanation

• Therefore, when you write a paper you will need to transform p p ythe output

• You’ve all seen article worthy tables (e.g. Balla & Wright 2001)tables (e.g. Balla & Wright 2001)

1/25/2008 Christenson & Powell: Intro to Stata 39

Page 40: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Exporting TablesExporting Tables

• Let’s run a simple OLS regression of some key political and demographic variables g pon the republican thermometer measure– Explanatory variables: p yeduc black south pray pid ideology

– Dependent variable: preptherm

• Stata output

1/25/2008 Christenson & Powell: Intro to Stata 40

Page 41: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Exporting TablesExporting Tables

• To export, highlight the p , g gtable with the mouse

• Right click on the hi hli ht d t blhighlighted table– For Word: Copy Text– For Excel: Copy TableFor Excel: Copy Table

• Edit in your chosen program in accord with journal specificationsjournal specifications

1/25/2008 Christenson & Powell: Intro to Stata 41

Page 42: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Do FilesDo Files• We’ve accomplished quite a bit and we have a log file of our work to prove itlog file of our work to prove it

• But is there an easy way to rerun all our work?Wh if d k ll• What if we wanted to make some small changes to our analyses and largely repeat this work?this work?

• Use a Do file!Cli k h• Click here to open a new or saved .do file 

1/25/2008 Christenson & Powell: Intro to Stata 42

Page 43: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Do Files New .do fileSave your .do fileDo Files

• A Stata do file saves 

yPrint your .do file

text in a text editor format

It is often easier to– It is often easier to create your commands in an editor than at the command promptcommand prompt

– Also easier to record your commands for future use and Fi d t t i d fil Copy do filefuture use and manipulation

Find text in .do fileRun .do file and show output

Copy .do fileUndo last edit

1/25/2008 Christenson & Powell: Intro to Stata 43

Page 44: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Do FilesDo Files• Typical text editing functions can 

be used in here: replace, copy…etc. 

• Asterisk * tells Stata not to run that line

Th f t t d fil– Therefore annotate your .do file with titles and explanations beginning with an *

• Let’s look at all the commands used in today’s presentation– Open a do file – Select Open… in do file toolbar– Select IntroStata_V08.do – Click Open

1/25/2008 Christenson & Powell: Intro to Stata 44

Page 45: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Do FilesDo Files

• The .do file presents all the commands from today in a simple editor

• From here we can edit the commands 

• We can run the entire series of commands at one fell swoop – Bring cursor to the first line and 

click on the Run button• We can also select portions to p

run by highlighting the appropriate text and clicking the same button

1/25/2008 Christenson & Powell: Intro to Stata 45

Page 46: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Do FilesDo Files

• Note: if you forget to ote: you o get towork in the do file, you can capture all 

d fyour commands from the review editor:

Right click in the– Right click in the review editor  Copy Review Contents to Clipboard

– Paste into your do file and editand edit

1/25/2008 Christenson & Powell: Intro to Stata 46

Page 47: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Importing Foreign DataImporting Foreign Data

• Often times we aren’t lucky enough to have data in Stata’sdatabase format

• Stata’s data files are stored as .dta files– They are just EZ‐form data files y j

• Used in various programs– Not to be confused with .dat files 

• Which are usually ASCII comma delimited and often viewed in text dieditors

• Not to worry!• Beyond working with .dta files, Stata allows you to import 

d fdata in various formats:– ASCII (.txt, .raw, .csv)– FDA (SAS export)– XML (.xml)

1/25/2008 Christenson & Powell: Intro to Stata 47

Page 48: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Importing Foreign DataImporting Foreign Data

• For example, say we wanted to use data stored at ICPSR

• www.icpsr.umich.eduwww.icpsr.umich.edu• ICPSR has tons of data on various topicsH “D t ” d• Hover on “Data” and select “Browse” to view their many ddatasets

• You can also search for a particular pdataset  1/25/2008 Christenson & Powell: Intro to Stata 48

Page 49: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Importing Foreign DataImporting Foreign Data• Today I’m interested in American 

state politicsstate politics• I find that ICPSR has 14 relevant 

datasets• I simply select to download the p y

dataset I’m interested in: 8655 Survey of City Council Members…

• If you are a returning user, it will request your login and passwordrequest your login and password

• If you are a new user, you will have to register first– It’s free and easy to register– No self‐respecting methods student 

will make it through their first year without registering & downloading a dataset here

1/25/2008 Christenson & Powell: Intro to Stata 49

Page 50: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Importing Foreign DataImporting Foreign Data

• Download will usually allow you to import the data with various set‐up files– These files make importing 

t f dto your preferred program easier

• In this case we just want the “Stata Setup” files withthe  Stata Setup  files with the data file

• Add these to the “Data Cart” in Step 3Cart  in Step 3

• Then select “Download” in Step 5 (you can review your cart in Step 4)cart in Step 4) 

1/25/2008 Christenson & Powell: Intro to Stata 50

Page 51: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Importing Foreign DataImporting Foreign Data

• After agreeing toAfter agreeing to their terms and conditions– The data files are compressed in a zip d idrive

– You are prompted to open or save the filesopen or save the files

• Save the drive in your preferred folderpreferred folder

1/25/2008 Christenson & Powell: Intro to Stata 51

Page 52: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Importing Foreign DataImporting Foreign Data

• Now we have the dataNow we have the data and setup files in a zip drive on our computerp– Extract the contents from your zip drive

– View the contents• Codebook as .pdf

• Data as .txt

• Setup dictionary as .dct

• Setup do file as .doSetup do file as .do

1/25/2008 Christenson & Powell: Intro to Stata 52

Page 53: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Importing Foreign DataImporting Foreign Data

• Let’s return to your Stata GUI

• Type clear to completely reset your datacompletely reset your data– Doing so deletes any variables you have stored or you have createdor you have created

• Click here to open a “do file”

In the do file select open– In the do file, select open – Browse for the setup do file:08655 0001 S d– 08655‐0001‐Setup.do 

1/25/2008 Christenson & Powell: Intro to Stata 53

Page 54: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Importing Foreign DataImporting Foreign Data

• The setup do fileThe setup do file

• This file will define and label your dataand label your data for the Stata editor bycallingcalling– The dataset 

A di– A corresponding dictionary file

1/25/2008 Christenson & Powell: Intro to Stata 54

Page 55: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Importing Foreign DataImporting Foreign Data• Edit the do file to pull from the appropriate folder

– You must tell it where to find the raw data (.txt) and the dictionary file (.dct)    we stored it on the desktop

– And you must specify the name of the output file (.dta)y p y p ( )

1/25/2008 Christenson & Powell: Intro to Stata 55

Page 56: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Importing Foreign DataImporting Foreign Data• Once we’ve told the do file 

editor where to find theeditor where to find the dictionary and data…

• We run the do file• In result viewer, Stata 

returns in green– file C:\council.dta

dsaved– Or, if you got it wrong, it 

returns an error code in redIf wrong make sure you– If wrong, make sure you specified the right directory

1/25/2008 Christenson & Powell: Intro to Stata 56

Page 57: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Imported Foreign DataImported Foreign Data

• Now you have a Stata formatted dataset ( dta)Now you have a Stata formatted dataset (.dta) from an ASCII file (.txt)

• Properly saved data file• Properly saved data file

• Variables listed and labeled 

1/25/2008 Christenson & Powell: Intro to Stata 57

Page 58: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

Other Importing OptionsOther Importing Options

• SPSS data (.sav) can be easily exported to Stata ( ) y pformat (.dta) from SPSS– In SPSS, just click Save As and select the appropriate Stata version (an export wizard is now available inStata version  (an export wizard is now available in SPSS as well)

– FYI: You can also export from SPSS to just about anything else (SAS Excel ASCII dBase & SAS)anything else (SAS, Excel, ASCII, dBase & SAS)

• The PRL lab has Stat/Transfer– An easy way to move data between packages and into An easy way to move data between packages and intodifferent databases 

– Especially good with large and labeled databases

1/25/2008 Christenson & Powell: Intro to Stata 58

Page 59: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

CongratulationsCongratulations

• By now you can move comfortably around Stata• You can

– Keep a log of your work– Use Stata as a statistics calculatorUse Stata as a stat st cs ca cu ato– Create variables– Load a Stata dataset– Examine your dataExamine your data– Run some descriptive functions– Make basic graphs– Search for help on commands and packagesSearch for help on commands and packages– Export Stata output into your preferred document– Create, edit, run and save commands from a do file– And even import foreign datasets– And even import foreign datasets

1/25/2008 Christenson & Powell: Intro to Stata 59

Page 60: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

RememberRemember

1. Begin by opening a log– Always keep a log 

2. To increase memory for large datasets, type set mem 100m 3. Begin all analyses with simple descriptives

– Know your data 4. Utilize gen to generate variables

– The egen command is a helpful extension to genf l f h d5. Usefulness of the Review window

– Don’t need to retype the command (just click from the review)– Also helpful are the page up/down keys within the command prompt

6 i St t i d f b ti b6. _n is Stata programming code for observation number 7. Use .do files

– Annotate your do files utilizing the *

1/25/2008 Christenson & Powell: Intro to Stata 60

Page 61: While We Are Waiting… - Department of Political Science to...• Example: CDF for the normal distribution, z = 1.96 • di normal(1 96)normal(1.96) 1/25/2008 Christenson & Powell:

See You Next TimeSee You Next Time

• PRISM’s next brownbaggContemporary Methods of Ideal Point EstimationPresenter: Josh Clinton of Princeton UniversityJanuary 30, 2008January 30, 2008 12:00‐1:00pm 

• PRISM’s Spring brownbagB i I f ith Wi BUGSBayesian Inference with WinBUGSPresenters: Dino Christenson & Scott Powell Date & Time TBA (Spring 2008)

d h // l d / /l h hUpdates at  http://polisci.osu.edu/prism/luncheons.htm

• PRISM’s next methods lunchFebruary 5th, 12 noonFebruary 5 , 12 noon

1/25/2008 Christenson & Powell: Intro to Stata 61