Top Banner
Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous Introduction to Stata Joseph D. Wolfe January 21, 2011
83

Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Jul 27, 2018

Download

Documents

LêHạnh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Introduction to Stata

Joseph D. Wolfe

January 21, 2011

Page 2: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

1 The Basics

2 Data Management

3 Statistics

4 Visualization

5 Automation

6 Miscellaneous

Page 3: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

The Basics

A couple of quick notes.

• Stata commands are in this font

Page 4: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

The Basics

A couple of quick notes.

• Stata commands are in this font

• Stata commands that are show as part of the outputstart with a period (.). When you type commands, doNOT type "." before your command

Page 5: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

The Basics

A couple of quick notes.

• Stata commands are in this font

• Stata commands that are show as part of the outputstart with a period (.). When you type commands, doNOT type "." before your command

• I provide examples of several of the commands in thispresentation in a Stata do file that can be found at mywebsite

Page 6: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Advantages of Stata

I won’t go into an elaborate comparison to other programs,but here are a several things I really appreciate about Stata:

Page 7: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Advantages of Stata

I won’t go into an elaborate comparison to other programs,but here are a several things I really appreciate about Stata:

• The graphical user interface neatly displays variablesand results.

Page 8: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Advantages of Stata

I won’t go into an elaborate comparison to other programs,but here are a several things I really appreciate about Stata:

• The graphical user interface neatly displays variablesand results.

• Stata is well documented, and there are numeroustexts on using Stata for a variety of statistical tasks.

Page 9: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Advantages of Stata

I won’t go into an elaborate comparison to other programs,but here are a several things I really appreciate about Stata:

• The graphical user interface neatly displays variablesand results.

• Stata is well documented, and there are numeroustexts on using Stata for a variety of statistical tasks.

• StataCorp works hard to ensure backward compatibility.

Page 10: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Advantages of Stata

I won’t go into an elaborate comparison to other programs,but here are a several things I really appreciate about Stata:

• The graphical user interface neatly displays variablesand results.

• Stata is well documented, and there are numeroustexts on using Stata for a variety of statistical tasks.

• StataCorp works hard to ensure backward compatibility.

• Stata has lots of tools to facilitate work flow andreplication.

Page 11: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Getting Help

There are 3 general ways of getting help with Stata:

Page 12: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Getting Help

There are 3 general ways of getting help with Stata:

1 Stata’s help command

Page 13: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Getting Help

There are 3 general ways of getting help with Stata:

1 Stata’s help command

2 The Stata manual - PDF documentation can beaccessed from the Help pull down menu or at thebottom of a command’s help file.

Page 14: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Getting Help

There are 3 general ways of getting help with Stata:

1 Stata’s help command

2 The Stata manual - PDF documentation can beaccessed from the Help pull down menu or at thebottom of a command’s help file.

3 StataList

Page 15: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

The Graphical User Interface

Page 16: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

The Graphical User Interface

1 Results Window: Displays results from commands,error messages, etc.

Page 17: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

The Graphical User Interface

1 Results Window: Displays results from commands,error messages, etc.

2 Command Window: Allows us to work interactively withStata.

Page 18: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

The Graphical User Interface

1 Results Window: Displays results from commands,error messages, etc.

2 Command Window: Allows us to work interactively withStata.

3 Variable Window: Displays the variables in a dataset.

Page 19: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

The Graphical User Interface

1 Results Window: Displays results from commands,error messages, etc.

2 Command Window: Allows us to work interactively withStata.

3 Variable Window: Displays the variables in a dataset.

4 Review Window: Displays the previous commands thathave been used.

Page 20: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Speaking Stata

Stata has its own language. We have to write commands ina specific format for Stata to understand what we want. Thisis referred to as syntax. The basic language syntax is:

[prefix :] command [varlist ] [=exp] [if ] [in] [weight ][using filename] [, options]

You can find the syntax for specific commands by typinghelp and the command in the command window.

Page 21: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

The Command Window

Now that we have an idea how Stata works, let’s jump rightin and look at how to use the command window.

Page 22: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

The Command Window

Now that we have an idea how Stata works, let’s jump rightin and look at how to use the command window.

For example, the display command is great for quickcalculations used in the command window.

. di 2 + 24

Page 23: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

The Command Window

Now that we have an idea how Stata works, let’s jump rightin and look at how to use the command window.

For example, the display command is great for quickcalculations used in the command window.

. di 2 + 24

You could probably do all your work using the commandwindow, but in the next section we’ll talk about why that’s nota good strategy for conducting serious research.

Page 24: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Data Management

Several of the issues I mention about data management canbe found in Scott Long’s Workflow book. You should gothere for more detailed information on this topic.

Page 25: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Setting Your Directories

A directory is where Stata looks for and saves information.So, if you want to use a dataset in a specific folder, you haveto Stata to look in that folder. Let’s check where my directlyis currently set by using the pwd command.

Page 26: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Setting Your Directories

A directory is where Stata looks for and saves information.So, if you want to use a dataset in a specific folder, you haveto Stata to look in that folder. Let’s check where my directlyis currently set by using the pwd command.

. pwd/Users/jdwolfe/Documents/

Page 27: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Setting Your Directories

A directory is where Stata looks for and saves information.So, if you want to use a dataset in a specific folder, you haveto Stata to look in that folder. Let’s check where my directlyis currently set by using the pwd command.

. pwd/Users/jdwolfe/Documents/

If I wanted to work on a project I’m involved with, I wouldneed to change the directory by using the cd command.

. cd "/Users/jdwolfe/Documents/Project"/Users/jdwolfe/Documents/Project

Page 28: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Importing and Exporting Data

Let’s get some data! To start off, let’s just use data that isshipped with Stata.

. sysuse auto(1978 Automobile Data)

The sysuse command is a quick way to load data includedwith Stata ( sysuse dir will list other available data).However, you’ll need a different command to load data inone of your personal folders. Let’s save this dataset in anew folder and then reload it.

Page 29: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Importing and Exporting Data

We’ve set our directory, so all we need is the savecommand.

. save auto2file auto2.dta saved

Page 30: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Importing and Exporting Data

We’ve set our directory, so all we need is the savecommand.

. save auto2file auto2.dta saved

To make sure I’ve saved it correctly, I can check thecontents of the directory.

. dir<dir> 8/26/10 16:33 .<dir> 8/26/10 16:33 ..5.8k 8/26/10 16:33 auto2. dta

Page 31: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Importing and Exporting Data

Now, because the data is in my folder, I need the usecommand.

. use auto2.dta, clear(1978 Automobile Data)

For other data formats, see outsheet, insheet, outfile, andinfile.

Page 32: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Do and Log Files

Although the point-and-click option is enticing, it inhibitsreplication of results. Using do-files allows you to have acopy of your work that you can always come back to andreuse in other projects.

Page 33: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Do and Log Files

Although the point-and-click option is enticing, it inhibitsreplication of results. Using do-files allows you to have acopy of your work that you can always come back to andreuse in other projects.

• To open a do-file click the Stata do-file editor icon in thetoolbar (in Stata 11, it’s the 7th icon to the right).

Page 34: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Do and Log Files

Although the point-and-click option is enticing, it inhibitsreplication of results. Using do-files allows you to have acopy of your work that you can always come back to andreuse in other projects.

• To open a do-file click the Stata do-file editor icon in thetoolbar (in Stata 11, it’s the 7th icon to the right).

• Notice that the do-file has it’s own toolbar. Use this tosave, open, print, and run files.

Page 35: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Do and Log Files

Although the point-and-click option is enticing, it inhibitsreplication of results. Using do-files allows you to have acopy of your work that you can always come back to andreuse in other projects.

• To open a do-file click the Stata do-file editor icon in thetoolbar (in Stata 11, it’s the 7th icon to the right).

• Notice that the do-file has it’s own toolbar. Use this tosave, open, print, and run files.

• You may want to use a text editor with more options.Here are several to check out. For PCs, there isUltraedit and Notepad++ (free!). I have a Mac so I useTextWrangler, a stripped-down version of BBEdit. It’sfree and scriptable!

Page 36: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Do and Log Files

I usually keep a do-file template on hand. When I want anew do file, I open my template and then save it with a newname. Here’s an example (this template can be downloadedfrom my website):

Page 37: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Do and Log Files

1> capture log close2> log using NAMEOFDOFILE, replace text3>4> // program: NAMEOFDOFILE5> // task:6> // project:7> // author:8>9> version 11

10> clear all11> macro drop _all12> set linesize 8013> set more off14>15> // #116> // Describe step17>18> log close19> exit

Page 38: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Do and Log FilesLet’s decode some of this.

Page 39: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Do and Log FilesLet’s decode some of this.

1 Lines 1-2: I begin by closing previous logs and creatinga new one. We’ll talk about this more in a second

Page 40: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Do and Log FilesLet’s decode some of this.

1 Lines 1-2: I begin by closing previous logs and creatinga new one. We’ll talk about this more in a second

2 Lines 4-7: I put some basic information about the do-fileand what it is doing. You can add notes to a do file byusing "//", "*", or surrounding text with "/*" and "*\"

Page 41: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Do and Log FilesLet’s decode some of this.

1 Lines 1-2: I begin by closing previous logs and creatinga new one. We’ll talk about this more in a second

2 Lines 4-7: I put some basic information about the do-fileand what it is doing. You can add notes to a do file byusing "//", "*", or surrounding text with "/*" and "*\"

3 Line 9: Version control is incredibly important.

Page 42: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Do and Log FilesLet’s decode some of this.

1 Lines 1-2: I begin by closing previous logs and creatinga new one. We’ll talk about this more in a second

2 Lines 4-7: I put some basic information about the do-fileand what it is doing. You can add notes to a do file byusing "//", "*", or surrounding text with "/*" and "*\"

3 Line 9: Version control is incredibly important.

4 Lines 10-11: I clear all information from any previoussessions that may still be in memory

Page 43: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Do and Log FilesLet’s decode some of this.

1 Lines 1-2: I begin by closing previous logs and creatinga new one. We’ll talk about this more in a second

2 Lines 4-7: I put some basic information about the do-fileand what it is doing. You can add notes to a do file byusing "//", "*", or surrounding text with "/*" and "*\"

3 Line 9: Version control is incredibly important.

4 Lines 10-11: I clear all information from any previoussessions that may still be in memory

5 Lines 12-13: I set the linesize and set more off. Withoutset more off , Stata will pause while running the dofile.

Page 44: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Do and Log FilesLet’s decode some of this.

1 Lines 1-2: I begin by closing previous logs and creatinga new one. We’ll talk about this more in a second

2 Lines 4-7: I put some basic information about the do-fileand what it is doing. You can add notes to a do file byusing "//", "*", or surrounding text with "/*" and "*\"

3 Line 9: Version control is incredibly important.

4 Lines 10-11: I clear all information from any previoussessions that may still be in memory

5 Lines 12-13: I set the linesize and set more off. Withoutset more off , Stata will pause while running the dofile.

6 Lines 15-17: I do my work in steps #1 through how evermany I need.

Page 45: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Do and Log FilesLet’s decode some of this.

1 Lines 1-2: I begin by closing previous logs and creatinga new one. We’ll talk about this more in a second

2 Lines 4-7: I put some basic information about the do-fileand what it is doing. You can add notes to a do file byusing "//", "*", or surrounding text with "/*" and "*\"

3 Line 9: Version control is incredibly important.

4 Lines 10-11: I clear all information from any previoussessions that may still be in memory

5 Lines 12-13: I set the linesize and set more off. Withoutset more off , Stata will pause while running the dofile.

6 Lines 15-17: I do my work in steps #1 through how evermany I need.

7 Lines 18-19: Finally, I close the log and exit.

Page 46: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Log Files

While do-files hold commands, log files record both thecommands used and the output created during a session.Look back at the template to see where I use thecommands I describe below.

• I always begin with capture log close . Thismakes sure that I’m not writing information into adifferent log file that may have been accidentally leftopen.

Page 47: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Log Files

While do-files hold commands, log files record both thecommands used and the output created during a session.Look back at the template to see where I use thecommands I describe below.

• I always begin with capture log close . Thismakes sure that I’m not writing information into adifferent log file that may have been accidentally leftopen.

• log using FILENAME, replace text . Thiscreates a log or replaces an old logs of the same name,and text saves it as a text file

Page 48: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Log Files

While do-files hold commands, log files record both thecommands used and the output created during a session.Look back at the template to see where I use thecommands I describe below.

• I always begin with capture log close . Thismakes sure that I’m not writing information into adifferent log file that may have been accidentally leftopen.

• log using FILENAME, replace text . Thiscreates a log or replaces an old logs of the same name,and text saves it as a text file

• At the end of do-files, I also include the commandclose log . Otherwise, the log will continue recordinganything that Stata does.

Page 49: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Variables

I spend a large chunk of my time on projects creating andrecoding variables. Here are some commands I’ve foundindispensable in creating and changing variables. See step#3 for an example using these commands.

Page 50: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Variables

I spend a large chunk of my time on projects creating andrecoding variables. Here are some commands I’ve foundindispensable in creating and changing variables. See step#3 for an example using these commands.

• generate - creates new variables

Page 51: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Variables

I spend a large chunk of my time on projects creating andrecoding variables. Here are some commands I’ve foundindispensable in creating and changing variables. See step#3 for an example using these commands.

• generate - creates new variables

• recode - recodes an existing variable

Page 52: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Variables

I spend a large chunk of my time on projects creating andrecoding variables. Here are some commands I’ve foundindispensable in creating and changing variables. See step#3 for an example using these commands.

• generate - creates new variables

• recode - recodes an existing variable

• replace - replaces the contents of an existing varable

Page 53: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Variables

I spend a large chunk of my time on projects creating andrecoding variables. Here are some commands I’ve foundindispensable in creating and changing variables. See step#3 for an example using these commands.

• generate - creates new variables

• recode - recodes an existing variable

• replace - replaces the contents of an existing varable

• egen - like gen but has lots of extensions

Page 54: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Variables

There are also several commands that should usually beincluded when creating or revising a variable.

Page 55: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Variables

There are also several commands that should usually beincluded when creating or revising a variable.

• label variable - adds or changes the label for a variable

Page 56: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Variables

There are also several commands that should usually beincluded when creating or revising a variable.

• label variable - adds or changes the label for a variable

• label define and label value - use these commands toadd labels to the categories of a variable

Page 57: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Variables

There are also several commands that should usually beincluded when creating or revising a variable.

• label variable - adds or changes the label for a variable

• label define and label value - use these commands toadd labels to the categories of a variable

• notes - add notes to data and variables. I usually add anote telling me the name of the do file used to create avariable. This very helpful when, weeks or months later,I need to see exactly how I created a variable.

Page 58: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Statistics

Stata can do a wide range of statistical analyses. We don’thave time to cover them all, or explain what they do in detail.In the next few slides, I list some of the more basiccommands. If you want more information on a command orusing a statistical technique in Stata, click on the commandname of the statistic or check out Stata’s bookstore.

Page 59: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Descriptive Statistics

Stata has lots of commands for getting basic statistics aboutyour data.

Page 60: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Descriptive Statistics

Stata has lots of commands for getting basic statistics aboutyour data.

• list - lists values of variables for observations

Page 61: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Descriptive Statistics

Stata has lots of commands for getting basic statistics aboutyour data.

• list - lists values of variables for observations

• codebook - nice table of basic descriptive statistics(with compact option) or more detailed informationabout variables

Page 62: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Descriptive Statistics

Stata has lots of commands for getting basic statistics aboutyour data.

• list - lists values of variables for observations

• codebook - nice table of basic descriptive statistics(with compact option) or more detailed informationabout variables

• summarize - also provides nice table of basicdescriptive statistics or more detailed information aboutvariable with detail option

Page 63: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Descriptive Statistics

Stata has lots of commands for getting basic statistics aboutyour data.

• list - lists values of variables for observations

• codebook - nice table of basic descriptive statistics(with compact option) or more detailed informationabout variables

• summarize - also provides nice table of basicdescriptive statistics or more detailed information aboutvariable with detail option

• tabulate - get a table of frequencies or, if you specifytwo variables, a two-way table of frequencies

Page 64: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Test and RegressionCommands

Here’s a short list of some basic test and regressioncommands.

• ttest - Mean comparison tests

• regress - Linear regression for continuous outcomes

• logit - Logistic regression for binary outcomes

• mlogit - Multinomial logistic regression for categoricaloutcomes

• poisson - Poisson regression for count outcomes

Page 65: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

More Advanced Models

Here are a few of Stata’s more specialized statisticalcapabilities.

• xt - These commands are intended for cross-sectionalpanels and longitudinal data.

• svy - These commands make adjustments for complexsurvey designs.

• st - These commands deal with survival-time data.

Page 66: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Tables

For a quick and dirty table, you can copy results from Stata’sresults window and paste them into a word processor. If youuse a fixed font, e.g., courier new, it will look exactly like itdid in Stata’s Result window.

Page 67: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Tables

For a quick and dirty table, you can copy results from Stata’sresults window and paste them into a word processor. If youuse a fixed font, e.g., courier new, it will look exactly like itdid in Stata’s Result window.

For publication-quality tables, try esttab. This is a userwritten program. You’ll have to download it. For instructions,see the Miscellaneous section of the presentation.

Page 68: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Graphs

Stata’s graphs can be great but can also get verycomplicated. Although there are pull down menus that willhelp create graphs, I strongy recommend using a saveddo-file. This way can reproduce your graphs and reuse thecommands. I provide a few simple examples, but like theother topics we’ve discussed, there are lots of options thatcan be added to get what you want.

Page 69: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

GraphsHistograms are an quick way to check variable frequencies.The basic command is straightforward:

. histogram price, freq

Page 70: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

GraphsFor two variables, you want to use the a twoway plot. Here’san example:

. graph twoway scatter price length

Page 71: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Graphs

For more than two variables, you’ll want a matrix graph.

. graph matrix price mpg length

Page 72: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Graphs

To save graphs, use the graph export command. Forexample, we could save the matrix graph we just created:

. graph export grmatrix.png, replace

Page 73: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

LocalsLocals allow you to designate letters or a word to representother text, variables, or numbers.

Page 74: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

LocalsLocals allow you to designate letters or a word to representother text, variables, or numbers.

For example, I create a local at the beginning of my do-filesthat contains the name of the do-file.

. local pgm "stintro01"

Page 75: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

LocalsLocals allow you to designate letters or a word to representother text, variables, or numbers.

For example, I create a local at the beginning of my do-filesthat contains the name of the do-file.

. local pgm "stintro01"

Then, I can use that local to add notes, captions to graphs,etc. For example, to add the name of the do file to a variablenote, I would say:

. note price: ` pgm’

Page 76: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

LocalsLocals allow you to designate letters or a word to representother text, variables, or numbers.

For example, I create a local at the beginning of my do-filesthat contains the name of the do-file.

. local pgm "stintro01"

Then, I can use that local to add notes, captions to graphs,etc. For example, to add the name of the do file to a variablenote, I would say:

. note price: ` pgm’

You’ll notice quotes around the local when I use it in acommand. The quote on the left is called a grave accent,which is located at the top left of your keyboard next to thenumber 1. On the right, there is a single quote (A.K.A., anapostrophe).

Page 77: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Loops

Loops allow you to do a command or a series of commandsover and over without having to type it all out. This is can bea lifesaver!!

Page 78: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Loops

Loops allow you to do a command or a series of commandsover and over without having to type it all out. This is can bea lifesaver!!

Pretend that we just created the variables make, price, mpg,rep78, and headroom in the auto dataset. Now, we want toattach a note that includes the the name of the do-fle.

Page 79: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Loops

Loops allow you to do a command or a series of commandsover and over without having to type it all out. This is can bea lifesaver!!

Pretend that we just created the variables make, price, mpg,rep78, and headroom in the auto dataset. Now, we want toattach a note that includes the the name of the do-fle.

Well, we could go through every variable like this:

. note make: ` pgm’

. note price: ` pgm’

. note mpg: ` pgm’

. note rep78: ` pgm’

. note headroom: ` pgm’

Page 80: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Loops

The foreach command can do this much faster.

. foreach var in make price mpg rep78headroom {2. note ` var’: ` pgm’3. }

This was a short example, so maybe the foreachcommand doesn’t make a lot of sense. However, if youneed to tabulate 20 variables or recode the missing in 200variables, the foreach command can save you a lot oftime.

Page 81: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

The forvalues command is similar except that it loopsover consecutive values.

. forvalues x = 5/10 {2. sum price if trunk==`x’3. }

Page 82: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

User-Written Software

For specialized tasks, you may need user-written software.To find the software, use the findit comand. For example,to find the esttab package type

findit esttab

Scroll down until you find the right package and click the linkto download.

If you are using a computer that will not allow you to changefiles, you’ll need to type:

sysdir set PLUS "PATHTOFOLDER"

Then try downloading again. The program should bedownloaded to the chosen folder.

Page 83: Introduction to Stata - Social Science Research Commons · Introduction to Stata Joseph D. Wolfe Outline The Basics Data Management Statistics Visualization Automation Miscellaneous

Introduction toStata

Joseph D.Wolfe

Outline

The Basics

DataManagement

Statistics

Visualization

Automation

Miscellaneous

Websites

Here are a list of websites that may be of interest.

• Stata’s Website

• The Stata Journal

• Stata TutorialsUCLA Statistical Computing

• General Introduction to StataGermán Rodríguez, Princeton University

• A SAS User’s Guide to StataDan Blanchette, Duke University