7/25/2019 Sl Lecture01
1/31
Lecture 1: Programming with R
Lecture 1: Programming with R
Renuka Sane
July 29, 2015
http://find/7/25/2019 Sl Lecture01
2/31
Lecture 1: Programming with R
The kinds of questions economists ask?
Has the mid-day meal scheme improved school attendance?
Does increase in police presence lead to a reduction in crime rate?Did the ban in commissions to mutual funds lead to a reduction in fundflows?
Do minimum wages cause unemployment?
Do FII inflows increase stock price volatility?
http://find/7/25/2019 Sl Lecture01
3/31
Lecture 1: Programming with R
Components of writing such papers
HypothesisEconometric model
Data!
http://find/7/25/2019 Sl Lecture01
4/31
Lecture 1: Programming with R
Statistical packages
Users supply data
Run pre-defined routines. Example regress Y XThe problems:
It is non-trivial to get Y and X in the same data-setOur work is actually 90% data handling & graphs and 10% estimation.What if the routine you want is not part of the package?
1
http://find/http://goback/7/25/2019 Sl Lecture01
5/31
Lecture 1: Programming with R
Data handing example: MG-NREGA
L t 1 P i ith R
http://find/7/25/2019 Sl Lecture01
6/31
Lecture 1: Programming with R
Data handing example: Trading data on NIFTY
Lecture 1: Programming with R
http://find/7/25/2019 Sl Lecture01
7/31
Lecture 1: Programming with R
Data handling example: Many files to be put in one
Lecture 1: Programming with R
http://find/7/25/2019 Sl Lecture01
8/31
Lecture 1: Programming with R
The elements for a computational toolkit
Price
Freedom
Computer science
Network effects
Lecture 1: Programming with R
http://find/7/25/2019 Sl Lecture01
9/31
Lecture 1: Programming with R
The three main alternatives
System Price Freedom CS Network
SAS Very high Zero Bad SmallStata High Zero Better than SAS High among
but not great economistsR Free Free Great Slowly growing
Lecture 1: Programming with R
http://find/7/25/2019 Sl Lecture01
10/31
Lecture 1: Programming with R
Part I
R
Lecture 1: Programming with R
http://goforward/http://find/http://goback/7/25/2019 Sl Lecture01
11/31
g g
The origins of S
The predecessor of R wasS. This was done at Bell Labs, and is a child of theUnix philosophy.
1970s Initial implementation (Fortran, mostly for internal use)
1980s Unix version, wider distribution in academia. New-S.
1990s Statistical modeling language. Licensing (S-PLUS). Addition offormal object-oriented programming.
Lecture 1: Programming with R
http://find/7/25/2019 Sl Lecture01
12/31
ACM Software System Award
John Chambers, awarded the Software Systems Award in 1998.For the S system, which has forever altered how people analyze, visualise,and manipulate data
http://awards.acm.org/software_system/year.cfm
The hall of fame includes:
1983 Unix1986 TeX1995 World-Wide Web
2002 Java
Lecture 1: Programming with R
http://find/7/25/2019 Sl Lecture01
13/31
From there we have R
R is the free S.
Started as a teaching tool by Robert Gentleman and Ross Ihaka at theUniversity of Auckland, around 1993.
Released as Free software around 1995
Version 1.0 released in 2000.
R is now the dominant statistics software in the world.
R is a GNU project.
R is available as Free Software under the terms of the Free SoftwareFoundations GNU General Public License in source code form.
Lecture 1: Programming with R
http://find/7/25/2019 Sl Lecture01
14/31
From there we have R
R is the free S.
Started as a teaching tool by Robert Gentleman and Ross Ihaka at theUniversity of Auckland, around 1993.
Released as Free software around 1995
Version 1.0 released in 2000.
R is now the dominant statistics software in the world.
R is a GNU project.
R is available as Free Software under the terms of the Free SoftwareFoundations GNU General Public License in source code form.
Lecture 1: Programming with R
http://find/7/25/2019 Sl Lecture01
15/31
A digression on free software
Free as in free speech, and not free beer
The freedom to run the program, for any purpose
The freedom to study how the program works, and adapt it to your needs.
The freedom to redistribute copies so that you can help your neighbour
The freedom to improve the program, and release your improvements tothe public, so that the whole community benefits
Free software can be commercial software
Lecture 1: Programming with R
http://find/7/25/2019 Sl Lecture01
16/31
Why open-source?
Do not need a license.Do not need a department.
Reproducible research.
Lecture 1: Programming with R
http://find/http://goback/7/25/2019 Sl Lecture01
17/31
R is a programming language
Designed for interactive use
With a focus on data analysisBasic data structures are vectorsLarge collection of statistical functionsAdvanced statistical graphics capabilities
Lecture 1: Programming with R
http://find/7/25/2019 Sl Lecture01
18/31
R in the real world
.. It is becoming their lingua franca partly because data mining has entered a
golden age, whether being used to set ad prices, find new drugs more quickly orfine-tune financial models. Companies as diverse as Google, Pfizer, Merck,
Bank of America, the InterContinental Hotels Group and Shell use it.
Source: New York Times, 7 January, 2009
Lecture 1: Programming with R
http://find/7/25/2019 Sl Lecture01
19/31
Where do you get R?
http://www.r-project.org/
http://www.rstudio.com/
Lecture 1: Programming with R
http://find/7/25/2019 Sl Lecture01
20/31
Installing R
R can be installed on Windows, Mac or Linux.
Homework: Visit the R website and follow the installation directions. You
will want to install the base system.There are several additional user contributed add-on packages.
To install a package, be connected to the internet and type >install.packages("plm") You will be asked to select the mirror sitenearest to you. After that everything is automatic.
Load the package before using it. >
library(plm)
Lecture 1: Programming with R
http://find/7/25/2019 Sl Lecture01
21/31
Preferred operating system
Ubuntu on Linux: http://www.ubuntu.com/download
Lecture 1: Programming with R
http://find/7/25/2019 Sl Lecture01
22/31
Part II
Programming for projects
Lecture 1: Programming with R
http://find/7/25/2019 Sl Lecture01
23/31
Reusability of code
Dont design a chainsaw for only an oak treeThere might be a million species of trees
But they are all trees
Lecture 1: Programming with R
http://find/7/25/2019 Sl Lecture01
24/31
Automation: writing and coding
Frequent changes to the data-set
Frequent changes to the statistics we want reported
Embed coding in the writing of your paper
Tools: LaTeX, R, knitr.
Lecture 1: Programming with R
http://find/7/25/2019 Sl Lecture01
25/31
Writing code is like writing a proof
Each step follows the otherYou cannot jump!
Many days later you may not remember how you got there after all
Lecture 1: Programming with R
http://find/7/25/2019 Sl Lecture01
26/31
Part III
This course
Lecture 1: Programming with R
http://find/7/25/2019 Sl Lecture01
27/31
Goals
Learn R as a statistics toolbox with a strong emphasis on programminglanguage aspects
Learn automation using R, LaTeX and knitr
Lecture 1: Programming with R
http://find/http://goback/7/25/2019 Sl Lecture01
28/31
Broad syllabus
Data manipulation in R: objects; tables and cross-classifications; array andmatrix operations
Writing functions
Graphics: Basic plots; lattice; ggplots
Univariate statistics: generating random data; univariate statistics;bootstrap and permutation methods
High performance computing: Introduction to high performance
computing
Lecture 1: Programming with R
http://find/http://goback/7/25/2019 Sl Lecture01
29/31
Grading
Examination Marks
Class tests 20Midterm exam 40Final exam 40
Lecture 1: Programming with R
http://find/7/25/2019 Sl Lecture01
30/31
Resources
The R Manuals: www.r-project.org
Introductory Statistics with R, by Peter DalgaardModern Applied Statistics with S, by W. N. Venables and B. D. Ripley
Tutorials on the web: http://www.r-bloggers.com/google-developers-r-programming-video-lectures/
Lecture 1: Programming with R
http://find/http://goback/7/25/2019 Sl Lecture01
31/31
Consultation
Tuesdays: 15:30-17:00, Room 204Email: [email protected]
http://find/