Top Banner
Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy [email protected]
48

Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy [email protected].

Dec 22, 2015

Download

Documents

Darren Sparks
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

Simple Thematic Mappingin Stata

3rd German Stata Users Group Meeting

Berlin, 8 April 2005

Maurizio PisatiUniversity of Milano Bicocca – Italy

[email protected]

Page 2: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

Thematic maps

Thematic maps represent the spatial distribution of one or more variables of interest within a given geographical unit

Page 3: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

Thematic maps

Examples:– A sociologist could use a choropleth map (a.k.a.

shaded map) to show how the percentage of families below the poverty line varies across the states or the provinces of a given country

– A police officer could be interested in analyzing a dot map showing the locations of drug markets within a given city

Page 4: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

Software for thematic mapping

Usually, to produce state-of-the-art thematic maps one has to resort to specialized software (e.g., ArcView, MapInfo)

In some cases, however, it is possible to exploit the graphical engine of a general-purpose statistical package to draw simple but effective thematic maps

Page 5: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

Stata’s mapping capabilities

Up until version 7, Stata offered very limited mapping capabilities

On the other hand, the graphical engine introduced in Stata 8 is quite flexible and makes it possible to draw several kinds of maps in a relatively simple manner

Page 6: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

The tmap package

The tmap package is a suite of Stata programs designed to draw five kinds of thematic map:– Choropleth maps– Proportional symbol maps– Deviation maps– Dot maps– Label maps

Page 7: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

The tmap package

Choropleth, proportional symbol, and deviation maps are intended to depict area data

Dot maps are suitable for representing point data

Label maps can be used to show data of both types

Page 8: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

The tmap package

The tmap package exploits the possibility – offered by the new Stata graphical engine – to overlay a large number of different graphs, each of which is used to create a distinct element of the desired map

Page 9: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

The tmap package

Specifically:– graph twoway area is used to draw the outlines

of the geographical areas of interest and to fill them with the appropriate colors

– graph twoway scatter is used to plot the proper symbols or labels when required

Page 10: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

Required datasets

To use tmap, one typically needs two Stata datasets:– The master dataset– The boundary dataset

Page 11: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

Master dataset

The master dataset is intended to store the spatial data to be represented

It is a standard cases-by-variables dataset whose rows represent the geographical areas or locations objects of analysis

Page 12: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

Master dataset: example

id land turnout

1 Baden-Wuerttemberg 62.6 2 Bayern 57.1 3 Bremen 61.3 4 Hamburg 68.7 5 Hessen 64.6 6 Niedersachsen 67.0 7 Nordrhein-Westfalen 56.7 8 Rheinland-Pfalz 62.1 9 Saarland 55.5 10 Schleswig-Holstein 66.5 11 Brandenburg 56.4 12 Mecklenburg-Vorpommern 70.6 13 Sachsen 59.6 14 Sachsen-Anhalt 56.5 15 Thueringen 53.8 16 Berlin 68.1

Page 13: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

Boundary dataset

The boundary dataset is intended to store the geographical boundaries of the whole geographical unit of interest R or of its sub-areas Ai (i = 1,…,n)

Page 14: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

Boundary dataset

The boundary dataset must always include the following three variables:– _ID, which contains the numeric identifier of R or of

each sub-area Ai– _X, which contains the x-coordinates of the polygon

or polygons that make up R or each sub-area Ai– _Y, which contains the y-coordinates of the polygon

or polygons that make up R or each sub-area Ai

Page 15: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

Boundary dataset

If one or more of the sub-areas Ai are “islands”, i.e., are completely surrounded by the territory of another sub-area, then the boundary dataset must include an additional variable:– _ISLAND, which takes value 1 when the

corresponding sub-area is an “island”, and value 0 otherwise

Page 16: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

Boundary dataset

The boundary dataset must always be sorted by variable _ID

Page 17: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

Boundary dataset

Each polygon included in the boundary dataset must be defined by 1+k+1 records, each of which corresponds to a proper pair of (x,y) coordinates

Page 18: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

Boundary dataset

The first record denotes the beginning of a new polygon and corresponds to a missing coordinate pair (.,.)

Page 19: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

Boundary dataset

The 2nd to (k+1)th records denote the k nodes of the polygon and correspond to the k coordinate pairs (x,y) that define such nodes

These records must be arranged so as to correspond to consecutive nodes

Page 20: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

Boundary dataset

The last record denotes the end of the polygon and corresponds to a coordinate pair which is an exact replica of the first node of the polygon

Page 21: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

Boundary dataset: example

A1 A1

A2

A3

Page 22: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

Boundary dataset: example

+-------------------------+ | _ID _X _Y _ISLAND | |-------------------------| | 1 . . 0 | <- Polygon 1: start | 1 10 10 0 | | 1 10 30 0 | | 1 18 30 0 | | 1 18 10 0 | | 1 10 10 0 | <- Polygon 1: end | 1 . . 0 | <- Polygon 2: start | 1 22 10 0 | | 1 22 30 0 | | 1 30 30 0 | | 1 30 10 0 | | 1 22 10 0 | <- Polygon 2: end |-------------------------| | 2 . . 0 | <- Polygon 3: start | 2 10 30 0 | | 2 10 50 0 | | 2 30 50 0 | | 2 30 30 0 | | 2 10 30 0 | <- Polygon 3: end |-------------------------| | 3 . . 1 | <- Polygon 4: start | 3 22 48 1 | | 3 28 48 1 | | 3 28 42 1 | | 3 22 42 1 | | 3 22 48 1 | <- Polygon 4: start +-------------------------+

Page 23: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

Boundary dataset: mif2dta

mif2dta is a simple Stata program that converts MapInfo Interchange Format boundary files into Stata boundary datasets

mif2dta converts any given pair of files rootname.mif and rootname.mid into a new pair of Stata datasets: rootname-Coordinates.dta (the boundary dataset) and rootname-Database.dta (the master dataset)

Optionally, mif2dta also computes the coordinates of the centroids of the geographical areas of interest

Page 24: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

Boundary dataset: shapefiles

To convert a shapefile into a Stata boundary dataset:– Convert the shapefile of interest into the proper pair

of MIF files (e.g., using the freeware DOS program shp2mif.exe)

– Use mif2dta to convert the MIF files into the corresponding Stata master and boundary datasets

Page 25: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

tmap choropleth

tmap choropleth represents the spatial distribution of area data by means of choropleth maps, i.e., maps where each of n sub-areas is colored (or shaded) according to a discrete scale based on the value taken on by a quantitative variable of interest in that sub-area

Page 26: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

tmap choropleth

The number of classes that make up the discrete scale must be between 2 and 9

The corresponding class breaks can be based on four different criteria:– Quantiles– Equal intervals– Standard deviates– Custom

Page 27: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

tmap choropleth: example

tmap choropleth spd2, /// id(id) map(Germany-Coordinates.dta) /// clmethod(quantile) clnumber(4) /// palette(Reds) /// title("Pct. votes for SPD at latest election") /// subtitle("(Two-party share)") /// legpos(5)

Page 28: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.
Page 29: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

tmap propsymbol

tmap propsymbol represents the spatial distribution of area data by means of proportional symbol maps, i.e., maps where the value taken on by a quantitative variable of interest in each of n sub-areas is represented by a symbol whose size is proportional to the value itself

Page 30: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

tmap propsymbol: example

tmap propsymbol spd2, /// x(x_coord) y(y_coord) /// map(Germany-Coordinates.dta) /// scolor(red) sshape(O) ssize(1.2) /// ocolor(white) fcolor(sand) /// title("Pct. votes for SPD at latest election") /// subtitle("(Two-party share)") /// note("Symbol size proportional to variable value", span)

Page 31: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.
Page 32: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

tmap deviation

tmap deviation represents the spatial distribution of area data by means of deviation maps, a particular kind of proportional symbol maps where:– symbol size expresses the absolute deviation of the

quantitative variable of interest from its mean or median

– symbol fill expresses the sign of the deviation (positive or negative)

Page 33: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

tmap deviation: example

tmap deviation spd2, /// x(x_coord) y(y_coord) /// map(Germany-Coordinates.dta) /// scolor(blue) sshape(O) ssize(1.2) /// ocolor(white) fcolor(bluishgray) /// title("Pct. votes for SPD at latest election") /// subtitle("(Two-party share)") /// note(`"`"Solid circles denote positive deviations from the mean"'"' /// `"`"Hollow circles denote negative deviations from the mean"'"' /// `"`"Circle size proportional to absolute value of deviation"'"', /// span)

Page 34: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.
Page 35: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

tmap label

tmap label is an auxiliary program that allows the user to superimpose onto a base map the values taken on by a numeric or string variable at different locations

This program can be used, for example, to plot sub-area names or to represent the spatial distribution of a given quantitative variable of interest in numeric form

Page 36: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

tmap label: example

tmap label state, /// x(x_coord) y(y_coord) /// map(Germany-Coordinates.dta) /// lcolor(red) llength(30) lsize(0.8) /// ocolor(sienna) fcolor(eggshell) /// title("Länder")

Page 37: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.
Page 38: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

tmap dot

tmap dot represents the spatial distribution of point data by means of dot maps, i.e., maps where the locations at which some “events” of interest have occurred are indicated by symbols whose color and/or shape can vary according to the type of “event”

Page 39: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

tmap dot: example

tmap dot, /// x(x) y(y) /// map("MilanoOutline-Coordinates.dta") /// by(type) /// fcolor(stone) /// title("Location of police stations") /// subtitle(`"`"Milano, 2004"'"') /// legtitle("Police force", size(*0.7)) /// legbox(lc(black))

Page 40: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.
Page 41: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

tmap choropleth: more examples

tmap choropleth murder if conterminous, /// id(id) map(Us48-Coordinates.dta) /// palette(Blues) ocolor(white) bcolor(navy) /// title(`"`"Murders per 100,000 population & Pct. pop. with high school diploma"'"', /// color(white) span) subtitle("United States 1994", color(white)) /// legbox(lc(white) fc(navy) margin(medsmall)) /// legpos(5) legcol(white) /// legtitle("Murder rate", color(white) size(*0.8)) /// addplot(deviation hsdip if conterminous, x(x) y(y) sc(red) ssi(0.8)) /// note(`"`"Circles represent pct. pop. with high school diploma"'"' /// `"`"Solid circles denote positive deviations from the mean"'"' /// `"`"Hollow circles denote negative deviations from the mean"'"' /// `"`"Circle size proportional to absolute value of deviation"'"', /// color(white) span)

Page 42: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.
Page 43: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

tmap choropleth: more examples

tmap choropleth winner if conterminous, /// id(id) map(Us48-Coordinates.dta) /// clmethod(unique) /// palette(Custom) colors(`"`"203 24 29"'"' navy) /// title(US Presidential Elections 2004) /// subtitle(Pct. votes for Bush) /// legpos(5) legsize(1.2) /// legtitle("Winner", size(*0.8)) legcount /// addplot(label votebushpct if conterminous, /// x(x) y(y) lc(gs14) ls(0.9))

Page 44: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.
Page 45: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

tmap and spatial data analysis

tmap can be used to display results produced by other Stata programs, e.g., spatial data analysis programs

What follows is an example of a choropleth map + LISA cluster map (Anselin 1995) created using a combination of tmap and a modified version of spatlsa

Page 46: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

European Parliament Elections 2004 - LombardiaPct. votes for Northern League

Page 47: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

Ackowledgments

The color schemes used in tmap choropleth were designed by Dr. Cynthia A. Brewer, Department of Geography, The Pennsylvania State University, University Park, Pennsylvania, USA. The color schemes are used with Dr. Brewer’s permission and are from the ColorBrewer map design tool available at ColorBrewer.org

Page 48: Simple Thematic Mapping in Stata 3 rd German Stata Users Group Meeting Berlin, 8 April 2005 Maurizio Pisati University of Milano Bicocca – Italy maurizio.pisati@unimib.it.

Ackowledgments

I wish to thank Nick Cox and Ian S. Evans for helping improve the first release of tmap

The second release owes much to ideas and suggestions by Nick Cox and Vince Wiggins

Any remaining errors and limitations are mine