Statistical Peril in the Transportation Planning Polygon Kevin Hathaway, Colin Smith, & John Gliebe May 2013.

Post on 12-Jan-2016

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Statistical Peril in the Transportation Planning Polygon

Kevin Hathaway, Colin Smith, &

John Gliebe

May 2013

Aggregated Data – A Planning Reality & A Planning Problem

Aggregation units are required since traffic analysis zones are the convenient grouping scheme for regional and statewide transportation planning.

Zone-level variables are both consumed on their own and used as inputs to travel demand and land use allocation models, with the assumption that the groupings are real and fixed.

The fundamentals of spatial analysis and statistical sampling error are commonly ignored, which can have undesirable consequences.

Modifiable Areal Unit Problem: The Zone Effect

The sizes and shapes of planning zones are modifiable and arbitrary (they rarely represent real geographical properties or segment the population in a meaningful way).

Changing the polygon boundaries can drastically change the zonal statistics (e.g. Gerrymandering)

The scale of the zones will also change the results.

As the polygons get bigger and underlying population grows, variability is washed away.

As the polygons get small and underlying population shrinks, we are more likely to observe extreme (and perhaps unreliable) values.

When we mix scales in a planning region, both statistical properties will be present.

Modifiable Areal Unit Problem: The Scale Effect

Normalizing a Layer’s Attribute - ArcMAP

Show a map with

New York State Housing Units Block-level: Units per Person

Hous. Units per Person

Population in Block

Two Ways to View the Distribution

Standard Errors for Averages and Proportions

Standard Errors for Averages and Proportions

Start with One Polygon

Simulated polygon with population of orange and grey squares.

Color locations are randomly assigned

20.2% of the zone is orange.

Cut the polygon up and measure the orange within each smaller polygon.

Look at size before location

• Always plot your statistic against its own denominator.

• Funnel or cone shapes indicate you may have a scale effect playing a role.

More on Scale – Conventional Guidance on TAZ size

According to AASHTO:

“…, it is strongly suggested that TAZs should be delineated with a resident or worker population of 1,200 or greater.”

Land Use Model Inputs

Employment Density (jobs/acre)

Non-residential Developed Acres

Rates of Seatbelt Use Across a State

Road Segment Daily Volume

What should you do?

1. Resist the temptation to explain all the spatial and temporal variability.

2. For TAZ delineation, optimization routines and explicit testing of varying zone structures have been proposed (Ding, 1998 & Viegas, et al., 2007).

3. Run simulations on your own planning units to explore the severity of the zone and scale effects. The impacts depend on the measures and the specific region under study.

More Tactical Adjustments during Data Exploration

1. Binomial Data with small n: methods that follow the Law of Succession (Laplace, Wilson, or Jeffreys) are helpful to improve small sample statistics.

2. For zone-level means, you can center the distribution by using the regional mean as the expected value.

Mapping Polygon Values

Mark NewmanUniversity of Michigan

2008 Presidential Election Results

Mark NewmanUniversity of Michigan

Mark NewmanUniversity of Michigan

Mark NewmanUniversity of Michigan

Cartograms in ArcGIS & R

Recap and Final Thoughts

1. Rather than ignoring sampling variation, we should recognize its presence.

2. Rather than only asking if the observational differences are a function of location or polygon-specific attributes, consider some or most of the differences could be merely be a function of the base size and your zonal delineation.

3. Real variation due to the underlying spatial phenomenon are often blurred by our unit of analysis. Both aggregation and disaggregation create problems; our job is understand the trade-offs.

4. The least densely populated zones are sometimes the largest. The use of thematic mapping has an unfortunate consequence of overemphasizing large units and minimizing small ones. Consider alternatives that are more honest in their visual representation.

Sources & Further Reading

1. Statistics for Spatial Data. Noel A. Cressie. 1993.

2. Spatial Modeling of Regional Variables. Noel Cressie and Ngai H. Chan. 1986.

3. The Most Dangerous Equation. Howard Wainer. 2007.

4. Diffusion-based method for producing density-equalizing maps. Michael T. Gastner and M. E. J. Newman. 2004.

5. Effects of the modifiable areal unit problem on the delineation of traffic analysis zones. Viegas, et al. 2007.

6. The GIS-Based Human Interactive TAZ Design Algorithm: Examining the Impacts of Data Aggregation on Transportation Planning Analysis. Ding, C. 1998.

7. When 100% Really Isn’t 100%: Improving the Accuracy of Small-Sample Estimates of Completion Rates. James Lewis and Jeff Sauro. 2006.Kevin.Hathaway@rsginc.com

top related