Data-Driven Parking - Information School · Research Domain Deliver Results & Next Steps Extract Data to Cloud Database Clean Data & Conduct EDA Develop Models & Engineer Features

Post on 10-Jun-2020

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

$490 million

Data-Driven Parking

$1200

1. Statistics from: inrix.com/press-releases/parking-pain-us/

City of SeattleProject Sponsor

Parking searches cost drivers 58 hours and $1200 in wasted time, fuel, and emissions every year.

To reduce parking pains, the city needs better information:

Actual parking occupancy within on-street paid zones

Drivers’ parking behaviors and variables that are related to parking

PHOTO CREDIT: FLICKR PHOTO/ORAN VIRIYINCY

in Seattle’s Belltown North neighborhood

yearly cost to each driver

Define Problem & Research Domain

Deliver Results & Next Steps

Extract Data to Cloud Database

Clean Data & Conduct EDA

Develop Models & Engineer Features

ResultsCalibrate transaction data based on manual survey results and

build statistical models to predict parking occupancy

across Seattle

Parking searches waste time and money.

Allison Chapman • Shreya Sabharwal Sahil Aggarwal • Nathan Cunningham

2019 Capstone Project by:

58 hrs/yr

This adds up to $490 million across the entire city of Seattle.1

Transaction data differs from manual survey data throughout the day

12pm

8:30am

7:30pm

4pm

2

3

4

5

6

7

1

0 Time

Ave

rage

Occ

upie

d Sp

ots

Information Assets & Challenges

Manual parking survey to learn actual parking occupancy—only completed once a year

Pay station and pay-by-mobile transactions recorded each minute—doesn’t account for legal permit use, illegal parking, drivers vacating spots early, etc.

Datasets about weather, events, employment, and nearby businesses

Seattle Open Data

Other Sources

Random Forest model performed best. Holdout set score:

Our Approach

Lagged parking occupancy

Hour of the day

Types of businesses within two blocks2

Proximity to downtown Seattle

2. We split businesses by type. Top-performing categories include medical

offices, grocery stores, and bars and restaurants.

Top variables related to parking:

0.17RMS Error

better

Give City of Seattle full documentation of our approach to inform their parking policy decisions Final Steps

Random Forest

XGBoost

Linear Regression

0.1705

0.1715

0.1740

0.2853

0.2989

0.3167

Mod

el

Root Mean Squared Error (lower is better)

With Lagged Features

Without Lagged Features

0.300.200.100

Model Performance on Validation Set

In other words, parking predictions are off by 17 percentage points on average.

BEST

top related