Data-Driven Parking - Information School · Research Domain Deliver Results & Next Steps Extract Data to Cloud Database Clean Data & Conduct EDA Develop Models & Engineer Features
Post on 10-Jun-2020
0 Views
Preview:
Transcript
$490 million
Data-Driven Parking
$1200
1. Statistics from: inrix.com/press-releases/parking-pain-us/
City of SeattleProject Sponsor
Parking searches cost drivers 58 hours and $1200 in wasted time, fuel, and emissions every year.
To reduce parking pains, the city needs better information:
Actual parking occupancy within on-street paid zones
Drivers’ parking behaviors and variables that are related to parking
PHOTO CREDIT: FLICKR PHOTO/ORAN VIRIYINCY
in Seattle’s Belltown North neighborhood
yearly cost to each driver
Define Problem & Research Domain
Deliver Results & Next Steps
Extract Data to Cloud Database
Clean Data & Conduct EDA
Develop Models & Engineer Features
ResultsCalibrate transaction data based on manual survey results and
build statistical models to predict parking occupancy
across Seattle
Parking searches waste time and money.
Allison Chapman • Shreya Sabharwal Sahil Aggarwal • Nathan Cunningham
2019 Capstone Project by:
58 hrs/yr
This adds up to $490 million across the entire city of Seattle.1
Transaction data differs from manual survey data throughout the day
12pm
8:30am
7:30pm
4pm
2
3
4
5
6
7
1
0 Time
Ave
rage
Occ
upie
d Sp
ots
Information Assets & Challenges
Manual parking survey to learn actual parking occupancy—only completed once a year
Pay station and pay-by-mobile transactions recorded each minute—doesn’t account for legal permit use, illegal parking, drivers vacating spots early, etc.
Datasets about weather, events, employment, and nearby businesses
Seattle Open Data
Other Sources
Random Forest model performed best. Holdout set score:
Our Approach
Lagged parking occupancy
Hour of the day
Types of businesses within two blocks2
Proximity to downtown Seattle
2. We split businesses by type. Top-performing categories include medical
offices, grocery stores, and bars and restaurants.
Top variables related to parking:
0.17RMS Error
better
Give City of Seattle full documentation of our approach to inform their parking policy decisions Final Steps
Random Forest
XGBoost
Linear Regression
0.1705
0.1715
0.1740
0.2853
0.2989
0.3167
Mod
el
Root Mean Squared Error (lower is better)
With Lagged Features
Without Lagged Features
0.300.200.100
Model Performance on Validation Set
In other words, parking predictions are off by 17 percentage points on average.
BEST
top related