Top Banner
Rent Surveys Web scraping to provide timely rental data Created by: Graham MacDonald Presented by: Rob Pitingolo NNIP Partnership Meeting, June 2013
15

Rent Surveys Web scraping to provide timely rental data Created by: Graham MacDonald Presented by: Rob Pitingolo NNIP Partnership Meeting, June 2013.

Mar 29, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Rent Surveys Web scraping to provide timely rental data Created by: Graham MacDonald Presented by: Rob Pitingolo NNIP Partnership Meeting, June 2013.

Rent SurveysWeb scraping to provide timely rental data

Created by: Graham MacDonald

Presented by: Rob Pitingolo

NNIP Partnership Meeting, June 2013

Page 2: Rent Surveys Web scraping to provide timely rental data Created by: Graham MacDonald Presented by: Rob Pitingolo NNIP Partnership Meeting, June 2013.

Why do this?Survey every rental housing unit listed online (n =potentially thousands)

Collect valuable information about neighborhood rents

Precision allows for indicators at small-level geographies

Page 3: Rent Surveys Web scraping to provide timely rental data Created by: Graham MacDonald Presented by: Rob Pitingolo NNIP Partnership Meeting, June 2013.

What is PadMapper?A “Meta-site” that regularly draws rental data from top online listing services (and its own listing service)

Makes the search process easier through simple filters and a Google map interface

Page 4: Rent Surveys Web scraping to provide timely rental data Created by: Graham MacDonald Presented by: Rob Pitingolo NNIP Partnership Meeting, June 2013.

What is web scraping?Web scraping is a program (often written in python) that extracts data from websites and puts it in a standard structured format. We scrape data weekly.

Page 5: Rent Surveys Web scraping to provide timely rental data Created by: Graham MacDonald Presented by: Rob Pitingolo NNIP Partnership Meeting, June 2013.

What can the data tell us?List prices for apartments listed online…

…but not rentals that never make it to the web (or don’t get listed at all).

Page 6: Rent Surveys Web scraping to provide timely rental data Created by: Graham MacDonald Presented by: Rob Pitingolo NNIP Partnership Meeting, June 2013.

How Inclusive Is PadMapper?

Ward Renter-Occupied Units PadMapper Listings Listings per 100 Renter-Occupied units PctUnder18 PctWhiteNH PctBlackNH PctPoorPersons Pct16OverEmployed AvgFamilyIncome

2 24,539 2263 9.2 4.8 70 9.8 15 67 205343

6 19,234 1610 8.4 14 47 43 18 67 115992

3 17,931 1486 8.3 13 78 5.6 7.1 67 257241

1 22,435 1491 6.6 12 40 33 16 71 94197

5 15,447 829 5.4 17 15 77 19 54 78559

4 11,843 634 5.4 20 20 59 9.9 61 116668

8 20,071 413 2.1 30 3.2 94 34 48 44341

7 17,255 249 1.4 24 1.5 95 27 47 54809

Over a 12 week period from 3/14 to 5/31, there tended to be more listings in higher income areas with more adults.

Page 7: Rent Surveys Web scraping to provide timely rental data Created by: Graham MacDonald Presented by: Rob Pitingolo NNIP Partnership Meeting, June 2013.

How Inclusive Is PadMapper?(Weighted average, based on the number of points in each tract.)

Page 8: Rent Surveys Web scraping to provide timely rental data Created by: Graham MacDonald Presented by: Rob Pitingolo NNIP Partnership Meeting, June 2013.

What did we find so far?General Council Ward-level price trends

Page 9: Rent Surveys Web scraping to provide timely rental data Created by: Graham MacDonald Presented by: Rob Pitingolo NNIP Partnership Meeting, June 2013.

What did we find so far?It is difficult to get enough observations for 3 bedroom apartments.

Page 10: Rent Surveys Web scraping to provide timely rental data Created by: Graham MacDonald Presented by: Rob Pitingolo NNIP Partnership Meeting, June 2013.

Use larger time periods for smaller geographies. Currently, we still need more data for the D.C. Neighborhood Cluster level, especially for 3-bedroom units.

Goals for the future

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39$0

$500

$1,000

$1,500

$2,000

$2,500

$3,000

Median Price for a 1 Bedroom Apartment by Neighborhood Cluster, Wash-ington D.C.

Based on 12 weekly data collection points, March 14 to May 31, 2013More than 9 observations - Less than 10 observations

Page 11: Rent Surveys Web scraping to provide timely rental data Created by: Graham MacDonald Presented by: Rob Pitingolo NNIP Partnership Meeting, June 2013.

Use larger time periods for smaller geographies. Currently, we still need more data for the D.C. Neighborhood Cluster level, especially for 3-bedroom units.

Goals for the future

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39$0

$1,000

$2,000

$3,000

$4,000

$5,000

$6,000

Median Price for a 3 Bedroom Apartment by Neighborhood Cluster, Wash-ington D.C.

Based on 12 weekly data collection points, March 14 to May 31, 2013More than 9 observations - Less than 10 observations

Page 12: Rent Surveys Web scraping to provide timely rental data Created by: Graham MacDonald Presented by: Rob Pitingolo NNIP Partnership Meeting, June 2013.

How does it actually work?

Step 1:Download data from web API to offline database

Step 2:Use ArcGIS to geocode lat/long data to local geographies

Step 3:Use statistical software to analyze your rent survey

Page 13: Rent Surveys Web scraping to provide timely rental data Created by: Graham MacDonald Presented by: Rob Pitingolo NNIP Partnership Meeting, June 2013.

I want to set this up. How?

Code available on request (Python + SAS). Contact Graham MacDonald.

You will need to know/have:

• Python or another web-scraping scripting language.

• A statistical software package or a database system:

• SAS, Stata, etc.

• MySQL, PostgreSQL

• Server-side scripting language

• PHP, Ruby, Python

Page 14: Rent Surveys Web scraping to provide timely rental data Created by: Graham MacDonald Presented by: Rob Pitingolo NNIP Partnership Meeting, June 2013.

Wait, is this legal?

It appears to be legal.

Sites like Craiglist do not have any exclusive content language in their Terms of Use.

Currently, PadMapper is involved in a lawsuit brought by Craigslist, but the judge only allowed evidence from posts made in a three week period between July 16 and August 8, 2012, when Craigslist required that users provide the site with exclusive content rights, before they ended up dropping that language as a result of criticism.

We do not use data from that time period.

PadMapper is not involved in any other ongoing litigation.

PROCEED AT YOUR OWN RISK

Page 15: Rent Surveys Web scraping to provide timely rental data Created by: Graham MacDonald Presented by: Rob Pitingolo NNIP Partnership Meeting, June 2013.

Resources:

Padmapperwww.padmapper.com

NeighborhoodInfo DCwww.neighborhoodinfo.org

Graham MacDonald:[email protected]

Rob [email protected]