Top Banner
MLS Data Barrett Avery
18

MLS Data with Barrett

Jan 19, 2017

Download

Real Estate

IDX Broker
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: MLS Data with Barrett

MLS DataBarrett Avery

Page 2: MLS Data with Barrett

What is Data Acquisition at IDX?• Acquire data from over 600 MLS’s across US and Canada.• Employ various methodologies.• Sanitize and normalize the data .• Store the data for use by our client’s websites.•Maintain a constant vigil on all MLS feeds to ensure they are

running/updating properly.

Page 3: MLS Data with Barrett

How we acquire MLS data at a high level.

Download• Via RETS, FTP, SFTP, SOAP, Xml Feed

Validate• Sanitize the data to ensure integrity and readability.

Map/Store

• Map data to be human readable.• Store in MySQL, NoSQL, Search Indexers.

Make Available

• Make data ready for display and search on client websites.• Maintain data reliability and availability.

Page 4: MLS Data with Barrett

Why should you care?

• Data is everything, it is our content. Without data our websites would be nothing more than pretty templates.• Data is what our customers are searching for, it is what

powers the internet as we know it. • At the end of the day, it’s the data that is our bread and

butter.

Page 5: MLS Data with Barrett

Some Stats• Total listings across 589 MLS’s: 3,826,086 in Platinum• With apx. 1,884,091 listings across 279 MLS’s in Original

• Translates to 90 Gigabytes worth of data in Platinum• With apx. 20 GB worth of data in Original

• Stored across 7 AWS (cloud) RDS database stacks for Platinum• 8 physical database servers for Original

Page 6: MLS Data with Barrett

Main Technologies used in Acquisition

• PHP• MySQL• AWS (Amazon Web Services)• Laravel (PHP Framework – Platinum only)• NoSQL (Platinum only)• Search Indexers (Platinum only)• Node.js (Platinum only)

Page 7: MLS Data with Barrett

How do we get all that data?

• RETS v1.5 through v1.8• FTP• SFTP• SOAP• XML Feeds

Page 8: MLS Data with Barrett

Real Estate Transaction Standard (RETS)

• Custom built using the phRETS PHP library• Using the phRETS library, parses out the XML responses and

stores them in CSV format for later parsing and eventual storage in database.• Compatible with version 1.5 through 1.8

Page 9: MLS Data with Barrett

FTP/SFTP

• Using custom FTP methods built in PHP• Downloading and parsing file formats such as:• Text (TXT)• Comma Separated Lists (CSV)• Tab or other control character delimited files (TXT)• Just about anything else that PHP can read

• Parse out raw files to be stored in database

Page 10: MLS Data with Barrett

SOAP

•Using highly customized SOAP script written in PHP•Parsing XML data returned by SOAP requests into CSV

format for later parsing and storage.•Currently we only have one of these, NWMLS.

Page 11: MLS Data with Barrett

XML

• Using a custom XML parser written in PHP• Parsing out the XML into CSV format for later parsing and

storage by the application.• These usually are on-off boards with their own set of rules

and layouts

Page 12: MLS Data with Barrett

Normalizing Data• Sanitizing common things such as Booleans, dates, and numbers.• Associating codes with their respective long names (where

applicable).• Accommodating for non-standard formatting of data.• Adhering to MLS display rules, Map fields in the data to more human

readable fields.• Using robust database tools such as MySQL, NoSQL and Search

Indexers to ensure fast and secure storage of the data. Making the data displayable and searchable on client websites.

Page 13: MLS Data with Barrett

Images• RETS

• Downloading from server.• Downloading Object URL’s.• Downloading Media Objects from RETS Resource.• Setting URL’s for images based off of MLS provided spec.

• FTP/SFTP• Download directly from server.• Download a list of URL’s to reference.• Download a list of filenames to download.

• SOAP• Download directly based off of date updated.

• XML• Various means, mostly directly from server.

Page 14: MLS Data with Barrett

Geocoding (GIS)

•Currently using MapQuest•We store over 24 million valid geocodes•Updates everyday.•Approximately 44,000 new geocodes per day

Page 15: MLS Data with Barrett

Putting it all together

•Download listing data.•Download agent/office data•Download media (images, virtual tours, open houses)•Associate all components by their unique ID’s• ListingID, InternalID, AgentID, OfficeID, MediaID

(Open Houses and Virtual Tours)

Page 16: MLS Data with Barrett

Search the data

• Robust• Configurable• Fast• Accurate• Across multiple devices

Page 17: MLS Data with Barrett

Future plans• Full Object Oriented architecture•Code to adhere to PSR standard• Full integration with NoSQL, Search Indexers and MySQL• This will provide much quicker searches as well as be

more scalable•Multi-Day updates for all RETS MLS’s• Sold Data for MLS’s that support and provide it

Page 18: MLS Data with Barrett