Top Banner
SceneFindr Stephanie Stark
17

DE Presentation v2

Apr 12, 2017

Download

Data & Analytics

scstark
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: DE Presentation v2

SceneFindrStephanie Stark

Page 2: DE Presentation v2

Motivation● Interested in hearing live music, but don’t

know where to go?

Page 4: DE Presentation v2

Pipeline

Page 5: DE Presentation v2

Data Sources

Page 6: DE Presentation v2

Data Sources

Page 7: DE Presentation v2

Data Sources

Page 8: DE Presentation v2

Data Sources

Page 9: DE Presentation v2

Data Sources

Page 10: DE Presentation v2

Pipeline

Page 11: DE Presentation v2

ETL

Artists

Events

Feature Extraction

K-Means Clusterin

g

Recommendations

Page 12: DE Presentation v2

Database

Page 13: DE Presentation v2

Pipeline

Page 14: DE Presentation v2

Scaling

500gb Artist Data

9 Hours

500gb Event Data

Page 15: DE Presentation v2

Lessons Learned (the hard way!)● Scala● Parallelized ML algorithms

Page 16: DE Presentation v2

About Me

B.A., Mount Holyoke CollegeMajor: MathematicsMinor: Computer Science

Education

Interests ReadingArt HistoryHiking

Stephanie Stark

Page 17: DE Presentation v2

Future WorkImplement TF/IDF compatibility for projectUse PCAImplement cosine similarity for feature clusteringCluster within metro areaUse Redis as a cache for feature vectors