Top Banner
© Nube Technologies Real Time Fuzzy Matching With Spark and ElasticSearch
18

Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)

Aug 08, 2015

Download

Data & Analytics

Spark Summit
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)

© Nube Technologies

Real Time Fuzzy Matching With Spark and ElasticSearch

Page 2: Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)

© Nube Technologies

About Us

The only way to do great work is to love what you do.

- Steve Jobs

Page 3: Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)

© Nube Technologies

The problem - lake or swamp?

Page 4: Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)

© Nube Technologies

Duplicates

Page 5: Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)

© Nube Technologies

Challenges

● Quadratic problem● No standard notion of similarity● Omissions, typos and other issues● Different languages

Page 6: Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)

© Nube Technologies

Use Case - Customer Record Dedup

Page 7: Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)

© Nube Technologies

Use Case - Customer Record Dedup

Page 8: Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)

© Nube Technologies

Use Case - Shopping Site Comparison

Page 9: Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)

© Nube Technologies

Use Case - Shopping Site Comparison

Page 10: Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)

© Nube Technologies

Other Use Cases

● Cross selling● Financial Credit Ratings● Fraud Analytics● Catalog and inventory management● Household and individual level analytics.

Page 11: Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)

© Nube Technologies

Lets start wishing...

● Data variety● Scalable● No manual configuration of rules or

algorithms● Multi language● Real time

Page 12: Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)

© Nube Technologies

Reifier - learn

Page 13: Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)

© Nube Technologies

Reifier - learn

Page 14: Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)

© Nube Technologies

Reifier - learn

Page 15: Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)

© Nube Technologies

Reifier - learn

Page 16: Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)

© Nube Technologies

Real Time

Spark + ElasticSearch

Page 17: Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)

© Nube Technologies

Spark Benefits

● Distributed● Scalable● Fast● Machine Learning● Sampling● No need to orchestrate multiple jobs

Page 18: Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)

© Nube Technologies

Thank You!

www.nubetech.co [email protected]