Page 1
Anton Slutsky, Lead Data Scientist, EPAM Systems
Hadoop + Mahout
Confidential
Page 2
Confidential 2
Agenda
Page 3
Confidential 3
Machine Learning vs. Statistics
Page 4
Confidential 4
Types of Machine Learning
Page 5
Confidential 5
Machine Learning Applications
Page 6
Confidential 6
Machine Learning and Data
Page 7
Confidential 7
Obligatory Big Data Slide
Page 8
Confidential 8
Hadoop
Page 9
Confidential 9
Apache Mahout
Page 10
Confidential 10
Why Hadoop + Mahout?
Page 11
Confidential 11
Machine Learning Applications
Page 12
Confidential 12
Machine Learning Applications
Page 13
Confidential 13
Hadoop + Mahout Algorithm
Page 14
Confidential 14
Get data into Hadoop
Page 15
Confidential 15
Convert data into Mahout format
Page 16
Confidential 16
Mahout format – Sequence File
Page 17
Confidential 17
Learn model from Data