Recommender Systems with Ruby (adding machine learning, statistics, etc)

Post on 13-Jan-2015

10787 Views

Category:

Technology

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

Talk lectured at Frevo On Rails Ruby Meeting at Recife/Pernambuco 14/09/2013

Transcript

Ruby in the world of recommendations

(also machine learning, statistics and visualizations..)

Marcel Caraciolo@marcelcaracioloDeveloper, Cientist, contributor to the Crab recsys project,works with Python for 6 years, interested at mobile,education, machine learning and dataaaaa!Recife, Brazil - http://aimotion.blogspot.com

Saturday, September 14, 2013

FAÇA BACKUP!    NUNCA:  find  .  -­‐type  f  -­‐not  -­‐name  '*pyc'  |  xargs  rm

Saturday, September 14, 2013

Scientific Environment

Presentation & VisualizationExperimentation

(Re-Design)

Data AcquisitionData Analysis

Saturday, September 14, 2013

Where is Ruby?

Presentation & VisualizationExperimentation

(Re-Design)

Data AcquisitionData Analysis

Saturday, September 14, 2013

Where is Ruby?

Presentation & VisualizationExperimentation

(Re-Design)

Data AcquisitionData Analysis

Saturday, September 14, 2013

Where is Ruby?

Presentation & VisualizationExperimentation

(Re-Design)

Data AcquisitionData Analysis

Saturday, September 14, 2013

Where is Ruby?

Presentation & VisualizationExperimentation

(Re-Design)

Data AcquisitionData Analysis

Saturday, September 14, 2013

Where is Ruby?

Python launched at 1991; Ruby launched at 1995

Python was highly addopted and promoted by most of the research and

development team of Google

Saturday, September 14, 2013

Where is Ruby?Python lançado em 1991; Ruby lançado em 1995

Python foi altamente popularizado com a adoção oficial de boa parte do time de pesquisa do Google

Python has been an important key of Google since its beginning, and still continues as our infra-structure grows, we are always looking for more people with skills in this language.

Peter Norvig, Google, Inc.Saturday, September 14, 2013

Where is Ruby?

Python was famous even at some old scientific articles

Saturday, September 14, 2013

Where is Ruby?Ruby’s popularity exploded at 2004.

Focus on web

Django - 2005; Numpy - 2005; BioPython - 2001; SAGE - 2005;

Matplotlib- 2000;

Python

Saturday, September 14, 2013

Where is Ruby?Programming comes second to researchers, not

first like us. - “Ruby developer answer”

Python    [(x, x*x) for x in [1,2,3,4] if x != 3]

vs Ruby`[1,2,3,4].map { |x| [x, x*x] if x != 3 }`

vs Result    [(1,1), (2,4), (4,16)]

Saturday, September 14, 2013

Where is Ruby?

Ruby

Python

Saturday, September 14, 2013

Hey, Ruby has options!

Saturday, September 14, 2013

Hey, Ruby has options!

Saturday, September 14, 2013

:(

Saturday, September 14, 2013

:D

Saturday, September 14, 2013

gem install nmatrix

git clone https://github.com/SciRuby/nmatrix.git

cd nmatrix/

bundle install

rake compile

rake repackage

gem install pkg/nmatrix-*.gem

Saturday, September 14, 2013

>> NMatrix.new([2, 3], [0, 1, 2, 3, 4, 5], :int64).pp [0, 1, 2] [3, 4, 5]=> nil

>> m = N[ [2, 3, 4], [7, 8, 9] ]=> #<NMatrix:0x007f8e121b6cf8shape:[2,3] dtype:int32 stype:dense> >> m.pp [2, 3, 4] [7, 8, 9]

Depends on ATLAS/CBLAST and written mostly in C and C++

https://github.com/SciRuby/nmatrix/wiki/Getting-started

Saturday, September 14, 2013

Hey, Ruby has options!

Saturday, September 14, 2013

Data Visualization

•R•Gnuplot•Google Charts API•JFreeChart•Scruffy•Timetric•Tioga•RChart

Saturday, September 14, 2013

Data Visualizationrequire 'rsruby'

cmd = %Q

(

pdf(file = "r_directly.pdf"))

boxplot(c(1,2,3,4),c(5,6,7,8))

dev.off()

)

def gnuplot(commands)

IO.popen("gnuplot", "w") { |io| io.puts commands }

end

commands = %Q(

set terminal svg

set output "curves.svg"

plot [-10:10] sin(x), atan(x), cos(atan(x))

)

gnuplot(commands)

http://effectif.com/ruby/manor/data-visualisation-with-rubyhttps://github.com/glejeune/Ruby-Graphviz/

Saturday, September 14, 2013

Other tools•BioRuby

#!/usr/bin/env ruby require 'bio' # create a DNA sequence object from a Stringdna = Bio::Sequence::NA.new("atcggtcggctta") # create a RNA sequence object from a Stringrna = Bio::Sequence::NA.new("auugccuacauaggc") # create a Protein sequence from a Stringaa = Bio::Sequence::AA.new("AGFAVENDSA") # you can check if the sequence contains illegal characters# that is not an accepted IUB character for that symbol# (should prepare a Bio::Sequence::AA#illegal_symbols method also)puts dna.illegal_bases # translate and concatenate a DNA sequence to Protein sequencenewseq = aa + dna.translateputs newseq # => "AGFAVENDSAIGRL"

http://bioruby.org/Saturday, September 14, 2013

Other tools•RubyDoop (uses JRuby)

module  WordCount

   class  Reducer

       def  reduce(key,  values,  context)

           sum  =  0

           values.each  {  |value|  sum  +=  value.get  }

           context.write(key,  Hadoop::Io::IntWritable.new(sum))

       end

   end

end

https://github.com/iconara/rubydoop

module  WordCount

   class  Mapper

       def  map(key,  value,  context)

           value.to_s.split.each  do  |word|

               word.gsub!(/\W/,  '')

               word.downcase!

               unless  word.empty?

                   context.write(Hadoop::Io::Text.new(word),  Hadoop::Io::IntWritable.new(1))

               end

           end

       end

   end

end

Saturday, September 14, 2013

Coming back to the world of recommenders

The world is an over-crowded place

Saturday, September 14, 2013

Coming back to the world of recommenders

!"#$%&'()$*+$,-$&.#'/0'&%)#)$1(,0#

Saturday, September 14, 2013

Recommendation Systems

Systems designed to recommend to me something I may like

Saturday, September 14, 2013

Recommendation Systems!"#$%&"'$"'(')*#*+,)

-+*#)+. -#/') 0#)1#

2' 23&4"+')1 5,6 7),*%'"&863

!

Graph Representation

Saturday, September 14, 2013

And how does it work ?

Saturday, September 14, 2013

What the recommenders realy do ?

1. Predict how much you may like a certain product o service

2. It suggests a list of N items ordered by the level of your interests.

3. It suggests a N list o f users to a product/service

4. It explains to you why those items were recommended.

5. It adjusts the prediction and recommendations based on your feedback and from anothers.

Saturday, September 14, 2013

Content Based Filtering

Gone with the Wind

Die Hard

Similar

Armagedon ToyStore

Marcel

likesrecommends

Items

Users

Saturday, September 14, 2013

Problems with Content Recommenders

1. Restrict Data Analysis

3. Portfolio Effect

- Items and users mal-formed. Even worst in audio and images

- An person that does not have experience with Sushi does not get the recommendation of the best sushi in town.

- Just because I saw 1 movie of Xuxa when I was child, it must have to recommend all movies of her (só para baixinhos!)

2. Specialized Data

Saturday, September 14, 2013

Collaborative Filtering

Gone with the wind

Thor

Similar

Armagedon ToyStore

Marcel

like recommend

Items

Rafael Amanda Users

Saturday, September 14, 2013

Problems with Collaborative Filtering

1. Scalability

2. Sparse Data

3. Cold Start

4. Popularity

- Amazon with 5M users, 50K items, 1.4B ratings

- New users and items with no records

- I only rated one book at Amazon!

- The person who reads ‘Harry Potter’ also reads ‘Kama Sutra’

5. Hacking

- Everyone reads Harry Potter!

Saturday, September 14, 2013

How does it show ?Highlights More about this artist...

Listen to the similar songs

Someone similar to you also liked this...

Since you listened this, you may like this one...

Those items come together...

The most popular of your group...

New Releases

Saturday, September 14, 2013

Recommendable

Quickly add a recommender engine for Likes and Dislikes to your Ruby app

http://davidcel.is/recommendable/

Saturday, September 14, 2013

Recommendable

Saturday, September 14, 2013

Recommendable

   gem  'recommendable'

Add to your GemFile:

Saturday, September 14, 2013

Recommendablerequire 'redis'

Recommendable.configure do |config| # Recommendable's connection to Redis config.redis = Redis.new(:host => 'localhost', :port => 6379, :db => 0)

# A prefix for all keys Recommendable uses config.redis_namespace = :recommendable

# Whether or not to automatically enqueue users to have their recommendations # refreshed after they like/dislike an item config.auto_enqueue = true

# The name of the queue that background jobs will be placed in config.queue_name = :recommendable

# The number of nearest neighbors (k-NN) to check when updating # recommendations for a user. Set to `nil` if you want to check all # other users as opposed to a subset of the nearest ones. config.nearest_neighbors = nilend

Create a configuration initializer:

Saturday, September 14, 2013

RecommendableIn your ONE model that will be receiving the

recommendations:

class User recommends :movies, :books, :minerals, :other_things

# ...end

Saturday, September 14, 2013

Recommendable

>> current_user.liked_movies.limit(10)>> current_user.bookmarked_books.where(:author => "Cormac McCarthy")>> current_user.disliked_movies.joins(:cast_members).where('cast_members.name = Kim Kardashian')

You can chain your queries

Saturday, September 14, 2013

Recommendable

>> current_user.hidden_minerals.order('density DESC')>> current_user.recommended_movies.where('year < 2010')>> book.liked_by.order('age DESC').limit(20)>> movie.disliked_by.where('age > 18')

You can chain your queries

Saturday, September 14, 2013

RecommendableYou can also like your recommendable objects

>> user.like(movie)=> true>> user.likes?(movie)=> true>> user.rated?(movie)=> true # also true if user.dislikes?(movie)>> user.liked_movies=> [#<Movie id: 23, name: "2001: A Space Odyssey">]>> user.liked_movie_ids=> ["23"]>> user.like(book)=> true>> user.likes=> [#<Movie id: 23, name: "2001: A Space Odyssey">, #<Book id: 42, title: "100 Years of Solitude">]>> user.likes_count=> 2>> user.liked_movies_count=> 1>> user.likes_in_common_with(friend)=> [#<Movie id: 23, name: "2001: A Space Odyssey">, #<Book id: 42, title: "100 Years of Solitude">]>> user.liked_movies_in_common_with(friend)=> [#<Movie id: 23, name: "2001: A Space Odyssey">]>> movie.liked_by_count=> 2>> movie.liked_by=> [#<User username: 'davidbowman'>, #<User username: 'frankpoole'>]

Saturday, September 14, 2013

RecommendableObviously, You can also DISLIKE your recommendable

objects>> user.dislike(movie)>> user.dislikes?(movie)>> user.disliked_movies>> user.disliked_movie_ids>> user.dislikes>> user.dislikes_count>> user.disliked_movies_count>> user.dislikes_in_common_with(friend)>> user.disliked_movies_in_common_with(friend)>> movie.disliked_by_count>> movie.disliked_by

Saturday, September 14, 2013

RecommendableRecommendations

>> friend.like(Movie.where(:name => "2001: A Space Odyssey").first)>> friend.like(Book.where(:title => "A Clockwork Orange").first)>> friend.like(Book.where(:title => "Brave New World").first)>> friend.like(Book.where(:title => "One Flew Over the Cuckoo's Next").first)>> user.like(Book.where(:title => "A Clockwork Orange").first)=> [#<User username: "frankpoole">, #<User username: "davidbowman">, ...]>> user.recommended_books # Defaults to 10 recommendations=> [#<Book title: "Brave New World">, #<Book title: "One Flew Over the Cuckoo's Nest">]>> user.similar_raters # Defaults to 10 similar users=> [#<>> user.recommended_movies(10, 30) # 10 Recommendations, offset by 30 (i.e. page 4)=> [#<Movie name: "A Clockwork Orange">, #<Movie name: "Chinatown">, ...]>> user.similar_raters(25, 50) # 25 similar users, offset by 50 (i.e. page 3)=> [#<User username: "frankpoole">, #<User username: "davidbowman">, ...]

Saturday, September 14, 2013

RecommendableJaccard Similarity

Marcel likes A, B, C and dislikes DAmanda likes A, B and dislikes CGuilherme likes C, D and dislikes AFlavio likes B, C, E and dislikes D

J(Marcel, Amanda) =([A,B].size + [].size - [C].size - [].size) / [A,B,C,D].size

J(Marcel, Amanda) =2 + 0 - 1 - 0 / 4 = 1/4 = 0.25

Saturday, September 14, 2013

RecommendableJaccard Similarity

Marcel likes A, B, C and dislikes DAmanda likes A, B and dislikes CGuilherme likes C, D and dislikes AFlavio likes B, C, E and dislikes D

J(Marcel, Guilherme) =([C].size + [].size - [A].size - [D].size) / [A,B,C,D].size

J(Marcel, Guilherme) =1 + 0 - 1 - 1 / 4 = 1/4 = - 0.25

Saturday, September 14, 2013

RecommendableJaccard Similarity

Marcel likes A, B, C and dislikes DAmanda likes A, B and dislikes CGuilherme likes C, D and dislikes AFlavio likes B, C, E and dislikes D

J(Marcel, Flavio) =([B,C].size + [D].size - [].size - [].size) / [A,B,C,D, E].size

J(Marcel, Flavio) =2 + 0 - 0 - 0 = 2/5 = 0.4

Saturday, September 14, 2013

RecommendableJaccard Similarity

MostSimilar(Marcel) = [ (Flavio, 0.4) , (Amanda, 0.25) , (Guilherme, -0.25)]

Marcel likes A, B, C and dislikes DAmanda likes A, B and dislikes CGuilherme likes C, D and dislikes AFlavio likes B, C, E and dislikes D

Saturday, September 14, 2013

RecommendableRecommendations

>> Movie.top=> #<Movie name: "2001: A Space Odyssey">>> Movie.top(3)=> [#<Movie name: "2001: A Space Odyssey">, #<Movie name: "A Clockwork Orange">, #<Movie name: "The Shining">]

The best of your recommendable models

Wilson score confidence - Reddit Algorithm

Saturday, September 14, 2013

RecommendableCallbacks

class User < ActiveRecord::Base has_one :feed

recommends :movies after_like :update_feed

def update_feed(obj) feed.update "liked #{obj.name}" endend

apotonick/hooks to implement callbacks for liking, disliking, etc

Saturday, September 14, 2013

Recommendable

Recommendable::Helpers::Calculations.update_similarities_for(user.id)Recommendable::Helpers::Calculations.update_recommendations_for(user.id)

Manual recommendations

Saturday, September 14, 2013

redis makes the magic!

Manual recommendations

Saturday, September 14, 2013

redis makes the magic!

Manual recommendations

Saturday, September 14, 2013

Recommendable

module  Recommendable    module  Workers        class  Resque            include  ::Resque::Plugins::UniqueJob  if  defined?(::Resque::Plugins::UniqueJob)            @queue  =  :recommendable

           def  self.perform(user_id)                Recommendable::Helpers::Calculations.update_similarities_for(user_id)                Recommendable::Helpers::Calculations.update_recommendations_for(user_id)            end        end    endend

Recommendations over Queueing SystemPut the workers to do the job! (SideKiq, Resque, DelayedJob)

Saturday, September 14, 2013

Recommended Books

SatnamAlag, Collective Intelligence in Action, Manning Publications, 2009

Toby Segaran, Programming Collective Intelligence, O'Reilly, 2007

Saturday, September 14, 2013

Recommended Books

Exploring everyday thingswith R and Ruby, Sau Chang,O’Reilly, 2012

Saturday, September 14, 2013

Recommended Course

https://www.coursera.org/course/recsys

Saturday, September 14, 2013

Ruby developers, It does exist

Web

Saturday, September 14, 2013

Ruby in the world of recommendations

(also machine learning, statistics and visualizations..)

Marcel Caraciolo@marcelcaracioloDeveloper, Cientist, contributor to the Crab recsys project,works with Python for 6 years, interested at mobile,education, machine learning and dataaaaa!Recife, Brazil - http://aimotion.blogspot.com

Saturday, September 14, 2013

top related