Ruby On Rails, Design, Simplicity, Web 2.0, Ajax, Mac and Tons of Pizza.

Jan 15

Programming Collective Intelligence

Posted by Sandro Paganotti in Ruby on Rails - comments are closed digg this add to delicious

Yesterday I started reading Programming Collective Intelligence and.. WOW ! This book is simply amazing.

At the moment I’m reading page 30 and in the chapter I’ve just finished I’ve learned how to create a simply yet powerful recommendations system that uses both euclidean distance and Pearson correlation score to determine the affinity between people based on how they have ranked objects.

For example you can measure the affinity between people based on how they rank movies or, with a little effort, the affinity between two movies based on how they have been ranked.

While reading this book I had to face a little trouble…

All the samples are written in Python!

and as you may have guessed I don’t know almost anything about this programming language. So, with the help of a good manual I’ve started to translate the code from Python to Ruby. The result after the first 30 pages is ready for you to download at the end of this post.

Recommendation.rb

This module contains the following functions:

  • sim_distance(prefs, person1, person2)
    It calculates an index between 0 (no affinity) and 1 (congruency) between person1 and person2 based on a set of preferences called prefs (explained at the end of this list );
  • sim_pearson(prefs, person1, person2)
    It does the same job as ‘sim distance’ but this time using the Pearson Correlation score;
  • top_matches(prefs,person,n=3,similarity=’sim_pearson’)
    It returns a list of the first n people ordered by their affinity to person. You can also choose which similarity algorithm you want to use (default is sim_pearson).

the ‘prefs’ preference set must have the following structure:



      prefs_hash = {
          'personA' => { :item_1 => numeric_rank, :item_2 => numeric_rank },
          'personB' => { :item_1 => numeric_rank, :item_2 => numeric_rank },
          etc...
      }


And here is a sample function call:



     sim_distance(prefs_hash, 'personA', 'personB')


Here is the file – recommendation.rb.zip

As far as I keep reading I’m going to translate the samples of this book, so stay tuned.

Comments

  • A.B.Leal

    Posted on January 15

    If you stick with it, you'll probably end up prefering Python. You have been warned ;-) Tip: also look up a "Python idioms" web page. Word-for-word translation is ugly for most languages. Python and Ruby are pretty compatible, though.
  • Jamie Pitts

    Posted on January 16

    I was planning to start reading this book in the near future, so thanks!
  • ms

    Posted on January 16

    woah awesome! I had the same problem because I only know ruby. Can't wait to see the rest of examples!
  • ms

    Posted on January 16

    What manual are you using?
  • Riccardo Cambiassi

    Posted on January 16

    Oh, sweet! We started doing the same thing here in the office. Well, actually the good Tom (infovore.org) started doing it :) Pretty awesome book, isn't it?
  • rb

    Posted on January 18

    there is an error on the code, check the errata of the book {11} last line of code sample; return 1/(1+sum_of_squares) should be return 1/(1+*sqr*t(sum_of_squares)) the formula should return the square root of all the sums
  • alex

    Posted on January 18

    Ahh, great minds think alike! I too have started a similar porting effort (details at http://blog.livollmers.net/index.php/2007/12/01/ruby-port-of-programming-collective-intelligence/) to learn the ins and outs of that book. So far it's a fairly slavish line-for-line port of the example Python code which makes it a tad on the unreadable side. I plan on going back and massaging the code into more idiomatic Ruby. Anyway, there is a splinter group of the Seattle Ruby Brigade that is getting together to study this book and, clearly, we'll using Ruby to do so. If you're interesting in collaborating or merging these efforts, let me know!
  • Sandro

    Posted on January 24

    Hi all and thank you for your comments!
    @rb: I've fixed the recommendation.rb file according to your suggestion.
    @alex: I'm going to contact you directly to share our passion for this book.
  • jason z

    Posted on February 09

    there's an errata with the errata: return 1/(1+*sqr*t(sum_of_squares)) should be return 1/(1+*sqrt(sum_of_squares))
  • Bosco

    Posted on February 10

    I'm really interested on these algorithms to be translated into ruby. Thanks a lot, please, track your progress and ideas on the blog!
  • Bosco

    Posted on February 20

    Hi, the link to your .rb file is missing! Can you mail it to me, please? Thnx a lot!
  • Sandro Paganotti

    Posted on February 21

    I've restored the .rb file, thank you for your message.
  • Bosco

    Posted on February 23

    Hi, I have implemented my getRecommendations method, cause it's not on your .rb file. Probably my Ruby syntax is crap, because it's just a few days I have started to leran RoR, but well, it works: def getRecommendations(prefs,person) totals={} simSums={} prefs.each do |other| # don't compare me to myself if other[0]==person next end sim=sim_pearson(prefs,person,other[0]) # ignore scores of zero or lower if sim<=0 next end prefs[other[0]].each do |item| event=item[0] # only score movies I haven't seen yet if (not prefs[person].include?(event) or prefs[person][event]==0) # Similarity * Score if totals[event] == nil totals[event]=0 end totals[event]=totals[event]+prefs[other[0]][event]*sim # Sum of similarities if simSums[event] == nil simSums[event]=0 end simSums[event]+=sim end end end # Create the normalized list rankings=[] totals.each do |total| rankings << {total[0],total[1]/simSums[total[0]]} end # Return the sorted list # rankings.sort{|x,y| x[1]<=>y[1]} # rankings.reverse() return rankings end
  • Bosco

    Posted on February 23

    BTW, I have opened a Google Group to discuss about the book's chaps. Have a look at it! http://groups.google.es/group/programming-collective-intelligence

Post a comment

Categories:

Tags:

Powered by Mephisto, Valid XHTML 1.1, Valid CSS - Supported by Wave Factory