Frank DENIS random thoughts.

Redis as a recommendation engine

“You might be interested in red toilet paper, because you bought blue and green toilet paper, and people who also bought blue and green toilet paper tend to also buy red toilet paper”.

Recommendation engines are now everywhere, from e-commerce to social networks.

Graphs databases like Neo4j are probably the best way to tackle the problem. But as an alternative, let’s try building a recommendation engine on Redis.

Who bought what or who is following who could be stored in Redis sorted sets.

Let’s take an example: a user buys A, and also B. We increment the score of the B item in the sorted set associated to key A. And we possibly also do it the other way round.

This is a trivial way to keep a list of what items have been bought, what are the related items and how many of them there are.

On a social network, the score associated to a relationship may reflect the level of trust.

Let’s suppose we have the following relationships:

B -> C
D -> C
E -> F
A -> B
A -> D
A -> E

C is referenced by B and D. B and D are referenced by A. But C isn’t referenced by A yet. This is presumably something to suggest.

F might also be a good candidate as a suggestion for A. However, since there’s only a single path from A to F (through E), it’s probably less relevant than C.

In order to limit the number of non-relevant results, we may want to add a threshold: nodes that have less possible paths than a cutoff value shouldn’t be suggested.

How to do that with Redis, and only Redis?

Here’s one possible way:

  • Use ZUNIONSTORE in order to build a temporary aggregate of all related items (scores are summed up), and remove the reference item from the set (cylic links).

  • Remove everything from the aggregate that is already directly related to the reference item. One way to achieve this on a sorted set is to use ZUNIONSTORE AGGREGATE MIN with a negative or null weight for the set we want to delete, followed by ZREMRANGEBYSCORE in order to actually remove everything from the reference set and everything below the cutoff score.

  • Sort and optionally trim the new set with ZREVRANGE et voila!

Maybe code speaks louder than words, so here we go for some code:

require "redis"

class Recommendation
  attr_reader :db
  
  def initialize
    @db = Redis.new  
  end
  
  def bought(id)
    Item.new(self, id)
  end
  
  def suggestions_for(id, options = { })
    Item.new(self, id).suggestions(options)
  end
  
  class Item
    def initialize(recommendation, id)
      @db, @id = recommendation.db, id
    end
    
    def self.key(id) "{rec}:items:#{id}" end
    
    def key() Item.key(@id) end
    
    def tmp_key(n = 1) "{rec}:_tmp:#{@id}:#{n}" end
    
    def and_also_bought(linked_id, weight = 1)
      @db.zincrby(key, weight, linked_id)
    end
    
    def suggestions(options = { })
      cutoff = options[:cutoff] || 1
      limit = options[:limit] || 50
      tmp, tmp2 = tmp_key, tmp_key(2)
      related_keys = @db.sort(key).collect { |item| Item.key(item) }
      return [ ] unless related_keys

      res = @db.multi do |m|
        m.zunionstore(tmp, related_keys, :aggregate => :sum)
        m.zrem(tmp, @id)
        m.zunionstore(tmp2, [ tmp, key ], :weights => [ 1, -1 ],
                                          :aggregate => :min)
        m.del(tmp)
        m.zremrangebyscore(tmp2, '-inf', cutoff)
        m.zrevrange(tmp2, 0, limit - 1)
        m.del(tmp2)
      end
      res[5]
    end
  end    
end

# Example

r = Recommendation.new

r.bought('b').and_also_bought('c')
r.bought('d').and_also_bought('c')
r.bought('e').and_also_bought('f')
r.bought('a').and_also_bought('b')
r.bought('a').and_also_bought('d')
r.bought('a').and_also_bought('e')

puts r.suggestions_for('a')