zerosum dirt(nap)

evolution through a series of accidents

zerosum dirt(nap)

count_on counter_cache

December 24, 2006 by nap · Comments

Counter cache is my new friend. It’s a very important little feature available on ActiveRecord that makes counting associations efficient by maintaining a cache on the model.

Although it’s documented in AWDWR (p359 in my shiny new copy of the second edition), I guess it slipped my mind until now, so I thought I’d blog about it just in case anyone else was looking for a solution…

So why does it rock so hard, you ask? Let’s say that I’m building YADRC (Yet Another Digg/Reddit Clone). I need to count the votes that users make to determine the popularity of an article, display that number, and use it to rank the order of the stories. So we set up our models: a story model, a vote model, and a user model, and we create the appropriate associations. Our vote.rb is going to look something like this:

class Vote < ActiveRecord::Base
  belongs_to :user
  belongs_to :story
end

Predictably, Story has_many votes and User has_many votes. Now we can do something like this to find the number of votes on a story object:

Story.find(4).votes.size

This will work nicely right out of the box. However, once we have a significant amount of stories logged in our system and a reasonable number of votes on each story, performance goes straight to poop. The reason is that each time we’re generating a score for a story (and remember, we’ve got N stories), we’re running a query like this against our database:

SELECT count(*) AS count_all FROM votes WHERE (votes.story_id = 4) 

If we have any real amount of data in our system, this is going to get really ugly. Counter caching is one way to help counteract this problem. Let’s rewrite our model to use it:

class Vote < ActiveRecord::Base
  belongs_to :user
  belongs_to :story, :counter_cache => true
end

We also have to add a column, called votes_count, to the stories table in our database. Make sure to specify a default value of 0 in your DDL. Then we generate a migration, run it, and now we’re ready to try again. The difference should be pretty dramatic. If we tail -f our development log, we’ll notice that those count(*) queries aren’t getting run anymore. So what’s happening?

ActiveRecord is using our counter cache column (called votes_count in this case) on the stories table to store the number of belonging objects on the associate class. This value is incremented when an object of this class is created, and decremented when it’s destroyed. The result is that we have a local “cached count” on the Story instance so we don’t have to constantly query the votes table directly. Good deal, eh?

There are two additional things worth noting about counter caching. As Dave Thomas points out in AWDWR, the counter won’t get updated if entries are added by setting the link to the parent directly in the child like this:

vote = Vote.new
vote.story = story
vote.save

If you’re doing it this way, you’ll have to force the parent class to refresh the collection. The right approach is to add a Vote through the Story object, which makes the parent aware of the increment (or decrement) and update the counter cache accordingly:

story.vote.create

The other point I wanted to make was that if you call the count method on object instead of using size, it will always run the actual count query on the underlying database, instead of using our cached shortcut. Thanks for the tip on that one, technoweenie.

blog comments powered by Disqus