Posted by
flyerhzm
on
November 22, 2010
Memoization is an optimization technique used primarily to speed up computer programs by having function calls avoid repeating the calculation of results for previously-processed inputs. In rails, you can easily use memoize which is inherited from ActiveSupport::Memoizable.
Memoization is an optimization technique used primarily to speed up computer programs by having function call avoid repeat the calculation of results for previously-processed input. Here I will give you an example.
Problem
Imagine you have a billing system that one user has many accounts, each account has its own budget, there is a method total_budget for user object, which calculate the summary of all the available accounts' budgets. The following is the model definition.
class User < ActiveRecord::Base
has_many :available_accounts, :class_name => 'Account', :conditions => "budget > 0"
def total_budget
self.available_accounts.inject(0) { |sum, a| sum += a.budget }
end
end
total_budget will be used multiple times in models, views and controllers, such as
<% if current_user.total_budget > 0 %>
<%= current_user.total_budget %>
<% end %>
every time you use the total_budget, there is a db query sent to retrieve all available accounts for the user, and then calculate the summary of all the available accounts' budgets. How can we avoid the duplicated db query and duplicated calculation?
Caching with instance variable
There is an easy solution to use caching with instance variable to avoid the duplication execution.
class User < ActiveRecord::Base
has_many :available_accounts, :class_name => 'Account', :conditions => "budget > 0"
def total_budget
@total_budget ||= self.available_accounts.inject(0) { |sum, a| sum += a.budget }
end
end
That means the first time you call total_budget, one db query will be sent, calculate the summary of budgets, then assign the summary to the instance variable @total_budget. The second time you call total_budget, no db query be sent and no calculation execute, just return the @total_budget directly.
If your returned value is non-true, like nil or false, you must use the following solution
def has_comment?
return @has_comment if defined?(@has_comment)
@has_comment = self.comments.size > 0
end
Memoizable
The problem with this memoization is that you have to litter your method implementation with caching logic.Memorization should be best applied in a transparent way.
From Rails 2.2, there is a transparent way to implement memoization by using memoize inherited from ActiveSupport::Memoizable.
class User < ActiveRecord::Base
extend ActiveSupport::Memoizable
has_many :available_accounts, :class_name => 'Account', :conditions => "budget > 0"
def total_budget
self.available_accounts.inject(0) { |sum, a| sum += a.budget }
end
memoize :total_budget
end
memoize method will help you cache the method result automatically, you don't need to change the method implementations anymore, what you want to do is just declare what methods should be memoization.
The other big issue for caching with instance variable is that it's inconvenient to cache the different result depends on different inputs. Let's define a new method total_spent.
class User < ActiveRecord::Base
extend ActiveSupport::Memoizable
has_many :available_accounts, :class_name => 'Account', :conditions => "budget > 0"
def total_budget
self.available_accounts.inject(0) { |sum, a| sum += a.budget }
end
def total_spent(start_date, end_date)
self.available_accounts.where('created_at >= ? and created_at <= ?', start_date, end_date).inject(0) { |sum, a| sum += a.spent }
end
memoize :total_budget, :total_spent
end
it's really inconvenient to cache the total_spent result by using instance variable, as the results of total_spent are different when passing different start_date and end_date. But memoize can do it as easy as memoization for methods without arguments, it will cache the different results according to different inputs.
Deprecation
It does not say use memoization is deprecated, it's ActiveSupport::Memoize module was deprecated in Rails 3.2, see the commit, josevalim prefers "use Ruby instead", it is the same solution I mentioned in Caching with instance variable, but ActiveSupport::Memoize provides more features than direct @var ||= solution, like
- correctly memoize non-true values (nil, false, etc)
- varias memoization by method parameters
- separate cached return value from variable instances
So if you still want to enjoy these extra bonuses, try memoist gem, it is an direct extraction of ActiveSupport::Memoizable.

Comments
There are mainly two cases for memoize methods, one is for methods without arguments, like the total_budget method in User model, the other is for methods with arguments.
In common, memoize will alias your method with _unmemoized_ prefix, so the total_budget method will be aliased to _unmemorized_total_budget, and then your method will be overridden.
This is the process of the first case
If the memoize method is without arguments, an instance variable will be created, it is @_memoized_total_budget in our example.
If the total_budget method is first called, the _unmemoized_total_budget will be called and pass the result to the @_memoized_total_budget
Then the next time you call total_budget, you get the value of @_memoized_total_budget without calling _unmemoized_toal_budget.
Here is of the second case
If the memoize method is with arguments, it's the same that an instance variable will be created, but the instance variable is a hash, key is the inputs, value is the output. It is @_memoize_total_spent in our example
If the first time you call total_spent("2010-5-1", "2010-5-30"), and the _unmemoize_total_spent("2010-5-1", "2010-5-30") returns 10
Then the next time you call total_spent("2010-5-1", "2010-5-30"), @_memoize_total_spent[ ["2010-5-1", "2010-5-30"] ] is returned without calling _unmemoize_total_spent.
But if you call total_spent("2010-6-1", "2010-6-30"), memoize finds there is no key ["2010-6-1", "2010-6-30"] for @_memoize_total_spent, so _unmemoize_total_spent("2010-6-1", "2010-1-30") will be called again and pass the value to @_memoize_total_spent[ ["2010-6-1", "2010-6-30"] ]
Finally @_memoize_total_spent is
Here I just use total_budget and total_spent as examples, it's one of the optimization options.
@user_total_budget = current_user.total_budget
The calculation may still be something you don't want to repeat though, so memoizing it is a potentially good practice :)
It also allows ruby code to be much more like functional programming - a method without side-effects (a 'pure function') will always return the same result with the same inputs so it should only ever need to be run once and then the return can be stored and re-used every time the method is called.
@Pierre
Imagine the case where the total_budget is being used in other models, in the controller and in the view. With a controller instance variable, you can end up jumping through a lot of hoops to get the stored value to all the correct places. With memoized values, you don't need to worry as the method is storing a cached value for the request.
https://gist.github.com/822173
Something like that doesn't work, what should we use in that case?
memoize :expensive_calculationGithub commit: Deprecate memoizable.