Posted by
flyerhzm
on
October 19, 2010
In a system like forum, the title and body is displayed on show page, but only title is on index page. You should use select in query to speed up the query and save memory.
Before
In the rails-bestpractices.com, rails best practices are saved as Post model. The database schema of the Post model is
create_table "posts", :force => true do |t|
t.string "title"
t.text "body"
t.datetime "created_at"
t.datetime "updated_at"
t.integer "user_id"
t.text "formatted_html"
t.text "description"
...
end
The posts index page displays 10 posts. At the beginning, I used
class PostsController < ApplicationController
def index
@posts = Post.paginate(:page => params[:page])
end
end
to fetch posts from database. The query sql is as follows.
SELECT `posts`.* FROM `posts` LIMIT 0, 10
It was slow when the posts had large body.
Refactor
Then I began to improve the performance. In the posts index page, the body, formatted_html and updated_at are no use, but the body and formatted_html occupy lots of memory, for example, the average size of body is 2k and formatted_html is also 2k, the posts index page display 10 posts per page, so the system wastes 40k memory each posts index request, and it also wastes time to fetch them from database. So I decided to remove theses fields from query.
In rails3 there is a method "select" to fetch specific fields, in rails2 you can also use select option in find method. Here I only give you the rails3 way.
class Post < ActiveRecrod::Base
INDEX_COLUMNS = column_names - ['body', 'formatted_html', 'updated_at']
end
class PostsController < ApplicationController
def index
@posts = Post.select(Post::INDEX_COLUMNS).paginate(:page => params[:page])
end
end
Here rails tells database fetch all the fields except body, formatted_html and updated_at, now the query sql is as follows
SELECT id, title, created_at, user_id, description FROM `posts` LIMIT 0, 10
As you see, there are no body, formatted_html and updated_at in query sql, it is faster than before and use less memory.
Update: You should not select specific fields if you use memory object caching system, such as memcache.

Comments
There are 50 rows in posts table
SELECT SQL_NO_CACHE * FROM posts;takes 0.02 secSELECT SQL_NO_CACHE id, title, created_at, user_id, description FROM posts;takes 0.00 secIt is not accurate, but it makes sense.
Using Ruby's Benchmark, I ran Partner.all a couple of times via console. Time are:
0.330000 0.030000 0.360000 ( 0.940293)
0.290000 0.000000 0.290000 ( 0.336642)
0.280000 0.010000 0.290000 ( 0.334005)
Without leaving console I then ran Partner.all(:select => 'id, title') :
0.130000 0.000000 0.130000 ( 0.135501)
0.020000 0.000000 0.020000 ( 0.026773)
0.020000 0.000000 0.020000 ( 0.025976)
I then left console and used runner to startup and measure each query once. All:
0.310000 0.010000 0.320000 ( 0.520687)
With select:
0.040000 0.000000 0.040000 ( 0.047784)
I'm sure Postgresql is caching data as well, but that's actually a good thing as the numbers above should indicate the difference in amount of time it takes Rails to instantiate the 1000-ish objects.
If I run Partner.connection.select_rows('select id,title from partners') and skip AR object creation entirely the times are:
0.020000 0.000000 0.020000 ( 0.041021)
0.030000 0.000000 0.030000 ( 0.036794)
0.020000 0.010000 0.030000 ( 0.038171)
The above is for 1000 objects though. It's pretty rare to fetch more than a handful at a time. If I rerun the above, but limit it to 10 rows I then get the following for All:
0.310000 0.010000 0.320000 ( 0.395079)
0.310000 0.010000 0.320000 ( 0.384910)
And for selecting just the id/title:
0.020000 0.000000 0.020000 ( 0.026231)
0.020000 0.000000 0.020000 ( 0.026358)
So, still a decent savings.
You can use it to automatically select specified fields, double check the logs to see if slim_scrooge supports your sql queries, if not, you should add the select methods by yourself.
I think bullet like detection and notification, but warn specific columns that is not included in the view is very interesting idea.
Will first fetch all posts, then paginate the array, it won't use the limit clause in the query.
The
allcall should be removed here.Thanks for your work.
Post.allfetches all of the posts, which is not expected. I have removed all call from the post. Thanks!