For those of us that write database-backed Rails apps,
ActiveRecord is a
familiar beast. We know its various query building methods well:
limit, and even
group. The most important SQL clause is
conspicuously missing from our repertoire: the
Chances are you can't think of many, if any, times that you've used
from method. This is to be expected. The
from clause in
SQL specifies the target table of a query. In the context of ActiveRecord, the
target table is explicitly expressed by the model from which you query.
Consider the times you use one-liners such as
Developer.count. With each of these, the
table as the target of the query is expressed by the use of the
So, where does the
from method come into the picture?
method as such.
Specifies table from which the records will be fetched.
Specifying the table of a query against an ActiveRecord model is going to feel
a bit redundant.
> Developer.from('developers').to_sql => "SELECT \"developers\".* FROM developers"
As you've probably noticed we get this same behavior with nothing more than the
> Developer.all.to_sql => "SELECT \"developers\".* FROM \"developers\""
So, again, where does the
from method come into the picture?
from clauses aren't always going to look as simple as
We sometimes need the "table" from which we are fetching to be a collection of
computed values. The key insight here is that the result of a query is itself a
table. That means that we can query against the result of another query. This
is ostensibly how subqueries work.
from method comes into play when we are building up a complex query
in SQL and need to transition back to the land of ActiveRecord and Rails.
select developers.id, username, posts, likes, round(likes::numeric / posts, 2) as avg_likes, round(log(2, (1022 * ((score - min(score) over ()) / ((max(score) over ()) - (min(score) over ()))) + 2)::numeric), 1) as hotness from developers join ( select developer_id as id, sum(score) as score from hot_posts group by developer_id ) developer_scores using(id) join ( select developer_id as id, count(*) as posts, sum(likes) as likes from posts group by developer_id ) stats using(id)
This query is used to compute both the average number of likes and the
hotness score for each developer's posts. Because of the complexity of the
query, we are better off writing it in SQL rather than using ActiveRecord's
DSL. This query does a complex computation that results in a bunch of data. It
would be nice if we could further refine the results as needed with
ActiveRecord's DSL. And herein lies the challenge. How do we transition from
this SQL query to something compatible with ActiveRecord?
from method — the topic of this post — is, of course, the answer. We
can treat the above query as a "table" that we can query against (i.e. a
subquery). This allows us to then use
group, etc. to
further refine the result set.
We can create a class (
app/models/developer_ranker.rb) to see how this can work.
class DeveloperRanker RANKED_DEV_SQL = <<-SQL (select developers.id, username, posts, likes, round(likes::numeric / posts, 2) as avg_likes, round(log(2, (1022 * ((score - min(score) over ()) / ((max(score) over ()) - (min(score) over ()))) + 2)::numeric), 1) as hotness from developers join ( select developer_id as id, sum(score) as score from hot_posts group by developer_id ) developer_scores using(id) join ( select developer_id as id, count(*) as posts, sum(likes) as likes from posts group by developer_id ) stats using(id) ) as developers SQL def self.top_developers(count=1) Developer.from(RANKED_DEV_SQL) .order(hotness: :desc) .limit(count) end def self.bottom_developers(count=1) Developer.from(RANKED_DEV_SQL) .order(hotness: :asc) .limit(count) end end
We treat the complex query as a subquery by wrapping it in
(...) as assigning it as a string to a constant. This constant is then
available to any methods that want to build on this query without duplicating
all the core query logic.
As you can see in
DeveloperRanker.top_developers, we hydrate an
ActiveRecord::Relation with the results of the query. We can then chain those
familiar ActiveRecord query methods on the end. In this case, we order the
results with the developers most on fire at the top and then limiting by the
count. Note that because we are querying against the
model, it is essential that the subquery's name matches the name that
ActiveRecord uses for that model's table.
DeveloperRanker.bottom_developers to perform a similar query with
very little duplication.
An additional benefit of this approach is that the
Developer objects that are
being hydrated are decorated with the additional values returned from the
query. Not only do we know the
username of each developer, we can
also access the
hotness attributes. These hydrated
Developer objects can be passed to the view layer like any other decorated
object would be.
This technique is great for generating reports and stats that require lots of
computation or complex joins, but are ultimately tied to particular domain
concepts (e.g. a developer).
Ultimately, this frees you to write some SQL without completely cutting
yourself off from the world of
Cover photo by Samuel Zeller on Unsplash.com