[Ruby on Rails Guides] Active Record Query Interface — Ruby on Rails Guides

Active Record Query Interface

2.1 Retrieving a Single Object

Active Record provides several different ways of retrieving a single object.

2.1.1 find

Using the find method, you can retrieve the object corresponding to the specified primary key that matches any supplied options. For example:

1
2
3
# Find the customer with primary key (id) 10.
irb> customer = Customer.find(10)
=> #<Customer id: 10, first_name: "Ryan">

The SQL equivalent of the above is:

1
SELECT * FROM customers WHERE (customers.id = 10) LIMIT 1

The find method will raise an ActiveRecord::RecordNotFound exception if no matching record is found.


You can also use this method to query for multiple objects. Call the find method and pass in an array of primary keys. The return will be an array containing all of the matching records for the supplied primary keys. For example:

1
2
3
# Find the customers with primary keys 1 and 10.
irb> customers = Customer.find([1, 10]) # OR Customer.find(1, 10)
=> [#<Customer id: 1, first_name: "Lifo">, #<Customer id: 10, first_name: "Ryan">]

The SQL equivalent of the above is:

1
SELECT * FROM customers WHERE (customers.id IN (1,10))

The find method will raise an ActiveRecord::RecordNotFound exception unless a matching record is found for all of the supplied primary keys.


2.1.2 take

The take method retrieves a record without any implicit ordering. For example:

1
2
irb> customer = Customer.take
=> #<Customer id: 1, first_name: "Lifo">

The SQL equivalent of the above is:

1
SELECT * FROM customers LIMIT 1

The take method returns nil if no record is found and no exception will be raised.

You can pass in a numerical argument to the take method to return up to that number of results. For example

1
2
irb> customers = Customer.take(2)
=> [#<Customer id: 1, first_name: "Lifo">, #<Customer id: 220, first_name: "Sara">]

The SQL equivalent of the above is:

1
SELECT * FROM customers LIMIT 2

The take! method behaves exactly like take, except that it will raise ActiveRecord::RecordNotFound if no matching record is found.

2.1.3 first

The first method finds the first record ordered by primary key (default). For example:

1
2
irb> customer = Customer.first
=> #<Customer id: 1, first_name: "Lifo">

The SQL equivalent of the above is:

1
SELECT * FROM customers ORDER BY customers.id ASC LIMIT 1

The first method returns nil if no matching record is found and no exception will be raised.

If your default scope contains an order method, first will return the first record according to this ordering.

You can pass in a numerical argument to the first method to return up to that number of results. For example

1
2
irb> customers = Customer.first(3)
=> [#<Customer id: 1, first_name: "Lifo">, #<Customer id: 2, first_name: "Fifo">, #<Customer id: 3, first_name: "Filo">]

The SQL equivalent of the above is:

1
SELECT * FROM customers ORDER BY customers.id ASC LIMIT 3

On a collection that is ordered using order, first will return the first record ordered by the specified attribute for order.

1
2
irb> customer = Customer.order(:first_name).first
=> #<Customer id: 2, first_name: "Fifo">

The SQL equivalent of the above is:

1
SELECT * FROM customers ORDER BY customers.first_name ASC LIMIT 1

The first! method behaves exactly like first, except that it will raise ActiveRecord::RecordNotFound if no matching record is found.

2.1.4 last

The last method finds the last record ordered by primary key (default). For example:

1
2
irb> customer = Customer.last
=> #<Customer id: 221, first_name: "Russel">

The SQL equivalent of the above is:

1
SELECT * FROM customers ORDER BY customers.id DESC LIMIT 1

The last method returns nil if no matching record is found and no exception will be raised.

If your default scope contains an order method, last will return the last record according to this ordering.

You can pass in a numerical argument to the last method to return up to that number of results. For example

1
2
irb> customers = Customer.last(3)
=> [#<Customer id: 219, first_name: "James">, #<Customer id: 220, first_name: "Sara">, #<Customer id: 221, first_name: "Russel">]

The SQL equivalent of the above is:

1
SELECT * FROM customers ORDER BY customers.id DESC LIMIT 3

On a collection that is ordered using order, last will return the last record ordered by the specified attribute for order.

1
2
irb> customer = Customer.order(:first_name).last
=> #<Customer id: 220, first_name: "Sara">

The SQL equivalent of the above is:

1
SELECT * FROM customers ORDER BY customers.first_name DESC LIMIT 1

The last! method behaves exactly like last, except that it will raise ActiveRecord::RecordNotFound if no matching record is found.

2.1.5 find_by

The ·find_by· method finds the first record matching some conditions. For example:

1
2
3
4
5
irb> Customer.find_by first_name: 'Lifo'
=> #<Customer id: 1, first_name: "Lifo">

irb> Customer.find_by first_name: 'Jon'
=> nil

It is equivalent to writing with where method and take method:

1
Customer.where(first_name: 'Lifo').take

The SQL equivalent of the above is:

1
SELECT * FROM customers WHERE (customers.first_name = 'Lifo') LIMIT 1

The find_by! method behaves exactly like find_by, except that it will raise ActiveRecord::RecordNotFound if no matching record is found. For example:

1
2
irb> Customer.find_by! first_name: 'does not exist'
ActiveRecord::RecordNotFound

This is equivalent to writing:

1
Customer.where(first_name: 'does not exist').take!

2.2 Retrieving Multiple Objects in Batches

We often need to iterate over a large set of records, as when we send a newsletter to a large set of customers, or when we export data.

This may appear straightforward:

1
2
3
4
# This may consume too much memory if the table is big.
Customer.all.each do |customer|
NewsMailer.weekly(customer).deliver_now
end

But this approach becomes increasingly impractical as the table size increases, since Customer.all.each instructs Active Record to fetch the entire table in a single pass, build a model object per row, and then keep the entire array of model objects in memory. Indeed, if we have a large number of records, the entire collection may exceed the amount of memory available.

Rails provides two methods that address this problem by dividing records into memory-friendly batches for processing. The first method, find_each, retrieves a batch of records and then yields each record to the block individually as a model. The second method, find_in_batches, retrieves a batch of records and then yields the entire batch to the block as an array of models.

The find_each and find_in_batches methods are intended for use in the batch processing of a large number of records that wouldn’t fit in memory all at once. If you just need to loop over a thousand records the regular find##3 methods are the preferred option.

2.2.1 find_each

The find_each method retrieves records in batches and then yields each one to the block. In the following example, find_each retrieves customers in batches of 1000 and yields them to the block one by one:

1
2
3
Customer.find_each do |customer|
NewsMailer.weekly(customer).deliver_now
end

This process is repeated, fetching more batches as needed, until all of the records have been processed.

find_each works on model classes, as seen above, and also on relations:

1
2
3
Customer.where(weekly_subscriber: true).find_each do |customer|
NewsMailer.weekly(customer).deliver_now
end

as long as they have no ordering, since the method needs to force an order internally to iterate.

If an order is present in the receiver the behaviour depends on the flag config.active_record.error_on_ignored_order. If true, ArgumentError is raised, otherwise the order is ignored and a warning issued, which is the default. This can be overridden with the option :error_on_ignore, explained below.

2.2.1.1 Options for find_each

:batch_size

The :batch_size option allows you to specify the number of records to be retrieved in each batch, before being passed individually to the block. For example, to retrieve records in batches of 5000:

1
2
3
Customer.find_each(batch_size: 5000) do |customer|
NewsMailer.weekly(customer).deliver_now
end
:start

By default, records are fetched in ascending order of the primary key. The :start option allows you to configure the first ID of the sequence whenever the lowest ID is not the one you need. This would be useful, for example, if you wanted to resume an interrupted batch process, provided you saved the last processed ID as a checkpoint.

For example, to send newsletters only to customers with the primary key starting from 2000:

1
2
3
Customer.find_each(start: 2000) do |customer|
NewsMailer.weekly(customer).deliver_now
end
:finish

Similar to the :start option, :finish allows you to configure the last ID of the sequence whenever the highest ID is not the one you need. This would be useful, for example, if you wanted to run a batch process using a subset of records based on :start and :finish.

For example, to send newsletters only to customers with the primary key starting from 2000 up to 10000:

1
2
3
Customer.find_each(start: 2000, finish: 10000) do |customer|
NewsMailer.weekly(customer).deliver_now
end

Another example would be if you wanted multiple workers handling the same processing queue. You could have each worker handle 10000 records by setting the appropriate :start and :finish options on each worker.

:error_on_ignore

Overrides the application config to specify if an error should be raised when an order is present in the relation.

2.2.2 find_in_batches

The find_in_batches method is similar to find_each, since both retrieve batches of records. The difference is that find_in_batches yields batches to the block as an array of models, instead of individually. The following example will yield to the supplied block an array of up to 1000 customers at a time, with the final block containing any remaining customers:

1
2
3
4
# Give add_customers an array of 1000 customers at a time.
Customer.find_in_batches do |customers|
export.add_customers(customers)
end

find_in_batches works on model classes, as seen above, and also on relations:

1
2
3
4
# Give add_customers an array of 1000 recently active customers at a time.
Customer.recently_active.find_in_batches do |customers|
export.add_customers(customers)
end

as long as they have no ordering, since the method needs to force an order internally to iterate.

2.2.2.1 Options for find_in_batches

The find_in_batches method accepts the same options as find_each:

:batch_size

Just like for find_each, batch_size establishes how many records will be retrieved in each group. For example, retrieving batches of 2500 records can be specified as:

1
2
3
Customer.find_in_batches(batch_size: 2500) do |customers|
export.add_customers(customers)
end
:start

The start option allows specifying the beginning ID from where records will be selected. As mentioned before, by default records are fetched in ascending order of the primary key. For example, to retrieve customers starting on ID: 5000 in batches of 2500 records, the following code can be used:

1
2
3
Customer.find_in_batches(batch_size: 2500, start: 5000) do |customers|
export.add_customers(customers)
end
:finish

The finish option allows specifying the ending ID of the records to be retrieved. The code below shows the case of retrieving customers in batches, up to the customer with ID: 7000:

1
2
3
Customer.find_in_batches(finish: 7000) do |customers|
export.add_customers(customers)
end
:error_on_ignore

The error_on_ignore option overrides the application config to specify if an error should be raised when a specific order is present in the relation.

3 Conditions

The where method allows you to specify conditions to limit the records returned, representing the WHERE-part of the SQL statement. Conditions can either be specified as a string, array, or hash.

3.1 Pure String Conditions

If you’d like to add conditions to your find, you could just specify them in there, just like Book.where("title = 'Introduction to Algorithms'"). This will find all books where the title field value is ‘Introduction to Algorithms’.

Building your own conditions as pure strings can leave you vulnerable to SQL injection exploits. For example, Book.where("title LIKE '%#{params[:title]}%'") is not safe. See the next section for the preferred way to handle conditions using an array.

3.2 Array Conditions

Now what if that title could vary, say as an argument from somewhere? The find would then take the form:

1
2
Book.where("title = ?", params[:title])

Active Record will take the first argument as the conditions string and any additional arguments will replace the question marks (?) in it.

If you want to specify multiple conditions:

1
Book.where("title = ? AND out_of_print = ?", params[:title], false)

In this example, the first question mark will be replaced with the value in params[:title] and the second will be replaced with the SQL representation of false, which depends on the adapter.

This code is highly preferable:

1
Book.where("title = ?", params[:title]

to this code:

1
2
# Bad
Book.where("title = #{params[:title]}")

because of argument safety. Putting the variable directly into the conditions string will pass the variable to the database as-is. This means that it will be an unescaped variable directly from a user who may have malicious intent. If you do this, you put your entire database at risk because once a user finds out they can exploit your database they can do just about anything to it. Never ever put your arguments directly inside the conditions string.

For more information on the dangers of SQL injection, see the Ruby on Rails Security Guide.

3.2.1 Placeholder Conditions

Similar to the (?) replacement style of params, you can also specify keys in your conditions string along with a corresponding keys/values hash:

1
2
Book.where("created_at >= :start_date AND created_at <= :end_date",
{start_date: params[:start_date], end_date: params[:end_date]})

This makes for clearer readability if you have a large number of variable conditions.

3.3 Hash Conditions

Active Record also allows you to pass in hash conditions which can increase the readability of your conditions syntax. With hash conditions, you pass in a hash with keys of the fields you want qualified and the values of how you want to qualify them:

Only equality, range, and subset checking are possible with Hash conditions.

3.3.1 Equality Conditions

1
Book.where(out_of_print: true)

This will generate SQL like this:

1
SELECT * FROM books WHERE (books.out_of_print = 1)

The field name can also be a string:

1
Book.where('out_of_print' => true)

In the case of a belongs_to relationship, an association key can be used to specify the model if an Active Record object is used as the value. This method works with polymorphic relationships as well.

1
2
3
author = Author.first
Book.where(author: author)
Author.joins(:books).where(books: { author: author })

3.3.2 Range Conditions

1
Book.where(created_at: (Time.now.midnight - 1.day)..Time.now.midnight)

This will find all books created yesterday by using a BETWEEN SQL statement:

1
SELECT * FROM books WHERE (books.created_at BETWEEN '2008-12-21 00:00:00' AND '2008-12-22 00:00:00')

This demonstrates a shorter syntax for the examples in Array Conditions

3.3.3 Subset Conditions

If you want to find records using the IN expression you can pass an array to the conditions hash:

1
Customer.where(orders_count: [1,3,5])

This code will generate SQL like this:

1
SELECT * FROM customers WHERE (customers.orders_count IN (1,3,5))

3.4 NOT Conditions

NOT SQL queries can be built by where.not:

1
Customer.where.not(orders_count: [1,3,5])

In other words, this query can be generated by calling where with no argument, then immediately chain with not passing where conditions. This will generate SQL like this:

1
SELECT * FROM customers WHERE (customers.orders_count NOT IN (1,3,5))

3.5 OR Conditions

OR conditions between two relations can be built by calling or on the first relation, and passing the second one as an argument.

1
Customer.where(last_name: 'Smith').or(Customer.where(orders_count: [1,3,5]))
1
SELECT * FROM customers WHERE (customers.last_name = 'Smith' OR customers.orders_count IN (1,3,5))

References

[1] Active Record Query Interface — Ruby on Rails Guides - https://guides.rubyonrails.org/active_record_querying.html

[2] Ruby on Rails Guides - https://guides.rubyonrails.org/

[3] Ruby on Rails | A web-application framework that includes everything needed to create database-backed web applications according to the Model-View-Controller (MVC) pattern. - https://rubyonrails.org/