One of the most relevant new features shipped with Rails 7 is Active Record Encryption.
Active Record Encryption helps to preserve your application’s data confidentiality by adding reversible application-level encryption in a way that you can encrypt, decrypt and do searches on that data.
Advantages of Active Record Encryption
Active Record Encryption makes encryption transparent by encrypting attributes before persisting them, and decrypting them upon retrieval. With Active Record Encryption your application will deal with data as if it was not encrypted, while storing it encrypted in database.
It also automatically filters out encrypted fields in the application logs (unless you configure it not to do so). This makes it a great fit when dealing with PII, and helps your application complying with CCPA and GDPR regulations.
Deterministic vs non-deterministic encryption
Active Record Encryption supports both of them. Before you setup and use Active Record Encryption, it is important that you understand their differences.
Non-deterministic mode
When using non-deterministic encryption, encrypting the same content with the same key twice will result in different cipher-texts.
The advantage of this approach is that it improves security by making crypto-analysis of cipher-texts harder for the attacker.
It also makes impossible to query the database, which is an advantage from the security point of view… but could be a disadvantage from the usability side in case you need to query that data.
For security reasons the non-deterministic approach is Active Record Encryption’s default approach. It is always recommended unless you need to query the data.
Deterministic mode
When using deterministic encryption, encrypting the same content with the same key twice will result in the same cipher-text. As mentioned above, the Rails guides recommend you to use this approach only when you need to query the data.
Encryption algorithm
Both deterministic and non-deterministic approaches use the same AES-GCM encryption algorithm. The difference is that the non-deterministic mode uses a random initialization vector, whereas the deterministic mode initialization vector is generated as an HMAC-SHA-256 digest of the key and contents to encrypt.
Setup Active Record Encryption
Enough talk, let’s get our “hands dirty” and see how you can use Active Record Encryption in your Rails application.
The first thing you need to do is to setup the encryption keys. To do so, first run the following rails command in a terminal to generate the encryption keys:
1 2 3 4 5 6 7 |
$ bin/rails db:encryption:init Add this entry to the credentials of the target environment: active_record_encryption: primary_key: cRWw6ctehulHtKJWBh71SoKzzgylS55q deterministic_key: 92S3IdInaHjPMsRqbIlZ8clMepIuXhQa key_derivation_salt: MVHaWgnwvWkqsxgtMzZXRlyxwFcuNSMr |
Now copy those keys and add them to your Rails credentials file (either to the global config/credentials.yml
file or to a specific environment’s credentials file).
Let’s explain each key in detail:
primary_key
The primary key is used for non-deterministic encryption.
To support key-rotation, primary_key
can also be a list of keys. In that case, Active Record uses the last key to encrypt new values, while keeps trying all the keys from the list when decrypting the content until it finds the one that works.
1 2 3 4 5 6 |
active_record encryption: primary_key: - cRWw6ctehulHtKJWBh71SoKzzgylS55q <em># Previous keys must be kept to decrypt values that were encrypted with them - MNEFSGqZF6hzATMXH66LSUt7BJ1h35Zy <em># Current primary_key used to encrypt new values key_derivation_salt: MVHaWgnwvWkqsxgtMzZXRlyxwFcuNSMr |
Since trying all the keys to decrypt previous values hinders performance, you can configure active_record.encryption.store_key_references
so that Active Record Encryption stores a reference to the encryption key within the encrypted payload.
If you want to use this feature, remember provisioning some extra storage space on each field for the key reference.
1 |
config.<strong>active_record</strong>.<strong>encryption</strong>.<strong>store_key_references</strong> <strong>=</strong> <strong>true</strong> |
Talking about configuration. The ideal place to put Active Record Encryption configuration options would be config/application.rb
since you typically want to have the same encryption configuration across all environments, If instead you want to set specific configurations on a per-environment basis you can do so within each specific environment config file under config/environments/
.
deterministic_key
As you can guess, this key is used for deterministic encryption. You can omit it if you do not plan to use deterministic encryption at all.
At the time of this writing Active Record Encryption does not support deterministic key-rotation, so no lists here.
key_derivation_salt
Active Record encryption uses this value to derive generated keys.
Declare encrypted attributes
Ok, now that you have a basic configuration, let’s see Active Record Encryption in action.
As an example, I will create a SecretAgent
model. This SecretAgent
will have two string
fields: code_name
, that can be known by everyone, and real_name
, that must be kept secret so that the agent’s real identity does not get compromised. It will also have a description
rich text field, which will contain information about the agent. Again, we want to keep this information secret so that the agent’s details, strengths and weaknesses do not fall into the wrong hands. To protect real_name
and description
we will use Active Record Encryption.
To create the SecretAgent
I will execute the following model generator command:
1 2 3 4 5 6 7 |
$ bin/rails g model SecretAgent code_name real_name description:rich_text invoke active_record create db/migrate/20220722094445_create_secret_agents.rb create app/models/secret_agent.rb invoke test_unit create test/models/secret_agent_test.rb create test/fixtures/secret_agents.yml |
And modify the migration file adding a limit
to real_name
, like this:
1 2 3 4 5 6 7 8 9 10 |
class CreateSecretAgents < ActiveRecord::Migration[7.0] def change create_table :secret_agents do |t| t.string :code_name t.string :real_name, limit: 510 t.timestamps end end end |
It is worth mentioning that since the encryption payload takes some space. The Rails guides recommends you to increase the number of bytes of string
columns from 255 bytes to 510 bytes. Although the above migration file sets the limit in characters and not in bytes for string columns (this is how add_column
works), since in this example we will store short names it might even not be necessary to increase the real_name
field’s length at all. However, what I want to point out is that you need to take this encryption payload’s overhead into account when planning your database schema for encrypted data when using string
columns.
The description
field does not appear in this migration file since it will be handled by Action Text. Nonetheless, Active Record Encryption’s documentation states that the additional overhead added for text
fields by Active Record Encryption is negligible, so following its advice I let it as it is.
As a side note, Active Record Encryption serializes values using the underlying type before encrypting them. The only prerequisite is that they must be serializable as strings.
After running the migration with bin/rails db:migrate
let’s create the following model in app/models/secret_agent.rb
1 2 3 4 |
class SecretAgent < ApplicationRecord encrypts :real_name has_rich_text :description, encrypted: true end |
As you can see, using Active Record encryption is pretty straight forward. For string fields you just need to add encrypts
to your model, followed by the name of the field, and for rich text fields adding the encrypted: true
option will do the job.
Let’s now create a secret agent and see encryption in action.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
irb(main):001:0> SecretAgent.create!(code_name: 'Solo', real_name: 'Napoleon', description: 'Number One in Section Two at U.N.C.L.E. Lorem ipsum...') (1.1ms) SELECT sqlite_version(*) TRANSACTION (0.0ms) begin transaction SecretAgent Create (0.9ms) INSERT INTO "secret_agents" ("code_name", "real_name", "created_at", "updated_at") VALUES (?, ?, ?, ?) [["code_name", "Solo"], ["real_name", "{\"p\":\"Ht4BbvSaLCU=\",\"h\":{\"iv\":\"299ws9hqfjF+eXUf\",\"at\":\"9EiGfa7j2aBPXRkgur0hcg==\"}}"], ["created_at", "2022-07-22 09:54:43.013796"], ["updated_at", "2022-07-22 09:54:43.013796"]] ActionText::EncryptedRichText Create (0.7ms) INSERT INTO "action_text_rich_texts" ("name", "body", "record_type", "record_id", "created_at", "updated_at") VALUES (?, ?, ?, ?, ?, ?) [["name", "description"], ["body", "{\"p\":\"UsUXo7vfEvbW2V1KQCtADDB7JLhkL1uL8k0VUZwB5a76pvrdLoBbEM0Jbs2RW0ivcFd9mg6+\",\"h\":{\"iv\":\"MJJD7J+2ImMrgvoN\",\"at\":\"odKONEGydkN/btDLq4lfRQ==\"}}"], ["record_type", "SecretAgent"], ["record_id", 1], ["created_at", "2022-07-22 09:54:43.041322"], ["updated_at", "2022-07-22 09:54:43.041322"]] ActiveStorage::Attachment Load (0.4ms) SELECT "active_storage_attachments".* FROM "active_storage_attachments" WHERE "active_storage_attachments"."record_id" = ? AND "active_storage_attachments"."record_type" = ? AND "active_storage_attachments"."name" = ? [["record_id", 1], ["record_type", "ActionText::RichText"], ["name", "embeds"]] SecretAgent Update (0.1ms) UPDATE "secret_agents" SET "updated_at" = ? WHERE "secret_agents"."id" = ? [["updated_at", "2022-07-22 09:54:43.043649"], ["id", 1]] TRANSACTION (0.9ms) commit transaction => #<SecretAgent:0x0000000106474058 id: 1, code_name: "Solo", real_name: "Napoleon", created_at: Fri, 22 Jul 2022 09:54:43.013796000 UTC +00:00, updated_at: Fri, 22 Jul 2022 09:54:43.043649000 UTC +00:00> |
As you can see the encryption process is completely transparent to you. If you check the transaction logs, Active Record persisted real_name
and description
fields encrypted to the database, But if you later retrieve one of these values, they appear in their original form, as if no encryption ever happened!
1 2 3 |
irb(main):002:0> SecretAgent.first.real_name SecretAgent Load (0.2ms) SELECT "secret_agents".* FROM "secret_agents" ORDER BY "secret_agents"."id" ASC LIMIT ? [["LIMIT", 1]] => "Napoleon" |
Let’s now look at the persisted value for real_name
from the transaction log in more detail. Notice that the persisted value is a hash serialized as a string that contains some interesting keys…
1 |
"{\"p\":\"Ht4BbvSaLCU=\",\"h\":{\"iv\":\"299ws9hqfjF+eXUf\",\"at\":\"9EiGfa7j2aBPXRkgur0hcg==\"}}" |
In this hash, p
represents the encrypted value of real_name
, while h
is another hash containing some headers related to the encryption process. The first one iv
, stands for initialization vector, while at
represents the auth_tag
that will be used during the decryption process to check that the encrypted string has not been modified, There could be other different headers within this hash depending on the configuration options you have set for Active Record Encryption. If you are curious about what they mean you can check them here.
How to query data
So far we have seen non-deterministic encryption, but what if you wanted to query some encrypted field? You would like the encryption process to always generate the same payload so that you can query it. For this case you must use deterministic encryption.
Using deterministic encryption in your model is as easy as adding the deterministic: true
option to your fields encrypt’s declaration in the model. Let’s see it in action.
Imagine you had a User
model with unique email addresses and you wanted to have the user emails encrypted, but yet be able to query by email. First of all the migration would look like this:
1 2 3 4 5 6 7 8 9 10 11 |
class CreateUsers < ActiveRecord::Migration[7.0] def change create_table :users do |t| t.string :email t.timestamps end add_index :users, :email, unique: true end end |
And then in the model you would indicate that email is an encrypted field, but this time you would also have to add the deterministic: true
option so that you can later query the data.
1 2 3 4 |
class User < ApplicationRecord encrypts :email, deterministic: true end |
Let’s now add some data:
1 2 3 4 5 6 7 8 9 10 11 12 |
(1.7ms) SELECT sqlite_version(*) TRANSACTION (0.0ms) begin transaction User Create (0.8ms) INSERT INTO "users" ("email", "created_at", "updated_at") VALUES (?, ?, ?) [["email", "{\"p\":\"O5y1kMItWX/DU3kbYD5gE0SPTg==\",\"h\":{\"iv\":\"+DROpwf9z9DRWOqh\",\"at\":\"Wm7Lz3WOIyJvr8qsDvjxqA==\"}}"], ["created_at", "2022-07-22 10:27:40.629757"], ["updated_at", "2022-07-22 10:27:40.629757"]] TRANSACTION (1.0ms) commit transaction => #<User:0x0000000111cde2b0 id: 1, email: "[email protected]", created_at: Fri, 22 Jul 2022 10:27:40.629757000 UTC +00:00, updated_at: Fri, 22 Jul 2022 10:27:40.629757000 UTC +00:00> TRANSACTION (0.1ms) begin transaction User Create (0.8ms) INSERT INTO "users" ("email", "created_at", "updated_at") VALUES (?, ?, ?) [["email", "{\"p\":\"dfv2T3gqbs9lU6i8dowc5WQ=\",\"h\":{\"iv\":\"dkC/6ugCgwRYZ23E\",\"at\":\"euwT7DwZP41oJFwdZ90Jnw==\"}}"], ["created_at", "2022-07-22 10:27:57.734062"], ["updated_at", "2022-07-22 10:27:57.734062"]] TRANSACTION (1.5ms) commit transaction => #<User:0x0000000111defb18 id: 2, email: "[email protected]", created_at: Fri, 22 Jul 2022 10:27:57.734062000 UTC +00:00, updated_at: Fri, 22 Jul 2022 10:27:57.734062000 UTC +00:00> |
Now you should be able to query by email as you usually do. Note that the encryption process remains transparent as before:
1 2 3 4 5 |
User Load (0.2ms) SELECT "users".* FROM "users" WHERE "users"."email" = ? LIMIT ? [["email", "{\"p\":\"O5y1kMItWX/DU3kbYD5gE0SPTg==\",\"h\":{\"iv\":\"+DROpwf9z9DRWOqh\",\"at\":\"Wm7Lz3WOIyJvr8qsDvjxqA==\"}}"], ["LIMIT", 1]] => #<User:0x000000011211e370 id: 1, email: "[email protected]", created_at: Fri, 22 Jul 2022 10:27:40.629757000 UTC +00:00, updated_at: Fri, 22 Jul 2022 10:27:40.629757000 UTC +00:00> irb(main):004:0> user.email |
As you can see, Active Record now encrypts the search value first and then returns the record that matches it.
Migrating from unencrypted to encrypted data
Nobody likes dealing with legacy data.
However, there will be times in which you will decide to start encrypting a field that already had some unencrypted records. While your goal should be to have everything encrypted in the long run, meanwhile you can set config.active_record.encryption.support_unencrypted_data
to true
so that Active Record Encryption starts encrypting new values but does not raise with legacy ones. Once you have finished encrypting legacy records, you should remove this configuration option.
Differences between Active Record Encryption and has_secure_password
You might have already used Rails has_secure_password
in your application to persist encrypted passwords. Let’s first clearly say that this is a completely different feature that solves a completely different problem.
The main difference with Active Record Encryption is that has_secure_password
is focused only on securing passwords and not other types of data. Given this fact, has_secure_password
‘s encryption is not reversible. Once you have encrypted a password with the BCrypt algorithm, you can not get its original plain text version again. This is great since you do not want anyone to see them in their original form for security reasons.
If you want to learn more about has_secure_password
please take a look to my previous post.
Advanced settings
There are many other options you could add to Active Record Encryption. Since this post would be too long, please check Rails guides to learn about of them. If you have any question or you want me to talk about any of them in detail please leave me a comment below.
And that is basically it!
If you enjoyed this post, do not forget to subscribe so that you do not miss any future updates. If you want me to write about something in particular, please let me know in the comment section below.
See you in my next post! Adiรณs!