CS-FLE in MongoAtlas on Azure

Surbhi Nijhara
8 min readDec 5, 2020

--

This blog is to make the audience using client side field level encryption in MongoDB aware of the different challenges and limitations.

As known, encryption can be done both at rest and at field level. Here we will first see in brief the supported methods to encrypt at field level in MongoDB and then outline the current limitations in two areas — Automatic method of encryption and Key creation and storage in MongoDB Atlas on Azure (managed MongoDB hosted on Azure cloud) which were encountered for our business case.

To begin with, let us quickly brush up the basic prerequisites on the encryption keys.

Keys

Typically, there are two types of keys involved in encryption of intended data stored in any database, and the strategy is popularly known as ‘envelope-encryption’.

  1. Customer Master Key — CMK
  2. Data Encryption Key — DEK

Customer Master Key (CMK) is a key that is created to protect the data encryption key. CMK can be an asymmetric or symmetric key. It is stored in a KMS (Key management Service) or a key vault. For e.g. Cloud based KMS like AWS KMS, Azure Key Vault or HarshiCorp Vault etc.

In the context of MongoDB, along with AWS KMS additionally a ‘local’ KMS is also supported where the key can also be stored in a local filesystem.

Data Encryption key (DEK) or simply Encryption key is mostly a symmetric key. The client application deals with encryption and decryption of data using a DEK. In the MongoDB context, data Encryption Key is stored in MongoDB vault which is essentially a collection in MongoDB.

So MongoDB, DEK is secured using Customer Master Key (CMK). CMK, in turn, is secured using KMS as mentioned previously.

Note: Currently MongoDB driver natively supports only AWS KMS (preferred) or local KMS. For any other key manager, the approach is to make KMS provider remote service calls and then be treated as local.

How the CMK and DEK could be created and stored in Azure Key Vault and MongoDB collection respectively is explained later in the section Key Creation Steps.

Let us next see what it takes to implement the same.

Client Side Field Encryption

As we know by now like most databases, MongoDB too supports Client Side Field Encryption and it provides the encryption at a field level granularity.

There are two methods supported for CS-DLE — Manual and Automatic.

MongoDB Community and Enterprise edition supports manual field encryption, while automatic field encryption is only supported in Enterprise editions.
Enterprise editions includes cloud managed MongoDB Enterprise editions i.e. MongoDB Atlas.

Manual Field Encryption

In manual method, the driver encryption library has to be explicitly used by the client application to encrypt/decrypt and is also therefore known as explicit method.

The sample code for implementing manual CS-FLE can be found here.

CS-FLE (Manual)

Automatic Field Encryption

CS-FLE (Automatic — uses a daemon mongocryptd)

In automatic method, the approach is slightly different syntactically where instead of the application code, a daemon does the read/write operations to construct the encryption/decryption logic. Application code has to only provide the encryption rules as configurations for the daemon to process.

MongoDB Enterprise or MongoDB Atlas supports Automatic Client-Side Field Level Encryption with help of a daemon mongocryptd.
The daemon should be installed on the host depending on its type viz VM(virtual machine)or dockerized environment and should become part of IaC (Infrastructure as Code) configuration.

A Schema is required for Automatic Field Level Encryption which is represented in the extended version of JSON Schema. The schema contains the following information

● The algorithm

● The data encryption key

● BSON Type of the field (required for Deterministic Encrypted Fields).

This schema can be configured at

a) Client Application level: This is enforced at the MongoDB driver level and automatically encrypts or decrypts the data. Example is here and can be seen from line 80 onwards.

b) Database level: If the client fails to encrypt the data, the driver will retrieve the schema configured on the server side and use it to perform automatic encryption.

Another view — CS-FLE (Automatic — uses a daemon mongocryptd)

Diagram source is here.

Automatic FLE Challenges

While automatic field level encryption seems the first and obvious choice given that it is less verbose, do check for following limitations early on to meet your business requirements. It restricts doing the following:
a) Encrypting individual elements of an array is not possible at this time. The entire array field needs to be encrypted.
b) Even if you are good from a business perspective with encryption of the entire array field, it does not allow the ‘Deterministic’ encryption type. Only ‘Randomized’ is supported.
The reference can be seen at Automatic Encryption Rules.

Overall CS-FLE Challenges

Overall limitations while using Azure Key Vault (AKV) as the key store for client side field level encryption are as follows.

● Currently MongoDB Atlas natively supports only AWS KMS and local keystore for client-side field level encryption. Ref: Master Key and Data Encryption Key Management

● To use Azure Key Vault (AKV) for the stored master encryption key, the encryption key needs to be fetched and proxied through local key store configuration for mongo drivers to understand. Ref: CS-FLE FAQ

● The local key store requires the key length as 96 bytes only. MongoDB documentation has demonstrated samples using Java SecureRandom API to generate the keys. By default, the algorithm used is SHA1PRNG and the other algorithms supported can be found at Oracle Java API. There are other methods/tools as well like ‘openssl’ or ‘dev urandom’ to generate the key for local key stores using cryptographic randomness.

If AES-128/192/256 is generated and stored as secret in AKV, after retrieving the key, padding is needed in the client application to make it 96 byte length before passing it to the local store.
However this is not needed and a secure random key generation of 96 bytes is recommended by the MongoDB team.

● Additionally, the generated key needs to be stored as a ‘Secret’ in AKV because the ‘Key’ in AKV only supports asymmetric keys (in RSA or EC format). Storing the master key as secret will not allow key rotation and other key management policies.

● While different fields can be configured using different keys, the same field for different values cannot be configured using different keys. i.e. Only one CMK and not multiple CMKs can be used to encrypt the same field values. The reason being the key needs to be associated with the schema itself. A DEK is created using a CMK and a KeyID is generated. The KeyId is configured with the fields in the schema.

Keys Creation Steps

Though MongoDB team has mentioned steps here, stating the end-end steps where assuming a CMK (Customer Managed key) exists in Azure key vault, and a DEK(data encryption key) has to be created. Azure CLI and Mongo shell can be used for the same.

The keys are recommended to be created beforehand by a privileged team from a security access perspective and the required key to be only read in the application.

Create a CMK to store in Azure KMS

Using MongoDB shell as below:

Command:

TEST_LOCAL_KEY=$(echo “$(head -c 96 /dev/urandom | base64 | tr -d ‘\n’)”)mongo — nodb — shell — eval “var TEST_LOCAL_KEY=’$TEST_LOCAL_KEY’” <

OR

openssl rand -base64 32 > mongodb-keyfile

Note: If you want to use a programming library like Java, see CS-FLE in Java where SecureRandom Java API is used.

Store the generated secure key in Azure KMS a Secret Object using Azure CLI or other methods.

Create a DEK to store in MongoDB Vault

1. Login into Azure

az login — service-principal — username <username> — password <password> — tenant <tenant-id>

2. Get Secret Name

az keyvault secret show — name “<column-masterkey-name>” — vault-name “<key vault name>”

This command will output a value.

3. Go to Shell

export TEST_LOCAL_KEY= <value from above>mongo — nodb — shell — eval “var TEST_LOCAL_KEY=’$TEST_LOCAL_KEY’var ClientSideFieldLevelEncryptionOptions = {“keyVaultNamespace” : “encryption.__dataKeys”,“kmsProviders” : {“local” : {“key” : BinData(0, TEST_LOCAL_KEY)}}}encryptedClient = Mongo(“mongodb+srv://<user>:<pw>@host”, — — — — — — — — — — — — — — — — → Change to the required connection uriClientSideFieldLevelEncryptionOptions)keyVault = encryptedClient.getKeyVault()keyVault.createKey(“local”, [“data-encryption-key”])

With above, an encryption.__dataKeys”, db.collection should get created, as seen in sample below.

Reference Schema:

Limitation:
If MongoAtlas is used, You won’t be able to run — nodb — eval option in Mongo shell connection for Atlas since there is no access to local terminal or bash shell. Execute eval command in Atlas is unsupported in Atlas. Refer to documentation. We need to use the compatible MongoDB driver to enable the client side application .Refer to the documentation.

The created data encryption key can now be used in your application.

Recap

We have seen above that for encrypting a data field in a database in MongoAtlas on Azure and to use Azure KMS as the third-party key vault, we did the following:

  1. Created a CMK for the MongoDB Locally Managed Keyfile.

The CMK is a secure cryptographic key. Instead of storing the file on a local path, the generated key is stored in the Azure KMS as a secret.

  1. Created a DEK by fetching the CMK secret and using the CMK for encrypting. Had there been a remote KMS support for Azure, the key would have passed by the library to the KMS for encryption/decryption because CMK never leaves the KMS and hence CMK is more secure.

The DEK is a symmetric key which needs to get stored in MongoDB. It should preferably be adopted to store DEK in a separate database and not store in the same collection as your business data,

  1. Finally the field gets encrypted using the AES-256-CBC algorithm combined with MAC. More details can be studied here at Encryption Algorithm.

Good Reads

  1. Supported Remote KMS (like AWS) to store CMK
  2. https://medium.com/swlh/beef-up-your-mongodb-security-with-client-side-field-level-encryption-ffad06abe4ba
  3. https://medium.com/@visweshwar/client-side-field-level-encryption-csfle-on-mongodb-community-ed-using-java-spring-a6489a5c9146
  4. https://medium.com/@visweshwar/mongodb-client-side-field-level-encryption-using-java-spring-part-2-community-edition-manual-2e9775e8c169
  5. CS-FLE Limitations

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Responses (1)

Write a response