Deploying datastores for IoT & Big Data: mongoDB on K8s. Part 2
This blog post series is intended to give an overview of how datastores capable of supporting high volumes of data from IoT devices and Big Data services can be deployed on Kubernetes. In the first part of this series, the StatefulSet primitive has been used to set up and deploy a mongoDB Replica Set (cluster). This part, part 2, demonstrates how other Kubernetes primitives such as Secret can be applied to secure our initial, dummy deployment. Part 3 of this series explains how to shard and further secure our mongoDB cluster.
It is assumed that you already have an up and running K8s environment, such as minikube. All the examples have been developed using minikube on macOS Catalina with VirtualBox.
For part 2 we will be using the sec-datastores K8s namespace. The same headlessService and ConfigMap that we used on Part 1 need to be created under this namespace.
The following requirements are under the scope of this series part:
The communication between replicas must be authenticated.
DB clients must be authenticated. A user/pass scheme is acceptable.
DB clients must connect to the DB through a secure and trusted channel (TLS).
The following requirements are not under scope but might be developed in future parts of this series:
The communication between replicas must take place through an authenticated channel based on TLS.
The communication between the DB and DB clients must be through mutual-TLS.
🏰 Securing mongoDB Replication
Generating a key for the Replica Set
The first step to ensure authentication of the replicas is to define a secret replica key to be presented to each other when submitting replication deltas. A new random key (of 1024 characters) can be generated and encoded in base64 as follows:
Generating a password for the mongoDB root user
We can generate a cryptographically secure password of 16 chars for the root user as follows:
Creating K8s Secret for mongoDB
The password of the root user (jmcf) and the replica key shall be stored on a K8s Secret.
📌 The Secret data properties will be made available to the mongoDB containers through a mounted Volume.
Securing StatefulSet of Part 1
In part 1 we defined an initial version of our StatefulSet that can be extended as follows:
Fixing the key file permissions problem
However if you apply the K8s manifest above you will find out that unfortunately the Pods will not be running. We can debug what is happening by running:
Although, initially you could think that the permissions problem can be fixed by using the defaultMode field of the Secret Volume declaration, it cannot (as of K8s 1.18 and mongoDB 4.2.6). There is another solution which implies running a Pod’s initialization container that just copies the required files with the proper permissions to a new Volume that will be the one actually consumed by the mongoDB container.
📌 The Volumes are shared by all containers pertaining to the Pod: the init container (based on busybox), named set-file-permissions and the mongoDB container, named mongo-db.
📌 The lifetime of the final volume containing secrets, secret-volume, will be the Pod’s lifetime.
📌 Once the initialization command completes, the init container will die. In case of failure, the logs of the init container can be obtained using the -container option of kubectl logs.
Configuring the mongoDB Replica Set
The next step is connecting to our cluster through the mongoDB shell and configure the Replica Set. Now we need to make use of the root user (jmcf) and pass previously configured as env vars.
📌 So far we have not configured TLS so our DB connection will be through an insecure channel.
After checking that we can make an authenticated connection we can execute the Replica Set configuration script and check that our replication is working properly using the replica key provided. Now we are ensuring that the members of our Replica Set can only receive data from parties that know their shared secret (the replica key).
🔒 Setting up the TLS layer
In order to meet our initial requirements, a TLS layer has to be set up. In this part DB only clients must connect through TLS to the DB. In future parts we will show how replicas of the mongoDB cluster could also use mutual TLS to authenticate.
Create a CSR to be signed by the K8s CA
The first step is to generate a new RSA private key (2048 bits) as follows:
Once we have our private key we need to generate a new Certificate. An interesting approach is to generate a Certificate signed by the Kubernetes CA itself, as that is a CA known by all Pods through their default Service Account.
First of all we need to generate a new Certificate Signing Request (CSR) for the private key generated above:
The openssl.conf file is necessary as it is convenient to generate the CSR using an extended X509 feature named SAN (Subject Alternative Names), that allows one certificate to be associated with more than one DNS name.
We can inspect the content of our CSR as follows:
Afterwards we can generate our certificate signed by the minikube CA (you can find the CA’s certificate at $HOME/.minikube/ca.crt). However, we can generate the final signed certificate through a standard K8s manifest for Certificate Signing Requests:
We can approve the CSR as follows:
After the certificate has been approved we can download it as follows:
We can inspect the contents of our brand new certificate as follows:
Now we have all we need to set up TLS for our mongoDB cluster!
Extending StatefulSet to support TLS
First of all we need to extend our Secret to include the private key and the certificate. mongoDB requires both to be concatenated on the same file:
We add the keycert file content as a base64-encoded Secret property:
We can double-check that our keycert has been properly stored as a K8s secret:
📌 We are assuming that Secrets are stored in Base64 format in the Secret store, which should not happen in a production environment!!.
And now we need to extend our StatefulSet definition to provide the different TLS parameters:
📌 The permissions of the keycert file have to be properly set by the init container.
📌 It is very easy and convenient to point to the CA file as it is always available as part of the Service Account of our Pods.
📌 --tlsAllowConnectionsWithoutCertificates allows clients to connect to the DB using just a user/pass.
Connecting to the Cluster through TLS
We can connect to the cluster through TLS as follows:
📌 We are using the FQDN of the mongoDB Pod so that there is a host match at the TLS layer.
Kubernetes provides powerful primitives to deploy a secured, clustered mongoDB datastore service, so that we can give production-grade support to IoT and Big Data Applications which demand higher scalability.