Deploying datastores for IoT & Big Data: mongoDB on K8s. Part 1

🎬 Introduction

This blog post series is intended to give an overview of how datastores capable of supporting high volumes of data from IoT devices and Big Data services can be deployed on Kubernetes. To start with, the StatefulSet primitive will be used to set up and deploy a mongoDB Replica Set (cluster). Part 2 demonstrates how other Kubernetes primitives such as Secret can be applied to secure our initial, dummy deployment. Part 3 of this series explains how to shard and further secure our mongoDB cluster.

Prerequisites

It is assumed that you already have an up and running K8s environment, such as minikube. All the examples have been developed using minikube on macOS Catalina with VirtualBox.

First of all, a new, clean namespace named datastores has to be created to develop this part.

kubectl create namespace datastores

📖 StatefulSet Primitive

A StatefulSet is a K8s Controller that manages the deployment and scaling of a set of Pods based on an identical container spec. However, conversely to what happens with Deployment Controllers, these Pods are not interchangeable: each has a persistent identifier that it maintains across any rescheduling.

A StatefulSet shall be associated to a Service, to expose its Pods, and to a PersistentVolumeClaim (PVC) template to gain persistent storage for Pods.

🖥️ Basic Deployment of a mongoDB Replica Set

Then, we need to create a K8s headless Service intended to expose our StatefulSet, as follows:

apiVersion: v1
kind: Service
metadata: 
  name: mongo-db-replica
  namespace: datastores
spec: 
  selector: 
    app: mongoDB-replica
  ports:
    - protocol: TCP
      port: 27017
      targetPort: 27017
  clusterIP: None
   

📌  Our Service will be bound to Pods labelled as app: mongoDB-replica.

Also it would be convenient to set up a ConfigMap to capture any configuration option needed.

apiVersion: v1
kind: ConfigMap
metadata:
  name: mongo-config 
  namespace: datastores
data:
  REPLICA_SET_NAME: replica-blog-1

📌  The name given to our replica set is: replica-blog-1.

Afterwards, we can declare our StatefulSet as follows:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mongo-db-statefulset
  labels:
    mongoDB-replica: "true"
    mongoDB-secured: "false"
    mongoDB-sharding: "false"
  annotations: 
    author: JMCF
  namespace: datastores
spec: 
  selector: 
    matchLabels:
      app: mongoDB-replica
  serviceName: mongo-db-replica
  replicas: 3
  template:
    metadata: 
      labels: 
        app: mongoDB-replica
    spec: 
      terminationGracePeriodSeconds: 10
      containers: 
        - name: mongo-db
          image: mongo:4.2.6
          ports: 
            - containerPort: 27017
              protocol: TCP
          volumeMounts: 
            - mountPath: /data/db
              name: mongo-volume-for-replica                  
          args: 
            - --replSet
            - $(REPLICA_SET_NAME)
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          imagePullPolicy: IfNotPresent
          envFrom: 
            - configMapRef:
                name: mongo-config
  volumeClaimTemplates: 
    - metadata: 
        name: mongo-volume-for-replica
      spec: 
        accessModes: 
          - ReadWriteOnce
        resources: 
          requests: 
            storage: 100Mi

📌  We need to bind (through the serviceName) the StatefulSet with the Service that was created initially: mongo-db-replica.

📌  Our StatefulSet is composed by 3 replicas that will be incarnated by 3 differentiated Pods.

📌  We run Pods labelled as app: mongoDB-replica in mongoDB’s replica set mode (--replSet).

📌  We mount a volume mongo-volume-for-replica that will be made available through a PVC.

📌  With volumeClaimTemplates we define the template of the PVCs that will be automatically created for each Pod.

After applying the manifest shown above, the status of our K8s cluster will be similar to:

kubectl get statefulset --namespace=datastores
NAME                   READY
mongo-db-statefulset   3/3
kubectl get pods --namespace=datastores --show-labels
NAME                     READY   STATUS   LABELS  
mongo-db-statefulset-0   1/1     Running  app=mongoDB-replica
mongo-db-statefulset-1   1/1     Running  app=mongoDB-replica
mongo-db-statefulset-2   1/1     Running  app=mongoDB-replica
kubectl describe service --namespace=datastores 
Name:              mongo-db-replica
Namespace:         datastores
Labels:            <none>
Annotations:       Selector:  app=mongoDB-replica
Type:              ClusterIP
IP:                None
Port:              <unset>  27017/TCP
TargetPort:        27017/TCP
Endpoints:         172.17.0.14:27017,172.17.0.15:27017,172.17.0.16:27017

We can ping our Pods by name (as they are already bound to the Service named mongo-db-replica) as follows:

kubectl run tm-pod --namespace=datastores -it --image=busybox --restart=Never --rm=true \ 
-- ping mongo-db-statefulset-0.mongo-db-replica
64 bytes from 172.17.0.14: seq=1 ttl=64 time=0.126 ms
64 bytes from 172.17.0.14: seq=2 ttl=64 time=0.069 ms

📌  Pods pertaining to a StatefulSet are distinguishable and keep their own identity. That’s why we can address them by <pod_id>.<service_name>.

📌  The identifier of a Pod pertaining to a StatefulSet is formed by concatenating the name of the StatefulSet (mongo-db-statefulset) with a dash (-) and order number (0, 1, 2, etc.).

We can observe that 3 different PVCs have been created to satisfy the storage demands of the 3 Pods that compose our mongoDB cluster:

kubectl get pvc --namespace=datastores
NAME                                              STATUS   VOLUME                                    
mongo-volume-for-replica-mongo-db-statefulset-0   Bound    pvc-179b1538-0cf6-4440-812e-64dc6de8b1a3
mongo-volume-for-replica-mongo-db-statefulset-1   Bound    pvc-c5c4aeb9-ed79-46a7-a7de-effc244090b6
mongo-volume-for-replica-mongo-db-statefulset-2   Bound    pvc-a7d8196b-ae94-43fd-8b31-2550b36b2997

📌  The name of each Pod’s PVC is formed by concatenating the name given to the volume claim template (mongo-volume-for-replica) with a dash (-) and the id of the Pod writing to the volume.

Configuring the mongoDB Replica Set

The next step would be to use our datastore, for instance, using the mongoDB shell client:

kubectl run tm-mongo-pod --namespace=datastores -it --image=mongo:4.2.6 --restart=Never --rm=true -- mongo mongo-db-statefulset-0.mongo-db-replica

At this stage we realize that there is a missing step which is the configuration of the mongoDB replica set itself, so that the leader (Primary Pod in the mongoDB cluster) election can proceed. Executing the following piece of Javascript code on the mongoDB shell will make it happen:

var config = {
  "_id": "replica-blog-1",
  "members": [
    {
      "_id": 0,
      "host": "mongo-db-statefulset-0.mongo-db-replica.datastores.svc.cluster.local:27017"
    },
    {
      "_id": 1,
      "host": "mongo-db-statefulset-1.mongo-db-replica.datastores.svc.cluster.local:27017"
    },
    {
      "_id": 2,
      "host": "mongo-db-statefulset-2.mongo-db-replica.datastores.svc.cluster.local:27017"
    }
  ]
};

rs.initiate(config);

Afterwards it can be observed that one of our Pods will become the Primary while the rest will be just Secondary. In my deployment, the Pod 1 of the Statefulset (mongo-db-statefulset-1) was elected as leader. Thus, we can connect to such Pod through the mongoDB shell and create a new DB, a collection and a document as follows:

use testdb;
db.testCollection.insertOne({ "type": "Building", "name": "Eiffel Tower"});
db.testCollection.find({});

If we want to check that the data is also available to be read on the Secondary replicas, we can do the following (Pod 0 and Pod 2 are my Secondary replicas):

kubectl run tm-mongo-pod --namespace=datastores -it --image=mongo:4.2.6 --restart=Never --rm=true -- mongo \
mongo-db-statefulset-0.mongo-db-replica --eval="rs.slaveOk();" --shell

In this case at shell start up it is executed a sentence (--eval param) that allows us to query data from a Secondary replica. Afterwards we can execute the following piece of Javascript code that allows to verify that the data just inserted was properly propagated to our replica(s):

use testdb;
db.testCollection.find({});

🧱 Replica Set Management

Stopping the Replica Set cluster

We can stop our mongoDB datastore cluster by scaling it to 0, as follows:

kubectl scale statefulsets/mongo-db-statefulset --replicas=0 --namespace=datastores

Now we can check the status in our namespace datastores:

kubectl get pods --namespace=datastores
No resources found in datastores namespace.
kubectl get pvc --namespace=datastores
NAME                                              STATUS   VOLUME                                    
mongo-volume-for-replica-mongo-db-statefulset-0   Bound    pvc-179b1538-0cf6-4440-812e-64dc6de8b1a3
mongo-volume-for-replica-mongo-db-statefulset-1   Bound    pvc-c5c4aeb9-ed79-46a7-a7de-effc244090b6
mongo-volume-for-replica-mongo-db-statefulset-2   Bound    pvc-a7d8196b-ae94-43fd-8b31-2550b36b2997

The PVCs are still there so that our data has not been lost.

Restarting the Replica Set cluster

We can restart our mongoDB replica set by scaling out to 3 again.

kubectl scale statefulsets/mongo-db-statefulset --replicas=3 --namespace=datastores
kubectl get pods --namespace=datastores
NAME                     READY   STATUS    RESTARTS   AGE
mongo-db-statefulset-0   1/1     Running   0          8s
mongo-db-statefulset-1   1/1     Running   0          6s
mongo-db-statefulset-2   1/1     Running   0          4s

Our Pods have come back to life. In my deployment the new leader election after scaling back resulted in Pod 2 now being the Primary and Pods 1 and 0 being Secondary.

Killing one Pod and forcing a new leader election

We can manually delete a Pod, for instance the Primary, and check that a new leader election happen and that the controller automatically restores the Pod instance of our StatefulSet.

🖊️ Conclusions

Kubernetes provides powerful primitives to deploy a clustered mongoDB datastore service. Furthermore, we can deploy a secured and sharded mongoDB so that we can give production-grade support to IoT and Big Data Applications which demand higher scalability.

🗒️ Feedback