In this post we will show how to deploy a containerized MongoDB replica-set using StatefulSets on Kubernetes with NetApp Trident and underlying SolidFire or ONTAP storage.
It is important to understand the concept of StatefulSets before we go further, if you do not yet know about StatefulSets you can go through this StatefulSet tutorial.
MongoDB is no doubt one of the most popular open-source scale-out NoSQL databases. It powers many modern big data analytics applications that require low latency for reads and writes, high availability, and advanced data management.
It may be confusing for many people when they think about running databases in containers. As containers are supposed to be stateless and a database is pointless without a state. Nevertheless, database deployments can greatly benefit from a containerized deployments:
- With new features like StatefulSets in Kubernetes, you can scale your database with ease and also configure replica-sets and HA automatically.
- Using containers can enable you to easily upgrade your database with minimal or no downtime.
- You can define your database deployment configuration including persistent storage as yaml/json in Kubernetes, which makes it easy and repeatable.
- Avoid unnecessary fail-overs and manual handling of database nodes.
Furthermore, deployment at scale of databases can be greatly simplified and standardized by using containers. As an example, Google runs almost everything including databases in containers as mentioned in this Kubernetes blog.
Deploying MongoDB on Kubernetes
Why do we need StatefulSets?
A database pod always has PersistentVolumes attached to it. If we scale a replication controller, all the pods will try to mount the same PersistentVolume and we don’t want to do that. We want each MongoDB replica-set to have its own PersistentVolume. Hence, if we use a replication controller we have to define a replication controller for each database node or replica set and deploy a new replication controller every time we want to scale.
But with StatefulSets we can overcome this problem and scale our stateful database pods easily. When we scale a StatefulSet a new PersistentVolumeClaim is created for our new database pod.
Dynamic Storage provisioning for MongoDB
NetApp recently released Trident, a dynamic storage provisioner for Kubernetes. With the help of Trident, we can easily provision Storage for Stateful applications and databases. Trident itself runs as a Kubernetes pod and dynamically provisions storage when a developer application demands it using a PVC (PersistentVolumeClaim).
Here is how the magic happens:
- The Administrator deploys and configures Trident and defines StorageClasses and Quotas.
- The developer creates a PVC or StatefulSet with the respective StorageClass.
- Trident dynamically provisions storage for the developer.
You can do a StatefulSet MongoDB deployment powered by NetApp Trident on NetApp SolidFire as well as ONTAP. We have documented and tested a simple MongoDB 3 replica-set configuration and instructions+code is now available on GitHub.
To do a MongoDB StatefulSet deployment on Kubernetes you need to install Trident with the SolidFire backend configuration and then define a StorageClass for your SolidFire backend. That’s all.
Once you have done that you can deploy your MongoDB StatefulSet and headless service as shown in the below samples. Find detailed steps here.
NetApp storage offers following key benefits for demanding applications built on MongoDB:
- Predictable high performance
- Consistent low latency and excellent response time
- Inline efficiency, achieved by inline deduplication and compression
- Instant, efficient and efficient database clones for QA and dev-test environments
- Backup and recovery using Snapshot technology
- Disaster recovery solutions using enterprise replication technology
- Scalability, reliability and non-disruptive operations
Containerization of databases like MongoDB can enable organizations to manage and standardize database deployment at scale. NetApp with Trident, its dynamic storage provisioner for Kubernetes, can help you abstract enterprise storage provisioning and offer on-demand provisioning of storage through PVCs.
While features like StatefulSets in Kubernetes are still new and will develop over time to effectively support enterprise deployments, there are still some open questions:
- How do I provide different storage class for each of my replica-set?
- Is replication the right use-case for StatefulSets or should we only use it for horizontal scaling (e.g. sharding in MongoDB)?
Finally, database backups and disaster recovery are important concerns in production environments and require proper planning and design. We will discuss them in our upcoming posts.