Running nDVP with LinuxKit-style Hosts

Ever tried to run the NetApp docker volume plugin (nDVP) with Docker Swarm? It’s fairly easy. But what if you want to do this on AWS and use ONTAP Cloud as storage? Let’s have a look.

Docker on AWS

First, what are the options to run Swarm on AWS?

An obvious way would be to deploy some Linux EC2 instances and create a Swarm cluster the traditional way. But Docker’s recommendation looks different:

With Docker for AWS they provide AWS CloudFormation templates to easily provision a Swarm cluster. Enter some parameters and it automatically deploys a cluster on EC2. In my tests it deployed Docker CE for AWS 17.09.0-ce.

From here on, installing nDVP should be fairly easy. On a traditional Linux. But this in no traditional Linux.

Enter Linuxkit

Instead of deploying a typical Linux like Ubuntu/Red Hat/AWS AMI, the template deploys an LinuxKit based OS, which is Docker’s intended way forward. If you are not aware of it, I recommend doing some research on Moby and it’s sub-projects like LinuxKit, containerd, etc. In short, LinuxKit is a hardened, very stripped down, immutable Linux which runs everything (even system processes) as containers.

If you login to such a system, you don’t end up on the docker host itself, but in a management container:

shell-aws is the container you end up if you SSH into a Swarm manager.

And no, you cannot access the hosts filesystem in it. This is the point where the nDVP installation instruction breaks, since you are not able to create the configuration file. I will show you how to get it running anyway, but before this, let’s get our persistent cloud storage configured. Look here, why you want to use ONTAP.

Installing ONTAP Cloud

First, provision an ONTAP Cloud (OTC) system in the same VPC as your Swarm. If you are unfamiliar how to do this, start with this video or follow the quick installation instructions. For a more detailed look on how to use OnCommand Cloud Manager (OCCM) to install ONTAP Cloud, see here.

After the installation is done, use OCCM to find out the cluster management IP and the NFS data IP of your storage.

Where to find LIF IPs in OnCommand Cloud Manager.

Configuring ONTAP Cloud

First, we create a export policy for our docker hosts. Since OTC only got private IPs, we use the Swarm hosts as jumphost:

You might want to make the clientmatch smaller, or create individual rules per Swarm host. But keep in mind that the Swarm if configured with an AWS autoscaling group, which might spin up additional hosts. If your clientmatch doesn’t account for it, you end up in permission denied problems for the unaccounted hosts.

Create a service user for nDVP as described in the documentation. I will use the system’s admin user for now.

Installing nDVP

Now comes the tricky part. We need to create the nDVP config file in the hosts /etc/netappdvp/, which the management container cannot access. But we can solve it the docker way and create a container which mounts /etc from the host:

You need to adjust managementLIF, dataLIF, svm, username, and password according to your environment. The only special lines in this config file are exportPolicy, where I specified the export policy we created above and nfsMountOptions.

nfsMountOptions is required here. The LinuxKit host doesn’t start rpc.lockd or rpc.statd, which are required for locking with NFSv3. Without the “nolock” option, mounting/attaching nDVP volumes will fail, due to missing rpc.statd. I found no way to start rpc.statd on this deployment, since the host OS is so well guarded. With “nolock” we simply disable NFS locking; the host now manages locks locally without propagating them.

Locking in NFSv3 is a story of its own. Since you normally don’t attach a volume to multiple hosts, you should be fine. If you do, beware that you cannot lock files and processes might concurrently write to the same file.

Now we can install the plugin on the docker host:

Use nDVP

Finally we can create and use docker volumes, stored securely on an ONTAP NFS backend:

And finally: clone a volume using FlexClone. It is instant, space efficient (only changed data consumes storage) and performance neutral.

Next steps

Installing nDVP on a Docker Swarm – deployed via Docker’s CloudFormation template in AWS – requires some tweaks, but works well.

The next steps are to install nDVP on all hosts of the Swarm. This can be done manually, but remember, it is an autoscaling group and hosts can appear or disappear. New hosts will be missing nDVP and will not be able to access the volumes until nDVP is installed. Not a desirable thing in a Swarm cluster.

It makes more sense to include the installation of nDVP into the automated deployment process.

But that is another story and shall be told another time …

If you have any questions, please reach out using the comments below or the Slack team.

Oliver Krause
Cloud Solution Architect at NetApp Deutschland GmbH
Nerd doing computer stuff for over 20 years. Ex-NAS expert, Ex-ONTAP expert, Ex-MetroCluster expert, ex-many other stuff. Now doing Clouds. The water and the server ones. My shell is my castle.

Leave a Reply