Highly Secure Kubernetes Persistent Volumes

Ensuring that only authorized applications and users can access Kubernetes volumes provisioned by Trident is obviously a paramount concern. It’s one of the first deeper conversations we tend to have with anyone planning a deployment.

The good news is that Kubernetes and Trident work together to provide highly secure persistence, provided that you follow these guidelines:

  1. Wall off access to volumes in Kubernetes by creating namespaces that define your trust boundaries
  2. Prevent pods from accessing volume mounts on worker nodes by creating an appropriate Kubernetes pod security policy
  3. Restrict volume access to appropriate worker nodes by specifying a security policy through Trident that is appropriate for each backend

We get asked questions about security regularly, below are some of the most common questions, however if you have others, please reach out to us using the comments for this post, Slack, or any of our other communications mechanisms.

Can access to a persistent volume be restricted to one pod/container?

Persistent Volumes (PVs) managed by Trident are created when a Persistent Volume Claim (PVC) is submitted by the application. This triggers Trident to create the volume on the storage system. PVs are global objects, however, PVCs belong to a single namespace. Only an administrator, or Trident because of the permissions granted to the service account it is using, are able to manage PVs.

Why is this important? Namespaces are logical boundaries for most resources in Kubernetes. They are a security domain, with the assumption that everything in a namespace can access everything else within it. However, a user or application is prevented from using resources in a different namespace. For example, a pod in one namespace cannot use a PVC in another, even if the user has access to both.

Additionally, a PV that is already bound to one PVC cannot be bound to another, regardless of namespace. This means that even if a user attempts to craft a PVC which claims an existing PV from a different namespace, it will fail. When using Trident, the PV and PVC are destroyed at the same time by default. This behavior can be changed so that PVs are retained, but a PV that was bound to a PVC once and then unbound can never be bound again.

So, to answer the question: no, an individual PV/PVC cannot be limited to a single pod, however, PVCs are limited to a single namespace in the same way that other resources are.

Can a pod see other volumes mounted to a host, and/or see what storage is presented from the array?

If a user in a pod were to execute the “showmount -e” command, or the iSCSI equivalent, against the storage system providing volume to the Kubernetes cluster, they are able to see the list of exports. However, as was stated above, they cannot gain access to another volume from inside a pod.

In order to mitigate this situation, the storage system volume access control policy, whether igroups, volume access groups, or export policies, should be restricted to only nodes in the Kubernetes cluster. This prevents mounting the volume from hosts outside of the Kubernetes cluster and bypassing the security controls in place. Additionally, disable “showmount” functionality for the SVM.

Can pods on the same node, but from a different namespace, gain access to a mounted volume?

No, with one exception: privileged containers. The process in the pod/container on the Kubernetes node does not have the ability to see resources on the system other than what they have been assigned, this is core Linux namespace functionality used by all containers. A user, or an application, do not pose any threat to other volumes by issuing fdisk or mount commands.

If privileged containers are an issue, how do we protect against them?

The configuration of the storage system and the Kubernetes cluster should be hardened to further ensure that volumes are neither visible, nor accessible, except by the pod which they are assigned to.

  • Do not allow privileged pods. To be clear, privileged pods are the only method where the potential to “escape” to the host and access storage not assigned is possible. It is very important to prevent the use of privileged pods by unauthorized applications. Standard containers have no ability to mount volumes (NFS or iSCSI), regardless of whether the user inside the container is root or not.

    Pod security policies should be used to prevent privileged containers from being created by user applications. See the second example policy here for a method of preventing users from creating privileged pods.

  • Limit application users to specific namespaces which belong to them, with no cluster-level permissions. Remember, only an administrator, or Trident, can manage PVs, users cannot create, manage, destroy, or assign PVs to themselves or anyone. In fact, only cluster administrators can view PV details, users can’t view the details for their own PVs.

  • Prevent pods from directly accessing the storage network using network policies. This stops attempts to garner information about the storage by pods, e.g. showmount functionality, before it even has a chance to succeed. Network policies, applied on a per-namespace basis, or using the host firewall to block access by the pod network to the storage network is a simple and easy way to deny access to entire network segments.

Will creating a volume with a specific UID and GID help protect the data?

No, it really won’t provide additional protection for Kubernetes-based applications.

The assumption here is that the volume has the userID (uid) and groupID (gid) specified and the Unix permissions are set to something like “700” (Note: Trident does not support setting uid and gid but does allow Unix permissions to be customized). Additionally, the pod is using a security context which specifies matching uid and gid values.

Logically, this means that because the uid/gid of the process and the volume all match, access is granted. If the uid/gid don’t match, then even if the volume is mounted the pod would not be able to access the data. Kubernetes even enables the administrator to limit a namespace to specific uid and gid values to prevent the user in a namespace from attempting to use another namespace’s user information.

So, why doesn’t that protect the data? NFSv3 assumes that the client has done the authentication/authorization, the values can be arbitrarily specified and no validation is done by the NFSv3 server. This means that any pod (in the same namespace) could use the uid/gid associated with the volume and gain access.

Kerberos could solve some of these issues because the NFS server participates in the authorization process, thus ensuring that only a validly authenticated user, with authorization, is accessing data. However, Kerberos is not supported by Kubernetes except for user authentication when using a proxy.

Security happens at all layers

A user who has, through whatever means, elevated their access on a Kubernetes node to full root has the ability to mount, manipulate, and/or destroy resources in many ways. It is vitally important to secure your cluster, both Kubernetes itself and the underlying host OS which Kubernetes is on top of, in the same manner you would a hypervisor management console or other critical systems. A good place to start is always the STIG (no, not that Stig) for your operating system.

Conclusion

Using namespaces to provide isolation between security domains, whether that be an application, a team, or something else, is “good enough” for many use cases. This is especially true when the host OS, Kubernetes, and the storage system have been configured to limit access to the storage devices and additional metadata about them. If you want ultimate protection between applications deployed to Kubernetes, having multiple clusters with dedicated resources provides the most robust separation.

We know you will have more questions about things which concern you. We haven’t covered every possible scenario, and probably never will, so please use the comments below or reach out to us on our Slack team, GitHub issues, or open a support case. We’re happy to help!

Andrew Sullivan on GithubAndrew Sullivan on Twitter
Andrew Sullivan
Technical Marketing Engineer at NetApp
Andrew has worked in the information technology industry for over 10 years, with a rich history of database development, DevOps experience, and virtualization. He is currently focused on storage and virtualization automation, and driving simplicity into everyday workflows.

Leave a Reply