An Introduction to Cheesecake in OpenStack

By Chad Morgenstern and Jenny Yang

Now that you’ve considered and reviewed your disaster recovery needs, you may have decided that DR with Cinder is worth exploring.

What is Cheesecake? Cheesecake in the OpenStack community refers to Cinder replication for DR use cases. In the event that disaster strikes your primary storage site, Cheesecake provides a backend-level failover solution (assuming that you have a secondary site configured for replication). Currently, users can also perform group-based replication by creating a volume type and volumes of that particular type. This allows the user to fail over volumes of that type as a single group, granting finer control over the process.

That being said, Cheesecake was implemented as Cinder replication v2.1 in the Mitaka release of OpenStack and is meant to be admin facing. It is worth knowing that the next evolution of Cinder replication is currently in development, and it will be known as Tiramisu. Tiramisu will feature more granularity of replication than Cheesecake and is meant to be more tenant-facing. Tiramisu will feature a group construct to make group-based replication native. Read more about it here.

Returning to the topic of Cheesecake, the overview of the workflow for implementation is as follows:

  1. The admin enables replication for the secondary backend through the cinder.conf file. The admin can specify whether all volumes or only certain volume types should be replicated.
  2. The tenant creates volumes that are replicated.
  3. A disaster occurs at the primary site.
  4. The admin issues the Cinder failover command to point the Cinder driver to the secondary backend. This is recorded in the Cinder database, and the original backend is marked as failed over.
    NOTE: For volumes that were attached before the disaster hit, the tenant must detach and reattach them manually, but this process can be automated (a sample automation script will be included in a future post).
  5. If the original backend is still usable, the admin can fail back if desired.

Something crucial to note—Cheesecake lacks a built-in mechanism for fail back. With this said, if you are comfortable issuing a few SQL update commands to the Cinder database, Cheesecake gains bi-directional failover (i.e. failover and failover back). Caveat emptor, modifying an OpenStack database has inherent dangers. Before approaching this process, you will want to get a database backup of, at a minimum, the Cinder database.

The next post in the series will cover the entirety of the failover and failback workflow for an iSCSI backend, step by step and in more detail. Although the post will seem lengthy, the process is straightforward. In effect, it will be a comprehensive tutorial.

For the moment, this failover and failback with Cheesecake has been explored only with secondary devices, as implementing the process with primary devices is a study still in the works.

Stay tuned to learn more, and if you have any questions, let us know in the comments below or on Slack!

Chad Morgenstern
Chad is a 10 -year veteran at NetApp having held positions In both escalations, reference architecture teams, and most recently as part of the workload performance team where he contributed to significant understandings of AFF performance for things such as VDI environments, working on the SPC-1 benchmark, and more. In addition, Chad spent time building automated tools and frameworks for performance testing with a few of them based on opensource technologies such as cloud and containers.

Chad is happily married and is the father of four daughters, and soon to be owner of two ferrets.

Leave a Reply