Corda 5: Cloud-native deployment
June 08, 2022
By David Currie, Principal Software Engineer
Corda 5 was designed from the outset to be cloud-native. Specifically, the worker architecture allows for independent scaling and resilience of the components that make up the Corda platform, all loosely coupled via a message bus. The usage of container images as the delivery mechanism for those components and Kubernetes as the runtime orchestrator should be fairly uncontroversial. This article discusses the rationale for the decision to use Helm to package the Kubernetes configuration.
Why Helm?
Although the Kubernetes command line supports the creation of resources imperatively, it is more typical to specify the required resources declaratively and let the Kubernetes reconciliation loop ensure that the contents of the cluster match the requested resources. Those resource configurations are usually declared using everyone’s favorite white-space sensitive markup language: YAML. So why not simply provide a link to a YAML file to install Corda?
Environment-specific configuration is the answer to that question. If nothing else, you need to tell Corda where to find its prerequisites, for example, PostgreSQL and Kafka. You may also want to modify the number of flow workers to better handle your anticipated workload, or perhaps you’re a Corda developer and need to enable debug. These are all examples of configuration that needs to be in place before Corda starts and, as a consequence, cannot be part of Corda’s internal configuration model.
Now, there is no shortage of tools that enable you to take some Kubernetes YAML and tweak it, with Kustomize being one of the leading contenders. These are great if you are given someone else’s configuration and need to bend it to fit your needs. What they don’t typically provide is a means for the original provider of the configuration to indicate the ways in which it makes sense for consumers to modify the configuration they are providing. This is where Helm fits in.
At its heart, Helm provides a templating engine. The configuration provider defines a set of Kubernetes resources as templates, along with a file that defines the points of variability and, if applicable, default values. These are packaged up together in what Helm refers to as a chart. (In keeping with all good Kubernetes projects, Helm attempts to keep up the nautical theme!) The consumer of the chart can then specify a set of overrides for the points of variability at the point at which it is installed.
Installing Corda with Helm
Full details of deploying open source Corda can be found in the GitHub wiki but, for the sake of illustration, the simplest installation using the Helm CLI might look as follows:
Whilst Helm is capable of installing versioned charts from a centralized repository, here we are pointing it at a directory charts/corda
in the corda-runtime-os
GitHub repository. The file values.yaml
contains our environment-specific configuration. Again, more details can be found in the wiki but a simple example of values.yaml
might look as follows:
In this example, we just have the minimum required information for Corda to connect to its prerequisites in the environment.
A word on secrets
Depending on who you ask, you may be told that secret management in Kubernetes is a solved problem. Unfortunately, it’s a problem that has been solved in a multitude of different ways. For a platform like Corda, that’s an issue as it means there are multiple integrations that we potentially need to be able to interoperate with.
Certificates for identity and authorization are a core part of the platform and these are handled within the platform by the crypto worker with the expectation that these will be stored within a HSM. But what about the credentials needed to bootstrap the platform? For example, the PostgreSQL credentials in the example above.
For now, we’re supporting the simplest two options:
- Pass the credentials as overrides to the Helm chart install.
- Pass the name of a secret containing the credentials as an override to the Helm chart install.
The first option may seem insecure but a tool like the Helm Secrets plugin allows those credentials to be kept encrypted, right up to the point that they are provided to the Helm install. For some, that will be insufficient, which is why we also support the second option. Here, the Corda operator has the freedom to decide how the Kubernetes secret is populated. A tool like Bitnami’s Sealed Secrets could be used to ensure that the credentials are only decrypted within the Kubernetes cluster.
We suspect that even this second option may be insufficient for some Corda users so expect to see further changes in this area. We’d love to have your feedback on what you’d like to see.
Managing releases
Helm has much more to offer than just a templating engine though. We’ve already mentioned plugins that provide extensibility to the CLI. There’s a list of some of the more popular plugins in the Helm documentation. Helm also supports the concept of a release.
In the example installation above, we created a release called mycorda
. Helm stores the current configuration for each release in a secret in the namespace the release was deployed into by default. This allows Helm to keep a history of the revisions for a release. It also means that you can use Helm to rollback a deployment to a previous revision. This assumes that the original deployment did not make any backwards incompatible changes. You can certainly expect Corda 5 to have a well-defined policy for upgrades/downgrades before it reaches GA.
Another capability that Helm provides is that of hooks. Hooks allow Kubernetes resources to be created at various points during the release lifecycle. For example, during installation or upgrade. Currently, install hooks are used by the Corda Helm chart to create Kubernetes jobs that perform some initial bootstrapping of PostgreSQL and Kafka. These jobs create and populate the initial database schema and create the required Kafka topics. We recognise that not all Corda administrators will want to provide credentials with sufficient permissions to perform these actions. As a consequence, these optional steps can be turned off via chart overrides. The necessary configuration can then be performed separately by an administrator.
Advanced Helm
Earlier versions of Helm used a server-side component called Tiller to actually perform the deployments. This introduced a potential security escalation if Tiller had greater privileges than the user that submitted the deployment. Thankfully, Tiller is now gone and Helm just takes the form of a CLI, using the local user’s permissions to drive the Kubernetes API. This means that the only signs of Helm in the Kubernetes cluster are the afore-mentioned secrets containing release history, and some additional labels added to the deployed resources.
What if the Kubernetes resources that the Helm chart generates are missing something specific needed for your environment? Firstly, I’d ask that you reach out to us. If it’s something that you need, there’s a good chance someone else needs that flexibility too and we should provide the ability to specify what you need via the chart overrides. If all else fails though, Helm has the concept of post-rendering that can be used to modify the generated resources before they are applied to the cluster. You could even use Kustomize to perform that modification!
Why no operator?
The operator pattern has proved very popular as means for deploying applications to Kubernetes. Typically this involves extending the Kubernetes API with a custom resource, the fields of which represent the configuration necessary for the deployment of the application. The application provider then creates a custom controller that knows how to convert instances of the custom resource into instances of the application. The advantage of this approach is that the controller can not only handle the creation of the required Kubernetes resources but perform other actions, such as managing state.
Whilst we won’t rule out creating an operator for Corda, right now our focus is on ensuring that the platform is sufficiently cloud-native that the workers themselves can handle these situations, without requiring orchestration from other parts of the platform.
The broader ecosystem
The advantage of using a well-established tool such as Helm is that it brings with it a broader ecosystem. For example, whether you use Ansible or Terraform for managing your wider infrastructure deployment, there are integrations for Helm. Or perhaps you’d prefer to take a GitOps approach to your Corda deployments using the Helm controller for Flux?
Helm is also an excellent way of consuming other applications. In the corda-dev-helm GitHub repository, for example, we provide a Helm chart that can be used to deploy the Corda prerequisites to your Kubernetes cluster for development purposes. This simply packages existing Helm charts for PostgreSQL and Kafka, configured in a suitable way for Corda. The Kafka chart, in turn, packages a ZooKeeper chart. It’s Helm charts all the way down!
Developer adoption
An interesting side-note relates to the adoption of the Helm chart by the Corda developers in R3. Historically, developers had been using Docker Compose to spin up Corda workers along with Kafka and PostgreSQL. The Helm charts for Corda and its prerequisites were first introduced as part of the end-to-end tests run during the Continuous Integration pipeline. The next step from there was to enable developers to run those tests locally in Kubernetes on Docker Desktop or minikube so that they could debug any issues that arose during the running of the pipeline.
There are obvious benefits to developers using the same artifacts that we intend customers to use in production, not least the overhead of trying to keep the Compose files and Helm chart in-sync. Before long, it became apparent that we should get all developers to use the Helm charts when running a Corda cluster and the Compose files were sunset. This required some additional learning but the team quickly adapted their ways of working.
In summary
I hope this post has given you some insight into the decision process for using Helm for the deployment of Corda 5 and the capabilities that it enables. We look forward to hearing your feedback!