Kubernetes Operator journey EP5: what upgrade means and how to upgrade the operator
There are a thousand versions of operator upgrade in a thousand people’s heads, as there are a thousand Hamlets in a thousand people’s eyes. To keep everyone on the same, we will try to define the range of operator upgrade. Currently, Knative community has both serving-operator and eventing-operator up online. The serving-operator has conducted the first release and eventing-operator is on its way to the first release. The feature of upgrade will become handy in future after we ship multiple releases. We take the project knative serving-operator as the example in the following context.
One-on-one version mapping for operator:
To start with an easy scenario, as we currently support in the operators, each instance of the operator can install and only install one version of corresponding services it operates. We continue to take this one-on-one version mapping, between the operator and corresponding services as the prerequisite for the upgrade support. For example, when we install serving-operator version 0.9.0, knative serving version 0.9.0 will be installed. If we upgrade the serving-operator to 0.9.1, knative serving will be upgraded automatically to 0.9.1, or the version serving-operator 0.9.1 points to. This will apply to the downgrade feature as well. I am sure there are multiple scenarios for the upgrade feature and the even downgrade feature, but we so far focus on this easy one.
The one-on-one version mapping here has an extensive meaning, which is one version of the operator, CRD, knative serving, and all the other resources bundling together. One version of something only works with something else within the same version. For example, CRD 0.9.0 only works with operator 0.9.0, knative serving 0.9.0 and all the other resources 0.9.0. If any of them go to 0.9.1, everything needs to go to 0.9.1 in order to work. Please be advised that DO NOT consider cross-support, like operator 0.9.1 can recognize CRD 0.9.0. If we want to deal with CRD 0.9.0, go to operator 0.9.0.
This one-on-one version mapping can simplify the implementation of operator upgrade, because it only need to recognize the CRD, knative serving and other resources within one version. If anything goes up or down to another version, everything, including the operator, CRD, knative serving and other resources goes to another version as well, in order to work.
What does it mean by doing an upgrade?
- upgrade of the operator,
- upgrade of the CRD,
- upgrade of target operated module, like knative serving for knative serving-operator,
- and upgrade of all the other resources(I am still trying to list them all)
at the same time.
All the other resources include Role, ClusterRole, RoleBinding, ClusterRoleBinding, ServiceAccount, and ConfigMaps, for knative serving operator.
When we speak of operator upgrade, we should at least consider the upgrade of the above four aspects, in order to make it complete. This concept can be generalized into any other operator, when upgrade is being discussed.
How to implement the upgrade?
There are multiple ways to initialize the process of an upgrade or even a downgrade. For example, etcd-operator, the icon of kubernetes operators, enable the custom resource with version and repositories, so that upgrade and downgrade can be kicked-off, by applying new CRs, so that the reconcile loop will do the work. However, to simplify the process of upgrade, Knative community decided to start the upgrade process of the operator, by applying new manifest, the released YAML file, of the operator. With that being said, the process of upgrading is equivalent to the process of installing the operator. If there is no existing operator installed, it installs the operator of a certain version. If there has been operator installed, it upgrades the operator to the certain version.
To make it happen, there are several bulletins, we need to take into account:
- Ship the manifest(YAML file) of the target operated module together with the operator image. This has already been done in knative serving operator, since the manifest of knative serving is bundled in the image available in the path of KO_DATA_PATH.
- Consolidate the CRD, and all the other resources with the general operator’s deployments to release one manifest(YAML file). Some operators like to take the CRD installation as prerequisite, which will breaks the upgrade process into multiple steps. Unless there is a special requirement to do that, we can ship the CRD & all the other resources together with the deployments in one single manifest.
As long as these two criteria have been met, we should be able to do the upgrade by merely applying the new manifest. Please comment down below, if you have other great ideas to do upgrade.
Follow Vincent, (and) you won’t derail!