Kubernetes-native declarative infrastructure for OpenStack.

For documentation, see the Cluster API Provider OpenStack book.

What is the Cluster API Provider OpenStack

The Cluster API brings declarative, Kubernetes-style APIs to cluster creation, configuration and management.

The API itself is shared across multiple cloud providers allowing for true OpenStack hybrid deployments of Kubernetes. It is built atop the lessons learned from previous cluster managers such as kops and kubicorn.

Launching a Kubernetes cluster on OpenStack

Check out the Cluster API Quick Start to create your first Kubernetes cluster on OpenStack using Cluster API. If you wish to use the external cloud provider, check out the External Cloud Provider as well.

Features

Native Kubernetes manifests and API
Choice of Linux distribution (as long as a current cloud-init is available)
Support for single and multi-node control plane clusters
Deploy clusters with and without LBaaS available (only cluster with LBaaS can be upgraded)
Support for security groups
cloud-init based nodes bootstrapping

Compatibility with Cluster API and Kubernetes Versions

This provider’s versions are compatible with the following versions of Cluster API:

	v1beta1 (v1.x)
OpenStack Provider v1alpha5 (v0.6)	✓
OpenStack Provider v1alpha6 (v0.7)	✓
OpenStack Provider v1alpha7 (v0.9)	✓
OpenStack Provider v1beta1	✓

This provider’s versions are able to install and manage the following versions of Kubernetes:

	v1.25	v1.26	v1.27	v1.28
OpenStack Provider v1alpha5 (v0.6)	✓	+	+	+
OpenStack Provider v1alpha6 (v0.7)	✓	✓	✓	+
OpenStack Provider v1alpha7 (v0.9)	+	✓	✓	★
OpenStack Provider v1beta1	+	✓	✓	★

This provider’s versions are able to install Kubernetes to the following versions of OpenStack:

	Queens	Rocky	Stein	Train	Ussuri	Victoria	Wallaby	Xena	Yoga	Bobcat
OpenStack Provider v1alpha5 (v0.6)	+	+	+	+	+	✓	✓	✓	✓	★
OpenStack Provider v1alpha6 (v0.7)	+	+	+	+	+	✓	✓	✓	✓	★
OpenStack Provider v1alpha7 (v0.9)		+	+	+	+	✓	✓	✓	✓	★
OpenStack Provider v1beta1		+	+	+	+	✓	✓	✓	✓	★

Test status:

★ currently testing
✓ previously tested
+ should work, but we weren’t able to test it

Older versions may also work but we have not verified.

Each version of Cluster API for OpenStack will attempt to support two Kubernetes versions.

NOTE: As the versioning for this project is tied to the versioning of Cluster API, future modifications to this policy may be made to more closely aligned with other providers in the Cluster API ecosystem.

NOTE: The minimum microversion of CAPI using nova is 2.60 now due to server tags support as well permitting multiattach volume types, see code for additional information.

NOTE: We require Keystone v3 for authentication.

Development versions

ClusterAPI provider OpenStack images and manifests are published after every PR merge and once every day:

With a Google Cloud account you can get a quick overview here
The manifests are available under:
- master/infrastructure-components.yaml: latest build from the main branch, overwritten after every merge
- e.g. nightly_master_20210407/infrastructure-components.yaml: build of the main branch from 7th April

These artifacts are published via Prow and Google Cloud Build. The corresponding job definitions can be found here.

Operating system images

Note: Cluster API Provider OpenStack relies on a few prerequisites which have to be already installed in the used operating system images, e.g. a container runtime, kubelet, kubeadm,.. . Reference images can be found in kubernetes-sigs/image-builder. If it isn’t possible to pre-install those prerequisites in the image, you can always deploy and execute some custom scripts through the KubeadmConfig.

Documentation

Please see our book for in-depth documentation.

Getting involved and contributing

Are you interested in contributing to cluster-api-provider-openstack? We, the maintainers and community, would love your suggestions, contributions, and help! Also, the maintainers can be contacted at any time to learn more about how to get involved:

via the cluster-api-openstack channel on Kubernetes Slack
via the SIG-Cluster-Lifecycle Mailing List.
during our Office Hours
- bi-weekly on Wednesdays @ 14:00 UTC on Zoom (link in meeting notes)
- Previous meetings: [ notes | recordings ]

In the interest of getting more new people involved we try to tag issues with good first issue. These are typically issues that have smaller scope but are good ways to start to get acquainted with the codebase.

We also encourage ALL active community participants to act as if they are maintainers, even if you don’t have “official” write permissions. This is a community effort, we are here to serve the Kubernetes community. If you have an active interest and you want to get involved, you have real power! Don’t assume that the only people who can get things done around here are the “maintainers”.

We also would love to add more “official” maintainers, so show us what you can do!

This repository uses the Kubernetes bots. See a full list of the commands here. Please also refer to the Contribution Guide and the Development Guide for this project.

Code of conduct

Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.

Github issues

Bugs

If you think you have found a bug please follow the instructions below.

Please spend a small amount of time giving due diligence to the issue tracker. Your issue might be a duplicate.
Get the logs from the cluster controllers. Please paste this into your issue.
Open a new issue.
Remember that users might be searching for your issue in the future, so please give it a meaningful title to help others.
Feel free to reach out to the Cluster API community on the Kubernetes Slack.

Tracking new features

We also use the issue tracker to track features. If you have an idea for a feature, or think you can help Cluster API Provider OpenStack become even more awesome follow the steps below.

Open a new issue.
Remember that users might be searching for your issue in the future, so please give it a meaningful title to help others.
Clearly define the use case, using concrete examples.
Some of our larger features will require some design. If you would like to include a technical design for your feature, please include it in the issue.
After the new feature is well understood, and the design agreed upon, we can start coding the feature. We would love for you to code it. So please open up a WIP (work in progress) pull request, and happy coding.

Table of Contents generated with DocToc

Getting Started

Getting Started

Quick Start

In this tutorial we’ll cover the basics of how to use Cluster API to create one or more Kubernetes clusters.

Installation

There are two major quickstart paths: Using clusterctl or the Cluster API Operator.

This article describes a path that uses the clusterctl CLI tool to handle the lifecycle of a Cluster API management cluster.

The clusterctl command line interface is specifically designed for providing a simple “day 1 experience” and a quick start with Cluster API. It automates fetching the YAML files defining provider components and installing them.

Additionally it encodes a set of best practices in managing providers, that helps the user in avoiding mis-configurations or in managing day 2 operations such as upgrades.

The Cluster API Operator is a Kubernetes Operator built on top of clusterctl and designed to empower cluster administrators to handle the lifecycle of Cluster API providers within a management cluster using a declarative approach. It aims to improve user experience in deploying and managing Cluster API, making it easier to handle day-to-day tasks and automate workflows with GitOps. Visit the CAPI Operator quickstart if you want to experiment with this tool.

Common Prerequisites

Install and setup kubectl in your local environment
Install kind and Docker
Install Helm

Install and/or configure a Kubernetes cluster

Cluster API requires an existing Kubernetes cluster accessible via kubectl. During the installation process the Kubernetes cluster will be transformed into a management cluster by installing the Cluster API provider components, so it is recommended to keep it separated from any application workload.

It is a common practice to create a temporary, local bootstrap cluster which is then used to provision a target management cluster on the selected infrastructure provider.

Choose one of the options below:

Existing Management Cluster

For production use-cases a “real” Kubernetes cluster should be used with appropriate backup and disaster recovery policies and procedures in place. The Kubernetes cluster must be at least v1.20.0.
```
export KUBECONFIG=<...>
```

Kind

Warning

kind is not designed for production use.

Minimum kind supported version: v0.22.0

Help with common issues can be found in the Troubleshooting Guide.

Note for macOS users: you may need to increase the memory available for containers (recommend 6 GB for CAPD).

Note for Linux users: you may need to increase ulimit and inotify when using Docker (CAPD).

kind can be used for creating a local Kubernetes cluster for development environments or for the creation of a temporary bootstrap cluster used to provision a target management cluster on the selected infrastructure provider.

The installation procedure depends on the version of kind; if you are planning to use the Docker infrastructure provider, please follow the additional instructions in the dedicated tab:
DefaultDockerKubeVirt
Create the kind cluster:
```
kind create cluster
```
Test to ensure the local kind cluster is ready:
```
kubectl cluster-info
```
Run the following command to create a kind config file for allowing the Docker provider to access Docker on the host:
```
cat > kind-cluster-with-extramounts.yaml <<EOF
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
networking:
  ipFamily: dual
nodes:
- role: control-plane
  extraMounts:
    - hostPath: /var/run/docker.sock
      containerPath: /var/run/docker.sock
EOF
```
Then follow the instruction for your kind version using kind create cluster --config kind-cluster-with-extramounts.yaml to create the management cluster using the above file.
Create the Kind Cluster

KubeVirt is a cloud native virtualization solution. The virtual machines we’re going to create and use for the workload cluster’s nodes, are actually running within pods in the management cluster. In order to communicate with the workload cluster’s API server, we’ll need to expose it. We are using Kind which is a limited environment. The easiest way to expose the workload cluster’s API server (a pod within a node running in a VM that is itself running within a pod in the management cluster, that is running inside a Docker container), is to use a LoadBalancer service.

To allow using a LoadBalancer service, we can’t use the kind’s default CNI (kindnet), but we’ll need to install another CNI, like Calico. In order to do that, we’ll need first to initiate the kind cluster with two modifications:
1. Disable the default CNI
2. Add the Docker credentials to the cluster, to avoid the Docker Hub pull rate limit of the calico images; read more about it in the docker documentation, and in the kind documentation.
Create a configuration file for kind. Please notice the Docker config file path, and adjust it to your local setting:
```
cat <<EOF > kind-config.yaml
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
networking:
# the default CNI will not be installed
  disableDefaultCNI: true
nodes:
- role: control-plane
  extraMounts:
   - containerPath: /var/lib/kubelet/config.json
     hostPath: <YOUR DOCKER CONFIG FILE PATH>
EOF
```
Now, create the kind cluster with the configuration file:
```
kind create cluster --config=kind-config.yaml
```
Test to ensure the local kind cluster is ready:
```
kubectl cluster-info
```
Install the Calico CNI

Now we’ll need to install a CNI. In this example, we’re using calico, but other CNIs should work as well. Please see calico installation guide for more details (use the “Manifest” tab). Below is an example of how to install calico version v3.24.4.

Use the Calico manifest to create the required resources; e.g.:
```
kubectl create -f  https://raw.githubusercontent.com/projectcalico/calico/v3.24.4/manifests/calico.yaml
```

Install clusterctl

The clusterctl CLI tool handles the lifecycle of a Cluster API management cluster.

LinuxmacOShomebrewWindows

Install clusterctl binary with curl on Linux

If you are unsure you can determine your computers architecture by running uname -a

Download for AMD64:

curl -L https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.6.2/clusterctl-linux-amd64 -o clusterctl

Download for ARM64:

curl -L https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.6.2/clusterctl-linux-arm64 -o clusterctl

Download for PPC64LE:

curl -L https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.6.2/clusterctl-linux-ppc64le -o clusterctl

Install clusterctl:

sudo install -o root -g root -m 0755 clusterctl /usr/local/bin/clusterctl

Test to ensure the version you installed is up-to-date:

clusterctl version

Install clusterctl binary with curl on macOS

If you are unsure you can determine your computers architecture by running uname -a

Download for AMD64:

curl -L https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.6.2/clusterctl-darwin-amd64 -o clusterctl

Download for M1 CPU (“Apple Silicon”) / ARM64:

curl -L https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.6.2/clusterctl-darwin-arm64 -o clusterctl

Make the clusterctl binary executable.

chmod +x ./clusterctl

Move the binary in to your PATH.

sudo mv ./clusterctl /usr/local/bin/clusterctl

Test to ensure the version you installed is up-to-date:

clusterctl version

Install clusterctl with homebrew on macOS and Linux

Install the latest release using homebrew:

brew install clusterctl

Test to ensure the version you installed is up-to-date:

clusterctl version

Install clusterctl binary with curl on Windows using PowerShell

Go to the working directory where you want clusterctl downloaded.

Download the latest release; on Windows, type:

curl.exe -L https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.6.2/clusterctl-windows-amd64.exe -o clusterctl.exe

Append or prepend the path of that directory to the PATH environment variable.

Test to ensure the version you installed is up-to-date:

clusterctl.exe version

Initialize the management cluster

Now that we’ve got clusterctl installed and all the prerequisites in place, let’s transform the Kubernetes cluster into a management cluster by using clusterctl init.

The command accepts as input a list of providers to install; when executed for the first time, clusterctl init automatically adds to the list the cluster-api core provider, and if unspecified, it also adds the kubeadm bootstrap and kubeadm control-plane providers.

Enabling Feature Gates

Feature gates can be enabled by exporting environment variables before executing clusterctl init. For example, the ClusterTopology feature, which is required to enable support for managed topologies and ClusterClass, can be enabled via:

export CLUSTER_TOPOLOGY=true

Additional documentation about experimental features can be found in Experimental Features.

Initialization for common providers

Depending on the infrastructure provider you are planning to use, some additional prerequisites should be satisfied before getting started with Cluster API. See below for the expected settings for common providers.

AWSAzureCloudStackDigitalOceanDockerEquinix MetalGCPHetznerIBM CloudK0smotronKubeKeyKubeVirtMetal3NutanixOCIOpenStackOutscaleProxmoxVCDvclusterVirtinkvSphere

Download the latest binary of clusterawsadm from the AWS provider releases. The clusterawsadm command line utility assists with identity and access management (IAM) for Cluster API Provider AWS.

LinuxmacOShomebrewWindows

Download the latest release; on Linux, type:

curl -L https://github.com/kubernetes-sigs/cluster-api-provider-aws/releases/download/v0.0.0/clusterawsadm-linux-amd64 -o clusterawsadm

Make it executable

chmod +x clusterawsadm

Move the binary to a directory present in your PATH

sudo mv clusterawsadm /usr/local/bin

Check version to confirm installation

clusterawsadm version

Example Usage

export AWS_REGION=us-east-1 # This is used to help encode your environment variables
export AWS_ACCESS_KEY_ID=<your-access-key>
export AWS_SECRET_ACCESS_KEY=<your-secret-access-key>
export AWS_SESSION_TOKEN=<session-token> # If you are using Multi-Factor Auth.

# The clusterawsadm utility takes the credentials that you set as environment
# variables and uses them to create a CloudFormation stack in your AWS account
# with the correct IAM resources.
clusterawsadm bootstrap iam create-cloudformation-stack

# Create the base64 encoded credentials using clusterawsadm.
# This command uses your environment variables and encodes
# them in a value to be stored in a Kubernetes Secret.
export AWS_B64ENCODED_CREDENTIALS=$(clusterawsadm bootstrap credentials encode-as-profile)

# Finally, initialize the management cluster
clusterctl init --infrastructure aws

Download the latest release; on macOs, type:

curl -L https://github.com/kubernetes-sigs/cluster-api-provider-aws/releases/download/v0.0.0/clusterawsadm-darwin-amd64 -o clusterawsadm

Or if your Mac has an M1 CPU (”Apple Silicon”):

curl -L https://github.com/kubernetes-sigs/cluster-api-provider-aws/releases/download/v0.0.0/clusterawsadm-darwin-arm64 -o clusterawsadm

Make it executable

chmod +x clusterawsadm

Move the binary to a directory present in your PATH

sudo mv clusterawsadm /usr/local/bin

Check version to confirm installation

clusterawsadm version

Example Usage

export AWS_REGION=us-east-1 # This is used to help encode your environment variables
export AWS_ACCESS_KEY_ID=<your-access-key>
export AWS_SECRET_ACCESS_KEY=<your-secret-access-key>
export AWS_SESSION_TOKEN=<session-token> # If you are using Multi-Factor Auth.

# The clusterawsadm utility takes the credentials that you set as environment
# variables and uses them to create a CloudFormation stack in your AWS account
# with the correct IAM resources.
clusterawsadm bootstrap iam create-cloudformation-stack

# Create the base64 encoded credentials using clusterawsadm.
# This command uses your environment variables and encodes
# them in a value to be stored in a Kubernetes Secret.
export AWS_B64ENCODED_CREDENTIALS=$(clusterawsadm bootstrap credentials encode-as-profile)

# Finally, initialize the management cluster
clusterctl init --infrastructure aws

Install the latest release using homebrew:

brew install clusterawsadm

Check version to confirm installation

clusterawsadm version

Example Usage

export AWS_REGION=us-east-1 # This is used to help encode your environment variables
export AWS_ACCESS_KEY_ID=<your-access-key>
export AWS_SECRET_ACCESS_KEY=<your-secret-access-key>
export AWS_SESSION_TOKEN=<session-token> # If you are using Multi-Factor Auth.

# The clusterawsadm utility takes the credentials that you set as environment
# variables and uses them to create a CloudFormation stack in your AWS account
# with the correct IAM resources.
clusterawsadm bootstrap iam create-cloudformation-stack

# Create the base64 encoded credentials using clusterawsadm.
# This command uses your environment variables and encodes
# them in a value to be stored in a Kubernetes Secret.
export AWS_B64ENCODED_CREDENTIALS=$(clusterawsadm bootstrap credentials encode-as-profile)

# Finally, initialize the management cluster
clusterctl init --infrastructure aws

Download the latest release; on Windows, type:

curl.exe -L https://github.com/kubernetes-sigs/cluster-api-provider-aws/releases/download/v0.0.0/clusterawsadm-windows-amd64.exe -o clusterawsadm.exe

Append or prepend the path of that directory to the PATH environment variable. Check version to confirm installation

clusterawsadm.exe version

Example Usage in Powershell

$Env:AWS_REGION="us-east-1" # This is used to help encode your environment variables
$Env:AWS_ACCESS_KEY_ID="<your-access-key>"
$Env:AWS_SECRET_ACCESS_KEY="<your-secret-access-key>"
$Env:AWS_SESSION_TOKEN="<session-token>" # If you are using Multi-Factor Auth.

# The clusterawsadm utility takes the credentials that you set as environment
# variables and uses them to create a CloudFormation stack in your AWS account
# with the correct IAM resources.
clusterawsadm bootstrap iam create-cloudformation-stack

# Create the base64 encoded credentials using clusterawsadm.
# This command uses your environment variables and encodes
# them in a value to be stored in a Kubernetes Secret.
$Env:AWS_B64ENCODED_CREDENTIALS=$(clusterawsadm bootstrap credentials encode-as-profile)

# Finally, initialize the management cluster
clusterctl init --infrastructure aws

See the AWS provider prerequisites document for more details.

For more information about authorization, AAD, or requirements for Azure, visit the Azure provider prerequisites document.

export AZURE_SUBSCRIPTION_ID="<SubscriptionId>"

# Create an Azure Service Principal and paste the output here
export AZURE_TENANT_ID="<Tenant>"
export AZURE_CLIENT_ID="<AppId>"
export AZURE_CLIENT_SECRET="<Password>"

# Base64 encode the variables
export AZURE_SUBSCRIPTION_ID_B64="$(echo -n "$AZURE_SUBSCRIPTION_ID" | base64 | tr -d '\n')"
export AZURE_TENANT_ID_B64="$(echo -n "$AZURE_TENANT_ID" | base64 | tr -d '\n')"
export AZURE_CLIENT_ID_B64="$(echo -n "$AZURE_CLIENT_ID" | base64 | tr -d '\n')"
export AZURE_CLIENT_SECRET_B64="$(echo -n "$AZURE_CLIENT_SECRET" | base64 | tr -d '\n')"

# Settings needed for AzureClusterIdentity used by the AzureCluster
export AZURE_CLUSTER_IDENTITY_SECRET_NAME="cluster-identity-secret"
export CLUSTER_IDENTITY_NAME="cluster-identity"
export AZURE_CLUSTER_IDENTITY_SECRET_NAMESPACE="default"

# Create a secret to include the password of the Service Principal identity created in Azure
# This secret will be referenced by the AzureClusterIdentity used by the AzureCluster
kubectl create secret generic "${AZURE_CLUSTER_IDENTITY_SECRET_NAME}" --from-literal=clientSecret="${AZURE_CLIENT_SECRET}" --namespace "${AZURE_CLUSTER_IDENTITY_SECRET_NAMESPACE}"

# Finally, initialize the management cluster
clusterctl init --infrastructure azure

Create a file named cloud-config in the repo’s root directory, substituting in your own environment’s values

[Global]
api-url = <cloudstackApiUrl>
api-key = <cloudstackApiKey>
secret-key = <cloudstackSecretKey>

Create the base64 encoded credentials by catting your credentials file. This command uses your environment variables and encodes them in a value to be stored in a Kubernetes Secret.

export CLOUDSTACK_B64ENCODED_SECRET=`cat cloud-config | base64 | tr -d '\n'`

Finally, initialize the management cluster

clusterctl init --infrastructure cloudstack

export DIGITALOCEAN_ACCESS_TOKEN=<your-access-token>
export DO_B64ENCODED_CREDENTIALS="$(echo -n "${DIGITALOCEAN_ACCESS_TOKEN}" | base64 | tr -d '\n')"

# Initialize the management cluster
clusterctl init --infrastructure digitalocean

The Docker provider requires the ClusterTopology and MachinePool features to deploy ClusterClass-based clusters. We are only supporting ClusterClass-based cluster-templates in this quickstart as ClusterClass makes it possible to adapt configuration based on Kubernetes version. This is required to install Kubernetes clusters < v1.24 and for the upgrade from v1.23 to v1.24 as we have to use different cgroupDrivers depending on Kubernetes version.

# Enable the experimental Cluster topology feature.
export CLUSTER_TOPOLOGY=true

# Enable the experimental Machine Pool feature
export EXP_MACHINE_POOL=true

# Initialize the management cluster
clusterctl init --infrastructure docker

In order to initialize the Equinix Metal Provider (formerly Packet) you have to expose the environment variable PACKET_API_KEY. This variable is used to authorize the infrastructure provider manager against the Equinix Metal API. You can retrieve your token directly from the Equinix Metal Console.

export PACKET_API_KEY="34ts3g4s5g45gd45dhdh"

clusterctl init --infrastructure packet

# Create the base64 encoded credentials by catting your credentials json.
# This command uses your environment variables and encodes
# them in a value to be stored in a Kubernetes Secret.
export GCP_B64ENCODED_CREDENTIALS=$( cat /path/to/gcp-credentials.json | base64 | tr -d '\n' )

# Finally, initialize the management cluster
clusterctl init --infrastructure gcp

Please visit the Hetzner project.

In order to initialize the IBM Cloud Provider you have to expose the environment variable IBMCLOUD_API_KEY. This variable is used to authorize the infrastructure provider manager against the IBM Cloud API. To create one from the UI, refer here.

export IBMCLOUD_API_KEY=<you_api_key>

# Finally, initialize the management cluster
clusterctl init --infrastructure ibmcloud

# Initialize the management cluster
clusterctl init --infrastructure k0sproject-k0smotron

# Initialize the management cluster
clusterctl init --infrastructure kubekey

Please visit the KubeVirt project for more information.

As described above, we want to use a LoadBalancer service in order to expose the workload cluster’s API server. In the example below, we will use MetalLB solution to implement load balancing to our kind cluster. Other solution should work as well.

Install MetalLB for load balancing

Install MetalLB, as described here; for example:

METALLB_VER=$(curl "https://api.github.com/repos/metallb/metallb/releases/latest" | jq -r ".tag_name")
kubectl apply -f "https://raw.githubusercontent.com/metallb/metallb/${METALLB_VER}/config/manifests/metallb-native.yaml"
kubectl wait pods -n metallb-system -l app=metallb,component=controller --for=condition=Ready --timeout=10m
kubectl wait pods -n metallb-system -l app=metallb,component=speaker --for=condition=Ready --timeout=2m

Now, we’ll create the IPAddressPool and the L2Advertisement custom resources. The script below creates the CRs with the right addresses, that match to the kind cluster addresses:

GW_IP=$(docker network inspect -f '{{range .IPAM.Config}}{{.Gateway}}{{end}}' kind)
NET_IP=$(echo ${GW_IP} | sed -E 's|^([0-9]+\.[0-9]+)\..*$|\1|g')
cat <<EOF | sed -E "s|172.19|${NET_IP}|g" | kubectl apply -f -
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: capi-ip-pool
  namespace: metallb-system
spec:
  addresses:
  - 172.19.255.200-172.19.255.250
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: empty
  namespace: metallb-system
EOF

Install KubeVirt on the kind cluster

# get KubeVirt version
KV_VER=$(curl "https://api.github.com/repos/kubevirt/kubevirt/releases/latest" | jq -r ".tag_name")
# deploy required CRDs
kubectl apply -f "https://github.com/kubevirt/kubevirt/releases/download/${KV_VER}/kubevirt-operator.yaml"
# deploy the KubeVirt custom resource
kubectl apply -f "https://github.com/kubevirt/kubevirt/releases/download/${KV_VER}/kubevirt-cr.yaml"
kubectl wait -n kubevirt kv kubevirt --for=condition=Available --timeout=10m

Initialize the management cluster with the KubeVirt Provider

clusterctl init --infrastructure kubevirt

Please visit the Metal3 project.

Please follow the Cluster API Provider for Nutanix Getting Started Guide

Please follow the Cluster API Provider for Oracle Cloud Infrastructure (OCI) Getting Started Guide

# Initialize the management cluster
clusterctl init --infrastructure openstack

export OSC_SECRET_KEY=<your-secret-key>
export OSC_ACCESS_KEY=<your-access-key>
export OSC_REGION=<you-region>
# Create namespace
kubectl create namespace cluster-api-provider-outscale-system
# Create secret
kubectl create secret generic cluster-api-provider-outscale --from-literal=access_key=${OSC_ACCESS_KEY} --from-literal=secret_key=${OSC_SECRET_KEY} --from-literal=region=${OSC_REGION}  -n cluster-api-provider-outscale-system
# Initialize the management cluster
clusterctl init --infrastructure outscale

First, we need to add the IPAM provider to your clusterctl config file ($XDG_CONFIG_HOME/cluster-api/clusterctl.yaml):

providers:
  - name: in-cluster
    url: https://github.com/kubernetes-sigs/cluster-api-ipam-provider-in-cluster/releases/latest/ipam-components.yaml
    type: IPAMProvider

# The host for the Proxmox cluster
export PROXMOX_URL="https://pve.example:8006"
# The Proxmox token ID to access the remote Proxmox endpoint
export PROXMOX_TOKEN='root@pam!capi'
# The secret associated with the token ID
# You may want to set this in `$XDG_CONFIG_HOME/cluster-api/clusterctl.yaml` so your password is not in
# bash history
export PROXMOX_SECRET="1234-1234-1234-1234"


# Finally, initialize the management cluster
clusterctl init --infrastructure proxmox --ipam in-cluster

For more information about the CAPI provider for Proxmox, see the Proxmox project.

Please follow the Cluster API Provider for Cloud Director Getting Started Guide

EXP_CLUSTER_RESOURCE_SET: “true”

# Initialize the management cluster
clusterctl init --infrastructure vcd

clusterctl init --infrastructure vcluster

Please follow the Cluster API Provider for vcluster Quick Start Guide

# Initialize the management cluster
clusterctl init --infrastructure virtink

# The username used to access the remote vSphere endpoint
export VSPHERE_USERNAME="vi-admin@vsphere.local"
# The password used to access the remote vSphere endpoint
# You may want to set this in `$XDG_CONFIG_HOME/cluster-api/clusterctl.yaml` so your password is not in
# bash history
export VSPHERE_PASSWORD="admin!23"

# Finally, initialize the management cluster
clusterctl init --infrastructure vsphere

For more information about prerequisites, credentials management, or permissions for vSphere, see the vSphere project.

The output of clusterctl init is similar to this:

Fetching providers
Installing cert-manager Version="v1.11.0"
Waiting for cert-manager to be available...
Installing Provider="cluster-api" Version="v1.0.0" TargetNamespace="capi-system"
Installing Provider="bootstrap-kubeadm" Version="v1.0.0" TargetNamespace="capi-kubeadm-bootstrap-system"
Installing Provider="control-plane-kubeadm" Version="v1.0.0" TargetNamespace="capi-kubeadm-control-plane-system"
Installing Provider="infrastructure-docker" Version="v1.0.0" TargetNamespace="capd-system"

Your management cluster has been initialized successfully!

You can now create your first workload cluster by running the following:

  clusterctl generate cluster [name] --kubernetes-version [version] | kubectl apply -f -

Create your first workload cluster

Once the management cluster is ready, you can create your first workload cluster.

Preparing the workload cluster configuration

The clusterctl generate cluster command returns a YAML template for creating a workload cluster.

Required configuration for common providers

Depending on the infrastructure provider you are planning to use, some additional prerequisites should be satisfied before configuring a cluster with Cluster API. Instructions are provided for common providers below.

Otherwise, you can look at the clusterctl generate cluster command documentation for details about how to discover the list of variables required by a cluster templates.

AWSAzureCloudStackDigitalOceanDockerEquinix MetalGCPIBM CloudK0smotronKubeKeyKubeVirtMetal3NutanixOpenStackOutscaleProxmoxVCDvclusterVirtinkvSphere

export AWS_REGION=us-east-1
export AWS_SSH_KEY_NAME=default
# Select instance types
export AWS_CONTROL_PLANE_MACHINE_TYPE=t3.large
export AWS_NODE_MACHINE_TYPE=t3.large

See the AWS provider prerequisites document for more details.

# Name of the Azure datacenter location. Change this value to your desired location.
export AZURE_LOCATION="centralus"

# Select VM types.
export AZURE_CONTROL_PLANE_MACHINE_TYPE="Standard_D2s_v3"
export AZURE_NODE_MACHINE_TYPE="Standard_D2s_v3"

# [Optional] Select resource group. The default value is ${CLUSTER_NAME}.
export AZURE_RESOURCE_GROUP="<ResourceGroupName>"

A Cluster API compatible image must be available in your CloudStack installation. For instructions on how to build a compatible image see image-builder (CloudStack)

Prebuilt images can be found here

To see all required CloudStack environment variables execute:

clusterctl generate cluster --infrastructure cloudstack --list-variables capi-quickstart

Apart from the script, the following CloudStack environment variables are required.

# Set this to the name of the zone in which to deploy the cluster
export CLOUDSTACK_ZONE_NAME=<zone name>
# The name of the network on which the VMs will reside
export CLOUDSTACK_NETWORK_NAME=<network name>
# The endpoint of the workload cluster
export CLUSTER_ENDPOINT_IP=<cluster endpoint address>
export CLUSTER_ENDPOINT_PORT=<cluster endpoint port>
# The service offering of the control plane nodes
export CLOUDSTACK_CONTROL_PLANE_MACHINE_OFFERING=<control plane service offering name>
# The service offering of the worker nodes
export CLOUDSTACK_WORKER_MACHINE_OFFERING=<worker node service offering name>
# The capi compatible template to use
export CLOUDSTACK_TEMPLATE_NAME=<template name>
# The ssh key to use to log into the nodes
export CLOUDSTACK_SSH_KEY_NAME=<ssh key name>

A full configuration reference can be found in configuration.md.

A ClusterAPI compatible image must be available in your DigitalOcean account. For instructions on how to build a compatible image see image-builder.

export DO_REGION=nyc1
export DO_SSH_KEY_FINGERPRINT=<your-ssh-key-fingerprint>
export DO_CONTROL_PLANE_MACHINE_TYPE=s-2vcpu-2gb
export DO_CONTROL_PLANE_MACHINE_IMAGE=<your-capi-image-id>
export DO_NODE_MACHINE_TYPE=s-2vcpu-2gb
export DO_NODE_MACHINE_IMAGE==<your-capi-image-id>

The Docker provider does not require additional configurations for cluster templates.

However, if you require special network settings you can set the following environment variables:

# The list of service CIDR, default ["10.128.0.0/12"]
export SERVICE_CIDR=["10.96.0.0/12"]

# The list of pod CIDR, default ["192.168.0.0/16"]
export POD_CIDR=["192.168.0.0/16"]

# The service domain, default "cluster.local"
export SERVICE_DOMAIN="k8s.test"

It is also possible but not recommended to disable the per-default enabled Pod Security Standard:

export POD_SECURITY_STANDARD_ENABLED="false"

There are several required variables you need to set to create a cluster. There are also a few optional tunables if you’d like to change the OS or CIDRs used.

# Required (made up examples shown)
# The project where your cluster will be placed to.
# You have to get one from the Equinix Metal Console if you don't have one already.
export PROJECT_ID="2b59569f-10d1-49a6-a000-c2fb95a959a1"
# This can help to take advantage of automated, interconnected bare metal across our global metros.
export METRO="da"
# What plan to use for your control plane nodes
export CONTROLPLANE_NODE_TYPE="m3.small.x86"
# What plan to use for your worker nodes
export WORKER_NODE_TYPE="m3.small.x86"
# The ssh key you would like to have access to the nodes
export SSH_KEY="ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIDvMgVEubPLztrvVKgNPnRe9sZSjAqaYj9nmCkgr4PdK username@computer"
export CLUSTER_NAME="my-cluster"

# Optional (defaults shown)
export NODE_OS="ubuntu_18_04"
export POD_CIDR="192.168.0.0/16"
export SERVICE_CIDR="172.26.0.0/16"
# Only relevant if using the kube-vip flavor
export KUBE_VIP_VERSION="v0.5.0"

# Name of the GCP datacenter location. Change this value to your desired location
export GCP_REGION="<GCP_REGION>"
export GCP_PROJECT="<GCP_PROJECT>"
# Make sure to use same Kubernetes version here as building the GCE image
export KUBERNETES_VERSION=1.23.3
# This is the image you built. See https://github.com/kubernetes-sigs/image-builder
export IMAGE_ID=projects/$GCP_PROJECT/global/images/<built image>
export GCP_CONTROL_PLANE_MACHINE_TYPE=n1-standard-2
export GCP_NODE_MACHINE_TYPE=n1-standard-2
export GCP_NETWORK_NAME=<GCP_NETWORK_NAME or default>
export CLUSTER_NAME="<CLUSTER_NAME>"

See the GCP provider for more information.

# Required environment variables for VPC
# VPC region
export IBMVPC_REGION=us-south
# VPC zone within the region
export IBMVPC_ZONE=us-south-1
# ID of the resource group in which the VPC will be created
export IBMVPC_RESOURCEGROUP=<your-resource-group-id>
# Name of the VPC
export IBMVPC_NAME=ibm-vpc-0
export IBMVPC_IMAGE_ID=<you-image-id>
# Profile for the virtual server instances
export IBMVPC_PROFILE=bx2-4x16
export IBMVPC_SSHKEY_ID=<your-sshkey-id>

# Required environment variables for PowerVS
export IBMPOWERVS_SSHKEY_NAME=<your-ssh-key>
# Internal and external IP of the network
export IBMPOWERVS_VIP=<internal-ip>
export IBMPOWERVS_VIP_EXTERNAL=<external-ip>
export IBMPOWERVS_VIP_CIDR=29
export IBMPOWERVS_IMAGE_NAME=<your-capi-image-name>
# ID of the PowerVS service instance
export IBMPOWERVS_SERVICE_INSTANCE_ID=<service-instance-id>
export IBMPOWERVS_NETWORK_NAME=<your-capi-network-name>

Please visit the IBM Cloud provider for more information.

Please visit the K0smotron provider for more information.

# Required environment variables
# The KKZONE is used to specify where to download the binaries. (e.g. "", "cn")
export KKZONE=""
# The ssh name of the all instance Linux user. (e.g. root, ubuntu)
export USER_NAME=<your-linux-user>
# The ssh password of the all instance Linux user.
export PASSWORD=<your-linux-user-password>
# The ssh IP address of the all instance. (e.g. "[{address: 192.168.100.3}, {address: 192.168.100.4}]")
export INSTANCES=<your-linux-ip-address>
# The cluster control plane VIP. (e.g. "192.168.100.100")
export CONTROL_PLANE_ENDPOINT_IP=<your-control-plane-virtual-ip>

Please visit the KubeKey provider for more information.

export CAPK_GUEST_K8S_VERSION="v1.23.10"
export CRI_PATH="/var/run/containerd/containerd.sock"
export NODE_VM_IMAGE_TEMPLATE="quay.io/capk/ubuntu-2004-container-disk:${CAPK_GUEST_K8S_VERSION}"

Please visit the KubeVirt project for more information.

Note: If you are running CAPM3 release prior to v0.5.0, make sure to export the following environment variables. However, you don’t need them to be exported if you use CAPM3 release v0.5.0 or higher.

# The URL of the kernel to deploy.
export DEPLOY_KERNEL_URL="http://172.22.0.1:6180/images/ironic-python-agent.kernel"
# The URL of the ramdisk to deploy.
export DEPLOY_RAMDISK_URL="http://172.22.0.1:6180/images/ironic-python-agent.initramfs"
# The URL of the Ironic endpoint.
export IRONIC_URL="http://172.22.0.1:6385/v1/"
# The URL of the Ironic inspector endpoint.
export IRONIC_INSPECTOR_URL="http://172.22.0.1:5050/v1/"
# Do not use a dedicated CA certificate for Ironic API. Any value provided in this variable disables additional CA certificate validation.
# To provide a CA certificate, leave this variable unset. If unset, then IRONIC_CA_CERT_B64 must be set.
export IRONIC_NO_CA_CERT=true
# Disables basic authentication for Ironic API. Any value provided in this variable disables authentication.
# To enable authentication, leave this variable unset. If unset, then IRONIC_USERNAME and IRONIC_PASSWORD must be set.
export IRONIC_NO_BASIC_AUTH=true
# Disables basic authentication for Ironic inspector API. Any value provided in this variable disables authentication.
# To enable authentication, leave this variable unset. If unset, then IRONIC_INSPECTOR_USERNAME and IRONIC_INSPECTOR_PASSWORD must be set.
export IRONIC_INSPECTOR_NO_BASIC_AUTH=true

Please visit the Metal3 getting started guide for more details.

A ClusterAPI compatible image must be available in your Nutanix image library. For instructions on how to build a compatible image see image-builder.

To see all required Nutanix environment variables execute:

clusterctl generate cluster --infrastructure nutanix --list-variables capi-quickstart

A ClusterAPI compatible image must be available in your OpenStack. For instructions on how to build a compatible image see image-builder. Depending on your OpenStack and underlying hypervisor the following options might be of interest:

To see all required OpenStack environment variables execute:

clusterctl generate cluster --infrastructure openstack --list-variables capi-quickstart

The following script can be used to export some of them:

wget https://raw.githubusercontent.com/kubernetes-sigs/cluster-api-provider-openstack/master/templates/env.rc -O /tmp/env.rc
source /tmp/env.rc <path/to/clouds.yaml> <cloud>

Apart from the script, the following OpenStack environment variables are required.

# The list of nameservers for OpenStack Subnet being created.
# Set this value when you need create a new network/subnet while the access through DNS is required.
export OPENSTACK_DNS_NAMESERVERS=<dns nameserver>
# FailureDomain is the failure domain the machine will be created in.
export OPENSTACK_FAILURE_DOMAIN=<availability zone name>
# The flavor reference for the flavor for your server instance.
export OPENSTACK_CONTROL_PLANE_MACHINE_FLAVOR=<flavor>
# The flavor reference for the flavor for your server instance.
export OPENSTACK_NODE_MACHINE_FLAVOR=<flavor>
# The name of the image to use for your server instance. If the RootVolume is specified, this will be ignored and use rootVolume directly.
export OPENSTACK_IMAGE_NAME=<image name>
# The SSH key pair name
export OPENSTACK_SSH_KEY_NAME=<ssh key pair name>
# The external network
export OPENSTACK_EXTERNAL_NETWORK_ID=<external network ID>

A full configuration reference can be found in configuration.md.

A ClusterAPI compatible image must be available in your Outscale account. For instructions on how to build a compatible image see image-builder.

# The outscale root disk iops
export OSC_IOPS="<IOPS>"
# The outscale root disk size
export OSC_VOLUME_SIZE="<VOLUME_SIZE>"
# The outscale root disk volumeType
export OSC_VOLUME_TYPE="<VOLUME_TYPE>"
# The outscale key pair
export OSC_KEYPAIR_NAME="<KEYPAIR_NAME>"
# The outscale subregion name
export OSC_SUBREGION_NAME="<SUBREGION_NAME>"
# The outscale vm type
export OSC_VM_TYPE="<VM_TYPE>"
# The outscale image name
export OSC_IMAGE_NAME="<IMAGE_NAME>"

A ClusterAPI compatible image must be available in your Proxmox cluster. For instructions on how to build a compatible VM template see image-builder.

# The node that hosts the VM template to be used to provision VMs
export PROXMOX_SOURCENODE="pve"
# The template VM ID used for cloning VMs
export TEMPLATE_VMID=100
# The ssh authorized keys used to ssh to the machines.
export VM_SSH_KEYS="ssh-ed25519 ..., ssh-ed25519 ..."
# The IP address used for the control plane endpoint
export CONTROL_PLANE_ENDPOINT_IP=10.10.10.4
# The IP ranges for Cluster nodes
export NODE_IP_RANGES="[10.10.10.5-10.10.10.50, 10.10.10.55-10.10.10.70]"
# The gateway for the machines network-config.
export GATEWAY="10.10.10.1"
# Subnet Mask in CIDR notation for your node IP ranges
export IP_PREFIX=24
# The Proxmox network device for VMs
export BRIDGE="vmbr1"
# The dns nameservers for the machines network-config.
export DNS_SERVERS="[8.8.8.8,8.8.4.4]"
# The Proxmox nodes used for VM deployments
export ALLOWED_NODES="[pve1,pve2,pve3]"

For more information about prerequisites and advanced setups for Proxmox, see the Proxmox getting started guide.

A ClusterAPI compatible image must be available in your VCD catalog. For instructions on how to build and upload a compatible image see CAPVCD

To see all required VCD environment variables execute:

clusterctl generate cluster --infrastructure vcd --list-variables capi-quickstart

export CLUSTER_NAME=kind
export CLUSTER_NAMESPACE=vcluster
export KUBERNETES_VERSION=1.23.4
export HELM_VALUES="service:\n  type: NodePort"

Please see the vcluster installation instructions for more details.

To see all required Virtink environment variables execute:

clusterctl generate cluster --infrastructure virtink --list-variables capi-quickstart

See the Virtink provider document for more details.

It is required to use an official CAPV machine images for your vSphere VM templates. See uploading CAPV machine images for instructions on how to do this.

# The vCenter server IP or FQDN
export VSPHERE_SERVER="10.0.0.1"
# The vSphere datacenter to deploy the management cluster on
export VSPHERE_DATACENTER="SDDC-Datacenter"
# The vSphere datastore to deploy the management cluster on
export VSPHERE_DATASTORE="vsanDatastore"
# The VM network to deploy the management cluster on
export VSPHERE_NETWORK="VM Network"
# The vSphere resource pool for your VMs
export VSPHERE_RESOURCE_POOL="*/Resources"
# The VM folder for your VMs. Set to "" to use the root vSphere folder
export VSPHERE_FOLDER="vm"
# The VM template to use for your VMs
export VSPHERE_TEMPLATE="ubuntu-1804-kube-v1.17.3"
# The public ssh authorized key on all machines
export VSPHERE_SSH_AUTHORIZED_KEY="ssh-rsa AAAAB3N..."
# The certificate thumbprint for the vCenter server
export VSPHERE_TLS_THUMBPRINT="97:48:03:8D:78:A9..."
# The storage policy to be used (optional). Set to "" if not required
export VSPHERE_STORAGE_POLICY="policy-one"
# The IP address used for the control plane endpoint
export CONTROL_PLANE_ENDPOINT_IP="1.2.3.4"

For more information about prerequisites, credentials management, or permissions for vSphere, see the vSphere getting started guide.

Generating the cluster configuration

For the purpose of this tutorial, we’ll name our cluster capi-quickstart.

Docker vcluster KubeVirt Other providers...

clusterctl generate cluster capi-quickstart --flavor development \
  --kubernetes-version v1.29.2 \
  --control-plane-machine-count=3 \
  --worker-machine-count=3 \
  > capi-quickstart.yaml

export CLUSTER_NAME=kind
export CLUSTER_NAMESPACE=vcluster
export KUBERNETES_VERSION=1.28.0
export HELM_VALUES="service:\n  type: NodePort"

kubectl create namespace ${CLUSTER_NAMESPACE}
clusterctl generate cluster ${CLUSTER_NAME} \
    --infrastructure vcluster \
    --kubernetes-version ${KUBERNETES_VERSION} \
    --target-namespace ${CLUSTER_NAMESPACE} | kubectl apply -f -

As we described above, in this tutorial, we will use a LoadBalancer service in order to expose the API server of the workload cluster, so we want to use the load balancer (lb) template (rather than the default one). We’ll use the clusterctl’s --flavor flag for that:

clusterctl generate cluster capi-quickstart \
  --infrastructure="kubevirt" \
  --flavor lb \
  --kubernetes-version ${CAPK_GUEST_K8S_VERSION} \
  --control-plane-machine-count=1 \
  --worker-machine-count=1 \
  > capi-quickstart.yaml

clusterctl generate cluster capi-quickstart \
  --kubernetes-version v1.29.2 \
  --control-plane-machine-count=3 \
  --worker-machine-count=3 \
  > capi-quickstart.yaml

This creates a YAML file named capi-quickstart.yaml with a predefined list of Cluster API objects; Cluster, Machines, Machine Deployments, etc.

The file can be eventually modified using your editor of choice.

See clusterctl generate cluster for more details.

Apply the workload cluster

When ready, run the following command to apply the cluster manifest.

kubectl apply -f capi-quickstart.yaml

The output is similar to this:

cluster.cluster.x-k8s.io/capi-quickstart created
dockercluster.infrastructure.cluster.x-k8s.io/capi-quickstart created
kubeadmcontrolplane.controlplane.cluster.x-k8s.io/capi-quickstart-control-plane created
dockermachinetemplate.infrastructure.cluster.x-k8s.io/capi-quickstart-control-plane created
machinedeployment.cluster.x-k8s.io/capi-quickstart-md-0 created
dockermachinetemplate.infrastructure.cluster.x-k8s.io/capi-quickstart-md-0 created
kubeadmconfigtemplate.bootstrap.cluster.x-k8s.io/capi-quickstart-md-0 created

Accessing the workload cluster

The cluster will now start provisioning. You can check status with:

kubectl get cluster

You can also get an “at glance” view of the cluster and its resources by running:

clusterctl describe cluster capi-quickstart

and see an output similar to this:

NAME              PHASE         AGE   VERSION
capi-quickstart   Provisioned   8s    v1.29.2

To verify the first control plane is up:

kubectl get kubeadmcontrolplane

You should see an output is similar to this:

NAME                    CLUSTER           INITIALIZED   API SERVER AVAILABLE   REPLICAS   READY   UPDATED   UNAVAILABLE   AGE    VERSION
capi-quickstart-g2trk   capi-quickstart   true                                 3                  3         3             4m7s   v1.29.2

After the first control plane node is up and running, we can retrieve the workload cluster Kubeconfig.

DefaultDocker

clusterctl get kubeconfig capi-quickstart > capi-quickstart.kubeconfig

For Docker Desktop on macOS, Linux or Windows use kind to retrieve the kubeconfig. Docker Engine for Linux works with the default clusterctl approach.

kind get kubeconfig --name capi-quickstart > capi-quickstart.kubeconfig

Install a Cloud Provider

The Kubernetes in-tree cloud provider implementations are being removed in favor of external cloud providers (also referred to as “out-of-tree”). This requires deploying a new component called the cloud-controller-manager which is responsible for running all the cloud specific controllers that were previously run in the kube-controller-manager. To learn more, see this blog post.

AzureOpenStack

Install the official cloud-provider-azure Helm chart on the workload cluster:

helm install --kubeconfig=./capi-quickstart.kubeconfig --repo https://raw.githubusercontent.com/kubernetes-sigs/cloud-provider-azure/master/helm/repo cloud-provider-azure --generate-name --set infra.clusterName=capi-quickstart --set cloudControllerManager.clusterCIDR="192.168.0.0/16"

For more information, see the CAPZ book.

Before deploying the OpenStack external cloud provider, configure the cloud.conf file for integration with your OpenStack environment:

cat > cloud.conf <<EOF
[Global]
auth-url=<your_auth_url>
application-credential-id=<your_credential_id>
application-credential-secret=<your_credential_secret>
region=<your_region>
domain-name=<your_domain_name>
EOF

For more detailed information on configuring the cloud.conf file, see the OpenStack Cloud Controller Manager documentation.

Next, create a Kubernetes secret using this configuration to securely store your cloud environment details. You can create this secret for example with:

kubectl -n kube-system create secret generic cloud-config --from-file=cloud.conf

Now, you are ready to deploy the external cloud provider!

kubectl apply -f https://raw.githubusercontent.com/kubernetes/cloud-provider-openstack/master/manifests/controller-manager/cloud-controller-manager-roles.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes/cloud-provider-openstack/master/manifests/controller-manager/cloud-controller-manager-role-bindings.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes/cloud-provider-openstack/master/manifests/controller-manager/openstack-cloud-controller-manager-ds.yaml

Alternatively, refer to the helm chart.

Deploy a CNI solution

Calico is used here as an example.

AzurevclusterKubeVirtOther providers...

Install the official Calico Helm chart on the workload cluster:

helm repo add projectcalico https://docs.tigera.io/calico/charts --kubeconfig=./capi-quickstart.kubeconfig && \
helm install calico projectcalico/tigera-operator --kubeconfig=./capi-quickstart.kubeconfig -f https://raw.githubusercontent.com/kubernetes-sigs/cluster-api-provider-azure/main/templates/addons/calico/values.yaml --namespace tigera-operator --create-namespace

After a short while, our nodes should be running and in Ready state, let’s check the status using kubectl get nodes:

kubectl --kubeconfig=./capi-quickstart.kubeconfig get nodes

Calico not required for vcluster.

Before deploying the Calico CNI, make sure the VMs are running:

kubectl get vm

If our new VMs are running, we should see a response similar to this:

NAME                                  AGE    STATUS    READY
capi-quickstart-control-plane-7s945   167m   Running   True
capi-quickstart-md-0-zht5j            164m   Running   True

We can also read the virtual machine instances:

kubectl get vmi

The output will be similar to:

NAME                                  AGE    PHASE     IP             NODENAME             READY
capi-quickstart-control-plane-7s945   167m   Running   10.244.82.16   kind-control-plane   True
capi-quickstart-md-0-zht5j            164m   Running   10.244.82.17   kind-control-plane   True

Since our workload cluster is running within the kind cluster, we need to prevent conflicts between the kind (management) cluster’s CNI, and the workload cluster CNI. The following modifications in the default Calico settings are enough for these two CNI to work on (actually) the same environment.

Change the CIDR to a non-conflicting range
Change the value of the CLUSTER_TYPE environment variable to k8s
Change the value of the CALICO_IPV4POOL_IPIP environment variable to Never
Change the value of the CALICO_IPV4POOL_VXLAN environment variable to Always
Add the FELIX_VXLANPORT environment variable with the value of a non-conflicting port, e.g. "6789".

The following script downloads the Calico manifest and modifies the required field. The CIDR and the port values are examples.

curl https://raw.githubusercontent.com/projectcalico/calico/v3.24.4/manifests/calico.yaml -o calico-workload.yaml

sed -i -E 's|^( +)# (- name: CALICO_IPV4POOL_CIDR)$|\1\2|g;'\
's|^( +)# (  value: )"192.168.0.0/16"|\1\2"10.243.0.0/16"|g;'\
'/- name: CLUSTER_TYPE/{ n; s/( +value: ").+/\1k8s"/g };'\
'/- name: CALICO_IPV4POOL_IPIP/{ n; s/value: "Always"/value: "Never"/ };'\
'/- name: CALICO_IPV4POOL_VXLAN/{ n; s/value: "Never"/value: "Always"/};'\
'/# Set Felix endpoint to host default action to ACCEPT./a\            - name: FELIX_VXLANPORT\n              value: "6789"' \
calico-workload.yaml

Now, deploy the Calico CNI on the workload cluster:

kubectl --kubeconfig=./capi-quickstart.kubeconfig create -f calico-workload.yaml

After a short while, our nodes should be running and in Ready state, let’s check the status using kubectl get nodes:

kubectl --kubeconfig=./capi-quickstart.kubeconfig get nodes

Troubleshooting

If the nodes don’t become ready after a long period, read the pods in the kube-system namespace

kubectl --kubeconfig=./capi-quickstart.kubeconfig get pod -n kube-system

If the Calico pods are in image pull error state (ErrImagePull), it’s probably because of the Docker Hub pull rate limit. We can try to fix that by adding a secret with our Docker Hub credentials, and use it; see here for details.

First, create the secret. Please notice the Docker config file path, and adjust it to your local setting.

kubectl --kubeconfig=./capi-quickstart.kubeconfig create secret generic docker-creds \
    --from-file=.dockerconfigjson=<YOUR DOCKER CONFIG FILE PATH> \
    --type=kubernetes.io/dockerconfigjson \
    -n kube-system

Now, if the calico-node pods are with status of ErrImagePull, patch their DaemonSet to make them use the new secret to pull images:

kubectl --kubeconfig=./capi-quickstart.kubeconfig patch daemonset \
    -n kube-system calico-node \
    -p '{"spec":{"template":{"spec":{"imagePullSecrets":[{"name":"docker-creds"}]}}}}'

After a short while, the calico-node pods will be with Running status. Now, if the calico-kube-controllers pod is also in ErrImagePull status, patch its deployment to fix the problem:

kubectl --kubeconfig=./capi-quickstart.kubeconfig patch deployment \
    -n kube-system calico-kube-controllers \
    -p '{"spec":{"template":{"spec":{"imagePullSecrets":[{"name":"docker-creds"}]}}}}'

Read the pods again

kubectl --kubeconfig=./capi-quickstart.kubeconfig get pod -n kube-system

Eventually, all the pods in the kube-system namespace will run, and the result should be similar to this:

NAME                                                          READY   STATUS    RESTARTS   AGE
calico-kube-controllers-c969cf844-dgld6                       1/1     Running   0          50s
calico-node-7zz7c                                             1/1     Running   0          54s
calico-node-jmjd6                                             1/1     Running   0          54s
coredns-64897985d-dspjm                                       1/1     Running   0          3m49s
coredns-64897985d-pgtgz                                       1/1     Running   0          3m49s
etcd-capi-quickstart-control-plane-kjjbb                      1/1     Running   0          3m57s
kube-apiserver-capi-quickstart-control-plane-kjjbb            1/1     Running   0          3m57s
kube-controller-manager-capi-quickstart-control-plane-kjjbb   1/1     Running   0          3m57s
kube-proxy-b9g5m                                              1/1     Running   0          3m12s
kube-proxy-p6xx8                                              1/1     Running   0          3m49s
kube-scheduler-capi-quickstart-control-plane-kjjbb            1/1     Running   0          3m57s

kubectl --kubeconfig=./capi-quickstart.kubeconfig \
  apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/calico.yaml

After a short while, our nodes should be running and in Ready state, let’s check the status using kubectl get nodes:

kubectl --kubeconfig=./capi-quickstart.kubeconfig get nodes

NAME                                          STATUS   ROLES           AGE    VERSION
capi-quickstart-vs89t-gmbld                   Ready    control-plane   5m33s  v1.29.2
capi-quickstart-vs89t-kf9l5                   Ready    control-plane   6m20s  v1.29.2
capi-quickstart-vs89t-t8cfn                   Ready    control-plane   7m10s  v1.29.2
capi-quickstart-md-0-55x6t-5649968bd7-8tq9v   Ready    <none>          6m5s   v1.29.2
capi-quickstart-md-0-55x6t-5649968bd7-glnjd   Ready    <none>          6m9s   v1.29.2
capi-quickstart-md-0-55x6t-5649968bd7-sfzp6   Ready    <none>          6m9s   v1.29.2

Clean Up

Delete workload cluster.

kubectl delete cluster capi-quickstart

Delete management cluster

kind delete cluster

Next steps

Create a second workload cluster. Simply follow the steps outlined above, but remember to provide a different name for your second workload cluster.
Deploy applications to your workload cluster. Use the CNI deployment steps for pointers.
See the clusterctl documentation for more detail about clusterctl supported actions.

Table of Contents generated with DocToc

Required configuration

The cluster configuration file can be generated by using clusterctl generate cluster command. This command actually uses the template file and replace the values surrounded by ${} with environment variables. You have to set all required environment variables in advance. The following sections explain some more details about what should be configured.

Note: You can use the template file by manually replacing values.

Note: By default the command creates highly available control plane with 3 control plane nodes. If you wish to create single control plane without load balancer, use without-lb flavor. For example,

# Using 'without-lb' flavor
clusterctl generate cluster capi-quickstart \
  --flavor without-lb \
  --kubernetes-version v1.24.2 \
  --control-plane-machine-count=1 \
  --worker-machine-count=1 \
  > capi-quickstart.yaml

OpenStack version

We currently require at least OpenStack Pike.

Operating system image

cloud-init based images

We currently depend on an up-to-date version of cloud-init, otherwise the operating system choice is yours. The kubeadm bootstrap provider we’re using also depends on some pre-installed software like a container runtime, kubelet, kubeadm, etc.. . For examples of how to build such an image take a look at image-builder (openstack).

The image can be referenced by exposing it as an environment variable OPENSTACK_IMAGE_NAME.

Ignition based images

Some OS like Fedora CoreOS or Flatcar do not use cloud-init but Ignition to provision the instance. You need to enable the Ignition experimental feature: export EXP_KUBEADM_BOOTSTRAP_FORMAT_IGNITION=true

Flatcar comes in two flavor variants:

flatcar

This variant relies on a Flatcar image built using the image-builder project: the Kubernetes version is bound to the Flatcar version and a rebuild of the image is required for each Kubernetes or Flatcar upgrade.

To build and use Flatcar image:
- Build the image with the image-builder: make OEM_ID=openstack build-qemu-flatcar
- Upload the image
- Export the name of the uploaded image: export OPENSTACK_FLATCAR_IMAGE_NAME=flatcar-stable-3374.2.5-kube-v1.25.6
- When generating the cluster configuration, use the following Cluster API flavor: --flavor flatcar (NOTE: Don’t forget to refer to the external-cloud-provider section)
flatcar-sysext

This variant relies on a plain Flatcar image and it leverages systemd-sysext feature to install and update Kubernetes components: the Kubernetes version is not bound to the Flatcar version (i.e Flatcar can be independently upgraded from Kubernetes and vice versa).

The template comes with a systemd-sysupdate configuration file that will download each new patch version of Kubernetes (i.e if you start with Kubernetes 1.x.y, systemd-sysupdate will automatically pull 1.x.y+1 but not 1.x+1.y), please note that this behavior is disabled by default. To enable the Kubernetes auto-update you can:
- Update the template to enable the systemd-sysupdate.timer
- Or run the following command on the nodes: sudo systemctl enable --now systemd-sysupdate.timer
When the Kubernetes release reaches end-of-life it will not receive updates anymore. To switch to a new major version, do a sudo rm /etc/sysupdate.kubernetes.d/kubernetes-*.conf and download the new update config into the folder with cd /etc/sysupdate.kubernetes.d && sudo wget https://github.com/flatcar/sysext-bakery/releases/download/latest/kubernetes-${KUBERNETES_VERSION%.*}.conf.

To coordinate the node reboot, we recommend to use Kured. Note that running kubeadm upgrade apply on the first controller and kubeadm upgrade node on all other nodes is not automated (yet), see the docs.

To use Flatcar image:
- Upload an image on OpenStack from the Flatcar release servers (e.g for Stable, you might use this image: https://stable.release.flatcar-linux.net/amd64-usr/current/flatcar_production_openstack_image.img)
- Export the name of the uploaded image: export FLATCAR_IMAGE_NAME=flatcar_production_openstack_image
- When generating the cluster configuration, use the following Cluster API flavor: --flavor flatcar-sysext (NOTE: Don’t forget to refer to the external-cloud-provider section)

SSH key pair

The SSH key pair is required. You can create one using,

openstack keypair create [--public-key <file> | --private-key <file>] <name>

The key pair name must be exposed as an environment variable OPENSTACK_SSH_KEY_NAME.

In order to access cluster nodes via SSH, you must either access nodes through the bastion host or configure custom security groups with rules allowing ingress for port 22.

OpenStack credential

Generate credentials

The env.rc script sets the environment variables related to credentials. It’s highly recommend to avoid using admin credential.

source env.rc <path/to/clouds.yaml> <cloud>

The following variables are set.

Variable	Meaning
OPENSTACK_CLOUD	The cloud name which is used as second argument
OPENSTACK_CLOUD_YAML_B64	The secret used by Cluster API Provider OpenStack accessing OpenStack
OPENSTACK_CLOUD_PROVIDER_CONF_B64	The content of cloud.conf which is used by OpenStack cloud provider
OPENSTACK_CLOUD_CACERT_B64	The content of your custom CA file which can be specified in your clouds.yaml by `ca-file`, mandatory when openstack endpoint is `https`

Note: Only the external cloud provider supports Application Credentials.

Note: you need to set clusterctl.cluster.x-k8s.io/move label for the secret created from OPENSTACK_CLOUD_YAML_B64 in order to successfully move objects from bootstrap cluster to target cluster. See bug 626 for further information.

CA certificates

When using an https openstack endpoint, providing CA certificates is required unless verification is explicitly disabled. You can choose to provide your ca certificates per cluster or globally using a specific capo flag.

Per cluster

To use the per cluster ca certificate, you can use the OPENSTACK_CLOUD_CACERT_B64 environment variable. The generator will set the cacert key with the variable’s content in the cluster’s cloud-config secret.

Global configuration

To use the same ca certificate for all clusters you can use the --ca-certs flag. When reconciling a cluster, if no cacert is set in the cluster’s cloud-config secret, CAPO will use the certicates provided with this flag.

For instance, to use CAPO’s docker image ca certificates:

kubectl patch deployment capo-controller-manager -n capo-system \
  --type='json' \
  -p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--ca-certs=/etc/ssl/certs/ca-certificates.crt"}]'

Availability zone

The availability zone names must be exposed as an environment variable OPENSTACK_FAILURE_DOMAIN.

By default, if Availability zone is not given, all Availability zone that defined in openstack will be a candidate to provision from, If administrator credential is used then internal Availability zone which is internal only Availability zone inside nova will be returned and can cause potential problem, see PR 1165 for further information. So we highly recommend to set Availability zone explicitly.

DNS server

The DNS servers must be exposed as an environment variable OPENSTACK_DNS_NAMESERVERS.

Machine flavor

The flavors for control plane and worker node machines must be exposed as environment variables OPENSTACK_CONTROL_PLANE_MACHINE_FLAVOR and OPENSTACK_NODE_MACHINE_FLAVOR respectively.

The recommmend minimum value of control plane flavor’s vCPU is 2 and minimum value of worker node flavor’s vCPU is 1.

CNI security group rules

Depending on the CNI that will be deployed on the cluster, you may need to add specific security group rules to the control plane and worker nodes. For example, if you are using Calico with BGP, you will need to add the following security group rules to the control plane and worker nodes:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackCluster
metadata:
  name: <cluster-name>
  namespace: <cluster-namespace>
spec:
   ...
   managedSecurityGroups: 
     allNodesSecurityGroupRules:
     - remoteManagedGroups:
       - controlplane
       - worker
       direction: ingress
       etherType: IPv4
       name: BGP (Calico)
       portRangeMin: 179
       portRangeMax: 179
       protocol: tcp
       description: "Allow BGP between control plane and workers"
     - remoteManagedGroups:
       - controlplane
       - worker
       direction: ingress
       etherType: IPv4
       name: IP-in-IP (Calico)
       protocol: 4
       description: "Allow IP-in-IP between control plane and workers"
     allowAllInClusterTraffic: false

Optional Configuration

Log level

When running CAPO with --v=6 the gophercloud client logs its requests to the OpenStack API. This can be helpful during debugging.

External network

If there is only a single external network it will be detected automatically. If there is more than one external network you can specify which one the cluster should use by setting the environment variable OPENSTACK_EXTERNAL_NETWORK_ID.

The public network id can be obtained by using command,

openstack network list --external

Note: If your openstack cluster does not already have a public network, you should contact your cloud service provider. We will not review how to troubleshoot this here.

Use existing router

You can use a pre-existing router instead of creating a new one. When deleting a cluster a pre-existing router will not be deleted.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackCluster
metadata:
  name: <cluster-name>
  namespace: <cluster-namespace>
spec:
  ...
  router: 
     id: <Router id>

API server floating IP

Unless explicitly disabled, a floating IP is automatically created and associated with the load balancer or controller node. If required, you can specify the floating IP explicitly by spec.apiServerFloatingIP of OpenStackCluster.

You have to be able to create a floating IP in your OpenStack in advance. You can create one using,

openstack floating ip create <public network>

Note: Only user with admin role can create a floating IP with specific IP.

Note: When associating a floating IP to a cluster with more than 1 controller node, the floatingIP will be associated to the first controller node and the other controller nodes have no floating IP assigned. When the controller node has the floating IP status down CAPO will NOT auto assign the floating IP address to any other controller node. So we recommend to only set one controller node when floating IP is needed, or please consider using load balancer instead, see issue #1265 for further information.

Disabling the API server floating IP

It is possible to provision a cluster without a floating IP for the API server by setting OpenStackCluster.spec.disableAPIServerFloatingIP: true (the default is false). This will prevent a floating IP from being allocated.

WARNING

If the API server does not have a floating IP, workload clusters will only deploy successfully when the management cluster and workload cluster control plane nodes are on the same network. This can be a project-specific network, if the management cluster lives in the same project as the workload cluster, or a network that is shared across multiple projects.

In particular, this means that the cluster cannot use OpenStackCluster.spec.managedSubnets to provision a new network for the cluster. Instead, use OpenStackCluster.spec.network to explicitly specify the same network as the management cluster is on.

When the API server floating IP is disabled, it is not possible to provision a cluster without a load balancer without additional configuration (an advanced use-case that is not documented here). This is because the API server must still have a virtual IP that is not associated with a particular control plane node in order to allow the nodes to change underneath, e.g. during an upgrade. When the API server has a floating IP, this role is fulfilled by the floating IP even if there is no load balancer. When the API server does not have a floating IP, the load balancer virtual IP on the cluster network is used.

Restrict Access to the API server

NOTE

This requires “amphora” as load balancer provider at in version >= v2.12

It is possible to restrict access to the Kubernetes API server on a network level. If required, you can specify the allowed CIDRs by spec.APIServerLoadBalancer.AllowedCIDRs of OpenStackCluster.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackCluster
metadata:
  name: <cluster-name>
  namespace: <cluster-namespace>
spec:
  apiServerLoadBalancer:
    allowedCidrs:
    - 192.168.10/24
    - 10.10.0.0/16

All known IPs of the target cluster will be discovered dynamically (e.g. you don’t have to take care of target Cluster own Router IP, internal CIDRs or any Bastion Host IP). Note: Please ensure, that at least the outgoing IP of your management Cluster is added to the list of allowed CIDRs. Otherwise CAPO can’t reconcile the target Cluster correctly.

All applied CIDRs (user defined + dynamically discovered) are written back into status.network.apiServerLoadBalancer.allowedCIDRs

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackCluster
metadata:
  name: <cluster-name>
  namespace: <cluster-namespace>
status:
  network:
    apiServerLoadBalancer:
      allowedCIDRs:
        - 10.6.0.0/24 # openStackCluster.Status.Network.Subnet.CIDR
        - 10.6.0.90/32 # bastion Host internal IP
        - 10.10.0.0/16 # user defined
        - 192.168.10/24 # user defined
        - 172.16.111.100/32 # bastion host floating IP
        - 172.16.111.85/32 # router IP
      internalIP: 10.6.0.144
      ip: 172.16.111.159
      name: k8s-clusterapi-cluster-<cluster-namespace>-<cluster-name>

If you locked out yourself or the CAPO management cluster, you can easily clear the allowed_cidrs field on OpenStack via

openstack loadbalancer listener unset --allowed-cidrs <listener ID>

Network Filters

If you have a complex query that you want to use to lookup a network, then you can do this by using a network filter. More details about the filter can be found in NetworkParam

By using filters to look up a network, please note that it is possible to get multiple networks as a result. This should not be a problem, however please test your filters with openstack network list to be certain that it returns the networks you want. Please refer to the following usage example:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackMachineTemplate
metadata:
  name: <cluster-name>-controlplane
  namespace: <cluster-name>
spec:
  template:
    spec:
      ports:
        - network:
            name: <network-name>

Multiple Networks

You can specify multiple networks (or subnets) to connect your server to. To do this, simply add another entry in the networks array. The following example connects the server to 3 different networks:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackMachineTemplate
metadata:
  name: <cluster-name>-controlplane
  namespace: <cluster-name>
spec:
  template:
    spec:
      ports:
        - network:
            name: myNetwork
            tags: myTag
        - network:
            id: your_network_id
        - fixedIPs:
            - subnet:
                id: your_subnet_id

Subnet Filters

Rather than just using a network, you have the option of specifying a specific subnet to connect your server to. The following is an example of how to specify a specific subnet of a network to use for your server.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackMachineTemplate
metadata:
  name: <cluster-name>-controlplane
  namespace: <cluster-name>
spec:
  template:
    spec:
      ports:
        - network:
            name: <network-name>
          fixedIPs:
            - subnet:
              name: <subnet-name>

Ports

A server can also be connected to networks by describing what ports to create. Describing a server’s connection with ports allows for finer and more advanced configuration. For example, you can specify per-port security groups, fixed IPs, VNIC type or profile.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackMachineTemplate
metadata:
  name: <cluster-name>-controlplane
  namespace: <cluster-name>
spec:
  template:
    spec:
      ports:
        - network:
            id: <your-network-id>
          fixedIPs:
            - subnet:
                id: <your-subnet-id>
              ipAddress: <your-fixed-ip>
            - subnet:
                name: <your-subnet-name>
                tags:
                  - tag1
                  - tag2
          nameSuffix: <your-port-name>
          description: <your-custom-port-description>
          vnicType: normal
          securityGroups:
            - <your-security-group-id>
          profile:
            capabilities:
              - <capability>

Any such ports are created in addition to ports used for connections to networks or subnets.

Port network and IP addresses

Together, network and fixedIPs define the network a port will be created on, and the addresses which will be assigned to the port on that network.

network is a filter which uniquely describes the Neutron network the port will be created be on. Machine creation will fail if the result is empty or not unique. If a network id is specified in the filter then no separate OpenStack query is required. This has the advantages of being both faster and unambiguous in all circumstances, so it is the preferred way to specify a network where possible.

The available fields are described in the CRD.

If network is not specified at all, it may be possible to infer the network from any uniquely defined subnets in fixedIPs. As this may result in additional OpenStack queries and the potential for ambiguity is greater, this is not recommended.

fixedIPs describes a list of addresses from the target network which will be allocated to the port. A fixedIP is either a specific ipAddress, a subnet from which an ip address will be allocated, or both. If only ipAddress is specified, it must be valid in at least one of the subnets defined in the current network. If both are defined, ipAddress must be valid in the specified subnet.

subnet is a filter which uniquely describe the Neutron subnet an address will be allocated from. Its operation is analogous to network, described above.

fixedIPs, including all fields available in the subnet filter, are described in the CRD.

If no fixedIPs are specified, the port will get an address from every subnet in the network.

Examples

A single explicit network with a single explicit subnet.

ports:
- tags:
  - control-plane
  network:
    id: 0686143b-f0a7-481a-86f5-cc1f8ccde692
  fixedIPs:
  - subnet:
      id: a5e50a9c-58f9-4b6f-b8ee-2e7b4e4414ee

No network or fixed IPs: the port will be created on the cluster default network, and will get a single address from the cluster default subnet.

ports:
- tags:
  - control-plane

Network and subnet are specified by filter. They will be looked up. Note that this is not as efficient or reliable as specifying the network by id.

ports:
- tags:
  - storage
  network:
    name: storage-network
  fixedIPs:
  - subnet:
      name: storage-subnet

No network, but a fixed IP with a subnet. The network will be inferred from the network of the subnet. Note that this is not as efficient or reliable as specifying the network explicitly.

ports:
- tags:
  - control-plane
  fixedIPs:
  - subnet:
      id: a5e50a9c-58f9-4b6f-b8ee-2e7b4e4414ee

Port Security

port security can be applied to specific port to enable/disable the port security on that port; When not set, it takes the value of the corresponding field at the network level.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackMachineTemplate
metadata:
  name: <cluster-name>-controlplane
  namespace: <cluster-name>
spec:
  template:
    spec:
      ports:
      - network:
          id: <your-network-id>
        ...
        disablePortSecurity: true
        ...

Security groups

Security groups are used to determine which ports of the cluster nodes are accessible from where.

If spec.managedSecurityGroups of OpenStackCluster is set to a non-nil value (e.g. {}), two security groups named k8s-cluster-${NAMESPACE}-${CLUSTER_NAME}-secgroup-controlplane and k8s-cluster-${NAMESPACE}-${CLUSTER_NAME}-secgroup-worker will be created and added to the control plane and worker nodes respectively.

Example of spec.managedSecurityGroups in OpenStackCluster spec when we want to enable the managed security groups:

managedSecurityGroups: {}

Control plane nodes
- API server traffic from anywhere
- Etcd traffic from other control plane nodes
- Kubelet traffic from other cluster nodes
Worker nodes
- Node port traffic from anywhere
- Kubelet traffic from other cluster nodes

When the flag OpenStackCluster.spec.managedSecurityGroups.allowAllInClusterTraffic is set to true, the rules for the managed security groups permit all traffic between cluster nodes on all ports and protocols (API server and node port traffic is still permitted from anywhere, as with the default rules).

We can add security group rules that authorize traffic from all nodes via allNodesSecurityGroupRules. It takes a list of security groups rules that should be applied to selected nodes. The following rule fields are mutually exclusive: remoteManagedGroups, remoteGroupID and remoteIPPrefix.

Valid values for remoteManagedGroups are controlplane, worker and bastion.

To apply a security group rule that will allow BGP between the control plane and workers, you can follow this example:

managedSecurityGroups:
  allNodesSecurityGroupRules:
  - remoteManagedGroups:
    - controlplane
    - worker
    direction: ingress
    etherType: IPv4
    name: BGP (Calico)
    portRangeMin: 179
    portRangeMax: 179
    protocol: tcp
    description: "Allow BGP between control plane and workers"

If this is not flexible enough, pre-existing security groups can be added to the spec of an OpenStackMachineTemplate, e.g.:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackMachineTemplate
metadata:
  name: ${CLUSTER_NAME}-control-plane
spec:
  template:
    spec:
      securityGroups:
      - name: allow-ssh

Tagging

You have the ability to tag all resources created by the cluster in the OpenStackCluster spec. Here is an example how to configure tagging:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackCluster
metadata:
  name: <cluster-name>
  namespace: <cluster-name>
spec:
  tags:
  - cluster-tag

To tag resources specific to a machine, add a value to the tags field in the OpenStackMachineTemplate spec like this:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackMachineTemplate
metadata:
  name: <cluster-name>-controlplane
  namespace: <cluster-name>
spec:
  template:
    spec:
      tags:
      - machine-tag

Metadata

You also have the option to add metadata to instances. Here is a usage example:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackMachineTemplate
metadata:
  name: <cluster-name>-controlplane
  namespace: <cluster-name>
spec:
  template:
    spec:
      serverMetadata:
        name: bob
        nickname: bobbert

Boot From Volume

For example in OpenStackMachineTemplate set spec.rootVolume.diskSize to something greater than 0 means boot from volume.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackMachineTemplate
metadata:
  name: <cluster-name>-controlplane
  namespace: <cluster-name>
spec:
  template:
    spec:
      ...
        rootVolume:
          diskSize: <image size>
          volumeType: <a cinder volume type (*optional)>
          availabilityZone: <the cinder availability zone for the root volume (*optional)>
      ...

If volumeType is not specified, cinder will use the default volume type.

If availabilityZone is not specified, the volume will be created in the cinder availability zone specified in the MachineSpec’s failureDomain. This same value is also used as the nova availability zone when creating the server. Note that this will fail if cinder and nova do not have matching availability zones. In this case, cinder availabilityZone must be specified explicitly on rootVolume.

Timeout settings

The default timeout for instance creation is 5 minutes. If creating servers in your OpenStack takes a long time, you can increase the timeout. You can set a new value, in minutes, via the environment variable CLUSTER_API_OPENSTACK_INSTANCE_CREATE_TIMEOUT in your Cluster API Provider OpenStack controller deployment.

Custom pod network CIDR

If 192.168.0.0/16 is already in use within your network, you must select a different pod network CIDR. You have to replace the CIDR 192.168.0.0/16 with your own in the generated file.

Accessing nodes through the bastion host via SSH

Enabling the bastion host

To configure the Cluster API Provider for OpenStack to create a SSH bastion host, add this line to the OpenStackCluster spec after clusterctl generate cluster was successfully executed:


spec:
  ...
  bastion:
    enabled: true
    instance:
      flavor: <Flavor name>
      image:  <Image name>
      sshKeyName: <Key pair name>

All parameters are mutable during the runtime of the bastion host. The bastion host will be re-created if it’s enabled and the instance spec has been changed. This is done by a simple checksum validation of the instance spec which is stored in the OpenStackCluster annotation infrastructure.cluster.x-k8s.io/bastion-hash.

A floating IP is created and associated to the bastion host automatically, but you can add the IP address explicitly:


spec:
  ...
  bastion:
    ...
    floatingIP: <Floating IP address>

If managedSecurityGroups is set to a non-nil value (e.g. {}), security group rule opening 22/tcp is added to security groups for bastion, controller, and worker nodes respectively. Otherwise, you have to add securityGroups to the bastion in OpenStackCluster spec and OpenStackMachineTemplate spec template respectively.

Making changes to the bastion host

Changes can be made to the bastion instance, like for example changing the flavor. First, you have to disable the bastion host by setting enabled: false in the OpenStackCluster.Spec.Bastion field. The bastion will be deleted, you can check the status of the bastion host by running kubectl get openstackcluster and looking at the Bastion field in status. Once it’s gone, you can re-enable the bastion host by setting enabled: true and then making changes to the bastion instance spec by modifying the OpenStackCluster.Spec.Bastion.Instance field. The bastion host will be re-created with the new instance spec.

Disabling the bastion

To disable the bastion host, set enabled: false in the OpenStackCluster.Spec.Bastion field. The bastion host will be deleted, you can check the status of the bastion host by running kubectl get openstackcluster and looking at the Bastion field in status. Once it’s gone, you can now remove the OpenStackCluster.Spec.Bastion field from the OpenStackCluster spec.

Obtain floating IP address of the bastion node

Once the workload cluster is up and running after being configured for an SSH bastion host, you can use the kubectl get openstackcluster command to look up the floating IP address of the bastion host (make sure the kubectl context is set to the management cluster). The output will look something like this:

$ kubectl get openstackcluster
NAME    CLUSTER   READY   NETWORK                                SUBNET                                 BASTION
nonha   nonha     true    2e2a2fad-28c0-4159-8898-c0a2241a86a7   53cb77ab-86a6-4f2c-8d87-24f8411f15de   10.0.0.213

Topics

Table of Contents generated with DocToc

External Cloud Provider
- Steps of using external cloud provider template

External Cloud Provider

To deploy a cluster using external cloud provider, create a cluster configuration with the external cloud provider template or refer to helm chart.

Steps of using external cloud provider template

After control plane is up and running, retrieve the workload cluster Kubeconfig:

clusterctl get kubeconfig ${CLUSTER_NAME} --namespace default > ./${CLUSTER_NAME}.kubeconfig

Deploy a CNI solution (using Calico now)

Note: choose desired version by replace <v3.23> below

kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig apply -f https://docs.projectcalico.org/archive/v3.23/manifests/calico.yaml

Create a secret containing the cloud configuration

templates/create_cloud_conf.sh <path/to/clouds.yaml> <cloud> > /tmp/cloud.conf

kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig create secret -n kube-system generic cloud-config --from-file=/tmp/cloud.conf

rm /tmp/cloud.conf

Create RBAC resources and openstack-cloud-controller-manager deamonset

kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig apply -f https://raw.githubusercontent.com/kubernetes/cloud-provider-openstack/master/manifests/controller-manager/cloud-controller-manager-roles.yaml
kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig apply -f https://raw.githubusercontent.com/kubernetes/cloud-provider-openstack/master/manifests/controller-manager/cloud-controller-manager-role-bindings.yaml
kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig apply -f https://raw.githubusercontent.com/kubernetes/cloud-provider-openstack/master/manifests/controller-manager/openstack-cloud-controller-manager-ds.yaml

Waiting for all the pods in kube-system namespace up and running

$ kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig get pod -n kube-system
NAME                                                    READY   STATUS    RESTARTS   AGE
calico-kube-controllers-5569bdd565-ncrff                1/1     Running   0          20m
calico-node-g5qqq                                       1/1     Running   0          20m
calico-node-hdgxs                                       1/1     Running   0          20m
coredns-864fccfb95-8qgp2                                1/1     Running   0          109m
coredns-864fccfb95-b4zsf                                1/1     Running   0          109m
etcd-mycluster-control-plane-cp2zw                      1/1     Running   0          108m
kube-apiserver-mycluster-control-plane-cp2zw            1/1     Running   0          110m
kube-controller-manager-mycluster-control-plane-cp2zw   1/1     Running   0          109m
kube-proxy-mxkdp                                        1/1     Running   0          107m
kube-proxy-rxltx                                        1/1     Running   0          109m
kube-scheduler-mycluster-control-plane-cp2zw            1/1     Running   0          109m
openstack-cloud-controller-manager-rbxkz                1/1     Running   8          18m

Table of Contents generated with DocToc

This documentation describes how to move Cluster API related objects from bootstrap cluster to target cluster. Check clusterctl move for further information.

Pre-condition

Bootstrap cluster

# kubectl get pods --all-namespaces
NAMESPACE                           NAME                                                             READY   STATUS    RESTARTS   AGE
capi-kubeadm-bootstrap-system       capi-kubeadm-bootstrap-controller-manager-68cfd4c5b8-mjq75       2/2     Running   0          27m
capi-kubeadm-control-plane-system   capi-kubeadm-control-plane-controller-manager-848575ccb7-m672j   2/2     Running   0          27m
capi-system                         capi-controller-manager-564d97d59c-2t7sl                         2/2     Running   0          27m
capi-webhook-system                 capi-controller-manager-9c8b5d6d4-49czx                          2/2     Running   0          28m
capi-webhook-system                 capi-kubeadm-bootstrap-controller-manager-7dff4b8c7-8w9sq        2/2     Running   1          27m
capi-webhook-system                 capi-kubeadm-control-plane-controller-manager-74c99998d-bftbn    2/2     Running   0          27m
capi-webhook-system                 capo-controller-manager-7d7bfc856b-5ttw6                         2/2     Running   0          24m
capo-system                         capo-controller-manager-5fb48fcb4c-ttkpv                         2/2     Running   0          25m
cert-manager                        cert-manager-544d659678-l9pjb                                    1/1     Running   0          29m
cert-manager                        cert-manager-cainjector-64c9f978d7-bjxkg                         1/1     Running   0          29m
cert-manager                        cert-manager-webhook-5855bb8c8c-8hb9w                            1/1     Running   0          29m
kube-system                         coredns-66bff467f8-ggn54                                         1/1     Running   0          40m
kube-system                         coredns-66bff467f8-t4bqr                                         1/1     Running   0          40m
kube-system                         etcd-kind-control-plane                                          1/1     Running   1          40m
kube-system                         kindnet-ng2gf                                                    1/1     Running   0          40m
kube-system                         kube-apiserver-kind-control-plane                                1/1     Running   1          40m
kube-system                         kube-controller-manager-kind-control-plane                       1/1     Running   1          40m
kube-system                         kube-proxy-h6rmz                                                 1/1     Running   0          40m
kube-system                         kube-scheduler-kind-control-plane                                1/1     Running   1          40m
local-path-storage                  local-path-provisioner-bd4bb6b75-ft7wh                           1/1     Running   0          40m

Target cluster (Below is an example of external cloud provider)

# kubectl get pods --kubeconfig capi-openstack-3.kubeconfig --all-namespaces
NAMESPACE     NAME                                                         READY   STATUS    RESTARTS   AGE
kube-system   calico-kube-controllers-59b699859f-djqd7                     1/1     Running   0          6m2s
kube-system   calico-node-szp44                                            1/1     Running   0          6m2s
kube-system   calico-node-xhgzr                                            1/1     Running   0          6m2s
kube-system   coredns-6955765f44-wk2vq                                     1/1     Running   0          21m
kube-system   coredns-6955765f44-zhps9                                     1/1     Running   0          21m
kube-system   etcd-capi-openstack-control-plane-82xck                      1/1     Running   0          22m
kube-system   kube-apiserver-capi-openstack-control-plane-82xck            1/1     Running   0          22m
kube-system   kube-controller-manager-capi-openstack-control-plane-82xck   1/1     Running   2          22m
kube-system   kube-proxy-4f9k8                                             1/1     Running   0          21m
kube-system   kube-proxy-gjd55                                             1/1     Running   0          21m
kube-system   kube-scheduler-capi-openstack-control-plane-82xck            1/1     Running   2          22m
kube-system   openstack-cloud-controller-manager-z9jtc                     1/1     Running   1          4m9s

Install OpenStack Cluster API provider into target cluster

You need install OpenStack cluster api providers into target cluster first.

# clusterctl --kubeconfig capi-openstack-3.kubeconfig  init --infrastructure openstack
Fetching providers
Installing cert-manager
Waiting for cert-manager to be available...
Installing Provider="cluster-api" Version="v0.3.8" TargetNamespace="capi-system"
Installing Provider="bootstrap-kubeadm" Version="v0.3.8" TargetNamespace="capi-kubeadm-bootstrap-system"
Installing Provider="control-plane-kubeadm" Version="v0.3.8" TargetNamespace="capi-kubeadm-control-plane-system"
Installing Provider="infrastructure-openstack" Version="v0.3.1" TargetNamespace="capo-system"

Your management cluster has been initialized successfully!

You can now create your first workload cluster by running the following:

  clusterctl generate cluster [name] --kubernetes-version [version] | kubectl apply -f -

Move objects from `bootstrap` cluster into `target` cluster.

CRD, objects such as OpenStackCluster, OpenStackMachine etc need to be moved.

# clusterctl move --to-kubeconfig capi-openstack-3.kubeconfig -v 10
No default config file available
Performing move...
Discovering Cluster API objects
Cluster Count=1
KubeadmConfig Count=2
KubeadmConfigTemplate Count=1
KubeadmControlPlane Count=1
MachineDeployment Count=1
Machine Count=2
MachineSet Count=1
OpenStackCluster Count=1
OpenStackMachine Count=2
OpenStackMachineTemplate Count=2
Secret Count=9
Total objects Count=23
Moving Cluster API objects Clusters=1
Pausing the source cluster
Set Cluster.Spec.Paused Paused=true Cluster="capi-openstack-3" Namespace="default"
Creating target namespaces, if missing
Creating objects in the target cluster
Creating Cluster="capi-openstack-3" Namespace="default"
Creating OpenStackMachineTemplate="capi-openstack-control-plane" Namespace="default"
Creating OpenStackMachineTemplate="capi-openstack-md-0" Namespace="default"
Creating OpenStackCluster="capi-openstack-3" Namespace="default"
Creating KubeadmConfigTemplate="capi-openstack-md-0" Namespace="default"
Creating MachineDeployment="capi-openstack-md-0" Namespace="default"
Creating KubeadmControlPlane="capi-openstack-control-plane" Namespace="default"
Creating Secret="capi-openstack-3-etcd" Namespace="default"
Creating Machine="capi-openstack-control-plane-n2kdq" Namespace="default"
Creating Secret="capi-openstack-3-sa" Namespace="default"
Creating Secret="capi-openstack-3-kubeconfig" Namespace="default"
Creating Secret="capi-openstack-3-proxy" Namespace="default"
Creating MachineSet="capi-openstack-md-0-dfdf94979" Namespace="default"
Creating Secret="capi-openstack-3-ca" Namespace="default"
Creating Machine="capi-openstack-md-0-dfdf94979-656zq" Namespace="default"
Creating KubeadmConfig="capi-openstack-control-plane-xzj7x" Namespace="default"
Creating OpenStackMachine="capi-openstack-control-plane-82xck" Namespace="default"
Creating Secret="capi-openstack-control-plane-xzj7x" Namespace="default"
Creating OpenStackMachine="capi-openstack-md-0-bkwhh" Namespace="default"
Creating KubeadmConfig="capi-openstack-md-0-t44gj" Namespace="default"
Creating Secret="capi-openstack-md-0-t44gj" Namespace="default"
Deleting objects from the source cluster
Deleting Secret="capi-openstack-md-0-t44gj" Namespace="default"
Deleting Secret="capi-openstack-control-plane-xzj7x" Namespace="default"
Deleting OpenStackMachine="capi-openstack-md-0-bkwhh" Namespace="default"
Deleting KubeadmConfig="capi-openstack-md-0-t44gj" Namespace="default"
Deleting Machine="capi-openstack-md-0-dfdf94979-656zq" Namespace="default"
Deleting KubeadmConfig="capi-openstack-control-plane-xzj7x" Namespace="default"
Deleting OpenStackMachine="capi-openstack-control-plane-82xck" Namespace="default"
Deleting Secret="capi-openstack-3-etcd" Namespace="default"
Deleting Machine="capi-openstack-control-plane-n2kdq" Namespace="default"
Deleting Secret="capi-openstack-3-sa" Namespace="default"
Deleting Secret="capi-openstack-3-kubeconfig" Namespace="default"
Deleting Secret="capi-openstack-3-proxy" Namespace="default"
Deleting MachineSet="capi-openstack-md-0-dfdf94979" Namespace="default"
Deleting Secret="capi-openstack-3-ca" Namespace="default"
Deleting OpenStackMachineTemplate="capi-openstack-control-plane" Namespace="default"
Deleting OpenStackMachineTemplate="capi-openstack-md-0" Namespace="default"
Deleting OpenStackCluster="capi-openstack-3" Namespace="default"
Deleting KubeadmConfigTemplate="capi-openstack-md-0" Namespace="default"
Deleting MachineDeployment="capi-openstack-md-0" Namespace="default"
Deleting KubeadmControlPlane="capi-openstack-control-plane" Namespace="default"
Deleting Cluster="capi-openstack-3" Namespace="default"
Resuming the target cluster
Set Cluster.Spec.Paused Paused=false Cluster="capi-openstack-3" Namespace="default"

Check cluster status

# kubectl get openstackcluster --kubeconfig capi-openstack-3.kubeconfig --all-namespaces
NAMESPACE   NAME               CLUSTER            READY   NETWORK                                SUBNET
default     capi-openstack-3   capi-openstack-3   true    4a6f2d57-bb3d-44f4-a28a-4c94a92e41d0   1a1a1d9d-5258-42cb-8756-fa4c648af72b

# kubectl get openstackmachines --kubeconfig capi-openstack-3.kubeconfig --all-namespaces
NAMESPACE   NAME                                 CLUSTER            STATE    READY   INSTANCEID                                         MACHINE
default     capi-openstack-control-plane-82xck   capi-openstack-3   ACTIVE   true    openstack:///f29007c5-f672-4214-a508-b7cf4a17b3ed   capi-openstack-control-plane-n2kdq
default     capi-openstack-md-0-bkwhh            capi-openstack-3   ACTIVE   true    openstack:///6e23324d-315a-4d75-85a9-350fd1705ab6   capi-openstack-md-0-dfdf94979-656zq

Table of Contents generated with DocToc

Troubleshooting

Troubleshooting

This guide (based on Minikube but others should be similar) explains general info on how to debug issues if a cluster creation fails.

Get logs of Cluster API controller containers

kubectl --kubeconfig minikube.kubeconfig -n capo-system logs -l control-plane=capo-controller-manager -c manager

Similarly, the logs of the other controllers in the namespaces capi-system and cabpk-system can be retrieved.

Master failed to start with error: node xxxx not found

Sometimes the master machine is created but fails to startup, take Ubuntu as example, open /var/log/messages and if you see something like this:

Jul 10 00:07:58 openstack-master-5wgrw kubelet: E0710 00:07:58.444950 4340 kubelet.go:2248] node "openstack-master-5wgrw" not found
Jul 10 00:07:58 openstack-master-5wgrw kubelet: I0710 00:07:58.526091 4340 kubelet_node_status.go:72] Attempting to register node openstack-master-5wgrw
Jul 10 00:07:58 openstack-master-5wgrw kubelet: E0710 00:07:58.527398 4340 kubelet_node_status.go:94] Unable to register node "openstack-master-5wgrw" with API server: nodes "openstack-master-5wgrw" is forbidden: node "openstack-master-5wgrw.novalocal" is not allowed to modify node "openstack-master-5wgrw"

This might be caused by This issue, try the method proposed there.

providerClient authentication err

If you are using https, you must specify the CA certificate in your clouds.yaml file, and when you encounter issue like:

kubectl --kubeconfig minikube.kubeconfig logs -n capo-system logs -l control-plane=capo-controller-manager
...
E0814 04:32:52.688514       1 machine_controller.go:204] Failed to check if machine "openstack-master-hxk9r" exists: providerClient authentication err: Post https://xxxxxxxxxxxxxxx:5000/v3/auth/tokens: x509: certificate signed by unknown authority
...

you can also add verify: false into clouds.yaml file to solve the problem.

clouds:
  openstack:
    auth:
        ....
    region_name: "RegionOne"
    interface: "public"
    identity_api_version: 3
    cacert: /etc/certs/cacert
    verify: false

Fails in creating floating IP during cluster creation.

If you encounter rule:create_floatingip and rule:create_floatingip:floating_ip_address is disallowed by policy when create floating ip, check with your openstack administrator, you need to be authorized to perform those actions, see issue 572 for more detailed information.

Refer to rule:create_floatingip and rule:create_floatingip:floating_ip_address for further policy information.

An alternative is to create the floating IP before create the cluster and use it.

Table of Contents generated with DocToc

CRD Changes
- Conversions

CRD Changes

Conversions

CAPO is able to automatically convert your old resources into new API versions.

Table of Contents generated with DocToc

v1alpha4 compared to v1alpha5
- Migration
- API Changes
  - OpenStackCluster
    - Managed API LoadBalancer
  - OpenStackMachine

v1alpha4 compared to v1alpha5

Migration

All users are encouraged to migrate their usage of the CAPO CRDs from older versions to v1alpha5. This includes yaml files and source code. As CAPO implements automatic conversions between the CRD versions, this migration can happen after installing the new CAPO release.

API Changes

This only documents backwards incompatible changes. Fields that were added to v1alpha5 are not listed here.

`OpenStackCluster`

Managed API LoadBalancer

The fields related to the managed API LoadBalancer were moved into a seperate object:

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha4
kind: OpenStackCluster
spec:
  managedAPIServerLoadBalancer: true
  apiServerLoadBalancerAdditionalPorts: [443]

becomes:

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha5
kind: OpenStackCluster
spec:
  apiServerLoadBalancer:
    enabled: true
    additionalPorts: [443]

`OpenStackMachine`

Major Changes to Ports and Networks

When using Ports it is now possible to specify network and subnet by filter instead of just ID. As a consequence, the relevant ID fields are now moved into the new filter specifications:

ports:
  - networkId: d-e-a-d
    fixedIPs:
      - subnetId: b-e-e-f

becomes:

ports:
  - network:
      id: d-e-a-d
    fixedIPs:
      subnet:
        id: b-e-e-f

Networks are now deprecated. With one exception, all functionality of Networks is now available for Ports. Consequently, Networks will be removed from the API in a future release.

The ability of a Network to add multiple ports with a single directive will not be added to Ports. When moving to Ports, all ports must be added explicitly. Specifically, when evaluating the network or subnet filter of a Network, if there are multiple matches we will add all of these to the server. By contrast we raise an error if the network or subnet filter of a Port does not return exactly one result.

tenantId was previously a synonym for projectId in both network and subnet filters. This has now been removed. Use projectId instead.

The following fields are removed from network and subnet filters without replacement:

status
adminStateUp
shared
marker
limit
sortKey
sortDir
subnetPoolId

Rename of `status.error{Reason,Message}` to `status.failure{Reason,Message}`

The actual fields were previously already renamed, but we still used the error prefix in JSON. This was done to align with CAPI, where these fields were renamed in v1alpha3.

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha4
kind: OpenStackMachine
status:
  errorReason: UpdateError
  errorMessage: Something when wrong

becomes:

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha5
kind: OpenStackMachine
status:
  failureReason: UpdateError
  failureMessage: Something when wrong

Changes to `rootVolume`

The following fields were removed without replacement:

rootVolume.deviceType
rootVolume.sourceType

Additionally, rootVolume.sourceUUID has been replaced by using ImageUUID or Image from the OpenStackMachine as appropriate.

Table of Contents generated with DocToc

v1alpha5 compared to v1alpha6
- Migration
- API Changes
  - OpenStackCluster
  - OpenStackMachine

v1alpha5 compared to v1alpha6

Migration

All users are encouraged to migrate their usage of the CAPO CRDs from older versions to v1alpha6. This includes yaml files and source code. As CAPO implements automatic conversions between the CRD versions, this migration can happen after installing the new CAPO release.

API Changes

This only documents backwards incompatible changes. Fields that were added to v1alpha6 are not listed here.

`OpenStackCluster`

`OpenStackMachine`

Table of Contents generated with DocToc

v1alpha6 compared to v1alpha7
- Migration
- API Changes
  - OpenStackMachine
  - OpenStackCluster
    - Change to externalRouterIPs.subnet

v1alpha6 compared to v1alpha7

⚠️ v1alpha7 has not been released yet.

Migration

API Changes

This only documents backwards incompatible changes. Fields that were added to v1alpha6 are not listed here.

`OpenStackMachine`

⚠️ Removal of networks

This is a major breaking change between v1alpha6 and v1alpha7 which in certain circumstances may require manual action before upgrading to v0.8.

v1alpha6 allowed network attachments to an OpenStackMachine to be specified as either Networks or Ports. In v1alpha7, Networks are removed. Network attachments may only be specified as ports.

In most cases Networks will be automatically converted to equivalent Ports on upgrade. However, this is not supported in the case where a Network specifies a network or subnet filter which returns more than one OpenStack resource. In this case it is important to rewrite any affected OpenStackMachineTemplates and wait for any consequent rollout to complete prior to updating to version 0.8.

Your configuration is affected if it contains any Network or Subnet filter which returns multiple resources. In a v1alpha6 Network definition this resulted in the creation of multiple ports: one for each returned result. In a Port definition, filters may only return a single resource and throw an error if multiple resources are returned.

For example, take this example OpenStackMachineTemplate specification:

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha6
kind: OpenStackMachineTemplate
metadata:
  name: ${CLUSTER_NAME}-md-0
spec:
  template:
    spec:
      ..
      networks:
      - filter:
          tags: tag-matches-multiple-networks
        subnets:
        - filter:
            tags: tag-matches-multiple-subnets

In this configuration both the network and subnet filters match multiple resources. In v0.8 this will be automatically converted to:

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha7
kind: OpenStackMachineTemplate
metadata:
  name: ${CLUSTER_NAME}-md-0
spec:
  template:
    spec:
      ..
      ports:
      - network:
          tags: tag-matches-multiple-networks
        fixedIPs:
        - subnet:
            tags: tag-matches-multiple-subnets

However, this will cause an error when reconciled by the machine controller, because in a port:

a network filter may only return a single network
a subnet filter may only return a single subnet

Instead it needs to be rewritten prior to upgrading to version 0.8. It can be rewritten as either ports or networks, as long as each results in the creation of only a single port. For example, rewriting without converting to ports might give:

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha6
kind: OpenStackMachineTemplate
metadata:
  name: ${CLUSTER_NAME}-md-0
spec:
  template:
    spec:
      ..
      networks:
      - filter:
          name: network-a
        subnets:
        - filter:
            name: subnet-a
      - filter:
          name: network-b
        subnets:
        - filter:
            name: subnet-b

This will be safely converted to use ports when upgrading to version 0.8.

To reiterate: it is not sufficient to leave your templates at v1alpha6 in this case, as it will still result in failure to reconcile in the machine controller. This change must be made prior to updating to version 0.8.

Removal of subnet

The OpenStackMachine spec previously contained a subnet field which could used to set the accessIPv4 field on Nova servers. This feature was not widely used, difficult to use, and could not be extended to support IPv6. It is removed without replacement.

Change to securityGroups

securityGroups has been simplified by the removal of a separate filter parameter. It was previously:

securityGroups:
  uuid: ...
  name: ...
  filter:
    description: ...
    tags: ...
    ...

It becomes:

securityGroups:
  id: ...
  name: ...
  description: ...
  tags: ...
  ...

Note that in this change, the uuid field has become id. So:

securityGroups:
- uuid: 4ea83db6-2760-41a9-b25a-e625a1161ed0

becomes:

securityGroups:
- id: 4ea83db6-2760-41a9-b25a-e625a1161ed0

The limit, marker, sortKey, sortDir, fields have been removed without replacement. They did not serve any purpose in this context.

The tenantId parameter has been removed. Use projectId instead.

Changes to ports

Change to securityGroupFilters

The same change is made to securityGroupFilters in ports as is made to securityGroups in the machine spec.

Removal of securityGroups

The Port field of the OpenStackMachine spec previously contained both securityGroups and securityGroupFilters. As securityGroups can easily be replaced with securityGroupFilters, that can express the same and more, securityGroups has now been removed. CAPO can automatically convert securityGroups to securityGroupFilters when upgrading.

Here is an example of how to use securityGroupFilters to replace securityGroups:

# securityGroups are available in v1alpha6
securityGroups:
- 60ed83f1-8886-41c6-a1c7-fcfbdf3f04c2
- 0ddd14d9-5c33-4734-b7d0-ac4fdf35c2d9
- 4a131d3e-9939-4a6b-adea-788a2e89fcd8
# securityGroupFilters are available in both v1alpha6 and v1alpha7
securityGroupFilters:
- id: 60ed83f1-8886-41c6-a1c7-fcfbdf3f04c2
- id: 0ddd14d9-5c33-4734-b7d0-ac4fdf35c2d9
- id: 4a131d3e-9939-4a6b-adea-788a2e89fcd8

Removal of tenantId and projectId

These are removed without replacement. They required admin permission to set, which CAPO does not have by default, and served no purpose.

Change to profile

We previously allowed to use the Neutron binding:profile via ports.profile but this API is not declared as stable from the Neutron API description.

Instead, we now explicitly support two use cases:

OVS Hardware Offload
Trusted Virtual Functions (VF)

Note that the conversion is lossy, we only support the two use cases above so if anyone was relying on anything other than the supported behaviour, it will be lost.

Here is an example on how to use ports.profile for enabling OVS Hardware Offload:

profile:
  OVSHWOffload: true

Here is an example on how to use ports.bindingProfile for enabling “trusted-mode” to the VF:

profile:
  TrustedVF: true

Creation of additionalBlockDevices

We now have the ability for a machine to have additional block devices to be attached.

Here is an example on how to use additionalBlockDevices for adding an additional Cinder volume attached to the server instance:

additionalBlockDevices:
- name: database
  sizeGiB: 50
  storage:
    type: Volume

Here is an example on how to use additionalBlockDevices for adding an additional Cinder volume attached to the server instance with an availability zone and a cinder type:

additionalBlockDevices:
- name: database
  sizeGiB: 50
  storage:
    type: Volume
    volume:
      type: my-volume-type
      availabilityZone: az0

Here is an example on how to attach a ephemeral disk to the instance:

additionalBlockDevices
- name: disk1
  sizeGiB: 1
  storage:
    type: local

Adding more than one ephemeral disk to an instance is possible but you should use it at your own risks, it has been known to cause some issues in some environments.

`OpenStackCluster`

Change to externalRouterIPs.subnet

The uuid field is renamed to id, and all fields from filter are moved directly into the subnet.

externalRouterIPs:
- subnet:
    uuid: f23bf9c1-8c66-4383-b474-ada1d1960149
- subnet:
    filter:
      name: my-subnet

becomes:

externalRouterIPs:
- subnet:
    id: f23bf9c1-8c66-4383-b474-ada1d1960149
- subnet:
    name: my-subnet

status.router and status.apiServerLoadBalancer moved out of status.network

status:
  network:
    id: 756f59c0-2a9b-495e-9bb1-951762523d2d
    name: my-cluster-network
    ...
    router:
      id: dd0b23a7-e785-4061-93c5-464843e8cc39
      name: my-cluster-router
      ...
    apiServerLoadBalancer:
      id: 660d628e-cbcb-4c10-9910-e2e6493643c7
      name: my-api-server-loadbalancer
      ...

becomes:

status:
  network:
    id: 756f59c0-2a9b-495e-9bb1-951762523d2d
    name: my-cluster-network
    ...
  router:
    id: dd0b23a7-e785-4061-93c5-464843e8cc39
    name: my-cluster-router
    ...
  apiServerLoadBalancer:
    id: 660d628e-cbcb-4c10-9910-e2e6493643c7
    name: my-api-server-loadbalancer
    ...

status.network.subnet becomes status.network.subnets

status:
  network:
    id: 756f59c0-2a9b-495e-9bb1-951762523d2d
    name: my-cluster-network
    subnet:
      id: 0e0c3d69-040a-4b51-a3f5-0f5d010b36f4
      name: my-cluster-subnet
      cidr: 192.168.100.0/24

becomes

  network:
    id: 756f59c0-2a9b-495e-9bb1-951762523d2d
    name: my-cluster-network
    subnets:
    - id: 0e0c3d69-040a-4b51-a3f5-0f5d010b36f4
      name: my-cluster-subnet
      cidr: 192.168.100.0/24

Nothing will currently create more than a single subnet, but there may be multiple subnets in the future. Similarly, code should no longer assume that the CIDR is an IPv4 CIDR, although nothing will currently create anything other than an IPv4 CIDR.

Table of Contents generated with DocToc

v1alpha7 compared to v1beta1
- Migration
- API Changes
  - OpenStackMachine
    - Removal of machine identityRef.kind
    - Change to serverGroupID
  - OpenStackCluster

v1alpha7 compared to v1beta1

⚠️ v1beta1 has not been released yet.

Migration

All users are encouraged to migrate their usage of the CAPO CRDs from older versions to v1beta1. This includes yaml files and source code. As CAPO implements automatic conversions between the CRD versions, this migration can happen after installing the new CAPO release.

API Changes

This only documents backwards incompatible changes. Fields that were added to v1beta1 are not listed here.

`OpenStackMachine`

Removal of machine identityRef.kind

The identityRef.Kind field has been removed. It was used to specify the kind of the identity provider to use but was actually ignored.

Change to serverGroupID

The field serverGroupID has been renamed to serverGroup and is now a ServerGroupFilter object rather than a string ID.

The ServerGroupFilter object allows selection of a server group by name or by ID.

serverGroupID: "e60f19e7-cb37-49f9-a2ee-0a1281f6e03e"

becomes

serverGroup:
  id: "e60f19e7-cb37-49f9-a2ee-0a1281f6e03e"

To select a server group by name instead of ID:

serverGroup:
  name: "workers"

If a server group is provided and found, it’ll be added to OpenStackMachine.Status.ReferencedResources.ServerGroupID. If the server group can’t be found or filter matches multiple server groups, an error will be returned. If empty object or null is provided, Machine will not be added to any server group and OpenStackMachine.Status.ReferencedResources.ServerGroupID will be empty.

`OpenStackCluster`

Removal of cluster identityRef.kind

The identityRef.Kind field has been removed. It was used to specify the kind of the identity provider to use but was actually ignored.

Change to externalNetworkID

The field externalNetworkID has been renamed to externalNetwork and is now a NetworkFilter object rather than a string ID. The NetworkFilter object allows selection of a network by name, by ID or by tags.

externalNetworkID: "e60f19e7-cb37-49f9-a2ee-0a1281f6e03e"

becomes

externalNetwork:
  id: "e60f19e7-cb37-49f9-a2ee-0a1281f6e03e"

It is now possible to specify a NetworkFilter object to select the external network to use for the cluster. The NetworkFilter object allows to select the network by name, by ID or by tags.

externalNetwork:
  name: "public"

If a network is provided, it’ll be added to OpenStackCluster.Status.ExternalNetwork. If the network can’t be found, an error will be returned. If no network is provided, CAPO will try to find network marked “External” and add it to OpenStackCluster.Status.ExternalNetwork. If it can’t find a network marked “External”, OpenStackCluster.Status.ExternalNetwork will be set to nil. If more than one network is found, an error will be returned.

It is now possible for a user to specify that no external network should be used by setting DisableExternalNetwork to true:

disableExternalNetwork: true

Change to image

The field image is now an ImageFilter object rather than a string name. The ImageFilter object allows selection of an image by name, by ID or by tags.

image: "test-image"

becomes

image:
  name: "test-image"

The image ID will be added to OpenStackMachine.Status.ReferencedResources.ImageID. If the image can’t be found or filter matches multiple images, an error will be returned.

Removal of imageUUID

The fild imageUUID has been removed in favor of the image field.

imageUUID: "72a6a1e6-3e0a-4a8b-9b4c-2d6f9e3e5f0a"

becomes

image:
  id: "72a6a1e6-3e0a-4a8b-9b4c-2d6f9e3e5f0a"

Change to floatingIP

The OpenStackMachineSpec.FloatingIP field has moved to OpenStackClusterSpec.Bastion.FloatingIP. For example, if you had the following OpenStackMachineTemplate:

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha6
kind: OpenStackMachineTemplate
metadata:
  name: ${CLUSTER_NAME}-md-0
spec:
  template:
    spec:
      ..
      floatingIP: "1.2.3.4"

This will safely converted to use Bastion.FloatingIP when upgrading to version 0.8.

To use the new Bastion.FloatingIP field, here is an example:

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha7
kind: OpenStackCluster
metadata:
  name: ${CLUSTER_NAME}
spec:
  ..
  bastion:
    floatingIP: "1.2.3.4"

Change to subnet

In v1beta1, Subnet of OpenStackCluster is modified to Subnets to allow specification of two existent subnets for the dual-stack scenario.

  subnet:
    id: a532beb0-c73a-4b5d-af66-3ad05b73d063

In v1beta1, this will be automatically converted to:

  subnets:
    - id: a532beb0-c73a-4b5d-af66-3ad05b73d063

Subnets allows specifications of maximum two SubnetFilter one being IPv4 and the other IPv6. Both subnets must be on the same network. Any filtered subnets will be added to OpenStackCluster.Status.Network.Subnets.

When subnets are not specified on OpenStackCluster and only the network is, the network is used to identify the subnets to use. If more than two subnets exist in the network, the user must specify which ones to use by defining the OpenStackCluster.Spec.Subnets field.

Change to nodeCidr and dnsNameservers

In v1beta1, OpenStackCluster.Spec.ManagedSubnets array field is introduced. The NodeCIDR and DNSNameservers fields of OpenStackCluster.Spec are moved into that structure (renaming NodeCIDR to CIDR). For example:

  nodeCidr: "10.0.0.0/24"
  dnsNameservers: "10.0.0.123"

In v1beta1, this will be automatically converted to:

  managedSubnets:
  - cidr: "10.0.0.0/24"
    dnsNameservers: "10.0.0.123"

Please note that currently managedSubnets can only hold one element.

Addition of allocationPools

In v1beta1, an AllocationPools property is introduced to OpenStackCluster.Spec.ManagedSubnets. When specified, OpenStack subnet created by CAPO will have the given values set as the allocation_pools property. This allows users to make sure OpenStack will not allocate some IP ranges in the subnet automatically. If the subnet is precreated and configured, CAPO will ignore AllocationPools property.

Change to managedSecurityGroups

The field managedSecurityGroups is now a pointer to a ManagedSecurityGroups object rather than a boolean.

Also, we can now add security group rules that authorize traffic from all nodes via allNodesSecurityGroupRules. It takes a list of security groups rules that should be applied to selected nodes. The following rule fields are mutually exclusive: remoteManagedGroups, remoteGroupID and remoteIPPrefix. Valid values for remoteManagedGroups are controlplane, worker and bastion.

Also, OpenStackCluster.Spec.AllowAllInClusterTraffic moved under ManagedSecurityGroups.

managedSecurityGroups: true

becomes

managedSecurityGroups: {}

and

allowAllInClusterTraffic: true
managedSecurityGroups: true

becomes

managedSecurityGroups:
  allowAllInClusterTraffic: true

To apply a security group rule that will allow BGP between the control plane and workers, you can follow this example:

managedSecurityGroups:
  allNodesSecurityGroupRules:
  - remoteManagedGroups:
    - controlplane
    - worker
    direction: ingress
    etherType: IPv4
    name: BGP (Calico)
    portRangeMin: 179
    portRangeMax: 179
    protocol: tcp
    description: "Allow BGP between control plane and workers"

Calico CNI

Historically we used to create the necessary security group rules for Calico CNI to work. This is no longer the case. Now the user needs to request creation of the security group rules by using the managedSecurityGroups.allNodesSecurityGroupRules feature.

Note that when upgrading from a previous version, the Calico CNI security group rules will be added automatically to allow backwards compatibility if allowAllInClusterTraffic is set to false.

Change to network

In v1beta1, when the OpenStackCluster.Spec.Network is not defined, the Subnets are now used to identify the Network.

Table of Contents generated with DocToc

Development Guide

Development Guide

This document explains how to develop Cluster API Provider OpenStack.

Using your own capi-openstack controller image for testing cluster creation or deletion

You need to create your own openstack-capi controller image for testing cluster creation or deletion by your code. The image is stored in the docker registry. You need to create an account of Docker registry in advance.

Building and upload your own capi-openstack controller image

Variable	Meaning	Mandatory	Example
REGISTRY	The registry name	Yes	docker.io/<username>
IMAGE_NAME	The image name (default: capi-openstack-controller)	No	capi-openstack-controller
TAG	The image version (default: dev)	No	latest

Execute the command to build and upload the image to the Docker registry.

make docker-build docker-push

Using your own capi-openstack controller image

After generating infrastructure-components.yaml, replace the us.gcr.io/k8s-artifacts-prod/capi-openstack/capi-openstack-controller:v0.3.4 with your image.

Testing Cluster Creation using the ‘dev-test’ ClusterClass with Tilt

This guide demonstrates how to create a Kubernetes cluster using a ClusterClass, specifically designed for a development environment. It includes configuring secrets, applying the ClusterClass, and creating a cluster with Tilt.

The dev-test ClusterClass is designed for development. This means that it is using the latest (potentially unstable) API version. The defaults are also aligned with the devstack setup (documented below) to make it as easy as possible to use in a development flow. However, this also means that it may not be well suited for general usage.

Developing with Tilt

We have support for using Tilt for rapid iterative development. Please visit the Cluster API documentation on Tilt for information on how to set up your development environment.

The Tiltfile in the Cluster API repository can be used as is with CAPO, but we need to add some configuration. For using Tilt with ClusterClass, update your tilt-settings.yaml file (located in the root of the CAPI repository) as described:

template_dirs:
  openstack:
  # Make Tilt aware of the CAPO templates
  - ../cluster-api-provider-openstack/templates

kustomize_substitutions:
  CLUSTER_TOPOLOGY: "true"
  # [Optional] SSH Keypair Name for Instances in OpenStack (Default: "")
  OPENSTACK_SSH_KEY_NAME: "<openstack_keypair_name>"
  # [Optional] Control Plane Machine Flavor (Default: m1.medium)
  OPENSTACK_CONTROL_PLANE_MACHINE_FLAVOR: "<openstack_control_plane_machine_flavor>"
  # [Optional] Node Machine Flavor (Default: m1.small)
  OPENSTACK_NODE_MACHINE_FLAVOR: "<openstack_node_machine_flavor>"
  # [Optional] OpenStack Cloud Environment (Default: capo-e2e)
  OPENSTACK_CLOUD: "<openstack_cloud>"

# [Optional] Automatically apply a kustomization, e.g. for adding the clouds.yaml secret
additional_kustomizations:
  secret_kustomization: /path/to/kustomize/secret/configuration

Apply ClusterClass and create Cluster

When you are happy with the configuration, start the environment as explained in the CAPI documentation. Open the Tilt dashboard in your browser. After a while, you should be able to find resources called CAPO.clusterclasses and CAPO.templates. These shoud correspond to what exists in the templates folder and you should see widgets for applying and deleting them.

Note: When you apply a cluster template, there will be a KUBERNETES_VERSION variable. This variable is used to pick the image used! Ensure that an image named ubuntu-2204-kube-{{ KUBERNETES_VERSION }} is available in your environment, corresponding to that Kubernetes version.

Note: All clusters created from the dev-test ClusterClass will require a secret named dev-test-cloud-config with the clouds.yaml to be used by CAPO for interacting with OpenStack. You can create it manually or see below how to make Tilt automate it.

Automatically applying kustomizations with Tilt

This explains how to automatically create the secret containing clouds.yaml. The same procedure can be used for any other things you want to create in the cluster.

Ensure the specified path (/path/to/kustomize/secret/configuration) contains both the clouds.yaml file and a kustomization.yaml file. The kustomization.yaml should define the necessary resources, such as a Kubernetes secret, using the clouds.yaml file.

For example, if you want to automatically create a secret named dev-test-cloud-config with the content of your clouds.yaml every time you do tilt up, you could do the following.

Create a folder to hold the kustomization. We will use /tmp/capo-dev as example here.

Add the clouds.yaml file that you want to use to the folder. It could look something like this:

clouds:
  capo-e2e:
    auth:
      username: demo
      password: secretadmin
      # If using application credentials you would have something like this instead:
      # auth_type: v3applicationcredential
      # application_credential_id: abc123
      # application_credential_secret: 456def
      user_domain_id: default
      auth_url: https://example.com/identity
      domain_id: default
      project_name: demo
    verify: false
    region_name: RegionOne

Create a kustomization file named kustomization.yaml in the same folder:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
# Do not add random hash to the end of the secret name
generatorOptions:
  disableNameSuffixHash: true
secretGenerator:
- files:
  - clouds.yaml
  name: dev-test-cloud-config
  type: Opaque

If you now add /tmp/capo-dev to the additional_kustomizations, tilt will automatically apply the secret.

To check that the kustomization produces the desired output, do kustomize build /tmp/capo-dev.

Using the ‘dev-test’ ClusterClass without Tilt

If you want to use the ClusterClass without Tilt, you will need to follow these steps instead of the above.

Creating a Kind Cluster

Create a Kind cluster and deploy CAPO.

Note: As the dev-test ClusterClass is made for development, it may be using a newer API version than what is in the latest release. You will need to use local artifacts for this to work in most cases!

kind create cluster
export CLUSTER_TOPOLOGY=true
clusterctl init --infrastructure openstack

Secret Configuration

CAPO needs a clouds.yaml file in order to manage the OpenStack resources needed for the Cluster. This should be supplied as a secret named dev-test-cloud-config. You can create this secret for example with:

kubectl create secret generic dev-test-cloud-config --from-file=clouds.yaml

Apply the ClusterClass and create Clusters

You can use clusterctl to render the ClusterClass:

clusterctl generate yaml  --from templates/clusterclass-dev-test.yaml

Create a cluster using the development template, that makes use of the ClusterClass:

clusterctl generate cluster my-cluster --kubernetes-version=v1.29.0 --from templates/cluster-template-development.yaml > my-cluster.yaml
kubectl apply -f my-cluster.yaml

Running E2E tests locally

You can run the E2E tests locally with:

make test-e2e OPENSTACK_CLOUD_YAML_FILE=/path/to/clouds.yaml OPENSTACK_CLOUD=mycloud

where mycloud is an entry in clouds.yaml.

The E2E tests:

Build a CAPO image from the local working directory
Create a kind cluster locally
Deploy downloaded CAPI, and locally-build CAPO to kind
Create an e2e namespace per-test on the kind cluster
Deploy cluster templates to the test namespace
Create test clusters on the target OpenStack

Support for clouds using SSL

If your cloud requires a cacert you must also pass this to make via OPENSTACK_CLOUD_CACERT_B64, i.e.:

make test-e2e OPENSTACK_CLOUD_YAML_FILE=/path/to/clouds.yaml OPENSTACK_CLOUD=my_cloud \
              OPENSTACK_CLOUD_CACERT_B64=$(base64 -w0 /path/to/mycloud-ca.crt)

CAPO deployed in the local kind cluster will automatically pick up a cacert defined in your clouds.yaml so you will see servers created in OpenStack without specifying OPENSTACK_CLOUD_CACERT_B64. However, the cacert won’t be deployed to those servers, so kubelet will fail to start.

Support for clouds with multiple external networks

If your cloud contains only a single external network CAPO will automatically select that network for use by a deployed cluster. However, if there are multiple external networks CAPO will log an error and fail to create any machines. In this case you must pass the id of an external network to use explicitly with OPENSTACK_EXTERNAL_NETWORK_ID, i.e.:

make test-e2e OPENSTACK_CLOUD_YAML_FILE=/path/to/clouds.yaml OPENSTACK_CLOUD=my_cloud \
              OPENSTACK_EXTERNAL_NETWORK_ID=27635f93-583d-454e-9c6d-3d305e7f8a22

OPENSTACK_EXTERNAL_NETWORK_ID must be specified as a uuid. Specifying by name is not supported.

You can list available external networks with:

$ openstack network list --external
+--------------------------------------+----------+--------------------------------------+
| ID                                   | Name     | Subnets                              |
+--------------------------------------+----------+--------------------------------------+
| 27635f93-583d-454e-9c6d-3d305e7f8a22 | external | be64cd07-f8b7-4705-8446-26b19eab3914 |
| cf2e83dc-545d-490f-9f9c-4e90927546f2 | hostonly | ec95befe-72f4-4af6-a263-2aec081f47d3 |
+--------------------------------------+----------+--------------------------------------+

E2E test environment

The test suite is executed in an existing OpenStack environment. You can create and manage this environment yourself or use the hacking CI scripts to provision an environment with DevStack similar to the one used for continuous integration.

Requirements

The file test/e2e/data/e2e_conf.yaml and the test templates under test/e2e/data/infrastructure-openstack reference several OpenStack resources which must exist before running the test:

System requirements
- Multiple nodes
- controller: 16 CPUs / 64 GB RAM
- worker: 8 CPUs / 32 GB RAM
Availability zones (for multi-AZ tests)
- testaz1: used by all test cases
- testaz2: used by multi-az test case
Services (Additional services to be enabled)
- Octavia
- Network trunking (neutron-trunk)
- see Configration for more details.
Glance images
- cirros-0.6.1-x86_64-disk
  - Download from https://docs.openstack.org/image-guide/obtain-images.html
- ubuntu-2004-kube-v1.23.10
  - Download from https://storage.googleapis.com/artifacts.k8s-staging-capi-openstack.appspot.com/test/ubuntu/2022-12-05/ubuntu-2004-kube-v1.23.10.qcow2
  - Or generate using the images/capi directory from https://github.com/kubernetes-sigs/image-builder
    - Boot volume size must be less than 15GB
Flavors
- m1.medium: used by control plane
- m1.small: used by workers
- m1.tiny: used by bastion

clouds.yaml

capo-e2e: for general user authorization
capo-e2e-admin: for administrator user authorization

i.e.:

clouds:
  capo-e2e:
    auth:
      auth_url: http://Node-Address/identity
      project_name: demo
      project_domain_name: Default
      user_domain_name: Default
      username: demo
      password: secret
    region_name: RegionOne

  capo-e2e-admin:
    auth:
      auth_url: http://Node-Address/identity
      project_name: demo
      project_domain_name: Default
      user_domain_name: Default
      username: admin
      password: secret
    region_name: RegionOne

Create E2E test environment

You can easily create a test environment similar to the one used during continuous integration on OpenStack, AWS or GCE with the hacking CI scripts.

The entry point for the creation of the DevStack environment is the create_devstack.sh script, which executes specific scripts to create infrastructure on different clouds:

You can switch between these cloud providers, by setting the RESOURCE_TYPE environment variable to aws-project, gce-project or openstack respectively.

OpenStack

Configure the following environment variables for OpenStack:

export RESOURCE_TYPE="openstack"
export OS_CLOUD=<your cloud>
export OPENSTACK_FLAVOR_controller=<flavor with >= 16 cores, 64GB RAM and 50GB storage>
export OPENSTACK_FLAVOR_worker=<flavor with >= 8 cores, 32GB RAM and 50GB storage>
export OPENSTACK_PUBLIC_NETWORK=<name of the external network>
export OPENSTACK_SSH_KEY_NAME=<your ssh key-pair name>
export SSH_PUBLIC_KEY_FILE=/home/user/.ssh/id_ed25519.pub
export SSH_PRIVATE_KEY_FILE=/home/user/.ssh/id_ed25519

and create the environment by running:

./hack/ci/create_devstack.sh

DevStack

Here’s a few notes to setup a DevStack environment and debug ressources (tested on m3.small from Equinix Metal: https://deploy.equinix.com/product/servers/m3-small/)

Server side

As a root user, install and configure DevStack:

# useradd -s /bin/bash -d /opt/stack -m stack
# chmod +x /opt/stack
# echo "stack ALL=(ALL) NOPASSWD: ALL" | tee /etc/sudoers.d/stack
# sudo -u stack -i
$ git clone https://opendev.org/openstack/devstack
$ cd devstack
$ cat > local.conf <<EOF
[[local|localrc]]
ADMIN_PASSWORD=!!! CHANGE ME !!!
DATABASE_PASSWORD=\$ADMIN_PASSWORD
RABBIT_PASSWORD=\$ADMIN_PASSWORD
SERVICE_PASSWORD=\$ADMIN_PASSWORD

GIT_BASE=https://opendev.org
# Enable Logging
LOGFILE=$DEST/logs/stack.sh.log
VERBOSE=True
LOG_COLOR=True
enable_service rabbit
enable_plugin neutron $GIT_BASE/openstack/neutron
# Octavia supports using QoS policies on the VIP port:
enable_service q-qos
enable_service placement-api placement-client
# Octavia services
enable_plugin octavia $GIT_BASE/openstack/octavia master
enable_plugin octavia-dashboard $GIT_BASE/openstack/octavia-dashboard
enable_plugin ovn-octavia-provider $GIT_BASE/openstack/ovn-octavia-provider
enable_plugin octavia-tempest-plugin $GIT_BASE/openstack/octavia-tempest-plugin
enable_service octavia o-api o-cw o-hm o-hk o-da
# Cinder
enable_service c-api c-vol c-sch
EOF
$ ./stack.sh

If you want to enable web-download (i.e import images from URL):

# /etc/glance/glance-api.conf
show_multiple_locations = True

# ./horizon/openstack_dashboard/defaults.py
IMAGE_ALLOW_LOCATIONS = True

# /etc/glance/glance-image-import.conf
[image_import_opts]
image_import_plugins = ['image_decompression']

$ sudo systemctl restart devstack@g-api.service apache2

With this dev setup, it might be useful to enable DHCP for the public subnet: Admin > Network > Networks > public > Subnets > public-subnet > Edit Subnet > Subnet Details > :ballot_box_with_check: Enable DHCP + Add DNS

CAPO side

To work with this setup, it takes an update of the test/e2e/data/e2e_conf.yaml file. (NOTE: You can decide to update the m1.small flavor to avoid changing it)

diff --git a/test/e2e/data/e2e_conf.yaml b/test/e2e/data/e2e_conf.yaml
index 0d66e1f2..a3b2bd78 100644
--- a/test/e2e/data/e2e_conf.yaml
+++ b/test/e2e/data/e2e_conf.yaml
@@ -136,7 +136,7 @@ variables:
   CNI: "../../data/cni/calico.yaml"
   CCM: "../../data/ccm/cloud-controller-manager.yaml"
   EXP_CLUSTER_RESOURCE_SET: "true"
-  OPENSTACK_BASTION_IMAGE_NAME: "cirros-0.6.1-x86_64-disk"
+  OPENSTACK_BASTION_IMAGE_NAME: "cirros-0.5.2-x86_64-disk"
   OPENSTACK_BASTION_MACHINE_FLAVOR: "m1.tiny"
   OPENSTACK_CLOUD: "capo-e2e"
   OPENSTACK_CLOUD_ADMIN: "capo-e2e-admin"
@@ -144,10 +144,10 @@ variables:
   OPENSTACK_CLOUD_YAML_FILE: '../../../../clouds.yaml'
   OPENSTACK_CONTROL_PLANE_MACHINE_FLAVOR: "m1.medium"
   OPENSTACK_DNS_NAMESERVERS: "8.8.8.8"
-  OPENSTACK_FAILURE_DOMAIN: "testaz1"
-  OPENSTACK_FAILURE_DOMAIN_ALT: "testaz2"
+  OPENSTACK_FAILURE_DOMAIN: "nova"
+  OPENSTACK_FAILURE_DOMAIN_ALT: "nova"
   OPENSTACK_IMAGE_NAME: "focal-server-cloudimg-amd64"
-  OPENSTACK_NODE_MACHINE_FLAVOR: "m1.small"
+  OPENSTACK_NODE_MACHINE_FLAVOR: "m1.medium"

Before running a test:

start sshuttle (https://github.com/sshuttle/sshuttle) to setup the network between the host and the devstack instance correctly.

sshuttle -r stack@<devstack-server-ip> 172.24.4.0/24 -l 0.0.0.0

import the tested image in DevStack by matching the name defined in e2e_conf.yaml (OPENSTACK_FLATCAR_IMAGE_NAME or OPENSTACK_IMAGE_NAME)

To run a specific test, it’s possible to fill this variable E2E_GINKGO_FOCUS, if you want to SSH into an instance to debug it, it’s possible to proxy jump via the bastion and to use the SSH key generated by Nova, for example with Flatcar:

ssh -J cirros@172.24.4.229 -i ./_artifacts/ssh/cluster-api-provider-openstack-sigs-k8s-io core@10.6.0.145

Running E2E tests using rootless podman

You can use unprivileged podman to:

Build the CAPO image
Deploy the kind cluster

To do this you need to configure the host appropriately and pass PODMAN=1 to make, i.e.:

make test-e2e OPENSTACK_CLOUD_YAML_FILE=/path/to/clouds.yaml OPENSTACK_CLOUD=my_cloud \
              PODMAN=1

Host configuration

Firstly, you must be using kernel >=5.11. If you are using Fedora, this means Fedora >= 34.

You must configure systemd and iptables as described in https://kind.sigs.k8s.io/docs/user/rootless/. There is no need to configure cgroups v2 on Fedora, as it uses this by default.

You must install the podman-docker package to emulate the docker cli tool. However, this is not sufficient on its own as described below.

Running podman system service to emulate docker daemon

While kind itself supports podman, the cluster-api test framework does not. This framework is used by the CAPO tests to push test images into the kind cluster. Unfortunately the cluster-api test framework explicitly connects to a running docker daemon, so cli emulation is not sufficient for compatibility. This issue is tracked in https://github.com/kubernetes-sigs/cluster-api/issues/5146, and the following workaround can be ignored when this is resolved.

podman includes a ‘system service’ which emulates docker. For the tests to work, this service must be running and listening on a unix socket at /var/run/docker.sock. You can achieve this with:

$ podman system service -t 0 &
$ sudo rm /var/run/docker.sock
$ sudo ln -s /run/user/$(id -u)/podman/podman.sock /var/run/docker.sock

API concepts

This sections goal is to gather various insights into the API design that can serve as a reference to explain various choices made without need to analyze discussions in individual PRs.

`referencedResources`

Starting from v1beta1 both OpenStackMachineStatus and BastionsStatus feature a field named referencedResources which aims to include fields that list individual IDs of the resources associated with the machine or bastion. These IDs are calculated on machine or bastion creation and are not intended to be changed during the object lifecycle.

Having all the IDs of related resources saved in the statuses allows CAPO to make easy decisions about deleting the related resources when deleting the VM corresponding to the machine or bastion.

Table of Contents generated with DocToc

Hacking CI for the E2E tests

Hacking CI for the E2E tests

Prow

CAPO tests are executed by Prow. They are defined in the Kubernetes test-infra repository. The E2E tests run as a presubmit. They run in a docker container in Prow infrastructure which contains a checkout of the CAPO tree under test. The entry point for tests is scripts/ci-e2e.sh, which is defined in the job in Prow.

DevStack

The E2E tests require an OpenStack cloud to run against, which we provision during the test with DevStack. The project has access to capacity on GCP, so we provision DevStack on 2 GCP instances.

The entry point for the creation of the test DevStack is hack/ci/create_devstack.sh, which is executed by scripts/ci-e2e.sh. We create 2 instances: controller and worker. Each will provision itself via cloud-init using config defined in hack/ci/cloud-init.

DevStack OS

In GCE, DevStack is installed on a community-maintained Ubuntu 22.04 LTS cloud image. The cloud-init config is also intended to work on CentOS 8, and this is known to work as of 2021-01-12. However, note that this is not regularly tested. See the comment in hack/ci/gce-project.sh for how to deploy on CentOS.

It is convenient to the project to have a viable second OS option as it gives us an option to work around issues which only affect one or the other. This is most likely when enabling new DevStack features, but may also include infrastructure issues. Consequently, when making changes to cloud-init, try not to use features specific to Ubuntu or CentOS. DevStack already supports both operating systems, so we just need to be careful in our peripheral configuration, for example by using cloud-init’s packages module rather than manually invoking apt-get or yum. Fortunately package names tend to be consistent across the two distributions.

Configuration

We configure a 2 node DevStack. controller is running:

All control plane services
Nova: all services, including compute
Glance: all services
Octavia: all services
Neutron: all services with ML2/OVN
Cinder: all services, including volume with default LVM/iSCSI backend

worker is running:

Nova: compute only
Neutron: OVN agents only
Cinder: volume only with default LVM/iSCSI backend

controller is using the n2-standard-16 machine type with 16 vCPUs and 64 GB RAM. worker is using the n2-standard-8 machine type with 8 vCPUs and 32 GB RAM. Each job has a quota limit of 24 vCPUs.

Build order

We build controller first, and then worker. We let worker build asynchronously because tests which don’t require a second AZ can run without it while it builds. A systemd job defined in the cloud-init of controller polls for worker coming up and automatically configures it.

Networking

Both instances share a common network which uses the CIDR defined in PRIVATE_NETORK_CIDR in hack/ci/create_devstack.sh. Each instance has a single IP on this network:

controller: 10.0.3.15
worker: 10.0.3.16

In addition, DevStack will create a floating IP network using CIDR defined in FLOATING_RANGE in hack/ci/create_devstack.sh. As the neutron L3 agent is only running on the controller, all of this traffic is handled on the controller, even if the source is an instance running on the worker. The controller creates iptables rules to NAT this traffic.

The effect of this is that instances created on either controller or worker can get a floating ip from the public network. Traffic using this floating IP will be routed via controller and externally via NAT.

We are configuring OVN to provide default DNS servers if a subnet is created without specifying DNS servers. This can be overridden in OPENSTACK_DNS_NAMESERVERS.

Availability zones

We are running nova compute and cinder volume on each of controller and worker. Each nova compute and cinder volume are configured to be in their own availability zone. The names of the availability zones are defined in OPENSTACK_FAILURE_DOMAIN and OPENSTACK_FAILURE_DOMAIN_ALT in test/e2e/data/e2e_conf.yaml, with the services running on controller being in OPENSTACK_FAILURE_DOMAIN and the services running on worker being in OPENSTACK_FAILURE_DOMAIN_ALT.

This configuration is intended only to allow the testing of functionality related to availability zones, and does not imply any robustness to failure.

Nova is configured (via [DEFAULT]/default_schedule_zone) to place all workloads on the controller unless they have an explicit availability zone. The intention is that controller should have the capacity to run all tests which are agnostic to availability zones. This means that the explicitly multi-az tests do not risk failure due to capacity issues.

However, this is not sufficient because by default CAPI explicitly schedules the control plane across all discovered availability zones. Consequently we explicitly confine all clusters to OPENSTACK_FAILURE_DOMAIN (controller) in the test cluster definitions in test/e2e/data/infrastructure-openstack.

Connecting to DevStack

The E2E tests running in Prow create a kind cluster. This also running in Prow using Docker in Docker. The E2E tests configure this cluster with clusterctl, which is where CAPO executes.

create_devstack.sh wrote a clouds.yaml to the working directory, which is passed to CAPO via the cluster definitions in test/e2e/data/infrastructure-openstack. This clouds.yaml references the public, routable IP of controller. However, DevStack created all the service endpoints using controller’s private IP, which is not publicly routable. In addition, the tests need to be able to SSH to the floating IP of the Bastion. This floating IP is also allocated from a range which is not publicly routable.

To allow this access we run sshuttle from create_devstack.sh. This creates an SSH tunnel and routes traffic for PRIVATE_NETWORK_CIDR and FLOATING_RANGE over it.

Note that the semantics of a sshuttle tunnel are problematic. While they happen to work currently for DinD, Podman runs the kind cluster in a separate network namespace. This means that kind running in podman cannot route over sshuttle running outside the kind cluster. This may also break in future versions of Docker.

Kubernetes Cluster API Provider OpenStack