[Infrastructure as Code (IaC) Pulumi] Use Pulumi kubernetes (K8S) Helm Chart to deploy Rancher Labs Longhorn

Rancher Labs Longhorn

Longhorn is a lightweight, reliable and easy-to-use Cloud native distributed block storage system for Kubernetes.

Longhorn is free, open source software. Originally developed by Rancher Labs, it is now being developed as a sandbox project of the Cloud Native Computing Foundation.

With Longhorn, you can:

  • Use Longhorn volumes as persistent storage for the distributed stateful applications in your Kubernetes cluster

  • Partition your block storage into Longhorn volumes so that you can use Kubernetes volumes with or without a cloud provider

  • Replicate block storage across multiple nodes and data centers to increase availability

  • Store backup data in external storage such as NFS or AWS S3

  • Create cross-cluster disaster recovery volumes so that data from a primary Kubernetes cluster can be quickly recovered from backup in a second Kubernetes cluster

  • Schedule recurring snapshots of a volume, and schedule recurring backups to NFS or S3-compatible secondary storage
    Restore volumes from backup

  • Upgrade Longhorn without disrupting persistent volumes

  • Manipulate Longhorn resources with kubectl

This article is about how to use Helm to deploy Longhorn on Kubernetes (K8S).

Prerequisites

Usage

Pulumi New

Create the workspace directory.

1
2
3
$ mkdir -p col-example-pulumi-typescript-longhorn

$ cd col-example-pulumi-typescript-longhorn

Pulumi login into local file system.

1
2
3
$ pulumi login file://.
Logged in to cloudolife as cloudolife (file://.)
or visit https://pulumi.com/docs/reference/install/ for manual instructions and release notes.

Pulumi new a project with kubernetes-typescript SDK.

1
$ pulumi new kubernetes-typescript

The above command will create some files within the current directory.

1
2
3
4
5
6
7
8
tree . -L 1
.
├── node_modules/
├── package.json
├── package.json.lock
├── Pulumi.dev.yaml
├── Pulumi.yaml
└── main.ts

Install js-yaml package to load and parse yaml file.

1
$ npm i js-yaml

Pulumi Configuration

Configure Kubernetes

By default, Pulumi will look for a kubeconfig file in the following locations, just like kubectl:

  • The environment variable: $KUBECONFIG,

  • Or in current user’s default kubeconfig directory: ~/.kube/config

If the kubeconfig file is not in either of these locations, Pulumi will not find it, and it will fail to authenticate against the cluster. Set one of these locations to a valid kubeconfig file, if you have not done so already.

Configure Values.yaml

Edit values.yaml and replace content within {{ }}.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# longhorn/values.yaml at master · longhorn/longhorn
# https://github.com/longhorn/longhorn/blob/master/chart/values.yaml

# persistence:
# defaultClass: true

ingress:
## Set to true to enable ingress record generation
enabled: true

host: {{ .Values.host }}

## Set this to true in order to enable TLS on the ingress record
## A side effect of this will be that the backend service will be connected at port 443
tls: true

## If TLS is set to true, you must declare what secret will store the key/certificate for TLS
tlsSecret: {{ .Values.ingress.tlsSecret }}

## Ingress annotations done as key:value pairs
## If you're using kube-lego, you will want to add:
## kubernetes.io/tls-acme: true
##
## For a full list of possible ingress annotations, please see
## ref: https://github.com/kubernetes/ingress-nginx/blob/master/docs/annotations.md
##
## If tls is set to true, annotation ingress.kubernetes.io/secure-backends: "true" will automatically be set
annotations:
kubernetes.io/ingress.class: nginx

# Basic Authentication - NGINX Ingress Controller
# https://kubernetes.github.io/ingress-nginx/examples/auth/basic/
# type of authentication
nginx.ingress.kubernetes.io/auth-type: basic
# name of the secret that contains the user/password definitions
nginx.ingress.kubernetes.io/auth-secret: longhorn-auth
# message to display with an appropriate context why the authentication is required
nginx.ingress.kubernetes.io/auth-realm: 'Authentication Required - foo'

Generate a Admin and password Auth.

1
2
3
$ htpasswd -c ./ing-auth admin

$ cat ./ing-auth | base64

Remember the base64 output as the next Secret data.

manifests

Edit manifests/Secret.yaml and replace content within {{ }}.

1
2
3
4
5
6
7
8
9
10
11
12
13
14

---
# Secrets | Kubernetes
# https://kubernetes.io/docs/concepts/configuration/secret/
apiVersion: v1
kind: Secret
metadata:
name: longhorn-auth
namespace: longhorn-system
type: Opaque
data:
# htpasswd -c ./ing-auth admin
# cat ./ing-auth | base64
auth: {{ .Values.Secret.auth }}

main.ts

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// main.ts

import * as pulumi from "@pulumi/pulumi";

import * as k8s from "@pulumi/kubernetes";

const yaml = require('js-yaml');
const fs = require('fs');

const nameLonghorn = "longhorn"

// kubernetes.core/v1.Namespace | Pulumi
// https://www.pulumi.com/docs/reference/pkg/kubernetes/core/v1/namespace/
const namespaceLonghorn = new k8s.core.v1.Namespace(nameLonghorn, {
metadata: {
// Longhorn must be installed into `longhorn-system` namespaces.
name: "longhorn-system",
},
})

const configGroup = new k8s.yaml.ConfigGroup(name, {
files: [`./manifests/*.yaml`],
});

const values = yaml.safeLoad(fs.readFileSync("./values.yaml", 'utf8'))


const charNameLonghorn = "longhorn"

const charLonghorn = new k8s.helm.v3.Chart(charNameLonghorn, {
chart: charNameLonghorn,
version: "1.2.0",
fetchOpts:{
repo: "https://charts.longhorn.io",
},
namespace: namespaceLonghorn.metadata.name,
values: values,
})

Pulumi Up

Run pulumi up to create the namespace and pods.

1
$ pulumi up

See pods about longhorn.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
$ kubectl get pods -n longhorn-system
NAME READY STATUS RESTARTS AGE
csi-attacher-75588bff58-hfvtg 1/1 Running 2 (12h ago) 17h
csi-attacher-75588bff58-hnqmn 1/1 Running 2 (12h ago) 17h
csi-attacher-75588bff58-x9vgh 1/1 Running 2 (12h ago) 17h
csi-provisioner-669c8cc698-27k6l 1/1 Running 1 (12h ago) 15h
csi-provisioner-669c8cc698-7dnzp 1/1 Running 1 (12h ago) 15h
csi-provisioner-669c8cc698-qlqwp 1/1 Running 1 (12h ago) 15h
csi-resizer-5c88bfd4cf-7sm2k 1/1 Running 2 (12h ago) 17h
csi-resizer-5c88bfd4cf-nmtwg 1/1 Running 2 (12h ago) 17h
csi-resizer-5c88bfd4cf-tlmlq 1/1 Running 2 (12h ago) 17h
csi-snapshotter-69f8bc8dcf-97skf 1/1 Running 2 (12h ago) 17h
csi-snapshotter-69f8bc8dcf-f2ldw 1/1 Running 2 (12h ago) 17h
csi-snapshotter-69f8bc8dcf-m4t4m 1/1 Running 2 (12h ago) 17h
engine-image-ei-0f7c4304-2d699 1/1 Running 2 (12h ago) 17h
engine-image-ei-0f7c4304-gpgg6 1/1 Running 2 (12h ago) 17h
engine-image-ei-0f7c4304-pgrhn 1/1 Running 2 (12h ago) 17h
instance-manager-e-60ecfa74 1/1 Running 0 12h
instance-manager-e-65e45472 1/1 Running 0 12h
instance-manager-e-80ecbd32 1/1 Running 0 12h
instance-manager-r-191cb864 1/1 Running 0 12h
instance-manager-r-89c3d4af 1/1 Running 0 12h
instance-manager-r-929dd0e9 1/1 Running 0 12h
longhorn-csi-plugin-qrq89 2/2 Running 5 (12h ago) 17h
longhorn-csi-plugin-vkr9p 2/2 Running 8 (12h ago) 17h
longhorn-csi-plugin-xdhm6 2/2 Running 8 (12h ago) 17h
longhorn-driver-deployer-5f47f8c9c-4x5bd 1/1 Running 3 (12h ago) 17h
longhorn-manager-2m5cw 1/1 Running 3 (12h ago) 17h
longhorn-manager-fq5hg 1/1 Running 3 (12h ago) 17h
longhorn-manager-gtkkh 1/1 Running 3 (12h ago) 17h
longhorn-ui-7545689d69-gtb8t 1/1 Running 2 (12h ago) 17h

Then, you can visit Longhorn Front UI with https://.

See Storage Class about local-path-provisioner.

1
2
3
$  kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
longhorn driver.longhorn.io Delete Immediate true 9d

Test Local Path Provisioner Storage Class

Create or edit a pvc.yaml manifest file.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# pvc.yaml

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: test-longhorn
labels:
app: test
spec:
storageClassName: longhorn
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi

Run command to run and check.

1
2
3
4
5
$ kubectl apply -f pvc.yaml

$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
test-longhorn Bound pvc-ce500448-7ce8-42a4-b9aa-f96bd005fd5b 1Gi RWO longhorn 19

Pulumi Destroy

Destroy all resources created by Pulumi.

1
$ pulumi destroy

FAQs

Permission denied causes Back-off restarting failed container on Pod attached to Longhorn volume since Kubernetes (K8S) v1.22+ and Longhorn v1.2.0+

After installing/upgrading to Longhorn v1.2.0, you will probably encounter a fsGroup ineffective issue when creating a new fs volume. This is due to the default fs setting being removed from the new upstream CSI external-provisioner. The workaround has been provided (ref: [BUG] Longhorn 1.2.0 - wrong volume permissions inside container / broken fsGroup · Issue #2964 · longhorn/longhorn - https://github.com/longhorn/longhorn/issues/2964#issuecomment-910969543).

Manually add a new flag --default-fstype=ext4 to the csi-provisioner deployment in longhorn-system namespace. It should look like this:

1
$ kubectl -n longhorn-system edit deployment csi-provisioner
1
2
3
4
5
6
7
8
      containers:
- args:
- --v=2
- --csi-address=$(ADDRESS)
- --timeout=1m50s
- --leader-election
- --leader-election-namespace=$(POD_NAMESPACE)
+ - --default-fstype=ext4

You should reinstall resources(Deployment, StatefulSet, etc.) relative to the Pod attached to Longhorn volume

See Important Notes | Longhorn | Documentation - https://longhorn.io/docs/1.2.0/deploy/important-notes/ to learn more.

no matches for kind “Ingress” in version “networking.k8s.io/v1beta1” since Kubernets (K8S) v1.22

1
Verify that any required CRDs have been created: no matches for kind "Ingress" in version "networking.k8s.io/v1beta1"

The extensions/v1beta1 and networking.k8s.io/v1beta1 API versions of Ingress is no longer served as of v1.22.

  • Migrate manifests and API clients to use the networking.k8s.io/v1 API version, available since v1.19.

  • All existing persisted objects are accessible via the new API

    Notable changes:

      - spec.backend is renamed to spec.defaultBackend
      
      - The backend serviceName field is renamed to service.name
    
      - Numeric backend servicePort fields are renamed to service.port.number
    
      - String backend servicePort fields are renamed to service.port.name
      
      - pathType is now required for each specified path. Options are Prefix, Exact, and ImplementationSpecific. To match the undefined v1beta1 behavior, use ImplementationSpecific.
    

See Deprecated API Migration Guide | Kubernetes - https://kubernetes.io/docs/reference/using-api/deprecation-guide/ to learn more.

First, create new Ingress with networking.k8s.io/v1 API version.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# Ingress.longhorn-frontend.yaml

---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
kubernetes.io/ingress.class: "nginx"

# Basic Authentication - NGINX Ingress Controller
# https://kubernetes.github.io/ingress-nginx/examples/auth/basic/
# type of authentication
nginx.ingress.kubernetes.io/auth-type: basic
# name of the secret that contains the user/password definitions
nginx.ingress.kubernetes.io/auth-secret: longhorn-auth
# message to display with an appropriate context why the authentication is required
nginx.ingress.kubernetes.io/auth-realm: 'Authentication Required - foo'

name: longhorn-frontend
namespace: longhorn-system
spec:
rules:
- host: {{ .Values.host }}
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: longhorn-frontend
port:
number: 80
tls:
- secretName: {{ .Values.tls.secretName }}

See Ingress | Kubernetes - https://kubernetes.io/docs/concepts/services-networking/ingress/ to leanr more.

Then, run kubectl apply command.

1
2
3
4
5
$ kubectl apply -f Ingress.longhorn-frontend.yaml

$ kubectl get ingress -n longhorn-system
NAME CLASS HOSTS ADDRESS PORTS AGE
longhorn-frontend <none> longhorn-frontend-.examples.cloudolife.com 10.233.43.59 80, 443 17h

Now, you can visite Grafana Loki with https://{{ .Values.host }}.

no matches for kind “Ingress” in version “extensions/v1beta1” since Kubernetes (K8S) v1.22

1
Verify that any required CRDs have been created: no matches for kind "Ingress" in version "extensions/v1beta1"

Same as above no matches for kind "Ingress" in version "networking.k8s.io/v1beta1" since Kubernets (K8S) v1.22.

instance-manager request too much CPU

1
2
3
$ kubectl describe node node1
longhorn-system instance-manager-e-9cc9f3a3 456m (12%) 0 (0%) 0 (0%) 0 (0%) 12d
longhorn-system instance-manager-r-18bb467c 456m (12%) 0 (0%) 0 (0%) 0 (0%) 12d

By default instance-manager request 12% CPU on a node. When the number of CPU cores is limit(<=4 cores), the problem of insufficient cpu is prone to occur.

Change guaranteedEngineManagerCPU and guaranteedReplicaManagerCPU value to set CPU limit.

1
2
3
4
5
# values/values.yaml

defaultSettings:
guaranteedEngineManagerCPU: 5 # 5 means 5% of the total CPU on a node will be allocated to each engine manager pod on this node
guaranteedReplicaManagerCPU: 5 # 5 means 5% of the total CPU on a node will be allocated to each engine manager pod on this node

Or change engineManagerCPURequest and replicaManagerCPURequest on nodes.longhorn.io resources

1
2
3
4
5
$ kubectl get  nodes.longhorn.io -n longhorn-system
NAMESPACE NAME READY ALLOWSCHEDULING SCHEDULABLE AGE
longhorn-system node1 True true True 33d
longhorn-system node2 True true True 33d
longhorn-system node3 True true True 33d
1
2
3
4
5
# node1.nodes.longhorn.io

sepc:
engineManagerCPURequest: 100m # 100, 0.1 or 100m
replicaManagerCPURequest: 100m # 100, 0.1 or 100m

Explanation about guaranteedEngineManagerCPU and guaranteedReplicaManagerCPU.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
- variable: defaultSettings.guaranteedEngineManagerCPU
label: Guaranteed Engine Manager CPU
description: "This integer value indicates how many percentage of the total allocatable CPU on each node will be reserved for each engine manager Pod. For example, 10 means 10% of the total CPU on a node will be allocated to each engine manager pod on this node. This will help maintain engine stability during high node workload.
In order to prevent unexpected volume engine crash as well as guarantee a relative acceptable IO performance, you can use the following formula to calculate a value for this setting:
Guaranteed Engine Manager CPU = The estimated max Longhorn volume engine count on a node * 0.1 / The total allocatable CPUs on the node * 100.
The result of above calculation doesn't mean that's the maximum CPU resources the Longhorn workloads require. To fully exploit the Longhorn volume I/O performance, you can allocate/guarantee more CPU resources via this setting.
If it's hard to estimate the usage now, you can leave it with the default value, which is 12%. Then you can tune it when there is no running workload using Longhorn volumes.
WARNING:
- Value 0 means unsetting CPU requests for engine manager pods.
- Considering the possible new instance manager pods in the further system upgrade, this integer value is range from 0 to 40. And the sum with setting 'Guaranteed Engine Manager CPU' should not be greater than 40.
- One more set of instance manager pods may need to be deployed when the Longhorn system is upgraded. If current available CPUs of the nodes are not enough for the new instance manager pods, you need to detach the volumes using the oldest instance manager pods so that Longhorn can clean up the old pods automatically and release the CPU resources. And the new pods with the latest instance manager image will be launched then.
- This global setting will be ignored for a node if the field \"EngineManagerCPURequest\" on the node is set.
- After this setting is changed, all engine manager pods using this global setting on all the nodes will be automatically restarted. In other words, DO NOT CHANGE THIS SETTING WITH ATTACHED VOLUMES."
group: "Longhorn Default Settings"
type: int
min: 0
max: 40
default: 12
- variable: defaultSettings.guaranteedReplicaManagerCPU
label: Guaranteed Replica Manager CPU
description: "This integer value indicates how many percentage of the total allocatable CPU on each node will be reserved for each replica manager Pod. 10 means 10% of the total CPU on a node will be allocated to each replica manager pod on this node. This will help maintain replica stability during high node workload.
In order to prevent unexpected volume replica crash as well as guarantee a relative acceptable IO performance, you can use the following formula to calculate a value for this setting:
Guaranteed Replica Manager CPU = The estimated max Longhorn volume replica count on a node * 0.1 / The total allocatable CPUs on the node * 100.
The result of above calculation doesn't mean that's the maximum CPU resources the Longhorn workloads require. To fully exploit the Longhorn volume I/O performance, you can allocate/guarantee more CPU resources via this setting.
If it's hard to estimate the usage now, you can leave it with the default value, which is 12%. Then you can tune it when there is no running workload using Longhorn volumes.
WARNING:
- Value 0 means unsetting CPU requests for replica manager pods.
- Considering the possible new instance manager pods in the further system upgrade, this integer value is range from 0 to 40. And the sum with setting 'Guaranteed Replica Manager CPU' should not be greater than 40.
- One more set of instance manager pods may need to be deployed when the Longhorn system is upgraded. If current available CPUs of the nodes are not enough for the new instance manager pods, you need to detach the volumes using the oldest instance manager pods so that Longhorn can clean up the old pods automatically and release the CPU resources. And the new pods with the latest instance manager image will be launched then.
- This global setting will be ignored for a node if the field \"ReplicaManagerCPURequest\" on the node is set.
- After this setting is changed, all replica manager pods using this global setting on all the nodes will be automatically restarted. In other words, DO NOT CHANGE THIS SETTING WITH ATTACHED VOLUMES."
group: "Longhorn Default Settings"
type: int
min: 0
max: 40
default: 12

References

[1] Install Longhorn with Helm - https://longhorn.io/docs/1.0.2/deploy/install/install-with-helm/

[2] Accessing the Longhorn UI - https://longhorn.io/docs/1.0.2/deploy/accessing-the-ui/

[3] Longhorn - Cloud native distributed block storage for Kubernetes - https://longhorn.io/

[4] Open-iSCSI by open-iscsi - http://www.open-iscsi.com/

[5] Helm - https://helm.sh/

[6] Kubernetes - https://kubernetes.io/

[7] Important Notes | Longhorn | Documentation - https://longhorn.io/docs/1.2.0/deploy/important-notes/

[8] Deprecated API Migration Guide | Kubernetes - https://kubernetes.io/docs/reference/using-api/deprecation-guide/

[9] Ingress | Kubernetes - https://kubernetes.io/docs/concepts/services-networking/ingress/

[10] Basic Authentication - NGINX Ingress Controller - https://kubernetes.github.io/ingress-nginx/examples/auth/basic/