[Infrastructure as Code (IaC) Pulumi] Use Pulumi kubernetes (K8S) Helm Chart to deploy Rancher Labs Longhorn

Posted on 2021-09-04 Edited on 2024-07-02 In Infrastructure as Code (IaC) , Cloud Native , Kubernetes (K8S) , Pulumi Views: Word count in article: 18k Reading time ≈ 17 mins.

Rancher Labs Longhorn

Longhorn is a lightweight, reliable and easy-to-use Cloud native distributed block storage system for Kubernetes.

Longhorn is free, open source software. Originally developed by Rancher Labs, it is now being developed as a sandbox project of the Cloud Native Computing Foundation.

With Longhorn, you can:

Use Longhorn volumes as persistent storage for the distributed stateful applications in your Kubernetes cluster
Partition your block storage into Longhorn volumes so that you can use Kubernetes volumes with or without a cloud provider
Replicate block storage across multiple nodes and data centers to increase availability
Store backup data in external storage such as NFS or AWS S3
Create cross-cluster disaster recovery volumes so that data from a primary Kubernetes cluster can be quickly recovered from backup in a second Kubernetes cluster
Schedule recurring snapshots of a volume, and schedule recurring backups to NFS or S3-compatible secondary storage
Restore volumes from backup
Upgrade Longhorn without disrupting persistent volumes
Manipulate Longhorn resources with kubectl

This article is about how to use Helm to deploy Longhorn on Kubernetes (K8S).

Prerequisites

Kubernetes (K8S)

Kubernetes (K8s) is an open-source system for automating deployment, scaling, and management of containerized applications.

See Getting started | Kubernetes - https://kubernetes.io/docs/setup/ to leanr more.
Pulumi - Modern Infrastructure as Code - https://www.pulumi.com/

Pulumi is a modern infrastructure-as-code platform that allows you to use common programming languages, tools, and frameworks, to provision, update, and manage cloud infrastructure resources.

Install the Pulumi - https://www.pulumi.com/ CLI.
1
2
# Mac OS X
$ brew install pulumi
See Download and Install | Pulumi - https://www.pulumi.com/docs/get-started/install/ to learn more about others OS.
Node.js - https://nodejs.org/en/

Node.js® is a JavaScript runtime built on Chrome’s V8 JavaScript engine.

Install Node.js - https://nodejs.org/en/ CLI.
1
2
# Mac OS X
$ brew install node
See Node.js - https://nodejs.org/en/ to learn more about others OS.

open-iscsi is installed, and the iscsid daemon is running on all the nodes.
The Open-iSCSI project provides a high-performance, transport independent, implementation of RFC 3720 iSCSI for Linux.

You will get an error message without open-iscsi.

1	level=error msg="Failed environment check, please make sure you have iscsiadm/open-iscsi installed on the host"

For help installing open-iscsi:

# For Debian and Ubuntu, use this command:
$ sudo apt-get install -y open-iscsi

# For RHEL, CentOS, and EKS with EKS Kubernetes Worker AMI with AmazonLinux2 image, use this command:
$ sudo yum install -y iscsi-initiator-utils

1
2
3

# start and enable iscsid.service or reboot
$ sudo systemctl start iscsid && sudo systemctl enable iscsid
$ sudo systemctl status iscsid

Usage

Pulumi New

Create the workspace directory.

1
2
3

$ mkdir -p col-example-pulumi-typescript-longhorn

$ cd col-example-pulumi-typescript-longhorn

Pulumi login into local file system.

1
2
3

$ pulumi login file://.
Logged in to cloudolife as cloudolife (file://.)
or visit https://pulumi.com/docs/reference/install/ for manual instructions and release notes.

Pulumi new a project with kubernetes-typescript SDK.

1	$ pulumi new kubernetes-typescript

The above command will create some files within the current directory.

tree . -L 1
.
├── node_modules/
├── package.json
├── package.json.lock
├── Pulumi.dev.yaml
├── Pulumi.yaml
└── main.ts

Install js-yaml package to load and parse yaml file.

1	$ npm i js-yaml

Pulumi Configuration

Configure Kubernetes

By default, Pulumi will look for a kubeconfig file in the following locations, just like kubectl:

The environment variable: $KUBECONFIG,
Or in current user’s default kubeconfig directory: ~/.kube/config

If the kubeconfig file is not in either of these locations, Pulumi will not find it, and it will fail to authenticate against the cluster. Set one of these locations to a valid kubeconfig file, if you have not done so already.

Configure Values.yaml

Edit values.yaml and replace content within {{ }}.

# longhorn/values.yaml at master · longhorn/longhorn
# https://github.com/longhorn/longhorn/blob/master/chart/values.yaml

# persistence:
#   defaultClass: true

ingress:
  ## Set to true to enable ingress record generation
  enabled: true

  host: {{ .Values.host }}

  ## Set this to true in order to enable TLS on the ingress record
  ## A side effect of this will be that the backend service will be connected at port 443
  tls: true

  ## If TLS is set to true, you must declare what secret will store the key/certificate for TLS
  tlsSecret: {{ .Values.ingress.tlsSecret }}

  ## Ingress annotations done as key:value pairs
  ## If you're using kube-lego, you will want to add:
  ## kubernetes.io/tls-acme: true
  ##
  ## For a full list of possible ingress annotations, please see
  ## ref: https://github.com/kubernetes/ingress-nginx/blob/master/docs/annotations.md
  ##
  ## If tls is set to true, annotation ingress.kubernetes.io/secure-backends: "true" will automatically be set
  annotations:
    kubernetes.io/ingress.class: nginx

    # Basic Authentication - NGINX Ingress Controller
    # https://kubernetes.github.io/ingress-nginx/examples/auth/basic/
    # type of authentication
    nginx.ingress.kubernetes.io/auth-type: basic
    # name of the secret that contains the user/password definitions
    nginx.ingress.kubernetes.io/auth-secret: longhorn-auth
    # message to display with an appropriate context why the authentication is required
    nginx.ingress.kubernetes.io/auth-realm: 'Authentication Required - foo'

Generate a Admin and password Auth.

1
2
3

$ htpasswd -c ./ing-auth admin

$ cat ./ing-auth | base64

Remember the base64 output as the next Secret data.

manifests

Edit manifests/Secret.yaml and replace content within {{ }}.


---
# Secrets | Kubernetes
# https://kubernetes.io/docs/concepts/configuration/secret/
apiVersion: v1
kind: Secret
metadata:
  name: longhorn-auth
  namespace: longhorn-system
type: Opaque
data:
  # htpasswd -c ./ing-auth admin
  # cat ./ing-auth | base64
  auth: {{ .Values.Secret.auth }}

main.ts

// main.ts

import * as pulumi from "@pulumi/pulumi";

import * as k8s from "@pulumi/kubernetes";

const yaml = require('js-yaml');
const fs   = require('fs');

const nameLonghorn = "longhorn"

// kubernetes.core/v1.Namespace | Pulumi
// https://www.pulumi.com/docs/reference/pkg/kubernetes/core/v1/namespace/
const namespaceLonghorn = new k8s.core.v1.Namespace(nameLonghorn, {
    metadata: {
        // Longhorn must be installed into `longhorn-system` namespaces.
        name: "longhorn-system",
    },
})

const configGroup = new k8s.yaml.ConfigGroup(name, {
    files: [`./manifests/*.yaml`],
});

const values = yaml.safeLoad(fs.readFileSync("./values.yaml", 'utf8'))


const charNameLonghorn = "longhorn"

const charLonghorn = new k8s.helm.v3.Chart(charNameLonghorn, {
    chart: charNameLonghorn,
    version: "1.2.0",
    fetchOpts:{
        repo: "https://charts.longhorn.io",
    },
    namespace: namespaceLonghorn.metadata.name,
    values: values,
})

Pulumi Up

Run pulumi up to create the namespace and pods.

1	$ pulumi up

See pods about longhorn.

$ kubectl get pods -n longhorn-system
NAME                                       READY   STATUS    RESTARTS      AGE
csi-attacher-75588bff58-hfvtg              1/1     Running   2 (12h ago)   17h
csi-attacher-75588bff58-hnqmn              1/1     Running   2 (12h ago)   17h
csi-attacher-75588bff58-x9vgh              1/1     Running   2 (12h ago)   17h
csi-provisioner-669c8cc698-27k6l           1/1     Running   1 (12h ago)   15h
csi-provisioner-669c8cc698-7dnzp           1/1     Running   1 (12h ago)   15h
csi-provisioner-669c8cc698-qlqwp           1/1     Running   1 (12h ago)   15h
csi-resizer-5c88bfd4cf-7sm2k               1/1     Running   2 (12h ago)   17h
csi-resizer-5c88bfd4cf-nmtwg               1/1     Running   2 (12h ago)   17h
csi-resizer-5c88bfd4cf-tlmlq               1/1     Running   2 (12h ago)   17h
csi-snapshotter-69f8bc8dcf-97skf           1/1     Running   2 (12h ago)   17h
csi-snapshotter-69f8bc8dcf-f2ldw           1/1     Running   2 (12h ago)   17h
csi-snapshotter-69f8bc8dcf-m4t4m           1/1     Running   2 (12h ago)   17h
engine-image-ei-0f7c4304-2d699             1/1     Running   2 (12h ago)   17h
engine-image-ei-0f7c4304-gpgg6             1/1     Running   2 (12h ago)   17h
engine-image-ei-0f7c4304-pgrhn             1/1     Running   2 (12h ago)   17h
instance-manager-e-60ecfa74                1/1     Running   0             12h
instance-manager-e-65e45472                1/1     Running   0             12h
instance-manager-e-80ecbd32                1/1     Running   0             12h
instance-manager-r-191cb864                1/1     Running   0             12h
instance-manager-r-89c3d4af                1/1     Running   0             12h
instance-manager-r-929dd0e9                1/1     Running   0             12h
longhorn-csi-plugin-qrq89                  2/2     Running   5 (12h ago)   17h
longhorn-csi-plugin-vkr9p                  2/2     Running   8 (12h ago)   17h
longhorn-csi-plugin-xdhm6                  2/2     Running   8 (12h ago)   17h
longhorn-driver-deployer-5f47f8c9c-4x5bd   1/1     Running   3 (12h ago)   17h
longhorn-manager-2m5cw                     1/1     Running   3 (12h ago)   17h
longhorn-manager-fq5hg                     1/1     Running   3 (12h ago)   17h
longhorn-manager-gtkkh                     1/1     Running   3 (12h ago)   17h
longhorn-ui-7545689d69-gtb8t               1/1     Running   2 (12h ago)   17h

Then, you can visit Longhorn Front UI with https://.

See Storage Class about local-path-provisioner.

1
2
3

$  kubectl get sc
NAME                               PROVISIONER                                                  RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
longhorn                           driver.longhorn.io                                           Delete          Immediate              true                   9d

Test Local Path Provisioner Storage Class

Create or edit a pvc.yaml manifest file.

# pvc.yaml

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: test-longhorn
  labels:
    app: test
spec:
  storageClassName: longhorn
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

Run command to run and check.

$ kubectl apply -f pvc.yaml

$ kubectl get pvc
NAME             STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
test-longhorn   Bound    pvc-ce500448-7ce8-42a4-b9aa-f96bd005fd5b   1Gi       RWO            longhorn     19

Pulumi Destroy

Destroy all resources created by Pulumi.

1	$ pulumi destroy

FAQs

Permission denied causes Back-off restarting failed container on Pod attached to Longhorn volume since Kubernetes (K8S) v1.22+ and Longhorn v1.2.0+

After installing/upgrading to Longhorn v1.2.0, you will probably encounter a fsGroup ineffective issue when creating a new fs volume. This is due to the default fs setting being removed from the new upstream CSI external-provisioner. The workaround has been provided (ref: [BUG] Longhorn 1.2.0 - wrong volume permissions inside container / broken fsGroup · Issue #2964 · longhorn/longhorn - https://github.com/longhorn/longhorn/issues/2964#issuecomment-910969543).

Manually add a new flag --default-fstype=ext4 to the csi-provisioner deployment in longhorn-system namespace. It should look like this:

1	$ kubectl -n longhorn-system edit deployment csi-provisioner

      containers:
      - args:
        - --v=2
        - --csi-address=$(ADDRESS)
        - --timeout=1m50s
        - --leader-election
        - --leader-election-namespace=$(POD_NAMESPACE)
+        - --default-fstype=ext4

You should reinstall resources(Deployment, StatefulSet, etc.) relative to the Pod attached to Longhorn volume

See Important Notes | Longhorn | Documentation - https://longhorn.io/docs/1.2.0/deploy/important-notes/ to learn more.

no matches for kind “Ingress” in version “networking.k8s.io/v1beta1” since Kubernets (K8S) v1.22

1	Verify that any required CRDs have been created: no matches for kind "Ingress" in version "networking.k8s.io/v1beta1"

The extensions/v1beta1 and networking.k8s.io/v1beta1 API versions of Ingress is no longer served as of v1.22.

Migrate manifests and API clients to use the networking.k8s.io/v1 API version, available since v1.19.

All existing persisted objects are accessible via the new API

Notable changes:

  - spec.backend is renamed to spec.defaultBackend
  
  - The backend serviceName field is renamed to service.name

  - Numeric backend servicePort fields are renamed to service.port.number

  - String backend servicePort fields are renamed to service.port.name
  
  - pathType is now required for each specified path. Options are Prefix, Exact, and ImplementationSpecific. To match the undefined v1beta1 behavior, use ImplementationSpecific.

See Deprecated API Migration Guide | Kubernetes - https://kubernetes.io/docs/reference/using-api/deprecation-guide/ to learn more.

First, create new Ingress with networking.k8s.io/v1 API version.

# Ingress.longhorn-frontend.yaml

---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/ingress.class: "nginx"

    # Basic Authentication - NGINX Ingress Controller
    # https://kubernetes.github.io/ingress-nginx/examples/auth/basic/
    # type of authentication
    nginx.ingress.kubernetes.io/auth-type: basic
    # name of the secret that contains the user/password definitions
    nginx.ingress.kubernetes.io/auth-secret: longhorn-auth
    # message to display with an appropriate context why the authentication is required
    nginx.ingress.kubernetes.io/auth-realm: 'Authentication Required - foo'
    
  name: longhorn-frontend
  namespace: longhorn-system
spec:
  rules:
    - host: {{ .Values.host }}
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: longhorn-frontend
                port:
                  number: 80
  tls:
    - secretName: {{ .Values.tls.secretName }}

See Ingress | Kubernetes - https://kubernetes.io/docs/concepts/services-networking/ingress/ to leanr more.

Then, run kubectl apply command.

$ kubectl apply -f Ingress.longhorn-frontend.yaml

$ kubectl get ingress -n longhorn-system
NAME                CLASS    HOSTS                                         ADDRESS        PORTS     AGE
longhorn-frontend   <none>   longhorn-frontend-.examples.cloudolife.com    10.233.43.59   80, 443   17h

Now, you can visite Grafana Loki with https://{{ .Values.host }}.

no matches for kind “Ingress” in version “extensions/v1beta1” since Kubernetes (K8S) v1.22

1	Verify that any required CRDs have been created: no matches for kind "Ingress" in version "extensions/v1beta1"

Same as above no matches for kind "Ingress" in version "networking.k8s.io/v1beta1" since Kubernets (K8S) v1.22.

instance-manager request too much CPU

1
2
3

$ kubectl describe node node1
  longhorn-system                            instance-manager-e-9cc9f3a3                                   456m (12%)    0 (0%)      0 (0%)           0 (0%)         12d
  longhorn-system                            instance-manager-r-18bb467c                                   456m (12%)    0 (0%)      0 (0%)           0 (0%)         12d

By default instance-manager request 12% CPU on a node. When the number of CPU cores is limit(<=4 cores), the problem of insufficient cpu is prone to occur.

Change guaranteedEngineManagerCPU and guaranteedReplicaManagerCPU value to set CPU limit.

# values/values.yaml

defaultSettings:
  guaranteedEngineManagerCPU: 5 # 5 means 5% of the total CPU on a node will be allocated to each engine manager pod on this node
  guaranteedReplicaManagerCPU: 5 # 5 means 5% of the total CPU on a node will be allocated to each engine manager pod on this node

Or change engineManagerCPURequest and replicaManagerCPURequest on nodes.longhorn.io resources

$ kubectl get  nodes.longhorn.io -n longhorn-system
NAMESPACE         NAME                 READY   ALLOWSCHEDULING   SCHEDULABLE   AGE
longhorn-system   node1                True    true              True          33d
longhorn-system   node2                True    true              True          33d
longhorn-system   node3                True    true              True          33d

# node1.nodes.longhorn.io

sepc:
  engineManagerCPURequest: 100m # 100, 0.1 or 100m
  replicaManagerCPURequest: 100m # 100, 0.1 or 100m

Explanation about guaranteedEngineManagerCPU and guaranteedReplicaManagerCPU.

- variable: defaultSettings.guaranteedEngineManagerCPU
  label: Guaranteed Engine Manager CPU
  description: "This integer value indicates how many percentage of the total allocatable CPU on each node will be reserved for each engine manager Pod. For example, 10 means 10% of the total CPU on a node will be allocated to each engine manager pod on this node. This will help maintain engine stability during high node workload.
  In order to prevent unexpected volume engine crash as well as guarantee a relative acceptable IO performance, you can use the following formula to calculate a value for this setting:
  Guaranteed Engine Manager CPU = The estimated max Longhorn volume engine count on a node * 0.1 / The total allocatable CPUs on the node * 100.
  The result of above calculation doesn't mean that's the maximum CPU resources the Longhorn workloads require. To fully exploit the Longhorn volume I/O performance, you can allocate/guarantee more CPU resources via this setting.
  If it's hard to estimate the usage now, you can leave it with the default value, which is 12%. Then you can tune it when there is no running workload using Longhorn volumes.
  WARNING:
    - Value 0 means unsetting CPU requests for engine manager pods.
    - Considering the possible new instance manager pods in the further system upgrade, this integer value is range from 0 to 40. And the sum with setting 'Guaranteed Engine Manager CPU' should not be greater than 40.
    - One more set of instance manager pods may need to be deployed when the Longhorn system is upgraded. If current available CPUs of the nodes are not enough for the new instance manager pods, you need to detach the volumes using the oldest instance manager pods so that Longhorn can clean up the old pods automatically and release the CPU resources. And the new pods with the latest instance manager image will be launched then.
    - This global setting will be ignored for a node if the field \"EngineManagerCPURequest\" on the node is set.
    - After this setting is changed, all engine manager pods using this global setting on all the nodes will be automatically restarted. In other words, DO NOT CHANGE THIS SETTING WITH ATTACHED VOLUMES."
  group: "Longhorn Default Settings"
  type: int
  min: 0
  max: 40
  default: 12
- variable: defaultSettings.guaranteedReplicaManagerCPU
  label: Guaranteed Replica Manager CPU
  description: "This integer value indicates how many percentage of the total allocatable CPU on each node will be reserved for each replica manager Pod. 10 means 10% of the total CPU on a node will be allocated to each replica manager pod on this node. This will help maintain replica stability during high node workload.
  In order to prevent unexpected volume replica crash as well as guarantee a relative acceptable IO performance, you can use the following formula to calculate a value for this setting:
  Guaranteed Replica Manager CPU = The estimated max Longhorn volume replica count on a node * 0.1 / The total allocatable CPUs on the node * 100.
  The result of above calculation doesn't mean that's the maximum CPU resources the Longhorn workloads require. To fully exploit the Longhorn volume I/O performance, you can allocate/guarantee more CPU resources via this setting.
  If it's hard to estimate the usage now, you can leave it with the default value, which is 12%. Then you can tune it when there is no running workload using Longhorn volumes.
  WARNING:
    - Value 0 means unsetting CPU requests for replica manager pods.
    - Considering the possible new instance manager pods in the further system upgrade, this integer value is range from 0 to 40. And the sum with setting 'Guaranteed Replica Manager CPU' should not be greater than 40.
    - One more set of instance manager pods may need to be deployed when the Longhorn system is upgraded. If current available CPUs of the nodes are not enough for the new instance manager pods, you need to detach the volumes using the oldest instance manager pods so that Longhorn can clean up the old pods automatically and release the CPU resources. And the new pods with the latest instance manager image will be launched then.
    - This global setting will be ignored for a node if the field \"ReplicaManagerCPURequest\" on the node is set.
    - After this setting is changed, all replica manager pods using this global setting on all the nodes will be automatically restarted. In other words, DO NOT CHANGE THIS SETTING WITH ATTACHED VOLUMES."
  group: "Longhorn Default Settings"
  type: int
  min: 0
  max: 40
  default: 12