Can't launch Zeebe cluster with k8s

Hi there.

I’m trying to deploy Zeebe (0.20.0) in my k8s cluster with Helm. It seems launched but I got a couple of problems.

Firstly, I have an eternal stream of errors on the main node (id=0) like:

12:37:43.812 [] [raft-server-raft-atomix-partition-1] WARN  io.atomix.protocols.raft.roles.FollowerRole - RaftServer{raft-atomix-partition-1}{role=FOLLOWER} - io.atomix.cluster.messaging.MessagingException$NoRemoteHandler: No remote message handler registered for this message

Does anyone know what I have to handle here? :slight_smile:

Next, readiness port returns 503 HTTP Status (/ready), so the pods remain unavailable. It worth noting that zbctl show the cluster is assembled:

root@zeebe-0:/usr/local/zeebe# ./bin/zbctl status
Cluster size: 3
Partitions count: 1
Replication factor: 1
Brokers:
  Broker 0 - zeebe-0.zeebe.dev.svc.cluster.local:26501
  Broker 1 - zeebe-1.zeebe.dev.svc.cluster.local:26501
  Broker 2 - zeebe-2.zeebe.dev.svc.cluster.local:26501

I spent a couple of days struggling with the problem and didn’t succeed. I need your help, guys!

Here is my deployment configuration:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: "{{.Values.global.zeebe.serviceName}}"
  namespace: "{{.Values.env}}"
spec:
  selector:
    matchLabels:
      app: "{{.Values.global.zeebe.serviceName}}"
  serviceName: "{{.Values.global.zeebe.serviceName}}"
  replicas: 3
  updateStrategy:
    type: RollingUpdate
  podManagementPolicy: Parallel
  template:
    metadata:
      labels:
        app: "{{.Values.global.zeebe.serviceName}}"
    spec:
      terminationGracePeriodSeconds: 10
      containers:
      - name: "{{.Values.global.zeebe.serviceName}}"
        image: {{.Values.dockerImage}}
        env:
        - name: ZEEBE_LOG_LEVEL
          value: debug
        - name: ZEEBE_PARTITIONS_COUNT
          value: "1"
        - name: ZEEBE_CLUSTER_SIZE
          value: "3"
        - name: ZEEBE_REPLICATION_FACTOR
          value: "1"
        - name: JAVA_TOOL_OPTIONS
          value: |
            -XX:+UnlockExperimentalVMOptions
            -XX:+UseCGroupMemoryLimitForHeap
            -Xms1024m
            -Xmx1024m
        ports:
        - containerPort: 9600
          name: http
        - containerPort: 26500
          name: gateway
        - containerPort: 26501
          name: command
        - containerPort: 26502
          name: internal
        readinessProbe:
          httpGet:
            path: /ready
            port: http
          initialDelaySeconds: 20
          periodSeconds: 5
        resources:
          requests:
            cpu: 500m
            memory: 1Gi
          limits:
            cpu: 1000m
            memory: 2Gi
        volumeMounts:
        - name: config
          mountPath: /usr/local/zeebe/conf/zeebe.cfg.toml
          subPath: zeebe.cfg.toml
        - name: config
          mountPath: /usr/local/bin/startup.sh
          subPath: startup.sh
        - name: data
          mountPath: /usr/local/zeebe/data
      volumes:
      - name: config
        configMap:
          name: zeebe-config
          defaultMode: 0744
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 5Gi

startup.sh:

    #!/bin/bash -xeu

    configFile=/usr/local/zeebe/conf/zeebe.cfg.toml
    export ZEEBE_HOST=$(hostname -f)
    export ZEEBE_NODE_ID="${HOSTNAME##*-}"
    
    # We need to specify all brokers as contact points for partition healing to work correctly
    # https://github.com/zeebe-io/zeebe/issues/2684
    ZEEBE_CONTACT_POINTS=${HOSTNAME::-1}0.$(hostname -d):26502
    for (( i=1; i<$ZEEBE_CLUSTER_SIZE; i++ ))
    do
        ZEEBE_CONTACT_POINTS="${ZEEBE_CONTACT_POINTS},${HOSTNAME::-1}$i.$(hostname -d):26502"
    done
    export ZEEBE_CONTACT_POINTS="${ZEEBE_CONTACT_POINTS}"
    
    exec /usr/local/zeebe/bin/broker

zeebe.cfg.toml:

    [threads]
    cpuThreadCount = 1
    
    [metrics]
    host = "0.0.0.0"

    [[exporters]]
    id = "hazelcast"
    className = "org.project.HazelcastExporter"

      [exporters.args]
      host = "hazelcast.dev.svc.cluster.local"
      enabledValueTypes = "JOB,WORKFLOW_INSTANCE,DEPLOYMENT,INCIDENT,TIMER,VARIABLE,MESSAGE,MESSAGE_SUBSCRIPTION,MESSAGE_START_EVENT_SUBSCRIPTION"

Hi there,
Can you try our HELM charts in helm.zeebe.io?

let me know how that goes.

2 Likes

Wow! It’s a miracle!

I didn’t recognise what was wrong with my config but your example works as a charm :slight_smile:

Thank you so much!

2 Likes

Happy to hear that @IcyEagle … if you find any issue with that please let me know. Can you please mark the question as solved?

Cheers