Cluster multiple hosts without k8s/swarm

Hi,

I am trying to start a cluster on multiple hosts without swarm or k8s.
So I am using the default zeebe.cfg.toml from zeebe-docker-compose/cluster example.

To start I have created three simple docker run commands to start the broker on each host:

HOST1(10.0.2.4):

docker run -d --rm --name broker1 \
  -p 26500:26500 -p 26502:26502 -p 9600:9600 \
  -e ZEEBE_HOST=10.0.2.4 \
  -e ZEEBE_EMBED_GATEWAY=false \
  -e ZEEBE_GATEWAY_CLUSTER_HOST=10.0.2.4 \
  -e ZEEBE_LOG_LEVEL=debug \
  -e ZEEBE_NODE_ID=0 \
  -e ZEEBE_PARTITIONS_COUNT=2 \
  -e ZEEBE_REPLICATION_FACTOR=3 \
  -e ZEEBE_CLUSTER_SIZE=3 \
  -e ZEEBE_CONTACT_POINTS=10.0.2.4:26502,10.0.2.5:26502,10.0.2.6:26502 \
  -v /var/zeebe-docker-compose/cluster0/zeebe.cfg.toml:/usr/local/zeebe/conf/zeebe.cfg.toml \
  camunda/zeebe:0.22.1

HOST2(10.0.2.6):

docker run -d --rm --name broker2 \
  -p 26500:26500 -p 26502:26502 -p 9600:9600 \
  -e ZEEBE_HOST=10.0.2.6 \
  -e ZEEBE_EMBED_GATEWAY=false \
  -e ZEEBE_GATEWAY_CLUSTER_HOST=10.0.2.6 \
  -e ZEEBE_LOG_LEVEL=debug \
  -e ZEEBE_NODE_ID=1 \
  -e ZEEBE_PARTITIONS_COUNT=2 \
  -e ZEEBE_REPLICATION_FACTOR=3 \
  -e ZEEBE_CLUSTER_SIZE=3 \
  -e ZEEBE_CONTACT_POINTS=10.0.2.4:26502,10.0.2.5:26502,10.0.2.6:26502 \
  -v /var/zeebe-docker-compose/cluster1/zeebe.cfg.toml:/usr/local/zeebe/conf/zeebe.cfg.toml \
  camunda/zeebe:0.22.1

HOST3(10.0.2.5):

docker run -d --rm --name broker3 \
  -p 26500:26500 -p 26502:26502 -p 9600:9600 \
  -e ZEEBE_HOST=10.0.2.5 \
  -e ZEEBE_EMBED_GATEWAY=false \
  -e ZEEBE_GATEWAY_CLUSTER_HOST=10.0.2.5 \
  -e ZEEBE_LOG_LEVEL=debug \
  -e ZEEBE_NODE_ID=2 \
  -e ZEEBE_PARTITIONS_COUNT=2 \
  -e ZEEBE_REPLICATION_FACTOR=3 \
  -e ZEEBE_CLUSTER_SIZE=3 \
  -e ZEEBE_CONTACT_POINTS=10.0.2.4:26502,10.0.2.5:26502,10.0.2.6:26502 \
  -v /var/zeebe-docker-compose/cluster2/zeebe.cfg.toml:/usr/local/zeebe/conf/zeebe.cfg.toml \
  camunda/zeebe:0.22.1

Logs from the brokers:

2020-05-18 15:26:56.512 [] [main] DEBUG io.zeebe.broker.system - Bootstrap Broker-0 [6/10]: cluster services started in 18814 ms
2020-05-18 15:26:56.513 [] [main] INFO  io.zeebe.broker.system - Bootstrap Broker-0 [7/10]: topology manager
2020-05-18 15:26:56.525 [] [main] DEBUG io.zeebe.broker.system - Bootstrap Broker-0 [7/10]: topology manager started in 12 ms
2020-05-18 15:26:56.527 [] [main] INFO  io.zeebe.broker.system - Bootstrap Broker-0 [8/10]: metric's server
2020-05-18 15:26:56.757 [] [main] INFO  io.zeebe.broker.system - Bootstrap Broker-0 [8/10]: metric's server failed with unexpected exception.
java.net.BindException: Cannot assign requested address
        at sun.nio.ch.Net.bind0(Native Method) ~[?:?]
        at sun.nio.ch.Net.bind(Unknown Source) ~[?:?]
        at sun.nio.ch.Net.bind(Unknown Source) ~[?:?]

2020-05-21 12:51:00.710 [] [main] DEBUG io.zeebe.broker.system - Bootstrap Broker-1 [6/10]: cluster services started in 7868 ms
2020-05-21 12:51:00.711 [] [main] INFO  io.zeebe.broker.system - Bootstrap Broker-1 [7/10]: topology manager
2020-05-21 12:51:00.729 [] [main] DEBUG io.zeebe.broker.system - Bootstrap Broker-1 [7/10]: topology manager started in 17 ms
2020-05-21 12:51:00.731 [] [main] INFO  io.zeebe.broker.system - Bootstrap Broker-1 [8/10]: metric's server
2020-05-21 12:51:00.766 [Broker-1-TopologyManager] [Broker-1-zb-actors-0] DEBUG io.zeebe.broker.clustering - Received metadata change for 0, partitions {} terms {}
2020-05-21 12:51:00.837 [Broker-1-TopologyManager] [Broker-1-zb-actors-0] DEBUG io.zeebe.broker.clustering - Received metadata change for 2, partitions {} terms {}
2020-05-21 12:51:00.892 [] [main] INFO  io.zeebe.broker.system - Bootstrap Broker-1 [8/10]: metric's server failed with unexpected exception.
java.net.BindException: Cannot assign requested address
        at sun.nio.ch.Net.bind0(Native Method) ~[?:?]
        at sun.nio.ch.Net.bind(Unknown Source) ~[?:?]
        at sun.nio.ch.Net.bind(Unknown Source) ~[?:?]

2020-05-21 21:41:08.222 [] [main] DEBUG io.zeebe.broker.system - Bootstrap Broker-2 [2/10]: membership and replication protocol started in 17575 ms
2020-05-21 21:41:08.223 [] [main] INFO  io.zeebe.broker.system - Bootstrap Broker-2 [3/10]: command api transport
2020-05-21 21:41:09.419 [] [main] DEBUG io.zeebe.broker.system - Bound command API to 10.0.2.5:26501
2020-05-21 21:41:09.500 [] [main] DEBUG io.zeebe.broker.system - Bootstrap Broker-2 [3/10]: command api transport started in 1276 ms
2020-05-21 21:41:09.502 [] [main] INFO  io.zeebe.broker.system - Bootstrap Broker-2 [4/10]: command api handler
2020-05-21 21:41:09.683 [] [main] DEBUG io.zeebe.broker.system - Bootstrap Broker-2 [4/10]: command api handler started in 178 ms
2020-05-21 21:41:09.684 [] [main] INFO  io.zeebe.broker.system - Bootstrap Broker-2 [5/10]: subscription api
2020-05-21 21:41:09.846 [] [main] DEBUG io.zeebe.broker.system - Bootstrap Broker-2 [5/10]: subscription api started in 157 ms
2020-05-21 21:41:09.847 [] [main] INFO  io.zeebe.broker.system - Bootstrap Broker-2 [6/10]: cluster services
2020-05-21 21:41:19.681 [] [main] DEBUG io.zeebe.broker.system - Bootstrap Broker-2 [6/10]: cluster services started in 9833 ms
2020-05-21 21:41:19.684 [] [main] INFO  io.zeebe.broker.system - Bootstrap Broker-2 [7/10]: topology manager
2020-05-21 21:41:19.700 [] [main] DEBUG io.zeebe.broker.system - Bootstrap Broker-2 [7/10]: topology manager started in 16 ms
2020-05-21 21:41:19.704 [] [main] INFO  io.zeebe.broker.system - Bootstrap Broker-2 [8/10]: metric's server
2020-05-21 21:41:19.743 [Broker-2-TopologyManager] [Broker-2-zb-actors-1] DEBUG io.zeebe.broker.clustering - Received metadata change for 0, partitions {} terms {}
2020-05-21 21:41:19.949 [] [main] INFO  io.zeebe.broker.system - Bootstrap Broker-2 [8/10]: metric's server failed with unexpected exception.
java.net.BindException: Cannot assign requested address
        at sun.nio.ch.Net.bind0(Native Method) ~[?:?]
        at sun.nio.ch.Net.bind(Unknown Source) ~[?:?]
        at sun.nio.ch.Net.bind(Unknown Source) ~[?:?]

Disabling the metrics server in zeebe.cfg.toml:

# Controls if the prometheus metrics should be exporter over HTTP
# This setting can also be overridden using the environment variable ZEEBE_METRICS_HTTP_SERVER.
enableHttpServer = false

# Host to export metrics on, defaults to network.host
host = "127.0.0.1"

Did not work.

What am I missing here?

Maarten

You might have a look at the zeebe-docker-compose configs. Check previous tags for the toml version. The 0.23.1 version (current) uses the new yaml config.

@maartend the main reason why people use kubernetes or swarm is because when you have multiple docker deamons in different host you are now in charge of dealing with the network layer to community these services together. The virtual subnets 10.0.x in different hosts (created by the docker deamons) cannot communicate between each other as far as I know.

@jwulf That was exactly what I tried to do, so I started working with the cluster docker-compose file

Now I tested it once more and again cluster is not coming up across multiple hosts, I cant use swarm/k8s (overlay networks).

So this morning I took the latest cluster compose and changed it per node:

HOST1

networks:
  zeebe_network:
    driver: bridge

services:
  node0:
    container_name: zeebe_broker_1
    image: camunda/zeebe:0.23.1
    environment:
      - ZEEBE_LOG_LEVEL=debug
      - ZEEBE_NODE_ID=0
      - ZEEBE_PARTITIONS_COUNT=2
      - ZEEBE_REPLICATION_FACTOR=3
      - ZEEBE_CLUSTER_SIZE=3
      - ZEEBE_CONTACT_POINTS=10.0.2.4:26502,10.0.2.5:26502,10.0.2.6:26502
    ports:
      - "26500:26500"
      - "26502:26502"
    volumes:
      - ./zeebe.cfg.toml:/usr/local/zeebe/conf/zeebe.cfg.toml
    networks:
      - zeebe_network

HOST2:

networks:
  zeebe_network:
    driver: bridge

services:
  node1:
    container_name: zeebe_broker_2
    image: camunda/zeebe:0.23.1
    environment:
      - ZEEBE_LOG_LEVEL=debug
      - ZEEBE_NODE_ID=1
      - ZEEBE_PARTITIONS_COUNT=2
      - ZEEBE_REPLICATION_FACTOR=3
      - ZEEBE_CLUSTER_SIZE=3
      - ZEEBE_CONTACT_POINTS=10.0.2.4:26502,10.0.2.5:26502,10.0.2.6:26502
    ports:
      - "26500:26500"
      - "26502:26502"
    volumes:
      - ./zeebe.cfg.toml:/usr/local/zeebe/conf/zeebe.cfg.toml
    networks:
      - zeebe_network

HOST3:

networks:
  zeebe_network:
    driver: bridge

services:
  node2:
    container_name: zeebe_broker_3
    image: camunda/zeebe:0.23.1
    environment:
      - ZEEBE_LOG_LEVEL=debug
      - ZEEBE_NODE_ID=2
      - ZEEBE_PARTITIONS_COUNT=2
      - ZEEBE_REPLICATION_FACTOR=3
      - ZEEBE_CLUSTER_SIZE=3
      - ZEEBE_CONTACT_POINTS=10.0.2.4:26502,10.0.2.5:26502,10.0.2.6:26502
    ports:
      - "26500:26500"
      - "26502:26502"
    volumes:
      - ./zeebe.cfg.toml:/usr/local/zeebe/conf/zeebe.cfg.toml
    networks:
      - zeebe_network

The only change I have made to the zeebe.cg.toml:

[network.internalApi]
# Overrides the host used for internal broker-to-broker communication
host = "<<IP of the host>>"

But it looks like that is not enough for broker-to-broker communication, or at least raft can’t deal with it :wink:

Cuz errors again on network area:

HOST1:

zeebe_broker_1 | 2020-05-22 09:13:07.307 [] [main] DEBUG io.zeebe.broker.system - Bootstrap Broker-0 [6/11]: embedded gateway started in 630 ms
zeebe_broker_1 | 2020-05-22 09:13:07.309 [] [main] INFO  io.zeebe.broker.system - Bootstrap Broker-0 [7/11]: cluster services
zeebe_broker_1 | 2020-05-22 09:13:10.468 [] [raft-server-0-system-partition-1] WARN  io.atomix.raft.roles.CandidateRole - RaftServer{system-partition-1}{role=CANDIDATE} - io.netty.channel.ConnectTimeoutException: connection timed out: /172.18.0.2:26502
zeebe_broker_1 | 2020-05-22 09:13:10.478 [] [raft-server-0-system-partition-1] WARN  io.atomix.raft.roles.CandidateRole - RaftServer{system-partition-1}{role=CANDIDATE} - io.netty.channel.ConnectTimeoutException: connection timed out: /172.18.0.2:26502
zeebe_broker_1 | 2020-05-22 09:13:15.772 [] [raft-server-0-system-partition-1] WARN  io.atomix.raft.roles.FollowerRole - RaftServer{system-partition-1}{role=FOLLOWER} - io.netty.channel.ConnectTimeoutException: connection timed out: /172.18.0.2:26502

HOST2:

zeebe_broker_2 | 2020-05-22 09:13:01.155 [] [main] DEBUG io.zeebe.broker.system - Bootstrap Broker-1 [6/11]: embedded gateway started in 952 ms
zeebe_broker_2 | 2020-05-22 09:13:01.157 [] [main] INFO  io.zeebe.broker.system - Bootstrap Broker-1 [7/11]: cluster services
zeebe_broker_2 | 2020-05-22 09:13:07.315 [] [raft-server-1-system-partition-1] WARN  io.atomix.raft.roles.FollowerRole - RaftServer{system-partition-1}{role=FOLLOWER} - java.net.ConnectException
zeebe_broker_2 | 2020-05-22 09:13:07.383 [] [raft-server-1-system-partition-1] WARN  io.atomix.raft.roles.CandidateRole - RaftServer{system-partition-1}{role=CANDIDATE} - java.net.ConnectException
zeebe_broker_2 | 2020-05-22 09:13:11.448 [] [raft-server-1-system-partition-1] WARN  io.atomix.raft.roles.CandidateRole - RaftServer{system-partition-1}{role=CANDIDATE} - io.netty.channel.ConnectTimeoutException: connection timed out: /172.20.0.2:26502

HOST3:

zeebe_broker_3 | 2020-05-22 09:12:59.643 [] [main] DEBUG io.zeebe.broker.system - Bootstrap Broker-2 [6/11]: embedded gateway started in 952 ms
zeebe_broker_3 | 2020-05-22 09:12:59.645 [] [main] INFO  io.zeebe.broker.system - Bootstrap Broker-2 [7/11]: cluster services
zeebe_broker_3 | 2020-05-22 09:13:05.627 [] [raft-server-2-system-partition-1] WARN  io.atomix.raft.roles.FollowerRole - RaftServer{system-partition-1}{role=FOLLOWER} - java.net.ConnectException
zeebe_broker_3 | 2020-05-22 09:13:05.713 [] [raft-server-2-system-partition-1] WARN  io.atomix.raft.roles.CandidateRole - RaftServer{system-partition-1}{role=CANDIDATE} - java.net.ConnectException
zeebe_broker_3 | 2020-05-22 09:13:09.455 [] [raft-server-2-system-partition-1] WARN  io.atomix.raft.roles.CandidateRole - RaftServer{system-partition-1}{role=CANDIDATE} - io.netty.channel.ConnectTimeoutException: connection timed out: /172.20.0.2:26502

So there are more settings required to ensure true broker-to-broker communication other than in the [network.internalApi] section, but which ones is not clear to me.

Maarten

@salaboy, that is why we do portmapping, to expose/bind to an external interface/IP of a virtual machine?

I have not seen issues with other cluster solutions, for example mysql (galera cluster) workes perfectly fine from within a container and across multiple hosts.
I can understand some applications have issues with the natting (external ip vs internal docker ip), but I don’t know if that is the case here.

Maarten

@maartend

There is definitely a network problem there… your brokers are complaining that they cannot reach 172.20.0.2 …

Also
Docker networks with driver: bridge are local to the host only -> https://docs.docker.com/network/bridge/

So I don’t think that there is a problem with Zeebe specifically here…

@salaboy I agree, in my initial post I came further, raft was coming up, but there was a conflict/issue with metrics server.

Since @jwulf pointed me to the docker-compose again I was thinking that I missed something so started again.

Do you have any insights where to look for this error:

Maarten

@salaboy, I thought that the setting in the zeebe.cfg.toml was for this:

[network.internalApi]
# Overrides the host used for internal broker-to-broker communication
host = "<<IP of the host>>"

But it is not using the IP specified there, at least raft is not using it.

Maarten

@maartend can you try setting that parameter to 0.0.0.0 to bind to the local address?

@salaboy no difference, same error.

Network is open:

localhost:~$ nc -vz 10.0.2.4 26502
10.0.2.4 (10.0.2.4:26502) open
localhost:~$ nc -vz 10.0.2.5 26502
10.0.2.5 (10.0.2.5:26502) open
localhost:~$ nc -vz 10.0.2.6 26502
10.0.2.6 (10.0.2.6:26502) open

But if I check the logs:

zeebe_broker_3 | 2020-05-22 10:16:27.640 [] [raft-server-2-system-partition-1] WARN  io.atomix.raft.roles.CandidateRole - RaftServer{system-partition-1}{role=CANDIDATE} - io.netty.channel.ConnectTimeoutException: connection timed out: /172.20.0.2:26502
zeebe_broker_3 | 2020-05-22 10:16:31.653 [] [raft-server-2-system-partition-1] WARN  io.atomix.raft.roles.CandidateRole - RaftServer{system-partition-1}{role=CANDIDATE} - io.netty.channel.ConnectTimeoutException: connection timed out: /172.20.0.2:26502
zeebe_broker_3 | 2020-05-22 10:16:36.677 [] [raft-server-2-system-partition-1] WARN  io.atomix.raft.roles.CandidateRole - RaftServer{system-partition-1}{role=CANDIDATE} - io.netty.channel.ConnectTimeoutException: connection timed out: /172.20.0.2:26502

So raft is not using the specified IP, I know I can change that behavior by changing other settings as shown in the initial post, then the partition part is coming up… but metrics fail :wink:

Maarten

I am also facing the same issue when tried to create cluster accross three VMs with partition size 3 and cluster size 3.

@mousumi which version are you using?

@mousumi,

My latest attempt is with docker-compose, swarm/stack deploy, resulting in the following:

I used the following docker-compose:

version: '3'

volumes:
    broker_1: {}
    broker_2: {}
    broker_3: {}

services:

    broker-1:
        image: camunda/zeebe:0.24.1
        ports:
            - 26500:26500
        environment:
            - ZEEBE_LOG_LEVEL=${ZEEBE_LOG_LEVEL:-debug}
            - ZEEBE_BROKER_CLUSTER_NODEID=0
            - ZEEBE_BROKER_CLUSTER_PARTITIONSCOUNT=3
            - ZEEBE_BROKER_CLUSTER_CLUSTERSIZE=3
            - ZEEBE_BROKER_CLUSTER_REPLICATIONFACTOR=3
        volumes:
            - broker_1:/usr/local/zeebe/data
        deploy:
          placement:
            constraints: [node.hostname == node0]

    broker-2:
        image: camunda/zeebe:0.24.1
        ports:
            - 26510:26500
        environment:
            - ZEEBE_LOG_LEVEL=${ZEEBE_LOG_LEVEL:-debug}
            - ZEEBE_BROKER_CLUSTER_NODEID=1
            - ZEEBE_BROKER_CLUSTER_PARTITIONSCOUNT=3
            - ZEEBE_BROKER_CLUSTER_CLUSTERSIZE=3
            - ZEEBE_BROKER_CLUSTER_REPLICATIONFACTOR=3
            - ZEEBE_BROKER_CLUSTER_INITIALCONTACTPOINTS=broker-1:26502
        volumes:
            - broker_2:/usr/local/zeebe/data
        deploy:
          placement:
            constraints: [node.hostname == node1]

    broker-3:
        image: camunda/zeebe:0.24.1
        ports:
            - 26520:26500
        environment:
            - ZEEBE_LOG_LEVEL=${ZEEBE_LOG_LEVEL:-debug}
            - ZEEBE_BROKER_CLUSTER_NODEID=2
            - ZEEBE_BROKER_CLUSTER_PARTITIONSCOUNT=3
            - ZEEBE_BROKER_CLUSTER_CLUSTERSIZE=3
            - ZEEBE_BROKER_CLUSTER_REPLICATIONFACTOR=3
            - ZEEBE_BROKER_CLUSTER_INITIALCONTACTPOINTS=broker-1:26502
        volumes:
            - broker_3:/usr/local/zeebe/data
        deploy:
          placement:
            constraints: [node.hostname == node2]

Still not completely what I want. I want to have it running without swarm/stack deploy.

Maarten

I am using 0.20.2

@mousumi can you please upgrade to newer versions?

Due to some official reason we can not upgrade to new version.

The key issue is the IP and port configuration.
In docker-compose.yml:

services:
  zeebe:
	image: camunda/zeebe:0.23.5
	environment:
	  - ZEEiBE_LOG_LEVEL=debug
	  - ZEEBE_BROKER_NETWORK_HOST=0.0.0.0
	  - ZEEBE_BROKER_GATEWAY_CLUSTER_HOST=0.0.0.0
	  - ZEEBE_BROKER_NETWORK_COMMANDAPI_HOST=13.13.13.34        # the local host IP
	  - ZEEBE_BROKER_NETWORK_INTERNALAPI_HOST=13.13.13.34
	ports:
	  - "26500:26500"
	  - "26501:26501"
	  - "26502:26502"
	  - "9600:9600"