Relation between # of partitions and memory required

Hi,

I have a 3 broker cluster setup. When I tried to increase the partitions from 3 to 25, I also had to increase the memory resources too to startup itself. From the discussions here, I understand that the disk requirements will be considerably affected by number of partitions. But did not quite understand how the runtime memory is proportional to number of partitions for startup?

These are the changes I had to do:
Broker version: 0.25.0
Workers: 1 http worker
Exporters : Kafka and Hazelcast

resources:
requests:
memory: 3Gi -> increased from 1Gi
limits:
memory: 6Gi -> increased from 2Gi

Changed default heap size to:
-Xms4g -Xmx6g

I am running on a Linux machine with 8 cores.

Could you please help me understand this?

Thanks

Each partition needs to be rebuilt on restart, which uses memory.

Hi @jwulf

Thanks for the response…
What is the initial memory required by each partition? This would help us estimate the initial memory requirements if we need to scale the partitions. To increase the partitions I had cleaned up the data volumes and there was no data in the pipeline or disk.

Thanks

That is going to depend on the number and size of workflows, and how long before the shutdown the last snapshot was taken.

Test. test. test. It is the only way to know.

Start various numbers of processes, with varying complexity (ideally close to what your production load will look like). Watch the log to see when the snapshot is taking place, and stop the brokers right before one is going to happen.

Restart, and measure the memory usage.

The answer to all questions about memory and performance is “it depends”.

You have to simulate your workload and measure. There is no other way - other than just massively over-provisioning to save time on actual testing.

Give every broker node 64GB of RAM” is a “correct” answer for 90% of use cases, without having to know anything about the load characteristics.

But you are probably looking for the amount of memory that is the minimum that is safe for 99.99999% of cases of your specific load.

1 Like