Performance metrics - which ones do you want?

Hey folks, we’re looking at automated performance testing for Zeebe releases.

What metrics are of interest, that you’d like to see included in the test and published for each release?

Hi,jwulf
The indicators I want to know are: the number of instances completed per second, and the average time to complete the instance.The influence of the opening and closing of the back pressure mechanism on the indicator.
This may involve many factors, the complexity of the instance process, the performance of the machine, the size of the broker cluster, the number of partitions, etc.

2 Likes

Hi,Jwulf
Metrics:

  1. Time difference between an instance is created via api and the first service task is sent to a worker. The workflow example: start->service task->end.
    My current measures (aprx.) under the default configuration settings 0.2 sec. It is multiplied by number of partitions and replication factors. So, 2n2p2r ->~ 0.4 sec; 3n3p3r ->~0.6 sec. No load situation.That does not look acceptable.

  2. Time to complete createWorkflowInstanceWithResult via API of the workflow start->service task->end where the service task just immediately finishes in the worker.
    My current measures (aprx.) under the default configuration is double the time from the #1. E.g. 3n3p3r -> 1,2 sec. I guess that is because there’re two moves in the workflow between start and the service task, and the service task and end. No load situation.

The metrics should be collected at least more complex configuration rather than just 1 node with one partition. For example, 3 nodes, 3 partitions, 3 replication factors, 5 nodes, 5 portions, 3 replication factors.

Measuring should be taken under different conditions. For example:

  1. on fast and “slow” drives.
  2. on no load situation. E.g. a workflow is instantiated each 3 seconds.
  3. on heavy load situation. E.g. 1000-10000 workflows are instantiated each second.
  4. metrics should be checked for maxims. E.g. the avg. time #2 is 1.2 sec, but maximums might be up to ~80 sec. It seems there’s kind of hangs.
1 Like