Updating workflow variable object with individual results from a multi-instance

I am referring to this doc, but am a little unclear on interpretation:
https://docs.zeebe.io/bpmn-workflows/multi-instance/multi-instance.html

Let’s say I have a workflow-level variable object:

{
"Book": {
    "Title": "My Book",
    "Pages": [{
            "PageNumber": 1,
            "IsRead": false
        }, {
            "PageNumber": 2,
            "IsRead": false
        }
    ]
  }
}

My parallel multi-instance would have a single task that will (after some processing) mark each page as read (will set IsRead = true). So I would set the multi-instance Input Collection to:

=Book.Pages

and the Input Element to:

page

so I can access local-scoped page variable. Since I am running a parallel multi-instance, I’d like to avoid race condition on variable update and have my IsRead values for each completed page get updated in the original workflow-level variable object (i.e. I don’t want to create a new variable with the result). When a client completes each page processing, it can update a locally-scoped page variable. So what should I set the Output Collection and the Output Element to - so that a client worker code, upon completing a service task, would update the “page” local variable’s IsRead property to true, and the result would be propagated to the original Book object? Thanks.

Hi @mostwired. I’m still relatively new to the team, so I took a shot at answering your question as a chance for me to learn something about using zeebe.

I know that the output element expression is evaluated at the end of each iteration (i.e. “page processing” in your description) and placed in the output collection. For example, if your output element expression is

=page

and your output collection is

pages

then after the multi-instance element has completed there will be a variable called pages with a collection as value. This collection contains each of the page values as they were at the end of each “page processing”. There is no problem with concurrency because each element is placed in the collection at a specific index that is pre-determined before the multi-instance body is executed. So concurrent modification of the same array index will not occur.

Having said that, I tried out your example, and I ran into an issue myself. Let me describe what I tried:

I created a sub process that reads pages.

The sub process is a parallel multi-instance element that iterates over the pages in the book. Each iteration has its own page variable (i.e. input element). Each page is collected into an output collection called pages. (Note that is not possible to directly overwrite the pages of the book here, as output collection is only a variable name and does not support complex structures like Book.pages. Using Book.pages would result in a variable with the name Book.pages.)

After execution of the sub process, the read pages are used to overwrite the pages of the book.

The read page task only needs to transform the page variable such that the page has been read. We can do that using an output mapping of the task.

Now when I create an instance of this workflow (running zeebe 0.23.2 on macOS) with a variable:

Book={"Title":"My Book","pages":[{"PageNumber":1,"IsRead":false},{"PageNumber":2,"IsRead":false}]}

and I have a simple worker that just prints to console

zbctl --insecure create worker read --handler=cat

then I see that Book.pages has become null, and that page is initialized as null and later becomes:

[{"isRead":true},{"isRead":true}]

So I seem to be missing the PageNumber, and somehow the pages result is not written correctly to Book.pages. Since it’s just a sunday, I’m calling it quits here. I’ll ask around in the team to see what I did wrong and come back to you tomorrow.

2 Likes

I’ve discussed the above scenario with @philipp.ossler (thanks :slight_smile:) and got it working. I had a couple mistakes in there, and I’d like to address them all to share my learnings. I hope this will as well answer your original questions, @mostwired.

I said:

The sub process is a parallel multi-instance element that iterates over the pages in the book. Each iteration has its own page variable (i.e. input element). Each page is collected into an output collection called pages.

Currently it is not possible to use the same variable as input element, and as output element (e.g. page as input element and = page as output element expression. This because the output element is nil initialised if it is a variable name or the name of a property of a variable. I believe this is a bug, as we do not have this scenario documented, not does it feel ‘right’ from the user perspective. I will create an issue for this in github.

The workaround for this is simply using a different variable name as output element. For example, = paged as the output element expression. We now do need to give this variable some value during the multi-instance body iteration. I’d assume the ‘read’ task to be a good fit for this.

The read page task only needs to transform the page variable such that the page has been read. We can do that using an output mapping of the task.

So here we now also need to write a value for our paged variable. I simply used the following output mapping for the read service task:

  • source: ={PageNumber:page.PageNumber,isRead:true}
  • target: paged

After execution of the sub process, the read pages are used to overwrite the pages of the book.

Here, I made the mistake of thinking that output mappings of the subprocess would be performed after all the iterations of the multi-instance body would be completed. It is actually the other way around: The output mappings are performed for every iteration of the multi-instance body (i.e. in this case the subprocess body) instead of on the container element (i.e. in this case the subprocess element). This separation of the container element and the body of the multi-instance is missing from the bpmn specification and leads to us not being able to define output mappings on both levels.

The workaround is to wrap the multi-instance sub process in another sub process. We can add the output mapping to the outer sub process:

  • source: =pages
  • target: Book.pages

Please take a look at my example: Example Zeebe bpmn to show output mappings in container of multi-instance subprocess · GitHub

I hope this helps :slight_smile:

Thanks Nico, it is a great explanation, and wrapping the multi-instance into a sub-process did the trick.

Thanks. Glad I could help, and I learned a lot from it :wink: