Cancel a WorkflowInstance in Operate

Hi,

I have some trouble to Cancel and Delete WorkflowInstances in Operate.

For example I have an “Echo” WorkflowInstance with ID 225. Trying to cancel it, Operate get a Timeout and show “Canceleling Instance 225 failed”.

In the Zeebe Debug Log:

12:36:53.890 [io.zeebe.gateway.impl.broker.BrokerRequestManager] [gateway-zb-actors-0] ERROR io.zeebe.gateway - Error handling gRPC request
io.grpc.StatusRuntimeException: NOT_FOUND: Command rejected with code ‘CANCEL’: Expected to cancel a workflow instance with key ‘225’, but no such workflow was found
at io.grpc.Status.asRuntimeException(Status.java:523) ~[grpc-core-1.19.0.jar:1.19.0]
at io.zeebe.gateway.EndpointManager.convertThrowable(EndpointManager.java:257) ~[zeebe-gateway-0.17.0.jar:0.17.0]
at io.zeebe.gateway.EndpointManager.lambda$sendRequest$2(EndpointManager.java:235) ~[zeebe-gateway-0.17.0.jar:0.17.0]
at io.zeebe.gateway.impl.broker.BrokerRequestManager.lambda$sendRequest$1(BrokerRequestManager.java:90) ~[zeebe-gateway-0.17.0.jar:0.17.0]
at io.zeebe.gateway.impl.broker.BrokerRequestManager.lambda$sendRequest$3(BrokerRequestManager.java:109) ~[zeebe-gateway-0.17.0.jar:0.17.0]
at io.zeebe.gateway.impl.broker.BrokerRequestManager.lambda$sendRequestInternal$6(BrokerRequestManager.java:191) ~[zeebe-gateway-0.17.0.jar:0.17.0]
at io.zeebe.util.sched.future.FutureContinuationRunnable.run(FutureContinuationRunnable.java:35) [zeebe-util-0.17.0.jar:0.17.0]
at io.zeebe.util.sched.ActorJob.invoke(ActorJob.java:90) [zeebe-util-0.17.0.jar:0.17.0]
at io.zeebe.util.sched.ActorJob.execute(ActorJob.java:53) [zeebe-util-0.17.0.jar:0.17.0]
at io.zeebe.util.sched.ActorTask.execute(ActorTask.java:189) [zeebe-util-0.17.0.jar:0.17.0]
at io.zeebe.util.sched.ActorThread.executeCurrentTask(ActorThread.java:154) [zeebe-util-0.17.0.jar:0.17.0]
at io.zeebe.util.sched.ActorThread.doWork(ActorThread.java:135) [zeebe-util-0.17.0.jar:0.17.0]
at io.zeebe.util.sched.ActorThread.run(ActorThread.java:112) [zeebe-util-0.17.0.jar:0.17.0]
Caused by: io.zeebe.gateway.cmd.BrokerRejectionException: Command (CANCEL) rejected (NOT_FOUND): Expected to cancel a workflow instance with key ‘225’, but no such workflow was found

I think the Elasticsearch and Zeebe datas are not in sync but I think we should be able to “force delete” a WorkflowInstance from ElasticSearch.

Hi @gizmo84, thanks for the report, and we’ll look into this. Just to clarify–you can see workflow instance ID 225 in Operate, but it seems that Zeebe is in some way out of sync?

A possibility that I’ll throw out there: could it be that you submitted the cancellation request multiple times, but there was a lag in the UI, and this NOT FOUND error was the result of one of the additional cancellation requests? Or did the workflow instance not ever cancel? Let me know if that makes sense.

Best,
Mike

Hi @wints,

Just to clarify–you can see workflow instance ID 225 in Operate, but it seems that Zeebe is in some way out of sync?

Yeah, that exactly what happened but I don’t exactly know why. See the Screenshot:

I’m unable to delete this from operate.

Hi @gizmo84, thanks for the screenshot. That helps. To recap:

  • Operate is still showing the instance as “running” (a green circle to the left of the workflow name as in your screenshot)
  • But you can’t cancel the instance in Operate even though Operate says it’s running
  • And when you try to cancel the instance in Operate, you see the error in the Zeebe logs that you included in your first post

Did I get all of that right? If so, then I’ll take this to the Zeebe and Operate teams because it sounds like something unexpected is happening.

And to give some quick background on expected behavior:

  • Workflow instances currently cannot be deleted, only canceled, and after cancellation, they’ll be visible in Operate if “Canceled” is selected in the Filters menu (screenshot)
  • A canceled workflow instance will eventually be cleaned up from Zeebe state but will still be available in Operate

Based on this post, are these missing steps needed to reproduce this error:

  • Run this in docker, using the zeebe-io/zeebe-docker-compose profile for operate
  • Stop the containers with Ctrl-C.
  • Recreate them.
  • Now, attempt to stop a running workflow in Operate.

Something like this?

I will run some new tests on a clean basis and report if I still see this kind of errors…

1 Like