Persistent error upon completing job

Hello,

I am currently experience a problem using Zeebe, in which I am always getting the following error when one of my workers tries to connect to a broker. The Zeebe broker appears to be “stuck” on it:

Error occurred at: calc-audience-file-chunks-job, error: Grpc.Core.RpcException: Status(StatusCode=“Cancelled”, Detail=“Received RST_STREAM with error code 8”, DebugException=“Grpc.Core.Internal.CoreErrorDetailException: {“created”:”@1603965084.555000000",“description”:“Error received from peer ipv6:[::1]:26500”,“file”:“T:\src\github\grpc\workspace_csharp_ext_windows_x64\src\core\lib\surface\call.cc”,“file_line”:1062,“grpc_message”:“Received RST_STREAM with error code 8”,“grpc_status”:1}")
at Zeebe.Client.Impl.Commands.CompleteJobCommand.Send(Nullable`1 timeout)

I am running the broker on my machine, which I started by using one of the provided docker-compose files on github. I started out using v0.24.1, and after the error started occuring, I changed the image to 0.25.0 but the error persists.

Prior to this error i was launching dozens of workers which were connecting to the Zeebe broker at the same time, which was working initially, until this error started occurring.

Please let me know if there’s any other info I can provide you with (log files, or anything), since my broker appears to be currently stuck in this state.

Thank you

Here’s the client code I’m using, with the .NET client:

var corrId = “123”;

var client = ZeebeClient.Builder()
.UseGatewayAddress(“localhost:26500”)
.UsePlainText()
.Build();

var resp = await client.NewPublishMessageCommand()
.MessageName(“msg”)
.CorrelationKey(corrId)
.Variables(JsonConvert.SerializeObject(new
{
fileUri = @"…",
accountId = “…”
}))
.Send();

using (var signal = new EventWaitHandle(false, EventResetMode.AutoReset))
{
client.NewWorker()
.JobType(“job”)
.Handler(async (client, job) => {

	await
		client.NewCompleteJobCommand(job.Key)
		.Variables(JsonConvert.SerializeObject(new { ... }))
		.Send();
})
.MaxJobsActive(10)
.Name(Environment.MachineName)
.PollInterval(TimeSpan.FromSeconds(1))
.Timeout(TimeSpan.FromSeconds(30))
.Open();

 signal.WaitOne();

}

Also, here’s an error message which comes up in the broker docker logs whenever i make a request to complete a job:

2020-10-29 11:52:07.731 [] [grpc-default-worker-ELG-3-1] WARN io.grpc.netty.NettyServerHandler - Stream Error
io.netty.handler.codec.http2.Http2Exception$StreamException: Received DATA frame for an unknown stream 15
at io.netty.handler.codec.http2.Http2Exception.streamError(Http2Exception.java:147) ~[netty-codec-http2-4.1.50.Final.jar:4.1.50.Final]
at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder$FrameReadListener.shouldIgnoreHeadersOrDataFrame(DefaultHttp2ConnectionDecoder.java:596) ~[netty-codec-http2-4.1.50.Final.jar:4.1.50.Final]
at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder$FrameReadListener.onDataRead(DefaultHttp2ConnectionDecoder.java:239) ~[netty-codec-http2-4.1.50.Final.jar:4.1.50.Final]
at io.netty.handler.codec.http2.Http2InboundFrameLogger$1.onDataRead(Http2InboundFrameLogger.java:48) ~[netty-codec-http2-4.1.50.Final.jar:4.1.50.Final]
at io.netty.handler.codec.http2.DefaultHttp2FrameReader.readDataFrame(DefaultHttp2FrameReader.java:422) ~[netty-codec-http2-4.1.50.Final.jar:4.1.50.Final]
at io.netty.handler.codec.http2.DefaultHttp2FrameReader.processPayloadState(DefaultHttp2FrameReader.java:251) ~[netty-codec-http2-4.1.50.Final.jar:4.1.50.Final]
at io.netty.handler.codec.http2.DefaultHttp2FrameReader.readFrame(DefaultHttp2FrameReader.java:160) ~[netty-codec-http2-4.1.50.Final.jar:4.1.50.Final]
at io.netty.handler.codec.http2.Http2InboundFrameLogger.readFrame(Http2InboundFrameLogger.java:41) ~[netty-codec-http2-4.1.50.Final.jar:4.1.50.Final]
at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder.decodeFrame(DefaultHttp2ConnectionDecoder.java:174) ~[netty-codec-http2-4.1.50.Final.jar:4.1.50.Final]
at io.netty.handler.codec.http2.Http2ConnectionHandler$FrameDecoder.decode(Http2ConnectionHandler.java:378) [netty-codec-http2-4.1.50.Final.jar:4.1.50.Final]
at io.netty.handler.codec.http2.Http2ConnectionHandler.decode(Http2ConnectionHandler.java:438) [netty-codec-http2-4.1.50.Final.jar:4.1.50.Final]
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:501) [netty-codec-4.1.50.Final.jar:4.1.50.Final]
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:440) [netty-codec-4.1.50.Final.jar:4.1.50.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276) [netty-codec-4.1.50.Final.jar:4.1.50.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:792) [netty-transport-native-epoll-4.1.50.Final-linux-x86_64.jar:4.1.50.Final]
at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:475) [netty-transport-native-epoll-4.1.50.Final-linux-x86_64.jar:4.1.50.Final]
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) [netty-transport-native-epoll-4.1.50.Final-linux-x86_64.jar:4.1.50.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [netty-common-4.1.50.Final.jar:4.1.50.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.50.Final.jar:4.1.50.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.50.Final.jar:4.1.50.Final]
at java.lang.Thread.run(Unknown Source) [?:?]

I have the detected the error. There was a bug in my worker code, which was originating a 57 MB JSON object to be passed as a variable, upon job completion. Once i fixed the error, making the JSON much smaller, the jobs are being completed normally.

1 Like

I wonder if the error message could be made better to make this easier to diagnose. If there is a maximum JSON size that can be passed without causing this, then the client library could throw.

Can you open a GitHub issue about it?

Sure, I’ll open a github issue with this information.

1 Like

There is already one https://github.com/zeebe-io/zeebe/issues/4928

1 Like