Description
As Repartitioner::scheduleHashShuffledFetches re-optimizes the number of tasks, sometimes one task may fetch results of thousands of tasks.
In my case, there is a case that one task fetches from 9240 tasks,
and got following exception
2014-03-07 13:29:24,240 ERROR rpc.AsyncRpcClient (AsyncRpcClient.java:exceptionCaught(220)) - skt-rf-01/50.1.103.1:28093,class org.apache.tajo.ipc.QueryMasterProtocol,Adjusted frame length exceeds 2097152: 7729393 - discarded org.jboss.netty.handler.codec.frame.TooLongFrameException: Adjusted frame length exceeds 2097152: 7729393 - discarded at org.jboss.netty.handler.codec.frame.LengthFieldBasedFrameDecoder.fail(LengthFieldBasedFrameDecoder.java:417) at org.jboss.netty.handler.codec.frame.LengthFieldBasedFrameDecoder.failIfNecessary(LengthFieldBasedFrameDecoder.java:405) at org.jboss.netty.handler.codec.frame.LengthFieldBasedFrameDecoder.decode(LengthFieldBasedFrameDecoder.java:320) at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425) at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90) at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) 2014-03-07 13:29:24,242 ERROR rpc.AsyncRpcClient (AsyncRpcClient.java:exceptionCaught(235)) - RPC Exception:org.jboss.netty.handler.codec.frame.TooLongFrameException: Adjusted frame length exceeds 2097152: 7729393 - discarded
The reason seems that frameDecoder setting in getPipeline() of ProtoPipelineFactory.java is 2MB
public ChannelPipeline getPipeline() throws Exception { ChannelPipeline p = Channels.pipeline(); p.addLast("frameDecoder", new LengthFieldBasedFrameDecoder(1048576*2, 0, 4, 0, 4)); p.addLast("protobufDecoder", new ProtobufDecoder(defaultInstance)); p.addLast("frameEncoder", new LengthFieldPrepender(4)); p.addLast("protobufEncoder", new ProtobufEncoder()); p.addLast("handler", handler); return p; }
I think temporal solution for the issue is just increasing the size.
However, I'm not sure what the proper size is.