Resolution: Not A Problem
subsequent stack analyzing using jstack shown that code is stuck in one of udf classes.
Hi, I've met strange problem. Maybe it's related to data. BUt I'm not sure. I'm working with derivative in avro format so all "bad data" should be caught on early stages.
My pig script worked to 2 days each hour (invoked using oozie coordinator).
Now it stucks. It always have one reducer which shows progess = 67.55%
I see in TT log that it does merge, sort, then starts reduce.
I do use custom UDF in my pig script.
I've added counters trying to debug the situation.My UDF works with bags.
Counter says that UDF worked fine because "Reduce input groups" = "invocation times of UDF".
I even see counters of output:
Counters say that all data passed through my UDF and even some output has been written.
But reducer (always only 1 of 54 total reducers) stucks for 1 hour an then killed by JT because of timeout. All other 53 reducers finished in 7 minutes.
How can I debug MultiStore?