Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.4.1
-
None
-
None
Description
In a multi-stage query, when one stage returns no data (resulting in a bunch of output files with size 0), the next stage creates a job with 0 mappers which just sits in the Hadoop task track forever and hangs the query at 0%. The issue is that CombineHiveInputFormat looks for blocks to populate splits, find nones (since input is all 0 bytes), and then returns an empty array from getSplits.
There may be good a way to just skip that job altogether, but as a quick hack to get it working, when there are no splits, I just create a single empty one using the first path so that the job doesn't hang.