Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-1007

CombinedHiveInputFormat fails with empty input

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.4.1
    • None
    • Query Processor
    • None

    Description

      In a multi-stage query, when one stage returns no data (resulting in a bunch of output files with size 0), the next stage creates a job with 0 mappers which just sits in the Hadoop task track forever and hangs the query at 0%. The issue is that CombineHiveInputFormat looks for blocks to populate splits, find nones (since input is all 0 bytes), and then returns an empty array from getSplits.

      There may be good a way to just skip that job altogether, but as a quick hack to get it working, when there are no splits, I just create a single empty one using the first path so that the job doesn't hang.

      Attachments

        1. hive.1007.1.patch
          1 kB
          Dave Lerman

        Activity

          People

            dlerman Dave Lerman
            dlerman Dave Lerman
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: