Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-529

Hive communicates state from RecordReader to Processor via JobConf

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.2.0
    • None
    • None
    • None

    Description

      Hive currently switches between operator pipelines + partition descriptors via a map.input.file

      In the CombineFileInputFormat case Hive relies on the fact that CombineFileRecordReader sets this field every time a new file is processed. This file will then be read in the processor to setup the correct processing pipeline.

      After the Tez refactor RecordReader and TezProcessor use different job conf instances. Because of that Hive will fail since map.input.file isn't set and updated in the processor's conf.

      Attachments

        Issue Links

          Activity

            People

              sseth Siddharth Seth
              hagleitn Gunther Hagleitner
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: