Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-1351

MROutput needs a flush method to ensure data is materialized for FileOutputCommitter

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.5.0
    • None
    • None
    • Incompatible change, Reviewed

    Description

      In MROutput.commit, we need to check isCommitRequired before invoking commitTask.

      Currently we did this check inside Pig:

                      if (fileOutput.isCommitRequired()) {
                          fileOutput.commit();
                      }
      

      However, in some loader, output file is generated only after fileOutput.close, which is part of fileOutput.commit. The isCommitRequired check is too early. A walk around is to invoke fileOutput.close before isCommitRequired:

                      fileOutput.close();
                      if (fileOutput.isCommitRequired()) {
                          fileOutput.commit();
                      }
      

      But we are told there is a plan to make MROutput.close private.

      Attachments

        1. TEZ-1351.1.patch
          6 kB
          Bikas Saha
        2. TEZ-1351.2.patch
          8 kB
          Bikas Saha

        Issue Links

          Activity

            People

              bikassaha Bikas Saha
              daijy Daniel Dai
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: