Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-7158

Inefficient Flush Logic in JobHistory EventWriter

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.2.0
    • 3.1.2, 3.3.0, 3.2.1
    • None
    • None
    • Reviewed

    Description

      In HDFS, if the flush is implemented to send server request to actually commit the pending writes on the storage service side, we could observe in the benchmark runs that the MR jobs are taking much longer. From investigation we see the current implementation for writing events doesn't look right:
      EventWriter# write()
      This flush is redundant and this statement should be removed. It defeats the purpose of having a separate flush function itself.
      Encoder.flush calls flush of the underlying output stream
      After patching with the fix the MR jobs could complete normally, please kindly find the patch in attached.

      Attachments

        1. MAPREDUCE-7158-001.patch
          1 kB
          Zichen Sun

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            zichensun Zichen Sun
            zichensun Zichen Sun
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment