Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-2256

Avoid use of BufferTooSmallException to signal end of buffer in UnorderedPartitionedKVWriter

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 0.6.0, 0.7.0
    • Fix Version/s: 0.6.1
    • Component/s: None
    • Labels:
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      UnorderedPartitionedKVWriter delegates serialization to the application, passing it a private ByteArrayOutputStream. In case the buffer is exhausted, ByteArrayOutputStream signals that with a private BufferTooSmallException, which can be seen but not dealt with by the application. As Chris Wensel pointed out, when the application is in fact a complex framework, there is no way to distinguish this exception from a real failure, which compels logging the full stack even for reasonable events such as "buffer complete".

      Suggested approach: set a "complete" flag in ByteArrayOutputStream that disables any further output, and replace BufferTooSmallException (BTSE) handling by checking that flag.

      Siddharth Seth suggested checking out SortedOutput as well, as the mechanisms there should be similar.

      I'll give this a go this week.

        Attachments

          Activity

            People

            • Assignee:
              cchepelov Cyrille Chépélov
              Reporter:
              cchepelov Cyrille Chépélov

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 6h
                6h
                Remaining:
                Remaining Estimate - 6h
                6h
                Logged:
                Time Spent - Not Specified
                Not Specified

                  Issue deployment