Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-2575

Handle KeyValue pairs size which do not fit in a single block in PipelinedSorter

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.5.0
    • 0.8.0-alpha, 0.7.1
    • None
    • None
    • Reviewed

    Description

      In the present implementation, the available buffer is divided into blocks (specified in the constructor for pipeline sort). and a linked list of these block byte buffers is maintained.
      A span is created out of the buffers.
      The present logic, doesnot handle scenario where a single key-value pair size doesnot fit into any of the blocks.
      example if 1mb total memory is divided into 4 blocks, (256 kb each),
      if a single KV pair is greater than the blocksize(~ignoring meta data size),
      then it fails with buffer exceptions.

      Attachments

        1. TEZ-2575.1.patch
          13 kB
          Saikat
        2. TEZ-2575.2.patch
          12 kB
          Saikat
        3. TEZ-2575.3.patch
          15 kB
          Saikat
        4. TEZ-2575.4.patch
          15 kB
          Rajesh Balamohan
        5. TEZ-2575.5.patch
          15 kB
          Saikat
        6. TEZ-2575.branch-0.7.patch
          14 kB
          Rajesh Balamohan
        7. TEZ-2575.patch
          13 kB
          Saikat

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            saikatr Saikat
            saikatr Saikat
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment