Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-3849

Combiner+PipelinedSorter silently drops records

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.9.1
    • None
    • None

    Description

      This bug was introduced in https://github.com/apache/tez/commit/a47e8fcbea5eeab5a7cf812271d329524cc02dba?diff=split

      when combiner != null, the change in this commit passes kvIter with next() having already been called. This ends up (silently) dropping the first record in the partition.

      Will submit PR and attach patch. jeagles, not sure if this is the way you want to fix or not but it does fix my tests.

      Attachments

        1. TEZ-3849.1.patch
          7 kB
          Jacob Tolar
        2. TEZ-3849.2.patch
          6 kB
          Jacob Tolar
        3. TEZ-3849.3.patch
          13 kB
          Jacob Tolar
        4. TEZ-3849.4.patch
          13 kB
          Jacob Tolar
        5. TEZ-3849.5.patch
          12 kB
          Jacob Tolar
        6. TEZ-3849.6.patch
          12 kB
          Jacob Tolar

        Issue Links

          Activity

            People

              jtolar Jacob Tolar
              jtolar Jacob Tolar
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: