Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.1
    • Fix Version/s: 1.1
    • Component/s: None
    • Labels:
      None

      Description

      Dennis reported:

      In the SegmentMerger.java file about line 150 we have this:

      final SequenceFile.Reader reader =
      new SequenceFile.Reader(FileSystem.get(job), fSplit.getPath(),
      job);

      Then about line 166 in the record reader we have this:

      boolean res = reader.next(key, w);

      If I am reading that right, that would mean that the map tap would loop
      over all records for a given file and not just a given split.

      Right, this should instead use SequenceFileRecordReader that already has the logic to handle splits. Patch coming shortly - thanks for spotting this! This could be the reason for "out of disk space" errors that many users reported.

        Attachments

        1. merger.patch
          6 kB
          Andrzej Bialecki

          Activity

            People

            • Assignee:
              ab Andrzej Bialecki
              Reporter:
              musepwizard Dennis Kubes
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: