Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-960

Unnecessary copy in mapreduce.lib.input.KeyValueLineRecordReader

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.21.0
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      KeyValueLineRecordReader effects the copy from the line to the key/value by creating separate arrays:

            int keyLen = pos;
            byte[] keyBytes = new byte[keyLen];
            System.arraycopy(line, 0, keyBytes, 0, keyLen);
            int valLen = lineLen - keyLen - 1;
            byte[] valBytes = new byte[valLen];
            System.arraycopy(line, pos + 1, valBytes, 0, valLen);
            key.set(keyBytes);
            value.set(valBytes);
      

      Since set triggers another copy and Text has a set taking byte[], off, len, the intermediate copy can be avoided

        Attachments

        1. M960-0.patch
          1 kB
          Chris Douglas

          Activity

            People

            • Assignee:
              chris.douglas Chris Douglas
              Reporter:
              chris.douglas Chris Douglas
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: