Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-960

Unnecessary copy in mapreduce.lib.input.KeyValueLineRecordReader

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.21.0
    • 0.21.0
    • None
    • None
    • Reviewed

    Description

      KeyValueLineRecordReader effects the copy from the line to the key/value by creating separate arrays:

            int keyLen = pos;
            byte[] keyBytes = new byte[keyLen];
            System.arraycopy(line, 0, keyBytes, 0, keyLen);
            int valLen = lineLen - keyLen - 1;
            byte[] valBytes = new byte[valLen];
            System.arraycopy(line, pos + 1, valBytes, 0, valLen);
            key.set(keyBytes);
            value.set(valBytes);
      

      Since set triggers another copy and Text has a set taking byte[], off, len, the intermediate copy can be avoided

      Attachments

        1. M960-0.patch
          1 kB
          Christopher Douglas

        Activity

          People

            cdouglas Christopher Douglas
            cdouglas Christopher Douglas
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: