Hadoop Common
  1. Hadoop Common
  2. HADOOP-499

Avoid the use of Strings to improve the performance of hadoop streaming

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.5.0
    • Fix Version/s: 0.6.0
    • Component/s: None
    • Labels:
      None

      Description

      In hadoop streaming, a record is represented as a String for I/O and is encoded as UTF8 for map/reduce. A record has to be converted between String and UTF8 back and forth multiple times and this wastes CPU time.

        Issue Links

          Activity

          Hairong Kuang created issue -
          Hairong Kuang made changes -
          Field Original Value New Value
          Link This issue incorporates HADOOP-413 [ HADOOP-413 ]
          Hide
          Hairong Kuang added a comment -

          This patch includes the following fix:
          1. replace the the use of UTF8 by Text in hadoop-streaming. Therefore, it fixesADOOP-413.
          2. removes the use of stringsby adding simple manipulation of bytes arrays.
          3. fix the stream close order when map/reduce finishes hence avoid truncated records.

          Show
          Hairong Kuang added a comment - This patch includes the following fix: 1. replace the the use of UTF8 by Text in hadoop-streaming. Therefore, it fixesADOOP-413. 2. removes the use of stringsby adding simple manipulation of bytes arrays. 3. fix the stream close order when map/reduce finishes hence avoid truncated records.
          Hairong Kuang made changes -
          Attachment text_streaming.patch [ 12340081 ]
          Hide
          Doug Cutting added a comment -

          I just committed this. Thanks, Hairong!

          Show
          Doug Cutting added a comment - I just committed this. Thanks, Hairong!
          Doug Cutting made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Doug Cutting made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Owen O'Malley made changes -
          Component/s contrib/streaming [ 12310972 ]

            People

            • Assignee:
              Hairong Kuang
              Reporter:
              Hairong Kuang
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development