Hadoop Common
  1. Hadoop Common
  2. HADOOP-3040

Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.17.0
    • Component/s: None
    • Labels:
      None
    • Release Note:
      If the first character on a line is the separator, empty key is assumed, and the whole line is the value (due to a bug this was not the case).

      Description

      Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator.

      1. patch-3040.txt
        5 kB
        Amareshwari Sriramadasu
      2. patch-3040.txt
        0.6 kB
        Amareshwari Sriramadasu

        Activity

        Hide
        Hudson added a comment -
        Show
        Hudson added a comment - Integrated in Hadoop-trunk #440 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/440/ )
        Hide
        Devaraj Das added a comment -

        I just committed this. Thanks, Amareshwari!

        Show
        Devaraj Das added a comment - I just committed this. Thanks, Amareshwari!
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12378475/patch-3040.txt
        against trunk revision 619744.

        @author +1. The patch does not contain any @author tags.

        tests included +1. The patch appears to include 3 new or modified tests.

        javadoc +1. The javadoc tool did not generate any warning messages.

        javac +1. The applied patch does not generate any new javac compiler warnings.

        release audit +1. The applied patch does not generate any new release audit warnings.

        findbugs +1. The patch does not introduce any new Findbugs warnings.

        core tests +1. The patch passed core unit tests.

        contrib tests +1. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2033/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2033/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2033/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2033/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12378475/patch-3040.txt against trunk revision 619744. @author +1. The patch does not contain any @author tags. tests included +1. The patch appears to include 3 new or modified tests. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2033/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2033/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2033/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2033/console This message is automatically generated.
        Hide
        Amareshwari Sriramadasu added a comment -

        Added a testcase with input having 3 lines:
        1. First line has key\tvalue
        2. Second line starts with a tab. i.e empty key and whole line is value.
        3. Third line doesnt have any tab. Then whole line is the key.

        Show
        Amareshwari Sriramadasu added a comment - Added a testcase with input having 3 lines: 1. First line has key\tvalue 2. Second line starts with a tab. i.e empty key and whole line is value. 3. Third line doesnt have any tab. Then whole line is the key.
        Hide
        Amareshwari Sriramadasu added a comment -

        Canceling patch for adding testcase.

        Show
        Amareshwari Sriramadasu added a comment - Canceling patch for adding testcase.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12378199/patch-3040.txt
        against trunk revision 619744.

        @author +1. The patch does not contain any @author tags.

        tests included -1. The patch doesn't appear to include any new or modified tests.
        Please justify why no tests are needed for this patch.

        javadoc +1. The javadoc tool did not generate any warning messages.

        javac +1. The applied patch does not generate any new javac compiler warnings.

        release audit +1. The applied patch does not generate any new release audit warnings.

        findbugs +1. The patch does not introduce any new Findbugs warnings.

        core tests +1. The patch passed core unit tests.

        contrib tests +1. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1999/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1999/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1999/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1999/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12378199/patch-3040.txt against trunk revision 619744. @author +1. The patch does not contain any @author tags. tests included -1. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1999/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1999/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1999/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1999/console This message is automatically generated.
        Hide
        Amareshwari Sriramadasu added a comment -

        Just for information, InputFormat handling of keys and values is fine, problem was in MROutputThread of PipeMapred i.e at the collect in PipeMapred

        Show
        Amareshwari Sriramadasu added a comment - Just for information, InputFormat handling of keys and values is fine, problem was in MROutputThread of PipeMapred i.e at the collect in PipeMapred
        Hide
        Amareshwari Sriramadasu added a comment -

        Patch fixes the bug. The bug is due to UTF8ByteArrayUtils.findNthByte, which starts to look at the line read, from position 1 instead of 0.

        Show
        Amareshwari Sriramadasu added a comment - Patch fixes the bug. The bug is due to UTF8ByteArrayUtils.findNthByte, which starts to look at the line read, from position 1 instead of 0.

          People

          • Assignee:
            Amareshwari Sriramadasu
            Reporter:
            Amareshwari Sriramadasu
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development