Hadoop Common
  1. Hadoop Common
  2. HADOOP-8654

TextInputFormat delimiter bug:- Input Text portion ends with & Delimiter starts with same char/char sequence

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.204.0, 1.0.3, 0.21.0, 2.0.0-alpha
    • Fix Version/s: 2.0.2-alpha
    • Component/s: util
    • Labels:
    • Environment:

      Linux

    • Target Version/s:
    • Tags:
      TextInputFormat record delimiter

      Description

      TextInputFormat delimiter bug scenario , a character sequence of the input text, in which the first character matches with the first character of delimiter, and the remaining input text character sequence matches with the entire delimiter character sequence from the starting position of the delimiter.

      eg delimiter ="record";
      and Text =" record 1:- name = Gelesh e mail = gelesh.hadoop@gmail.com Location Bangalore record 2: name = sdf .. location =Bangalorrecord 3: name .... "

      Here string "=Bangalorrecord 3: " satisfy two conditions
      1) contains the delimiter "record"
      2) The character / character sequence immediately before the delimiter (ie ' r ') matches with first character (or character sequence ) of delimiter. (ie "=Bangalor" ends with and Delimiter starts with same character/char sequence 'r' ),

      Here the delimiter is not encountered by the program resulting in improper value text in map that contains the delimiter

      1. MAPREDUCE-4512.txt
        0.7 kB
        Gelesh
      2. HADOOP-8654.patch
        3 kB
        Jason Lowe

        Activity

        No work has yet been logged on this issue.

          People

          • Assignee:
            Unassigned
            Reporter:
            Gelesh
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 1m
              1m
              Remaining:
              Remaining Estimate - 1m
              1m
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development