Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-8654

TextInputFormat delimiter bug:- Input Text portion ends with & Delimiter starts with same char/char sequence

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.20.204.0, 1.0.3, 0.21.0, 2.0.0-alpha
    • 2.0.2-alpha
    • util
    • Linux

    • TextInputFormat record delimiter

    Description

      TextInputFormat delimiter bug scenario , a character sequence of the input text, in which the first character matches with the first character of delimiter, and the remaining input text character sequence matches with the entire delimiter character sequence from the starting position of the delimiter.

      eg delimiter ="record";
      and Text =" record 1:- name = Gelesh e mail = gelesh.hadoop@gmail.com Location Bangalore record 2: name = sdf .. location =Bangalorrecord 3: name .... "

      Here string "=Bangalorrecord 3: " satisfy two conditions
      1) contains the delimiter "record"
      2) The character / character sequence immediately before the delimiter (ie ' r ') matches with first character (or character sequence ) of delimiter. (ie "=Bangalor" ends with and Delimiter starts with same character/char sequence 'r' ),

      Here the delimiter is not encountered by the program resulting in improper value text in map that contains the delimiter

      Attachments

        1. HADOOP-8654.patch
          3 kB
          Jason Darrell Lowe
        2. MAPREDUCE-4512.txt
          0.7 kB
          Gelesh

        Activity

          People

            Unassigned Unassigned
            gelesh Gelesh
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 1m
                1m
                Remaining:
                Remaining Estimate - 1m
                1m
                Logged:
                Time Spent - Not Specified
                Not Specified