Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-13589

DelimitedInputFormat index error on multi-byte delimiters with whole file input splits

    XMLWordPrintableJSON

Details

    Description

      The DelimitedInputFormat can drops bytes when using input splits that have a length of -1 (for reading the whole file).  It looks like this is a simple bug in handing the delimiter on buffer boundaries where the logic is inconsistent for different split types.

      Attached is a possible patch with fix and test.

       

      Attachments

        1. delimiter-bug.patch
          2 kB
          Adric Eckstein

        Activity

          People

            arvid Arvid Heise
            aeckstein Adric Eckstein
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h
                1h