Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-13589

DelimitedInputFormat index error on multi-byte delimiters with whole file input splits

    XMLWordPrintableJSON

    Details

      Description

      The DelimitedInputFormat can drops bytes when using input splits that have a length of -1 (for reading the whole file).  It looks like this is a simple bug in handing the delimiter on buffer boundaries where the logic is inconsistent for different split types.

      Attached is a possible patch with fix and test.

       

        Attachments

        1. delimiter-bug.patch
          2 kB
          Adric Eckstein

          Activity

            People

            • Assignee:
              arvid heise Arvid Heise
              Reporter:
              aeckstein Adric Eckstein
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h
                1h