Uploaded image for project: 'Camel'
  1. Camel
  2. CAMEL-19894

camel-kafka: enabling "breakOnFirstError" causes to skip records on exception

    XMLWordPrintableJSON

Details

    • Novice
    • Regression

    Description

      Reproducing:

      • Configure camel kafka consumer with with "breakOnFirstError" = "true"
      • Setup a topic with exactly 2 partitions
      • Produce a series of records to kafka record to both partitions.
      • Ensure offset is commited (I've done that with manual commit, autocommit MAY have a second bug also, check the description)
      • Make a route to consume this topic. Ensure the first poll gets records from both partitions. Ensure the second-to-consume partition has some more records to fetch in the next poll.
      • Trigger an error when processing exactly first record of the second-to-consume partition

      Expected behavior:

      • Application should consume all records from the first partition, and none from the second. 

      Actual behavior:

      • Application should consume all records from the first partition. Some records from the second partition are skipped (the number depends on quantity consumed from the first in a single poll).  

       

      This bug was introduced in https://issues.apache.org/jira/browse/CAMEL-18350, which had fixed a major issue with breakOnFirstError, but had some edge cases.

      The root cause is that lastResult variable is not cleaned between polls (and between partitions loop iterations), and might have an invalid dirty value got from the previous iteration. And it has no chance to be correctly initialized if exception happens on the first record of partition. Then forced sync commit is done to the right (new) partition but with invalid "random" (dirty) offset.

      I've adjusted a project test project for CAMEL-18350 (many thanks to klease78) to demonstrate the issue and published it to github. Check the failing test in the project: https://github.com/Krivda/camel-bug-reproduction

      P.S. Also, there might be a second bug related to this issue which may occur with enableAutoCommit=true : when the bug occurs, physical commit might be not made to already processed partitions, which may result in double processing. But i haven't investigated this issue further. 

      P.P.S - Please note, that the github project contains a very detailed description of the behavior pointing to the specific failing lines of code, that should be very helpful in investigation.

       

      Attachments

        Issue Links

          Activity

            People

              orpiske Otavio Rodolfo Piske
              akrivda akrivda
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: