Camel
  1. Camel
  2. CAMEL-5282

Strange race condition in for SEDA, when shutting down in Camel 2.9.1

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Not a Problem
    • Affects Version/s: 2.9.1
    • Fix Version/s: 2.9.3, 2.10.0
    • Component/s: camel-core
    • Labels:
      None
    • Environment:

      Windows XP, jdk1.6.0_29

    • Estimated Complexity:
      Unknown

      Description

      We have file endpoints, with idemptotent file repository feeds into a SEDA chanel.
      Behind the SEDA is a solution, which can only deal one file at a time.

      If the system some reason shutdowns, than on restart we shouldn't consume already consumed files.

      If exception occurs, we shutdown our application.
      This behaviour assured by a unit test, which begin to fail when we introduce timeout=0 configuration.

      Here is the pseudo code:

      onException(RuntimeException.class).process(new ShutDown());

      from("file:/tempfolder/files/?idempotent=true&noop=true&idempotentRepository=#repo&delay=1000")
      .inOut("seda:process?timeout=0");

      from("seda:process").delay(1000).throwException(new RuntimeException("Testing with exception"));

      I use inOut so the file thread, waits for SEDA to acomplish. This serves as there is an exception happens behind seda side, by using DefaultErrorHandler strategy it propagates back to file endpoint and the sent file, won't be marked consumed.

      With 2.9.1 the files marked as consumed (meaning recorded in the idempotent repository) 90% of the times. I say % because sometimes the test passes without any issue. On 2.9.0 I wasn't able to reproduce the error.

      NOTE: When you remove the timeout, than behaviour will be fine
      So on 2.9.0 and 2.9.1 with positive timeout there is no issue.
      In 2.9.1 timeout < 1 there is issue.

        Activity

        Hide
        David Cifer added a comment -

        The unit tests, which reproduce the error

        Show
        David Cifer added a comment - The unit tests, which reproduce the error
        Hide
        David Cifer added a comment -

        Maybe the error relates: CAMEL-4882

        Show
        David Cifer added a comment - Maybe the error relates: CAMEL-4882
        Hide
        David Cifer added a comment -

        I don't know this is the same issue: CAMEL-5033

        Show
        David Cifer added a comment - I don't know this is the same issue: CAMEL-5033
        Hide
        David Cifer added a comment -

        This test shows, that with default 30 sec timeout the behavoir is stable

        Show
        David Cifer added a comment - This test shows, that with default 30 sec timeout the behavoir is stable
        Hide
        Claus Ibsen added a comment -

        Can you re-attached the files and mark [x] in grant license to Apache. Otherwise we cannot use them for regression testing in the source.

        Show
        Claus Ibsen added a comment - Can you re-attached the files and mark [x] in grant license to Apache. Otherwise we cannot use them for regression testing in the source.
        Hide
        Claus Ibsen added a comment -

        See this FAQ how to properly shutdown a route from a route. What you do is wrong
        http://camel.apache.org/how-can-i-stop-a-route-from-a-route.html

        Show
        Claus Ibsen added a comment - See this FAQ how to properly shutdown a route from a route. What you do is wrong http://camel.apache.org/how-can-i-stop-a-route-from-a-route.html
        Hide
        David Cifer added a comment - - edited

        Got the idea from the book . I used the latch solution on a project, but I was allowed to modify the application lifecycle logic...

        I try to implement server applications, with camel... but the requirements most of the time is a short running feed applications... maybe a reusable but properly implemented shutdown component would be handy

        Show
        David Cifer added a comment - - edited Got the idea from the book . I used the latch solution on a project, but I was allowed to modify the application lifecycle logic... I try to implement server applications, with camel... but the requirements most of the time is a short running feed applications... maybe a reusable but properly implemented shutdown component would be handy
        Hide
        Claus Ibsen added a comment -

        This is working as designed.
        I have added unit tests, that runs fine.

        And btw the source code for the Camel in Action book has been updated to follow the practice outlined in that FAQ

        Show
        Claus Ibsen added a comment - This is working as designed. I have added unit tests, that runs fine. And btw the source code for the Camel in Action book has been updated to follow the practice outlined in that FAQ
        Hide
        David Cifer added a comment -

        Sorry. Maybe I'm looking in wrong files... but I didn't see in the commit list the test, which tries with seda:name?timeout=0

        Show
        David Cifer added a comment - Sorry. Maybe I'm looking in wrong files... but I didn't see in the commit list the test, which tries with seda:name?timeout=0
        Hide
        Claus Ibsen added a comment -

        A timeout with 0 does not make sense.

        Show
        Claus Ibsen added a comment - A timeout with 0 does not make sense.
        Hide
        David Cifer added a comment -

        I thought timeout=0 means that never timeout.

        Show
        David Cifer added a comment - I thought timeout=0 means that never timeout.
        Hide
        Claus Ibsen added a comment -

        Yeah timeout=0 or timeout=-1, is the same as not specifying timeout, which there is already a test for. Click the subversions tab on the JIRA tracker to see the commits.

        Show
        Claus Ibsen added a comment - Yeah timeout=0 or timeout=-1, is the same as not specifying timeout, which there is already a test for. Click the subversions tab on the JIRA tracker to see the commits.
        Hide
        David Cifer added a comment -

        Oh. I thought not specifying timeout means it should have 30000 as default. I think this is the reason we tried to introduce timeout=0. Because large file consumption timed out on the producer side side.

        Show
        David Cifer added a comment - Oh. I thought not specifying timeout means it should have 30000 as default. I think this is the reason we tried to introduce timeout=0. Because large file consumption timed out on the producer side side.
        Hide
        Claus Ibsen added a comment -

        Ah yeah sorry. I just added a timeout=-1 example. All of them works fine.

        Show
        Claus Ibsen added a comment - Ah yeah sorry. I just added a timeout=-1 example. All of them works fine.
        Hide
        David Cifer added a comment -

        Hmm. Let me try it on my home machine. In the office 7-9 out times from 10 failed.

        Show
        David Cifer added a comment - Hmm. Let me try it on my home machine. In the office 7-9 out times from 10 failed.
        Hide
        David Cifer added a comment -

        I was able to reproduce this issue, with my the tests I sent. This is environment is Linux, with multi-core processor. I try to implement now the thread solution, what u showed in the FAQ.

        Show
        David Cifer added a comment - I was able to reproduce this issue, with my the tests I sent. This is environment is Linux, with multi-core processor. I try to implement now the thread solution, what u showed in the FAQ.
        Hide
        David Cifer added a comment -

        Ok. When I move to the thread solution the application shutdowns fine and the exceptions arrives as well. So works for me. Thx for the help.

        Show
        David Cifer added a comment - Ok. When I move to the thread solution the application shutdowns fine and the exceptions arrives as well. So works for me. Thx for the help.
        Hide
        Claus Ibsen added a comment -

        Yeah you really need to use a separate thread for stopping/shutting down. This is also how you do eg when you stop the JVM etc.
        If you do in the same thread as processing the message, then its much harder and causes blocking issues and whatnot. Hence why the FAQ and the Camel in Action source code has been updated.

        Show
        Claus Ibsen added a comment - Yeah you really need to use a separate thread for stopping/shutting down. This is also how you do eg when you stop the JVM etc. If you do in the same thread as processing the message, then its much harder and causes blocking issues and whatnot. Hence why the FAQ and the Camel in Action source code has been updated.

          People

          • Assignee:
            Claus Ibsen
            Reporter:
            David Cifer
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development