Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.8.0
    • Fix Version/s: None
    • Component/s: core
    • Labels:

      Description

      Migration tool halts, attaching a thread dump for further investigation

      1. migrationtoolfix.patch
        3 kB
        Sriram Subramanian
      2. migrationtoolfix-2.patch
        1 kB
        Sriram Subramanian
      3. migration-tool-halts
        59 kB
        Neha Narkhede

        Activity

        Neha Narkhede created issue -
        Neha Narkhede made changes -
        Field Original Value New Value
        Attachment migration-tool-halts [ 12565986 ]
        Neha Narkhede made changes -
        Labels p1
        Neha Narkhede made changes -
        Assignee Sriram Subramanian [ sriramsub ]
        Hide
        Sriram Subramanian added a comment -

        When one of the migration tool thread dies, we were just eating up the exception. Also all the fetcher threads were trying to push data to the queue of the thread that died causing the tool to stall. We now catch exceptions and log them and continue. For any error belonging to lang.Error we log a fatal error.

        Show
        Sriram Subramanian added a comment - When one of the migration tool thread dies, we were just eating up the exception. Also all the fetcher threads were trying to push data to the queue of the thread that died causing the tool to stall. We now catch exceptions and log them and continue. For any error belonging to lang.Error we log a fatal error.
        Sriram Subramanian made changes -
        Attachment migrationtoolfix.patch [ 12566247 ]
        Hide
        Neha Narkhede added a comment -

        Thanks for the patch. There are some compile time errors. Few comments -

        1. Each log4j statement using the helper APIs in Logging.scala should be of the form ("error message", cause). Let's fix that.
        2. Inside the while loop, catching throwable will also catching all Exceptions, so we only need to catch Throwable and exit. This is because, in the migration tool case, only a Kafka bug will throw an exception. In other cases, we might know the exact type of exception that is data dependent and expected, so it makes sense to catch those separately.

        Show
        Neha Narkhede added a comment - Thanks for the patch. There are some compile time errors. Few comments - 1. Each log4j statement using the helper APIs in Logging.scala should be of the form ("error message", cause). Let's fix that. 2. Inside the while loop, catching throwable will also catching all Exceptions, so we only need to catch Throwable and exit. This is because, in the migration tool case, only a Kafka bug will throw an exception. In other cases, we might know the exact type of exception that is data dependent and expected, so it makes sense to catch those separately.
        Hide
        Sriram Subramanian added a comment -

        The patch seems to have conflicts with the changes I have. Will fix that.

        2. Exception and Throwable are specifically caught separately. Throwable includes critical errors which should never be swallowed. In case of Exceptions caused mainly by kafka errors we are logging an error and continuing with consumption. If you are suggesting we exit for any exceptions it is no different from what is there already,

        Show
        Sriram Subramanian added a comment - The patch seems to have conflicts with the changes I have. Will fix that. 2. Exception and Throwable are specifically caught separately. Throwable includes critical errors which should never be swallowed. In case of Exceptions caused mainly by kafka errors we are logging an error and continuing with consumption. If you are suggesting we exit for any exceptions it is no different from what is there already,
        Hide
        Sriram Subramanian added a comment -
        • for now changed the catch statement to log a fatal error instead of directing to stdout. Let us look at the logs if the issue reproduces and find what the exceptions are.
        Show
        Sriram Subramanian added a comment - for now changed the catch statement to log a fatal error instead of directing to stdout. Let us look at the logs if the issue reproduces and find what the exceptions are.
        Sriram Subramanian made changes -
        Attachment migrationtoolfix-2.patch [ 12566337 ]
        Hide
        Neha Narkhede added a comment -

        Checked in v2. We'll see how it goes

        Show
        Neha Narkhede added a comment - Checked in v2. We'll see how it goes
        Neha Narkhede made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Neha Narkhede made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Sriram Subramanian
            Reporter:
            Neha Narkhede
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development