Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.7.2
    • Fix Version/s: 0.7.2
    • Component/s: None
    • Labels:

      Description

      If you remove all log4j.properties from classpath (and prvent any log4j intialization previous to the one which involves kafka appender), and put KAFKA appender on rootLogger, then, even if you exclude the KAFKA appender from logger.kafka and logger.org, you get a Log4j deadlock when zookeeper send thread tries to issue a LOG.info, and for that it locks the root category (as its correct category, say org, is not yet set up in log4j, who is still in activateOptions on kafka appender).

      please read comments on the parent and ask any questions to nmarasoi@adobe.com

        Activity

        nicu marasoiu created issue -
        nicu marasoiu made changes -
        Field Original Value New Value
        Description please read list comment on the parent If you remove all log4j.properties from classpath (and prvent any log4j intialization previous to the one which involves kafka appender), and put KAFKA appender on rootLogger, then, even if you exclude the KAFKA appender from logger.kafka and logger.org, you get a Log4j deadlock when zookeeper send thread tries to issue a LOG.info, and for that it locks the root category (as its correct category, say org, is not yet set up in log4j, who is still in activateOptions on kafka appender).

        please read comments on the parent and ask any questions to nmarasoi@adobe.com
        Hide
        David Arthur added a comment -

        I have seen this as well. Particularly this is an issue with slf4j and the LoggerFactory deadlocking. When the appender is initializing, it creates a Producer which in turn connects to ZooKeeper. The ZooKeeper client attempts to open a logger with slf4j LoggerFactory which causes the deadlock.

        One possible solution would be to defer the Producer initialization until after log4j was all setup.

        Show
        David Arthur added a comment - I have seen this as well. Particularly this is an issue with slf4j and the LoggerFactory deadlocking. When the appender is initializing, it creates a Producer which in turn connects to ZooKeeper. The ZooKeeper client attempts to open a logger with slf4j LoggerFactory which causes the deadlock. One possible solution would be to defer the Producer initialization until after log4j was all setup.
        Hide
        David Arthur added a comment -

        Attaching a fix for this.

        Show
        David Arthur added a comment - Attaching a fix for this.
        David Arthur made changes -
        David Arthur made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Affects Version/s 0.7.2 [ 12322475 ]
        Labels patch
        Fix Version/s 0.7.2 [ 12322475 ]
        Hide
        Jun Rao added a comment -

        David, thanks for the patch. Could you find out if AppenderSkeleton#append is synchronized? Once you know the answer, I can commit the patch to 0.7. Do we need to patch 0.8 too?

        Show
        Jun Rao added a comment - David, thanks for the patch. Could you find out if AppenderSkeleton#append is synchronized? Once you know the answer, I can commit the patch to 0.7. Do we need to patch 0.8 too?
        Hide
        David Arthur added a comment -

        It does appear to be synchronized, so that lock can be removed

        https://github.com/apache/log4j/blob/trunk/src/main/java/org/apache/log4j/AppenderSkeleton.java#L228

        I patched against 0.7 since that's what we have in production (and where
        the fix was needed). However, looking at 0.8, it doesn't appear that
        there is an option to connect via ZK. I see that it was removed via
        KAFKA-369. What was the reasoning for this? In our deployment (as with
        many others, I'm sure), ZooKeeper is the only globally known thing so it
        is very useful to be able to connect to Kafka by way of ZooKeeper.

        Show
        David Arthur added a comment - It does appear to be synchronized, so that lock can be removed https://github.com/apache/log4j/blob/trunk/src/main/java/org/apache/log4j/AppenderSkeleton.java#L228 I patched against 0.7 since that's what we have in production (and where the fix was needed). However, looking at 0.8, it doesn't appear that there is an option to connect via ZK. I see that it was removed via KAFKA-369 . What was the reasoning for this? In our deployment (as with many others, I'm sure), ZooKeeper is the only globally known thing so it is very useful to be able to connect to Kafka by way of ZooKeeper.
        Hide
        David Arthur added a comment -

        v2 of the patch

        Show
        David Arthur added a comment - v2 of the patch
        David Arthur made changes -
        Hide
        Jun Rao added a comment -

        Thanks for patch v2. Committed to 0.7 branch.

        In 0.8, the producer issus metadata requests to the brokers to figure out the leader of each partition. So, the producer doesn't need to know ZK. Instead, it needs to know a subset of brokers. One can also use a VIP in front of the brokers.

        Show
        Jun Rao added a comment - Thanks for patch v2. Committed to 0.7 branch. In 0.8, the producer issus metadata requests to the brokers to figure out the leader of each partition. So, the producer doesn't need to know ZK. Instead, it needs to know a subset of brokers. One can also use a VIP in front of the brokers.
        Jun Rao made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Assignee David Arthur [ mumrah ]
        Resolution Fixed [ 1 ]
        Joe Stein made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            David Arthur
            Reporter:
            nicu marasoiu
          • Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development