Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-4296

NullPointerException when ClientCnxnSocketNetty is closed without being opened

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 3.5.9, 3.5.3, 3.6.3, 3.6.2
    • Fix Version/s: None
    • Component/s: None

      Description

      I believe this bug was originally reported as ZOOKEEPER-2966 but that was closed as not reproducible in February 2019. I left a comment with these details on that issue in December. I can create a PR with a fix at some point this week.

       

      In ZooKeeper 3.6.2, in the context of the SolrJ client, we hit the NPE reported on ZOOKEEPER-2966 when a DNS error causes an exception after the SolrZkClient trys to connect to ZooKeeper, but then immediately calls close on the ClientCnxn https://github.com/apache/solr/blob/releases/lucene-solr%2F8.7.0/solr/solrj/src/java/org/apache/solr/common/cloud/SolrZkClient.java#L158-L204.

      java.lang.NullPointerException: null
              at org.apache.zookeeper.ClientCnxnSocketNetty.onClosing(ClientCnxnSocketNetty.java:247) ~[zookeeper-3.6.2.jar:3.6.2]
              at org.apache.zookeeper.ClientCnxn$SendThread.close(ClientCnxn.java:1445) ~[zookeeper-3.6.2.jar:3.6.2]
              at org.apache.zookeeper.ClientCnxn.disconnect(ClientCnxn.java:1488) ~[zookeeper-3.6.2.jar:3.6.2]
              at org.apache.zookeeper.ClientCnxn.close(ClientCnxn.java:1517) ~[zookeeper-3.6.2.jar:3.6.2]
              at org.apache.zookeeper.ZooKeeper.close(ZooKeeper.java:1614) ~[zookeeper-3.6.2.jar:3.6.2]
              at org.apache.solr.common.cloud.SolrZooKeeper.close(SolrZooKeeper.java:97) ~[solr-solrj-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:39:18]
              at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:198) ~[solr-solrj-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:39:18]
              at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:127) ~[solr-solrj-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:39:18]
              at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:122) ~[solr-solrj-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:39:18]
              at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:109) ~[solr-solrj-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:39:18]
      

      This happens if the ClientCnxnSocketNetty's onClosing() is called before connect(...) (or if connect isn't called at all) because the firstConnect CountDownLatch is only initialized in connect(...).
      https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxnSocketNetty.java#L129
      https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxnSocketNetty.java#L247
      A null check in onClosing() will fix it, but I don't know if there's any greater change required, e.g. some synchronization around connect and onClosing.

      The code in 3.5.3 looks very similar, it looks like it's been present since the initial commit of ClientCnxnSocketNetty.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                cjcowie Colvin Cowie
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m