Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-4296

NullPointerException when ClientCnxnSocketNetty is closed without being opened

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 3.5.9, 3.5.3, 3.6.3, 3.6.2
    • 3.9.0
    • None

    Description

      I believe this bug was originally reported as ZOOKEEPER-2966 but that was closed as not reproducible in February 2019. I left a comment with these details on that issue in December. I can create a PR with a fix at some point this week.

       

      In ZooKeeper 3.6.2, in the context of the SolrJ client, we hit the NPE reported on ZOOKEEPER-2966 when a DNS error causes an exception after the SolrZkClient trys to connect to ZooKeeper, but then immediately calls close on the ClientCnxn https://github.com/apache/solr/blob/releases/lucene-solr%2F8.7.0/solr/solrj/src/java/org/apache/solr/common/cloud/SolrZkClient.java#L158-L204.

      java.lang.NullPointerException: null
              at org.apache.zookeeper.ClientCnxnSocketNetty.onClosing(ClientCnxnSocketNetty.java:247) ~[zookeeper-3.6.2.jar:3.6.2]
              at org.apache.zookeeper.ClientCnxn$SendThread.close(ClientCnxn.java:1445) ~[zookeeper-3.6.2.jar:3.6.2]
              at org.apache.zookeeper.ClientCnxn.disconnect(ClientCnxn.java:1488) ~[zookeeper-3.6.2.jar:3.6.2]
              at org.apache.zookeeper.ClientCnxn.close(ClientCnxn.java:1517) ~[zookeeper-3.6.2.jar:3.6.2]
              at org.apache.zookeeper.ZooKeeper.close(ZooKeeper.java:1614) ~[zookeeper-3.6.2.jar:3.6.2]
              at org.apache.solr.common.cloud.SolrZooKeeper.close(SolrZooKeeper.java:97) ~[solr-solrj-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:39:18]
              at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:198) ~[solr-solrj-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:39:18]
              at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:127) ~[solr-solrj-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:39:18]
              at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:122) ~[solr-solrj-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:39:18]
              at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:109) ~[solr-solrj-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:39:18]
      

      This happens if the ClientCnxnSocketNetty's onClosing() is called before connect(...) (or if connect isn't called at all) because the firstConnect CountDownLatch is only initialized in connect(...).
      https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxnSocketNetty.java#L129
      https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxnSocketNetty.java#L247
      A null check in onClosing() will fix it, but I don't know if there's any greater change required, e.g. some synchronization around connect and onClosing.

      The code in 3.5.3 looks very similar, it looks like it's been present since the initial commit of ClientCnxnSocketNetty.

      Attachments

        Issue Links

          Activity

            People

              eolivelli Enrico Olivelli
              colvinco Colvin Cowie
              Votes:
              2 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1.5h
                  1.5h