Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-12013

NullPointerException is thrown by ExchangeLatchManager during cache creation

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.7
    • 2.9, 2.8.1
    • None
    • None

    Description

      NullPointerException may be thrown during cluster topology change:

      [14:15:49,820][SEVERE][exchange-worker-#63][GridDhtPartitionsExchangeFuture] Failed to reinitialize local partitions (rebalancing will be stopped): GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=468, minorTopVer=1], discoEvt=DiscoveryCustomEvent [customMsg=DynamicCacheChangeBatch [id=728f11e1c61-11d31f36-508d-47e0-9a9c-d4f5a270948d, reqs=[DynamicCacheChangeRequest [cacheName=SQL_PUBLIC_UPRIYA_112093_TB, hasCfg=true, nodeId=10a0b1a4-09bb-4aa6-81e0-537a6431283b, clientStartOnly=false, stop=false, destroy=false, disabledAfterStartfalse]], exchangeActions=ExchangeActions [startCaches=[SQL_PUBLIC_UPRIYA_112093_TB], stopCaches=null, startGrps=[SQL_PUBLIC_UPRIYA_112093_TB], stopGrps=[], resetParts=null, stateChangeRequest=null], startCaches=false], affTopVer=AffinityTopologyVersion [topVer=468, minorTopVer=1], super=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=10a0b1a4-09bb-4aa6-81e0-537a6431283b, addrs=[0:0:0:0:0:0:0:1%lo, 10.244.1.100, 127.0.0.1], sockAddrs=[/10.244.1.100:0, /0:0:0:0:0:0:0:1%lo:0, /127.0.0.1:0], discPort=0, order=39, intOrder=27, lastExchangeTime=1563872413854, loc=false, ver=2.7.0#20181130-sha1:256ae401, isClient=true], topVer=468, nodeId8=6a076901, msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1563891349722]], nodeId=10a0b1a4, evt=DISCOVERY_CUSTOM_EVT]
      java.lang.NullPointerException
      at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.latch.ExchangeLatchManager.canSkipJoiningNodes(ExchangeLatchManager.java:327)
      at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.distributedExchange(GridDhtPartitionsExchangeFuture.java:1401)
      at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:806)
      at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:2667)
      at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2539)
      at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
      at java.lang.Thread.run(Thread.java:745)
      

      The original topic on the user-list: http://apache-ignite-users.70518.x6.nabble.com/Ignite-2-7-0-server-node-null-pointer-exception-td28899.html

      RESOLUTION
      It seems that the reason for the issue is a small value of IGNITE_DISCOVERY_HISTORY_SIZE ( smaller than the number of nodes joining/left the cluster simultaneously). I could not reproduce the issue with the default values of TcpDiscoverySpi#topHistSize and IGNITE_DISCOVERY_HISTORY_SIZE. I assume that this property was changed by the user.

      So, NullPointerException was changed to IgniteException with the appropriate message which provides a hint to resolve the issue. Perhaps, it would be a good idea to change the implementation of ExchangeLatchManager in the way of using DiscoCache instance instead of AffinityTopologyVersion. This approach has pros and cons, so it requires additional investigation.

      Attachments

        1. ignitenullpointer.log
          169 kB
          Vyacheslav Koptilin

        Issue Links

          Activity

            People

              slava.koptilin Vyacheslav Koptilin
              slava.koptilin Vyacheslav Koptilin
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m