Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-15199

Cassandra throwing occasional NPE 3.11.x

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Normal
    • Resolution: Unresolved
    • None
    • Messaging/Client
    • None
    • Availability - Unavailable
    • Normal
    • Normal
    • User Report
    • All
    • None

    Description

      Hey folks, decided to raise an official Jira(never done one of these before) about an issue we have found between Kong API Gateway leveraging Cassandra as our db.

      We run a C* cluster in 2 close DCs, 6 C* nodes total, 3 in each DC. Data replicated across all nodes. We have found C* sometimes throws NPEs based on the calls made by the lua-cassandra driver Kong leverages(https://github.com/thibaultcha/lua-cassandra). Very specifically it seems to occur when attempting to do paging across multiple C* nodes. When persistently paging to a single C* node we can't reproduce NPEs in C*.

      The exact Error C* throws with its stack-trace can be seen here:

      https://github.com/Kong/kong/issues/4194#issuecomment-497572751

      And again when we tried to upgrade from 3.11.2 to 3.11.4 in hopes it was already resolved:

      https://github.com/Kong/kong/issues/4194#issuecomment-497590235

      Same error same line numbers, so code must be same in this portion.

       

      Sample of our C* Config: https://github.com/Kong/kong/issues/4194#issuecomment-497595766

      We discussed this in the ASF Slack flow as well:

      ASF Slack discussion (https://the-asf.slack.com/archives/CJZLTM05A/p1559321422028200 )

      Some of the more important technical comments I saw people posting here:

      https://github.com/Kong/kong/issues/4194#issuecomment-497858824

       

      I am not a C* DBA so I don't have an exact repro for you other than stating what the client application was attempting to do(Paging across multiple C* nodes within a DC) when we could see the failures. Any Apache C* folk think they see the issue or could drop me C* JAR with extra debugging print statements I could run in dev to help feed you more info? Or if you see the problem and can one shot a fix so no more NPE and Cassandra responds appropriately to the client with some sort of error message around what was wrong that would be insightful.

       

      Thanks!

      Attachments

        Activity

          People

            Unassigned Unassigned
            jeremyjustus0916 Jeremy Justus
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: