Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-18739

UDF functions fail to load on rolling restart

    XMLWordPrintableJSON

Details

    Description

      UDFs fail to reload properly after a rolling restart.

      Symptom:

      NPE thrown when used after restart.

      Steps to recreate:

      1. Create a cluster as per cql file
      2. Populate the cluster with data.cql.
      3. Execute SELECT city_measurements(city, measurement, 16.5) AS m FROM current
      4. expect min and max values for cities.
      5. Performing a rolling restart on one server.
      6. When the server is back up
      7. Execute SELECT city_measurements(city, measurement, 16.5) AS m FROM current
      8. expect: error result with NPE message.

      Analysis:

      During system restart the SchemaKeyspace.fetchNonSystemKeyspaces() is called, when a keyspace with a UDF is loaded the SchemaKeyspace method createUDFFromRow() is called, this in turn calls UDFunction.create() which eventually calls back to UDFunction constructor where the Schema.instance.getKeyspaceMetadata() is called with the keyspace for the UDF name as the argument. However, the keyspace for the UDF name is being constructed and is not yet in the instance so the method returns null for the KeyspaceMetadata. That null KeyspaceMetadata is then used in the udfContext.

      Later when the UDF method is called, if there is a need to call a method on the keyspaceMetadata, such as udfContext.newUDTValue() where the implementation uses keyspaceMetadata.types, a null pointer is thrown.

      I have verified this affects version 4.0, 4.1 and trunk. I have not verified 3.x but I suspect it is the same there.

      I modified UDFunction constructor to assert that the metadata was not null and received the following stack trace

      ERROR [main] 2023-08-09 11:44:46,408 CassandraDaemon.java:911 - Exception encountered during startup
      java.lang.AssertionError: No metadata for temperatures.city_measurements_sfunc
          at org.apache.cassandra.cql3.functions.UDFunction.<init>(UDFunction.java:240)
          at org.apache.cassandra.cql3.functions.JavaBasedUDFunction.<init>(JavaBasedUDFunction.java:195)
          at org.apache.cassandra.cql3.functions.UDFunction.create(UDFunction.java:276)
          at org.apache.cassandra.schema.SchemaKeyspace.createUDFFromRow(SchemaKeyspace.java:1182)
          at org.apache.cassandra.schema.SchemaKeyspace.fetchUDFs(SchemaKeyspace.java:1131)
          at org.apache.cassandra.schema.SchemaKeyspace.fetchFunctions(SchemaKeyspace.java:1119)
          at org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspace(SchemaKeyspace.java:859)
          at org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspacesWithout(SchemaKeyspace.java:848)
          at org.apache.cassandra.schema.SchemaKeyspace.fetchNonSystemKeyspaces(SchemaKeyspace.java:836)
          at org.apache.cassandra.schema.Schema.loadFromDisk(Schema.java:132)
          at org.apache.cassandra.schema.Schema.loadFromDisk(Schema.java:121)
          at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:287)
          at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:765)
          at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:889)

       

      Possible solution:

      Version 4.x

      Create a KeyspaceMetadata.Builder class that uses accepts the types, tables and views but uses a builder for the functions.

      Add a KeyspaceMetadata constructor to accept the KeyspaceMetadata.Builder so that the function builder keyspaceMetadata value can be set correctly during construction of the KeyspaceMetadata.

      Modify SchemaKeyspace.fetchKeyspace(string) so that it uses the KeyspaceMetadata.Builder.

       

      Version 5.x

      Similar to 4.x except that the KeyspaceMetadata.Builder will have to have builders for Views and Tables because the functions necessary to construct those objects will not be available until the KeyspaceMetadata.Builder constructs it.

       

      Attachments

        1. udf_error.cql
          1 kB
          Claude Warren
        2. udf_error_data.cql
          9 kB
          Claude Warren

        Activity

          People

            claude Claude Warren
            claude Claude Warren
            Claude Warren, Stefan Miklosovic
            Andres de la Peña, Stefan Miklosovic
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 2h 40m
                2h 40m