Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-9967

Determine if a Materialized View is finished building, without having to query each node

    Details

      Description

      Since MVs are eventually consistent with its base table, It would nice if we could easily know the state of the MV after its creation, so we could wait until the MV is built before doing some operations.

      // cc Matthias Broecheler T Jake Luciani Carl Yeksigian Ryan McGuire

        Issue Links

          Activity

          Hide
          szhou Simon Zhou added a comment -

          Carl Yeksigian Is it possible to backport this patch to 3.0? Thanks.

          Show
          szhou Simon Zhou added a comment - Carl Yeksigian Is it possible to backport this patch to 3.0? Thanks.
          Hide
          carlyeks Carl Yeksigian added a comment -

          Rebased and committed as 21448c5.

          Thanks for the review, Joel Knighton!

          Show
          carlyeks Carl Yeksigian added a comment - Rebased and committed as 21448c5 . Thanks for the review, Joel Knighton !
          Hide
          jkni Joel Knighton added a comment -

          CI looks clean relative to upstream. Changes look good.

          +1 - this needs a squash and rebase; feel free to set to "Ready to Commit" after that.

          Show
          jkni Joel Knighton added a comment - CI looks clean relative to upstream. Changes look good. +1 - this needs a squash and rebase; feel free to set to "Ready to Commit" after that.
          Hide
          carlyeks Carl Yeksigian added a comment -

          I just pushed an update to address Joel's comments and CI is rerunning now.

          Show
          carlyeks Carl Yeksigian added a comment - I just pushed an update to address Joel's comments and CI is rerunning now.
          Hide
          carlyeks Carl Yeksigian added a comment -

          I haven't gotten back to this yet, but I will in the next couple of days.

          Show
          carlyeks Carl Yeksigian added a comment - I haven't gotten back to this yet, but I will in the next couple of days.
          Hide
          jbellis Jonathan Ellis added a comment -

          Carl, how is this looking?

          Show
          jbellis Jonathan Ellis added a comment - Carl, how is this looking?
          Hide
          jjordan Jeremiah Jordan added a comment - - edited

          AFAIK the goal of executeInternal is to only execute on the current node and not be a distributed query.

          Show
          jjordan Jeremiah Jordan added a comment - - edited AFAIK the goal of executeInternal is to only execute on the current node and not be a distributed query.
          Hide
          jkni Joel Knighton added a comment -

          Sorry for the delay.

          utests and dtests look clean compared to upstream. The added dtests pass locally for me. A dtest that covers calling the command after dropping a view would be nice.

          In SystemKeyspace

          • Now might be a good time to fix the comment above finishViewBuildStatus that we discussed ("the build succeeded, but the view build failed")
          • Changing viewBuiltReplicated to setViewBuiltReplicated better matches the naming convention used elsewhere in the class
          • The forceBlockingFlush at end of viewBuiltReplicated is redundant with the call at the end of setViewBuilt

          In QueryProcessor

          • As we discussed, there's other improvements to be made in QueryProcessor for clarity. It seems to me that the executeInternal you added might not be appropriately named - it calls statement.execute rather than statement.executeInternal. (It's not entirely clear whether the internal in QueryProcessor.executeInternal is to reflect that it uses internal QueryState or executeInternal on PreparedStatement, but it seems to me to indicate the latter.)

          In StorageService

          • In getViewBuildStatuses, retrieving the hostIdToEndpoint map at a different time than calling getEndpointToHostIdMapForReading() opens up a race with changes in TokenMetadata that could result in the endpoint being null. It seems to me cleaner to retrieve the endpointToHostIdMapForReading and iterate over its entries rather than maintaining two separate maps.
          Show
          jkni Joel Knighton added a comment - Sorry for the delay. utests and dtests look clean compared to upstream. The added dtests pass locally for me. A dtest that covers calling the command after dropping a view would be nice. In SystemKeyspace Now might be a good time to fix the comment above finishViewBuildStatus that we discussed ("the build succeeded, but the view build failed") Changing viewBuiltReplicated to setViewBuiltReplicated better matches the naming convention used elsewhere in the class The forceBlockingFlush at end of viewBuiltReplicated is redundant with the call at the end of setViewBuilt In QueryProcessor As we discussed, there's other improvements to be made in QueryProcessor for clarity. It seems to me that the executeInternal you added might not be appropriately named - it calls statement.execute rather than statement.executeInternal . (It's not entirely clear whether the internal in QueryProcessor.executeInternal is to reflect that it uses internal QueryState or executeInternal on PreparedStatement, but it seems to me to indicate the latter.) In StorageService In getViewBuildStatuses , retrieving the hostIdToEndpoint map at a different time than calling getEndpointToHostIdMapForReading() opens up a race with changes in TokenMetadata that could result in the endpoint being null. It seems to me cleaner to retrieve the endpointToHostIdMapForReading and iterate over its entries rather than maintaining two separate maps.
          Hide
          carlyeks Carl Yeksigian added a comment -

          I've pushed a version which uses the system_distributed keyspace to do this.

          trunk
          branch
          utest
          dtest

          Also added some dtests.

          Show
          carlyeks Carl Yeksigian added a comment - I've pushed a version which uses the system_distributed keyspace to do this. trunk branch utest dtest Also added some dtests .
          Hide
          carlyeks Carl Yeksigian added a comment -

          A few ideas if someone wants to pick this up:

          • We should use the system_distributed keyspace for this, and I think the primary key should be: (table_id, host_id)
          • We should retry updating the table if we don't succeed, and make sure that on startup we have captured all of the builds that we have locally in the distributed table
          • We need to make sure that we handle node membership properly
          • We should make sure that we set the exit code if the view isn't built yet
          Show
          carlyeks Carl Yeksigian added a comment - A few ideas if someone wants to pick this up: We should use the system_distributed keyspace for this, and I think the primary key should be: (table_id, host_id) We should retry updating the table if we don't succeed, and make sure that on startup we have captured all of the builds that we have locally in the distributed table We need to make sure that we handle node membership properly We should make sure that we set the exit code if the view isn't built yet
          Hide
          jbellis Jonathan Ellis added a comment -

          I don't think we can scope creep the original request ("without having to query each node") into 3.0.

          I don't think we want every node in the cluster querying every other node for MV progress. And we can't just assume that originating coordinator is the master, because it might go down before the MV is finished. Which means we need to elect a mv master to do this. (Or better yet a push model instead of pull, which also requires master election.)

          Pushing to 3.x unless someone has a better idea very very quickly.

          Show
          jbellis Jonathan Ellis added a comment - I don't think we can scope creep the original request ("without having to query each node") into 3.0. I don't think we want every node in the cluster querying every other node for MV progress. And we can't just assume that originating coordinator is the master, because it might go down before the MV is finished. Which means we need to elect a mv master to do this. (Or better yet a push model instead of pull, which also requires master election.) Pushing to 3.x unless someone has a better idea very very quickly.
          Hide
          carlyeks Carl Yeksigian added a comment -

          This is stored in the local system.built_materialized_views table.

          SELECT * FROM system.built_materialized_view WHERE keyspace_name='<ks>' AND view_name='<view>'
          
          Show
          carlyeks Carl Yeksigian added a comment - This is stored in the local system.built_materialized_views table. SELECT * FROM system.built_materialized_view WHERE keyspace_name='<ks>' AND view_name='<view>'
          Hide
          jbellis Jonathan Ellis added a comment -

          For 3.0 you can query each node's local state. (Carl Yeksigian, can you explain how?)

          For 3.x I agree it would be useful to simplify this.

          Show
          jbellis Jonathan Ellis added a comment - For 3.0 you can query each node's local state. ( Carl Yeksigian , can you explain how?) For 3.x I agree it would be useful to simplify this.

            People

            • Assignee:
              carlyeks Carl Yeksigian
              Reporter:
              aboudreault Alan Boudreault
              Reviewer:
              Joel Knighton
              Tester:
              Alan Boudreault
            • Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development