For some time now the index subsystem has been a pain point and in large part this is due to the way that the APIs and principal classes have grown organically over the years. It would be a good idea to conduct a wholesale review of the area and see if we can come up with something a bit more coherent.
A few starting points:
- There's a lot in AbstractPerColumnSecondaryIndex & its subclasses which could be pulled up into SecondaryIndexSearcher (note that to an extent, this is done in
- SecondayIndexManager is overly complex and several of its functions should be simplified/re-examined. The handling of which columns are indexed and index selection on both the read and write paths are somewhat dense and unintuitive.
- The SecondaryIndex class hierarchy is rather convoluted and could use some serious rework.
There are a number of outstanding tickets which we should be able to roll into this higher level one as subtasks (but I'll defer doing that until getting into the details of the redesign):
Whilst they're not hard dependencies, I propose that this be done on top of both
CASSANDRA-8099 and CASSANDRA-6717. The former largely because the storage engine changes may facilitate a friendlier index API, but also because of the changes to SIS mentioned above. As for 6717, the changes to schema tables there will help facilitate CASSANDRA-7771.