Thanks James Taylor for the reveiws!
I think we can get away with just one additional column in SYSTEM.CATALOG. I don't think we need INDEX_NEED_PARTIALLY_REBUILD. Just use a INDEX_DISABLE_TIMESTAMP value of 0 or null to know that a rebuild is not necessary.
You're right. I was thinking a use case to temporally disable rebuild of a disabled index. If we overload the column INDEX_DISABLE_TIMESTAMP, we have to rebuild the whole index because its value will be 0 at that point. What do you think? I can remove INDEX_NEED_PARTIALLY_REBUILD if such use case isn't needed.
Did you run into any issues opening a Phoenix JDBC connection from the server-side in MetaDataRegionObserver? It would add a new dependency on the antlr jar being available on the server-side. Plus, is everything available from a coprocessor that we need (i.e. can it act just like an HBase client)?
That's a good point. I can add the dependency on antlar for phone-core jar. Each RS can act as an HBase client.
Is the change from calling recoveryWriter.writeAndKillYourselfOnFailure(indexUpdates) to unconditionally calling recoveryWriter.write(indexUpdates) intentional?
Yes, that's by intention. The reason we abort server during normal write path is that we write updates into WAL firstly, then send them to index region server and commit changes on current data region server. Since changes are already in WAL and we have to roll-forward, we can only abort RS to avoid inconsistency read between index & data region.
While in recovery, no new WAL and data region isn't online(no inconsistency issue because only index region is online) so no need to abort data region server again because data region is already offline anyway.
During the whole recovered edits replay, Index region at most will see one new change nor at all. This is all right because in normal situation index region can have one change ahead of data region. Therefore when we failed to update index during recovery, we can just let exception bubble up to fail the data region open and the region will be re-assigned somewhere and retry WAL edits replay later.
Can you please use static constants for config parameter names (define them in QueryServices with the others) and static constants for default values (define them in QueryServicesOptions)?
Would you mind filing a subtask to update the secondary index docs?
Sure, let me do that. Thanks.