Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-18714

Expand CQLSSTableWriter to write SSTable-attached secondary indexes

    XMLWordPrintableJSON

Details

    Description

      CQLSSTableWriter currently has no way of writing any secondary indexes inline as it writes the core SSTable components. With SAI, this has become tractable problem, and we should be able to enhance both it and SSTableImporter to handle cases where we might want to write SSTables somewhere in bulk (and in parallel) and then import them without waiting for index building on import. It would require the following changes:

      1.) CQLSSTableWriter must accept 2i definitions on top of its current table schema definition. Once added to the schema, any ColumnFamilyStore instances opened will have those 2i defined in their index managers.

      2.) All AbstractSSTableSimpleWriter instances must register index groups, allowing the proper SSTableFlushObservers to be attached to SSTableWriter. Once this is done, SAI (and any other SSTable-attached indexes) components will be built incrementally along w/ the SSTable data file, and will be finalized when the newly written SSTable is finalized.

      3.) Provide an example (in a unit test?) of how a third-party tool might, assuming access to the right C* JAR, validate/checksum SAI components outside C* proper.

      4.) SSTableImporter should have two new options:
      a.) an option that fails import if any SSTable-attached 2i must be built (i.e. has not already been built and brought along w/ the other new SSTable components)
      b.) an option that allows us to bypass full checksum validation on imported/already-built SSTable-attached indexes (assuming they have just been written by CQLSSTableWriter)

      Attachments

        Issue Links

          Activity

            People

              smiklosovic Stefan Miklosovic
              maedhroz Caleb Rackliffe
              Stefan Miklosovic
              Caleb Rackliffe, Doug Rohrer
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 14h
                  14h