Details

      Description

      We have recently released new secondary index engine (https://github.com/xedin/sasi) build using SecondaryIndex API, there are still couple of things to work out regarding 3.x since it's currently targeted on 2.0 released. I want to make this an umbrella issue to all of the things related to integration of SASI, which are also tracked in sasi_issues, into mainline Cassandra 3.x release.

        Issue Links

          Activity

          Hide
          dbrosius Dave Brosius added a comment -

          Dave Brosius Can you expand further why someMapEntry.equals(someAbstractTrie) will always be false ?

          darn it, i misread. the equals isn't on AbstractTrie, but on a subclass. so yes you are right. Sorry for the noise.

          Show
          dbrosius Dave Brosius added a comment - Dave Brosius Can you expand further why someMapEntry.equals(someAbstractTrie) will always be false ? darn it, i misread. the equals isn't on AbstractTrie, but on a subclass. so yes you are right. Sorry for the noise.
          Hide
          doanduyhai DOAN DuyHai added a comment -

          Dave Brosius Can you expand further why someMapEntry.equals(someAbstractTrie) will always be false ?

          According to the contract of Map.Entry::equals, as long as the key and value are equal, the equality holds.

          I've tried an unit test and it works:

          public class AbstractTrieTest
          {
              
              @Test
              public void should_test_equality() throws Exception {
                  Map<String, Long> map = new HashMap<>();
                  map.put("10", 10L);
          
                  final Map.Entry<String, Long> mapEntry = map.entrySet().iterator().next();
          
                  final AbstractTrie.BasicEntry<String, Long> trieEntry = new AbstractPatriciaTrie.TrieEntry<>("10", 10L, 0);
          
                  Assert.assertTrue("mapEntry.equals(trieEntry)", mapEntry.equals(trieEntry));
              }
          }
          
          % ant testsome -Dtest.name=org.apache.cassandra.index.sasi.utils.AbstractTrieTest -Dtest.methods=should_test_equality                   
          ...
          testsome:
              [junit] WARNING: multiple versions of ant detected in path for junit
              [junit]          jar:file:/usr/local/Cellar/ant/1.9.4/libexec/lib/ant.jar!/org/apache/tools/ant/Project.class
              [junit]      and jar:file:/Users/archinnovinfo/perso/cassandra/build/lib/jars/ant-1.9.4.jar!/org/apache/tools/ant/Project.class
              [junit] Testsuite: org.apache.cassandra.index.sasi.utils.AbstractTrieTest
              [junit] Testsuite: org.apache.cassandra.index.sasi.utils.AbstractTrieTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.058 sec
          
          Show
          doanduyhai DOAN DuyHai added a comment - Dave Brosius Can you expand further why someMapEntry.equals(someAbstractTrie) will always be false ? According to the contract of Map.Entry::equals , as long as the key and value are equal, the equality holds. I've tried an unit test and it works: public class AbstractTrieTest { @Test public void should_test_equality() throws Exception { Map< String , Long > map = new HashMap<>(); map.put( "10" , 10L); final Map.Entry< String , Long > mapEntry = map.entrySet().iterator().next(); final AbstractTrie.BasicEntry< String , Long > trieEntry = new AbstractPatriciaTrie.TrieEntry<>( "10" , 10L, 0); Assert.assertTrue( "mapEntry.equals(trieEntry)" , mapEntry.equals(trieEntry)); } } % ant testsome -Dtest.name=org.apache.cassandra.index.sasi.utils.AbstractTrieTest -Dtest.methods=should_test_equality ... testsome: [junit] WARNING: multiple versions of ant detected in path for junit [junit] jar:file:/usr/local/Cellar/ant/1.9.4/libexec/lib/ant.jar!/org/apache/tools/ant/Project.class [junit] and jar:file:/Users/archinnovinfo/perso/cassandra/build/lib/jars/ant-1.9.4.jar!/org/apache/tools/ant/Project.class [junit] Testsuite: org.apache.cassandra.index.sasi.utils.AbstractTrieTest [junit] Testsuite: org.apache.cassandra.index.sasi.utils.AbstractTrieTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.058 sec
          Hide
          dbrosius Dave Brosius added a comment -

          AbstractTrie.equals isn't symmetric..

          someAbstractTrie.equals(someMapEntry) could be true, but

          someMapEntry.equals(someAbstractTrie) will always be false.

          that is unstable.

          Show
          dbrosius Dave Brosius added a comment - AbstractTrie.equals isn't symmetric.. someAbstractTrie.equals(someMapEntry) could be true, but someMapEntry.equals(someAbstractTrie) will always be false. that is unstable.
          Hide
          giaosudau Chanh Le added a comment -

          Pavel Yaskevich Thank man. You got my day.

          Show
          giaosudau Chanh Le added a comment - Pavel Yaskevich Thank man. You got my day.
          Hide
          xedin Pavel Yaskevich added a comment - - edited

          Hi Chanh Le, the name of the index class is 'org.apache.cassandra.index.sasi.SASIIndex' you most likely reading documentation specific for 2.0, here is the updated doc https://github.com/apache/cassandra/blob/trunk/doc/SASI.md, it resides in doc/ folder of Apache Cassandra distribution.

          Edit: also NonTokenizingAnalyzer is located in 'org.apache.cassandra.index.sasi.analyzer' as well.

          Show
          xedin Pavel Yaskevich added a comment - - edited Hi Chanh Le , the name of the index class is 'org.apache.cassandra.index.sasi.SASIIndex' you most likely reading documentation specific for 2.0, here is the updated doc https://github.com/apache/cassandra/blob/trunk/doc/SASI.md , it resides in doc/ folder of Apache Cassandra distribution. Edit: also NonTokenizingAnalyzer is located in 'org.apache.cassandra.index.sasi.analyzer' as well.
          Hide
          giaosudau Chanh Le added a comment -

          Hi I am using cassandra 3.5 and I have problem when create index with that.
          CREATE CUSTOM INDEX ON bar (fname) USING
          'org.apache.cassandra.db.index.SSTableAttachedSecondaryIndex'
          WITH OPTIONS = {
          'analyzer_class':
          'org.apache.cassandra.db.index.sasi.analyzer.NonTokenizingAnalyzer',
          'case_sensitive': 'false'
          };

          it throws: unable to find custom indexer class 'org.apache.cassandra.db.index.SSTableAttachedSecondaryIndex

          Show
          giaosudau Chanh Le added a comment - Hi I am using cassandra 3.5 and I have problem when create index with that. CREATE CUSTOM INDEX ON bar (fname) USING 'org.apache.cassandra.db.index.SSTableAttachedSecondaryIndex' WITH OPTIONS = { 'analyzer_class': 'org.apache.cassandra.db.index.sasi.analyzer.NonTokenizingAnalyzer', 'case_sensitive': 'false' }; it throws: unable to find custom indexer class 'org.apache.cassandra.db.index.SSTableAttachedSecondaryIndex
          Hide
          xedin Pavel Yaskevich added a comment -

          Yes, it does account for the range specified and even not going to touch unnecessary SSTables which are not in the range.

          Show
          xedin Pavel Yaskevich added a comment - Yes, it does account for the range specified and even not going to touch unnecessary SSTables which are not in the range.
          Hide
          doanduyhai DOAN DuyHai added a comment -

          Last but not least. We have an use-case for secondary index that is pretty interesting. With the Spark/Cassandra connector, since each Spark partition is distributed according to Cassandra token ranges, each CQL query is node local.

          Consequently a

          SELECT * FROM table WHERe col=xxx`
          

          is transformed by the connector into

          SELECT * FROM table WHERE col=xxx AND token(pk)>=aaa AND token(pk) <= bbb
          

          The normal CQL query engine is token-aware and will restrict the secondary index query on the given range. Does SASI query planner also take into account token range restrictions to avoid hitting un-necessary nodes ?

          Show
          doanduyhai DOAN DuyHai added a comment - Last but not least. We have an use-case for secondary index that is pretty interesting. With the Spark/Cassandra connector, since each Spark partition is distributed according to Cassandra token ranges, each CQL query is node local . Consequently a SELECT * FROM table WHERe col=xxx` is transformed by the connector into SELECT * FROM table WHERE col=xxx AND token(pk)>=aaa AND token(pk) <= bbb The normal CQL query engine is token-aware and will restrict the secondary index query on the given range. Does SASI query planner also take into account token range restrictions to avoid hitting un-necessary nodes ?
          Hide
          xedin Pavel Yaskevich added a comment -

          Pushed as squashed commit 72790dc. I'm going to resolve this issue and promote CASSANDRA-10765 from sub-task.

          Show
          xedin Pavel Yaskevich added a comment - Pushed as squashed commit 72790dc . I'm going to resolve this issue and promote CASSANDRA-10765 from sub-task.
          Hide
          xedin Pavel Yaskevich added a comment -

          So is this stuff actually ready to release? I mean, consistent with the new philosophy that "trunk is always releasable"? IOW, if it does get committed, it will be in 3.4 no matter what? I only ask because it just seemed that there was stuff in flux fairly recently (a couple days ago), suggested it wasn't quite baked enough to be considered "releasable".

          Yes, the stuff is ready to release since fairly recently added changes are ported from 2.0 and clustering support is just couple of lines of additional filtering added, no internal data structure changes, this is also opt-in feature which is irrelevant for core functionality until enabled. This is also the reason why we don't want do any of the CQL front-end related changes right away but rather more gradual migration.

          Show
          xedin Pavel Yaskevich added a comment - So is this stuff actually ready to release? I mean, consistent with the new philosophy that "trunk is always releasable"? IOW, if it does get committed, it will be in 3.4 no matter what? I only ask because it just seemed that there was stuff in flux fairly recently (a couple days ago), suggested it wasn't quite baked enough to be considered "releasable". Yes, the stuff is ready to release since fairly recently added changes are ported from 2.0 and clustering support is just couple of lines of additional filtering added, no internal data structure changes, this is also opt-in feature which is irrelevant for core functionality until enabled. This is also the reason why we don't want do any of the CQL front-end related changes right away but rather more gradual migration.
          Hide
          jrwest Jordan West added a comment - - edited

          Is there also a way to query a SASI-indexed column by exact value? I mean, it seems as if by enabling prefix or contains, that it will always query by prefix or contains. For example, if I want to query for full first name, like where their full first name really is "J" and not get "John" and "James" as well, while at other times I am indeed looking for names starting with a prefix of "Jo" for "John", "Joseph", etc.

          The example is correct, but this is not a limitation of SASI, its a limitation in CQL, and we decided not to further extend the grammar, since we have already had to scale back our grammar changes to later phases (removing OR, grouping, and != support for now). Ideally, `=` would mean exact match and CQL would support a `LIKE` operator similar to SQL, and depending on if the index was created with `PREFIX` or `CONTAINS` we would allow/disallow forms such as `%Jo%` or `_j%`.

          Will SPARSE mode in fact give me an exact match? (Sounds like it.) In which case, would I be better off with a SPARSE index for first_name_full, or would a traditional Cassandra non-custom index work fine (or even better.)

          It does, but so are all queries on numerical data, which thinking about it, may make the `PREFIX` option confusing for numeric types. SPARSE is intended to improve query performance on numerical data where there are a large number of terms (e.g. timestamps), but small number of keys per term (e.g. some timeseries data). `SPARSE` should not be used on every numerical column, and for most non-numerical data is not an ideal setting either. For example, in a large data set of first names the number of names will be small compared to the number of keys, and given the distribution of first names using SPARSE will increase the size of the index and at best have zero effect on query performance, but may hurt it.

          Show
          jrwest Jordan West added a comment - - edited Is there also a way to query a SASI-indexed column by exact value? I mean, it seems as if by enabling prefix or contains, that it will always query by prefix or contains. For example, if I want to query for full first name, like where their full first name really is "J" and not get "John" and "James" as well, while at other times I am indeed looking for names starting with a prefix of "Jo" for "John", "Joseph", etc. The example is correct, but this is not a limitation of SASI, its a limitation in CQL, and we decided not to further extend the grammar, since we have already had to scale back our grammar changes to later phases (removing OR, grouping, and != support for now). Ideally, `=` would mean exact match and CQL would support a `LIKE` operator similar to SQL, and depending on if the index was created with `PREFIX` or `CONTAINS` we would allow/disallow forms such as `%Jo%` or `_j%`. Will SPARSE mode in fact give me an exact match? (Sounds like it.) In which case, would I be better off with a SPARSE index for first_name_full, or would a traditional Cassandra non-custom index work fine (or even better.) It does, but so are all queries on numerical data, which thinking about it, may make the `PREFIX` option confusing for numeric types. SPARSE is intended to improve query performance on numerical data where there are a large number of terms (e.g. timestamps), but small number of keys per term (e.g. some timeseries data). `SPARSE` should not be used on every numerical column, and for most non-numerical data is not an ideal setting either. For example, in a large data set of first names the number of names will be small compared to the number of keys, and given the distribution of first names using SPARSE will increase the size of the index and at best have zero effect on query performance, but may hurt it.
          Hide
          rustyrazorblade Jon Haddad added a comment - - edited

          If sparse means what Jack is implying, perhaps a better name for it would be EXACT.

          Using SPARSE will usually result in people asking "what does that mean", and the answer will be "exact match" so I propose we just use that as it'll cut down on the number of questions people have.

          Show
          rustyrazorblade Jon Haddad added a comment - - edited If sparse means what Jack is implying, perhaps a better name for it would be EXACT. Using SPARSE will usually result in people asking "what does that mean", and the answer will be "exact match" so I propose we just use that as it'll cut down on the number of questions people have.
          Hide
          jkrupan Jack Krupansky added a comment - - edited

          Is there also a way to query a SASI-indexed column by exact value? I mean, it seems as if by enabling prefix or contains, that it will always query by prefix or contains. For example, if I want to query for full first name, like where their full first name really is "J" and not get "John" and "James" as well, while at other times I am indeed looking for names starting with a prefix of "Jo" for "John", "Joseph", etc.

          Or, can I indeed have two indexes on a single column, one a traditional exact match, and one a prefix match. Hmmm... in which case, which gets used if I just specify a column name?

          CREATE INDEX first_name_full ON mytable (first_name)...
          CREATE CUSTOM INDEX first_name_prefix ON mytable (first_name)...

          (I may be confused here - can you specify an index name in place of a column name in a relation in a SELECT/WHERE clause (SELECT... WHERE... first_name_exact = 'Joe')? I don't see any doc/spec that indicates that you can. I'm not sure why I thought that you could. But I don't see any code that detects and fails on this case at CREATE INDEX time. The code checks for "everything but name" rather than detecting two non-keys/values indexes on the same column.)

          It would be good to have an example that illustrates this. In fact, I would argue that first and last names are perfect examples of where you really do need to query on both exact match and partial match. In fact, I'm not sure I can think of any examples of non-tokenized text fields where you don't want to reserve the ability to find an exact match even if you do need partial matches for some queries.

          Will SPARSE mode in fact give me an exact match? (Sounds like it.) In which case, would I be better off with a SPARSE index for first_name_full, or would a traditional Cassandra non-custom index work fine (or even better.)

          Are there any use cases of traditional Cassandra indexes which shouldn't almost automatically be converted to SPARSE. After all, the current recommended best practice is to avoid secondary indexes where the column cardinality is either very high or very low, which seems to be a match for SPARSE, although the precise meaning of SPARSE is still a bit fuzzy for me.

          Maybe, for the first_name use case I mentioned the user would be better off with a first_name Materialized View using first_name in the PK instead of the SPARSE SASI index. In fact, by placing first_name in the partition key of the MV I could assure that all base table rows with the same first name would be on the same node.

          If all of that is true, we will need to give users some decent guidance on when to use SPARSE SASI vs. MV (vs. classic secondary... or even DSE Search.)

          Show
          jkrupan Jack Krupansky added a comment - - edited Is there also a way to query a SASI-indexed column by exact value? I mean, it seems as if by enabling prefix or contains, that it will always query by prefix or contains. For example, if I want to query for full first name, like where their full first name really is "J" and not get "John" and "James" as well, while at other times I am indeed looking for names starting with a prefix of "Jo" for "John", "Joseph", etc. Or, can I indeed have two indexes on a single column, one a traditional exact match, and one a prefix match. Hmmm... in which case, which gets used if I just specify a column name? CREATE INDEX first_name_full ON mytable (first_name)... CREATE CUSTOM INDEX first_name_prefix ON mytable (first_name)... (I may be confused here - can you specify an index name in place of a column name in a relation in a SELECT/WHERE clause (SELECT... WHERE... first_name_exact = 'Joe')? I don't see any doc/spec that indicates that you can. I'm not sure why I thought that you could. But I don't see any code that detects and fails on this case at CREATE INDEX time. The code checks for "everything but name" rather than detecting two non-keys/values indexes on the same column.) It would be good to have an example that illustrates this. In fact, I would argue that first and last names are perfect examples of where you really do need to query on both exact match and partial match. In fact, I'm not sure I can think of any examples of non-tokenized text fields where you don't want to reserve the ability to find an exact match even if you do need partial matches for some queries. Will SPARSE mode in fact give me an exact match? (Sounds like it.) In which case, would I be better off with a SPARSE index for first_name_full, or would a traditional Cassandra non-custom index work fine (or even better.) Are there any use cases of traditional Cassandra indexes which shouldn't almost automatically be converted to SPARSE. After all, the current recommended best practice is to avoid secondary indexes where the column cardinality is either very high or very low, which seems to be a match for SPARSE, although the precise meaning of SPARSE is still a bit fuzzy for me. Maybe, for the first_name use case I mentioned the user would be better off with a first_name Materialized View using first_name in the PK instead of the SPARSE SASI index. In fact, by placing first_name in the partition key of the MV I could assure that all base table rows with the same first name would be on the same node. If all of that is true, we will need to give users some decent guidance on when to use SPARSE SASI vs. MV (vs. classic secondary... or even DSE Search.)
          Hide
          jkrupan Jack Krupansky added a comment -

          So is this stuff actually ready to release? I mean, consistent with the new philosophy that "trunk is always releasable"? IOW, if it does get committed, it will be in 3.4 no matter what? I only ask because it just seemed that there was stuff in flux fairly recently (a couple days ago), suggested it wasn't quite baked enough to be considered "releasable".

          Show
          jkrupan Jack Krupansky added a comment - So is this stuff actually ready to release? I mean, consistent with the new philosophy that "trunk is always releasable"? IOW, if it does get committed, it will be in 3.4 no matter what? I only ask because it just seemed that there was stuff in flux fairly recently (a couple days ago), suggested it wasn't quite baked enough to be considered "releasable".
          Hide
          xedin Pavel Yaskevich added a comment -

          Sam Tunnicliffe Awesome, will try to do everything tomorrow, thanks!

          Show
          xedin Pavel Yaskevich added a comment - Sam Tunnicliffe Awesome, will try to do everything tomorrow, thanks!
          Hide
          beobal Sam Tunnicliffe added a comment -
          Show
          beobal Sam Tunnicliffe added a comment - Pavel Yaskevich SGTM!
          Hide
          xedin Pavel Yaskevich added a comment -

          No problem! Just to clarify a bit more - ORDER BY I mean is "real SQL" ORDER BY (the one which sorts keys) and not currently built-in one which depends on CLUSTERING ORDER, that one would still work the same way with indexes as it works right now.

          Show
          xedin Pavel Yaskevich added a comment - No problem! Just to clarify a bit more - ORDER BY I mean is "real SQL" ORDER BY (the one which sorts keys) and not currently built-in one which depends on CLUSTERING ORDER, that one would still work the same way with indexes as it works right now.
          Hide
          doanduyhai DOAN DuyHai added a comment -

          Thanks for the clarifications

          Show
          doanduyhai DOAN DuyHai added a comment - Thanks for the clarifications
          Hide
          xedin Pavel Yaskevich added a comment -

          DOAN DuyHai

          Regarding CONTAINS mode - it's more expensive to build since it has to extract the suffixes (which is exactly what search people are doing) which makes it as expensive to query as PREFIX columns. Regarding sorting - this is currently not a priority since it would require extensive changes to Cassandra interfaces to support that, MAX_ROWS currently is the same restriction per result page as in CQL3.

          Show
          xedin Pavel Yaskevich added a comment - DOAN DuyHai Regarding CONTAINS mode - it's more expensive to build since it has to extract the suffixes (which is exactly what search people are doing) which makes it as expensive to query as PREFIX columns. Regarding sorting - this is currently not a priority since it would require extensive changes to Cassandra interfaces to support that, MAX_ROWS currently is the same restriction per result page as in CQL3.
          Hide
          doanduyhai DOAN DuyHai added a comment -

          Hello Pavel Yaskevich, it's me again.

          I've had some discussion with search people and they told me that wildcard searches (name like "*xxxxx*") are very expensive. Classical data structure like suffix trees are adapted for suffix searching (name like "xxx*"). For prefix search (name like "*xxx") they're creating a reversed index. Does it mean that the CONTAINS mode (formerly named SUFFIX) is more expensive than the NORMAL search mode ? If yes, how much expensive is it (x2 ? order of magnitude ?)

          Second question, more related to the impl, since you query the nodes following the token range and do not hit all nodes like normal secondary index, does it imply that sorting (ORDER BY) is no longer relevant since you do not retrieve all possible results ? (I've seen in QueryPlan.MAX_ROWS that there is a hard-coded limit of 10 000 results)

          Sorry to annoy you with my questions but they are important so that we, evangelists, can give the right use-cases for users and especially deter them from mis-using SASI when it's not appropriate or when the search cost is prohibitive.

          Show
          doanduyhai DOAN DuyHai added a comment - Hello Pavel Yaskevich , it's me again. I've had some discussion with search people and they told me that wildcard searches (name like "*xxxxx*") are very expensive. Classical data structure like suffix trees are adapted for suffix searching (name like "xxx*"). For prefix search (name like "*xxx") they're creating a reversed index. Does it mean that the CONTAINS mode (formerly named SUFFIX) is more expensive than the NORMAL search mode ? If yes, how much expensive is it (x2 ? order of magnitude ?) Second question, more related to the impl, since you query the nodes following the token range and do not hit all nodes like normal secondary index, does it imply that sorting (ORDER BY) is no longer relevant since you do not retrieve all possible results ? (I've seen in QueryPlan.MAX_ROWS that there is a hard-coded limit of 10 000 results) Sorry to annoy you with my questions but they are important so that we, evangelists, can give the right use-cases for users and especially deter them from mis-using SASI when it's not appropriate or when the search cost is prohibitive.
          Hide
          xedin Pavel Yaskevich added a comment - - edited

          Sam Tunnicliffe How about `unfilteredCluster`? Since we are on the same page about this, here is what I'm thinking - we are going to update README.md we have in xedin/sasi and I'm going to put it into doc/SASI.md, squash all 17 commits into one and push to trunk, sounds good?

          Show
          xedin Pavel Yaskevich added a comment - - edited Sam Tunnicliffe How about `unfilteredCluster`? Since we are on the same page about this, here is what I'm thinking - we are going to update README.md we have in xedin/sasi and I'm going to put it into doc/SASI.md, squash all 17 commits into one and push to trunk, sounds good?
          Hide
          beobal Sam Tunnicliffe added a comment -

          So let's prioritize phase #2 and #3 over that for now, WDYT?

          Sounds good to me. I've only one tiny comment about the latest commits, which is that it's slightly odd API-wise that SSTableFlushObserver::nextRow takes an Unfiltered rather than a Row. Maybe renaming it to nextUnfiltered or even just flushed would make sense.

          Show
          beobal Sam Tunnicliffe added a comment - So let's prioritize phase #2 and #3 over that for now, WDYT? Sounds good to me. I've only one tiny comment about the latest commits, which is that it's slightly odd API-wise that SSTableFlushObserver::nextRow takes an Unfiltered rather than a Row . Maybe renaming it to nextUnfiltered or even just flushed would make sense.
          Hide
          xedin Pavel Yaskevich added a comment -

          Sam Tunnicliffe Thinking more about it, we should probably proceed with implementation of the clustering that I've finished last night to conclude phase #1, since it's already an improvement over internal indexes anyway and it would take some time to have TokenTree keys with flexible size implemented which is going to speed up intersections but not necessarily satisfies-by at the end of the query, so benefit of it is uncertain at this point, because we have to scan through all row clusters anyway. So let's prioritize phase #2 and #3 over that for now, WDYT?

          P.S. I have been running testall and dtest in CI (0 failures for testall, some old failures for dtest):

          branch testall dtest
          sasi-integration testall dtest
          Show
          xedin Pavel Yaskevich added a comment - Sam Tunnicliffe Thinking more about it, we should probably proceed with implementation of the clustering that I've finished last night to conclude phase #1, since it's already an improvement over internal indexes anyway and it would take some time to have TokenTree keys with flexible size implemented which is going to speed up intersections but not necessarily satisfies-by at the end of the query, so benefit of it is uncertain at this point, because we have to scan through all row clusters anyway. So let's prioritize phase #2 and #3 over that for now, WDYT? P.S. I have been running testall and dtest in CI (0 failures for testall, some old failures for dtest): branch testall dtest sasi-integration testall dtest
          Hide
          xedin Pavel Yaskevich added a comment -

          I've pushed initial version of the clustering support for SASI into sasi-3.2-integration branch, it's not as well performing yet as it can be but is still an improvement over ClusteringColumnIndex because index intersection is used instead of filtering (although intersection currently only done on the partition key instead of partition key + clustering), so if anybody wants they can start testing it.

          Show
          xedin Pavel Yaskevich added a comment - I've pushed initial version of the clustering support for SASI into sasi-3.2-integration branch, it's not as well performing yet as it can be but is still an improvement over ClusteringColumnIndex because index intersection is used instead of filtering (although intersection currently only done on the partition key instead of partition key + clustering), so if anybody wants they can start testing it.
          Hide
          xedin Pavel Yaskevich added a comment -

          Just as a quick update, I've ported in-memory index size estimation, so the only remaining thing on the list for phase #1 is clustering support.

          Show
          xedin Pavel Yaskevich added a comment - Just as a quick update, I've ported in-memory index size estimation, so the only remaining thing on the list for phase #1 is clustering support.
          Hide
          xedin Pavel Yaskevich added a comment -

          DOAN DuyHai It uses built-in facilities for it, namely PartitionRangeReadCommand in 3.x since it returns results in the token order it doesn't have to scatter-gather right away and can do what normal read commands do.

          Show
          xedin Pavel Yaskevich added a comment - DOAN DuyHai It uses built-in facilities for it, namely PartitionRangeReadCommand in 3.x since it returns results in the token order it doesn't have to scatter-gather right away and can do what normal read commands do.
          Hide
          doanduyhai DOAN DuyHai added a comment -

          Pavel Yaskevich If you have some time, can you point me to the source code (class) where SASI manages the fetching of data on other nodes in the ring ? Jason Brown told me that SASI does not use the scatter-gather technique but fetches data by token range: https://twitter.com/doanduyhai/status/662392685706289152

          Show
          doanduyhai DOAN DuyHai added a comment - Pavel Yaskevich If you have some time, can you point me to the source code (class) where SASI manages the fetching of data on other nodes in the ring ? Jason Brown told me that SASI does not use the scatter-gather technique but fetches data by token range: https://twitter.com/doanduyhai/status/662392685706289152
          Hide
          xedin Pavel Yaskevich added a comment -
          Show
          xedin Pavel Yaskevich added a comment - Pushed to 3.2-integration branch
          Hide
          beobal Sam Tunnicliffe added a comment -

          maybe we should rename ORIGINAL into PREFIX at the same time, so we'll have - PREFIX, CONTAINS, SPARSE

          +1

          Show
          beobal Sam Tunnicliffe added a comment - maybe we should rename ORIGINAL into PREFIX at the same time, so we'll have - PREFIX, CONTAINS, SPARSE +1
          Hide
          doanduyhai DOAN DuyHai added a comment -

          Looks good to me!

          Show
          doanduyhai DOAN DuyHai added a comment - Looks good to me!
          Hide
          xedin Pavel Yaskevich added a comment -

          This got me thinking - maybe we should rename ORIGINAL into PREFIX at the same time, so we'll have - PREFIX, CONTAINS, SPARSE, WDYT?

          Show
          xedin Pavel Yaskevich added a comment - This got me thinking - maybe we should rename ORIGINAL into PREFIX at the same time, so we'll have - PREFIX, CONTAINS, SPARSE, WDYT?
          Hide
          xedin Pavel Yaskevich added a comment -

          Sounds good, Doan! we can do that.

          Show
          xedin Pavel Yaskevich added a comment - Sounds good, Doan! we can do that.
          Hide
          doanduyhai DOAN DuyHai added a comment -

          A minor remark. Shouldn't we take this integration into C* opportunity to rename the SUFFIX mode to CONTAINS ?
          Indeed, NORMAL and SPARSE indexing modes are quite self-explanatory whereas SUFFIX mode not only allows searching on suffixes but also on prefixes.

          Users can be confused by the suffix name and think it only works for suffix search

          Show
          doanduyhai DOAN DuyHai added a comment - A minor remark. Shouldn't we take this integration into C* opportunity to rename the SUFFIX mode to CONTAINS ? Indeed, NORMAL and SPARSE indexing modes are quite self-explanatory whereas SUFFIX mode not only allows searching on suffixes but also on prefixes. Users can be confused by the suffix name and think it only works for suffix search
          Hide
          xedin Pavel Yaskevich added a comment -

          Thanks for the review!

          This is pretty ironic that COMPACT STORAGE is broken now, but we are going to fit this anyway once clustering support is added. dtests are definitely a good idea, I think we should have some already and we'll definitely port them main repo, I will also add my branch to CI tonight.

          Regarding review comments (all of the changes are pushed):

          The regex matching in o.a.c.io.sstable.Component.Type::fromRepresentation throws an NPE

          Fixed.

          In SASIIndex, getMetadataReloadTask & getBlockingFlushTask should be able to just return null

          Made both return null in SASIndex and made sure there would be no NPEs in existing call sites.

          I couldn't see why a PeekingIterator is used in OnDiskIndex::search

          Moved back to Iterator<ByteBuffer> that, since PeekingIterator was a rudiment of the previous version if search.

          The use of "a" and "b" in the o.a.c.i.sasi.plan.Expression ctor seems like it could have the potential for pain when debugging.

          Renamed to "sasi", "internal". maybe it will make it a bit clearer...

          The anonymous extension of Expression in Operation::analyzeGroup can be replaced

          Fixed, Thanks for catching that!

          MemIndex::estimateSize is unused

          Is a TODO item for me as memory tracking is different in 3.0 (see my previous comment).

          It doesn't really affect anything, but just for clarity I would rename MemoryUtil.DIRECT_BYTE_BUFFER_R_CLASS to RO_DIRECT_BYTE_BUFFER_CLASS

          Makes sense and done.

          Most trivial of nits: brace placement in SchemaLoader (ln 255)

          Fixed

          Show
          xedin Pavel Yaskevich added a comment - Thanks for the review! This is pretty ironic that COMPACT STORAGE is broken now, but we are going to fit this anyway once clustering support is added. dtests are definitely a good idea, I think we should have some already and we'll definitely port them main repo, I will also add my branch to CI tonight. Regarding review comments (all of the changes are pushed): The regex matching in o.a.c.io.sstable.Component.Type::fromRepresentation throws an NPE Fixed. In SASIIndex, getMetadataReloadTask & getBlockingFlushTask should be able to just return null Made both return null in SASIndex and made sure there would be no NPEs in existing call sites. I couldn't see why a PeekingIterator is used in OnDiskIndex::search Moved back to Iterator<ByteBuffer> that, since PeekingIterator was a rudiment of the previous version if search. The use of "a" and "b" in the o.a.c.i.sasi.plan.Expression ctor seems like it could have the potential for pain when debugging. Renamed to "sasi", "internal". maybe it will make it a bit clearer... The anonymous extension of Expression in Operation::analyzeGroup can be replaced Fixed, Thanks for catching that! MemIndex::estimateSize is unused Is a TODO item for me as memory tracking is different in 3.0 (see my previous comment). It doesn't really affect anything, but just for clarity I would rename MemoryUtil.DIRECT_BYTE_BUFFER_R_CLASS to RO_DIRECT_BYTE_BUFFER_CLASS Makes sense and done. Most trivial of nits: brace placement in SchemaLoader (ln 255) Fixed
          Hide
          beobal Sam Tunnicliffe added a comment -

          This is looking pretty good.

          A problem (which isn't caught by any of the unit tests btw) is that due to the fact that under the hood 3.x considers all compact storage columns as static. This breaks interactions with sasi-indexed tables via CQL - for example, try running through the examples in the original SASI readme and you'll find querying mostly broken.

          cqlsh:demo> select first_name, last_name, age, height, created_at from sasi where first_name = 'M';
          InvalidRequest: code=2200 [Invalid query] message="Queries using 2ndary indexes don't support selecting only static columns"
          cqlsh:demo>
          cqlsh:demo>
          cqlsh:demo> select * from sasi where first_name = 'M';
          
           id | age | created_at | first_name | height | last_name
          ----+-----+------------+------------+--------+-----------
          
          (0 rows)
          

          Fortunately, I believe we can simply drop the use of COMPACT STORAGE. My (limited) testing suggests that when tables are created without it, everything that's currently implemented works as expected.

          The new SASI specific tests look good and are all green, but we obviously need to run this through CI before it's committed. On a related note, are there any dtests that may be worth adding? The utest coverage is pretty comprehensive (modulo the CQL issues) so I wouldn't say it was absolutely critical, but some multi-node & CQL based tests would be nice to have.

          Otherwise, this first phase of integration looks good to me. On initial review I found one bug and a handful of nits. I have a few scenarios I want to run through, mostly to verify how sasi interacts with some of the parts of the index subsystem that were changed in 3.0.

          Initial review comments:

          • The regex matching in o.a.c.io.sstable.Component.Type::fromRepresentation throws an NPE when it encounters an unknown name and tries to match it to a CUSTOM component.
          • In SASIIndex, getMetadataReloadTask & getBlockingFlushTask should be able to just return null (like getInitializationTask does). In the case of the former, that is true right now as the only call site is in SIM where nulls are properly handled. getBlockingFlushTask is also called from KeyCacheCqlTest which doesn't check for nulls so would need tweaking slightly. (This is totally minor, the irregularity in SASIIndex just bugged me).
          • I couldn't see why a PeekingIterator is used in OnDiskIndex::search
          • The use of "a" and "b" in the o.a.c.i.sasi.plan.Expression ctor seems like it could have the potential for pain when debugging. I'm sure that it isn't very likely we'll ever care too much & I don't have any particularly better suggestion but if you do, could these be changed to something more greppable (or extracted to constants)?
          • The anonymous extension of Expression in Operation::analyzeGroup can be replaced with perColumn.add(new Expression(controller, columnIndex).add(e.operator(), token));
          • MemIndex::estimateSize is unused
          • It doesn't really affect anything, but just for clarity I would rename MemoryUtil.DIRECT_BYTE_BUFFER_R_CLASS to RO_DIRECT_BYTE_BUFFER_CLASS
          • Most trivial of nits: brace placement in SchemaLoader (ln 255)
          Show
          beobal Sam Tunnicliffe added a comment - This is looking pretty good. A problem (which isn't caught by any of the unit tests btw) is that due to the fact that under the hood 3.x considers all compact storage columns as static. This breaks interactions with sasi-indexed tables via CQL - for example, try running through the examples in the original SASI readme and you'll find querying mostly broken. cqlsh:demo> select first_name, last_name, age, height, created_at from sasi where first_name = 'M'; InvalidRequest: code=2200 [Invalid query] message= "Queries using 2ndary indexes don't support selecting only static columns" cqlsh:demo> cqlsh:demo> cqlsh:demo> select * from sasi where first_name = 'M'; id | age | created_at | first_name | height | last_name ----+-----+------------+------------+--------+----------- (0 rows) Fortunately, I believe we can simply drop the use of COMPACT STORAGE. My (limited) testing suggests that when tables are created without it, everything that's currently implemented works as expected. The new SASI specific tests look good and are all green, but we obviously need to run this through CI before it's committed. On a related note, are there any dtests that may be worth adding? The utest coverage is pretty comprehensive (modulo the CQL issues) so I wouldn't say it was absolutely critical, but some multi-node & CQL based tests would be nice to have. Otherwise, this first phase of integration looks good to me. On initial review I found one bug and a handful of nits. I have a few scenarios I want to run through, mostly to verify how sasi interacts with some of the parts of the index subsystem that were changed in 3.0. Initial review comments: The regex matching in o.a.c.io.sstable.Component.Type::fromRepresentation throws an NPE when it encounters an unknown name and tries to match it to a CUSTOM component. In SASIIndex, getMetadataReloadTask & getBlockingFlushTask should be able to just return null (like getInitializationTask does). In the case of the former, that is true right now as the only call site is in SIM where nulls are properly handled. getBlockingFlushTask is also called from KeyCacheCqlTest which doesn't check for nulls so would need tweaking slightly. (This is totally minor, the irregularity in SASIIndex just bugged me). I couldn't see why a PeekingIterator is used in OnDiskIndex::search The use of "a" and "b" in the o.a.c.i.sasi.plan.Expression ctor seems like it could have the potential for pain when debugging. I'm sure that it isn't very likely we'll ever care too much & I don't have any particularly better suggestion but if you do, could these be changed to something more greppable (or extracted to constants)? The anonymous extension of Expression in Operation::analyzeGroup can be replaced with perColumn.add(new Expression(controller, columnIndex).add(e.operator(), token)); MemIndex::estimateSize is unused It doesn't really affect anything, but just for clarity I would rename MemoryUtil.DIRECT_BYTE_BUFFER_R_CLASS to RO_DIRECT_BYTE_BUFFER_CLASS Most trivial of nits: brace placement in SchemaLoader (ln 255)
          Hide
          xedin Pavel Yaskevich added a comment -

          I've rebased sasi-3.x-integration with the latest trunk and everything is ready for initial review, the most recent changes are:

          • Finished up minor cleanup (e.g. @Override got removed, parenthesis on the proper places, small changes in analyzers etc.);
          • Temporarily removed ignored OR, parenthesis, NOT_EQ tests;
          • Fixed minor SSTable leak problem in SASIIndexTest.

          TODO:

          • Add back memory usage tracking of memtable index (since it's significantly different in 2.0 vs. 3.x and requires base cfs memtable allocator).
          • Add clustering column support;
          Show
          xedin Pavel Yaskevich added a comment - I've rebased sasi-3.x-integration with the latest trunk and everything is ready for initial review, the most recent changes are: Finished up minor cleanup (e.g. @Override got removed, parenthesis on the proper places, small changes in analyzers etc.); Temporarily removed ignored OR, parenthesis, NOT_EQ tests; Fixed minor SSTable leak problem in SASIIndexTest. TODO: Add back memory usage tracking of memtable index (since it's significantly different in 2.0 vs. 3.x and requires base cfs memtable allocator). Add clustering column support;
          Hide
          xedin Pavel Yaskevich added a comment - - edited

          Sam Tunnicliffe Yes, I will remove OR/parenthesis features from the branch (since we have a separate repo we can port it back from), do some more cleanup, and mark this as Patch Available. I forgot to mention that there is no clustering columns support yet in the branch (only Regular), but we are working on it.

          Show
          xedin Pavel Yaskevich added a comment - - edited Sam Tunnicliffe Yes, I will remove OR/parenthesis features from the branch (since we have a separate repo we can port it back from), do some more cleanup, and mark this as Patch Available. I forgot to mention that there is no clustering columns support yet in the branch (only Regular), but we are working on it.
          Hide
          beobal Sam Tunnicliffe added a comment -

          Pavel Yaskevich that sounds to me like a reasonable approach. Does that basically make this ticket Patch Available based on your integration branch (if so, I'll be sure and wrap up review asap after the holidays)?

          Show
          beobal Sam Tunnicliffe added a comment - Pavel Yaskevich that sounds to me like a reasonable approach. Does that basically make this ticket Patch Available based on your integration branch (if so, I'll be sure and wrap up review asap after the holidays)?
          Hide
          xedin Pavel Yaskevich added a comment -

          Sam Tunnicliffe Here is the latest status: I've attempted to integrate OR/Parenthesis into the CQL3 and SelectStatement which, as I've figured, actually would still require CASSADRA-10765 to be implemented since all of the restrictions have to be constructed/checked per logical operation (in other words, per CQL3 statement we'll have to build operation graph instead of current list approach) which would require substantial changes in SelectStatement, StatementRestrictions and other query processing classes. Maybe an alternative, and granular approach, would be more appropriate in this case:

          phase #1 - SASI goes into trunk supporting AND only (in other words, having QueryPlan internalized, no changes to CQL3);
          phase #2 - implement CASSANDRA-10765 with AND support only, which would supersede restriction support (via StatementRestrictions) in CQL3;
          phase #3 - add OR support to, by that time, already global QueryPlan.

          WDYT?

          Show
          xedin Pavel Yaskevich added a comment - Sam Tunnicliffe Here is the latest status: I've attempted to integrate OR/Parenthesis into the CQL3 and SelectStatement which, as I've figured, actually would still require CASSADRA-10765 to be implemented since all of the restrictions have to be constructed/checked per logical operation (in other words, per CQL3 statement we'll have to build operation graph instead of current list approach) which would require substantial changes in SelectStatement, StatementRestrictions and other query processing classes. Maybe an alternative, and granular approach, would be more appropriate in this case: phase #1 - SASI goes into trunk supporting AND only (in other words, having QueryPlan internalized, no changes to CQL3); phase #2 - implement CASSANDRA-10765 with AND support only, which would supersede restriction support (via StatementRestrictions) in CQL3; phase #3 - add OR support to, by that time, already global QueryPlan. WDYT?
          Hide
          xedin Pavel Yaskevich added a comment -

          Just as a quick update - I've finished porting all SASI functionality except OR/parenthesis and NOT_EQ to trunk in my branch, all of the appropriate SASI tests pass now. Going to proceed with porting of new operations.

          Show
          xedin Pavel Yaskevich added a comment - Just as a quick update - I've finished porting all SASI functionality except OR/parenthesis and NOT_EQ to trunk in my branch , all of the appropriate SASI tests pass now. Going to proceed with porting of new operations.

            People

            • Assignee:
              xedin Pavel Yaskevich
              Reporter:
              xedin Pavel Yaskevich
              Reviewer:
              Sam Tunnicliffe
              Tester:
              Mikhail Stepura
            • Votes:
              12 Vote for this issue
              Watchers:
              29 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development