Cassandra
  1. Cassandra
  2. CASSANDRA-3497

BloomFilter FP ratio should be configurable or size-restricted some other way

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Fix Version/s: 1.0.7
    • Component/s: Core
    • Labels:
      None

      Description

      When you have a live dc and purely analytical dc, in many situations you can have less nodes on the analytical side, but end up getting restricted by having the BloomFilters in-memory, even though you have absolutely no use for them. It would be nice if you could reduce this memory requirement by tuning the desired FP ratio, or even just disabling them altogether.

      1. CASSANDRA-1.0-3497.txt
        26 kB
        Yuki Morishita
      2. 3497-v3.txt
        21 kB
        Jonathan Ellis
      3. 3497-v4.txt
        21 kB
        Jonathan Ellis
      4. 0001-give-default-val-to-fp_chance.patch
        0.9 kB
        Yuki Morishita
      5. 0001-Add-bloom_filter_fp_chance-to-cli.patch
        2 kB
        Yuki Morishita

        Issue Links

          Activity

          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Patch Available Patch Available
          37d 47m 1 Yuki Morishita 22/Dec/11 23:17
          Patch Available Patch Available Resolved Resolved
          17h 21m 1 Jonathan Ellis 23/Dec/11 16:39
          Resolved Resolved Reopened Reopened
          8d 12h 43m 2 Jonathan Ellis 04/Jan/12 03:23
          Reopened Reopened Resolved Resolved
          4d 17h 47m 2 Jonathan Ellis 05/Jan/12 23:10
          Gavin made changes -
          Workflow patch-available, re-open possible [ 12749176 ] reopen-resolved, no closed status, patch-avail, testing [ 12754090 ]
          Gavin made changes -
          Workflow no-reopen-closed, patch-avail [ 12642257 ] patch-available, re-open possible [ 12749176 ]
          Brandon Williams made changes -
          Link This issue is related to CASSANDRA-4303 [ CASSANDRA-4303 ]
          Hide
          Brandon Williams added a comment -

          Note for others trying to disable their BF: despite earlier discussion on this ticket, zero is NOT disabled, but instead sets it back to the default, since 0 false positives is invalid. You actually want to set it to 1 to have the smallest possible filter.

          Show
          Brandon Williams added a comment - Note for others trying to disable their BF: despite earlier discussion on this ticket, zero is NOT disabled, but instead sets it back to the default, since 0 false positives is invalid. You actually want to set it to 1 to have the smallest possible filter.
          Hide
          Jonathan Ellis added a comment -

          Makes sense, that's what I was referring to when I reviewed that patch and said "the BloomFilter#modify approach won't work." v4 / 1.0 branch should be fine.

          Show
          Jonathan Ellis added a comment - Makes sense, that's what I was referring to when I reviewed that patch and said "the BloomFilter#modify approach won't work." v4 / 1.0 branch should be fine.
          Hide
          Ophir Radnitz added a comment -

          I actually applied the 'CASSANDRA-1.0-3497' patch, which I can see now is not the most updated one. We'll probably revisit this once 1.0.7 is out.

          Show
          Ophir Radnitz added a comment - I actually applied the ' CASSANDRA-1 .0-3497' patch, which I can see now is not the most updated one. We'll probably revisit this once 1.0.7 is out.
          Jonathan Ellis made changes -
          Status Reopened [ 4 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Hide
          Jonathan Ellis added a comment -

          Patch attached so that cli show schema or describe commands show bloom_filter_fp_chance if set.

          committed

          Show
          Jonathan Ellis added a comment - Patch attached so that cli show schema or describe commands show bloom_filter_fp_chance if set. committed
          Hide
          Jonathan Ellis added a comment -

          We've found that many records that were inserted counld not be fetched in a multiget_slice query. It seemed as if the bloom filters resulted in false negatives.

          I have trouble understanding how this could be the case, because if our BF could cause false negatives then surely we'd see that even at today's low default FP rates. This patch didn't change how the BF is used, only the parameters it's created with, nor does it try to retrofit the new BF parameters onto existing sstables.

          You did apply the v4 patch and not an earlier one, right?

          Show
          Jonathan Ellis added a comment - We've found that many records that were inserted counld not be fetched in a multiget_slice query. It seemed as if the bloom filters resulted in false negatives. I have trouble understanding how this could be the case, because if our BF could cause false negatives then surely we'd see that even at today's low default FP rates. This patch didn't change how the BF is used, only the parameters it's created with, nor does it try to retrofit the new BF parameters onto existing sstables. You did apply the v4 patch and not an earlier one, right?
          Yuki Morishita made changes -
          Hide
          Yuki Morishita added a comment -

          Patch attached so that cli show schema or describe commands show bloom_filter_fp_chance if set.

          Show
          Yuki Morishita added a comment - Patch attached so that cli show schema or describe commands show bloom_filter_fp_chance if set.
          Hide
          Jonathan Ellis added a comment -

          the fix patch (0001-give-default-val-to-fp_chance.patch) works for the 1.1 branch but not for 1.0

          it's already applied to both. (Note that we've switched to git, the old svn repo is no longer maintained.)

          Show
          Jonathan Ellis added a comment - the fix patch (0001-give-default-val-to-fp_chance.patch) works for the 1.1 branch but not for 1.0 it's already applied to both. (Note that we've switched to git, the old svn repo is no longer maintained.)
          Hide
          Ophir Radnitz added a comment -

          We've tried this patch with 1.0.6 with fp_ratio of 0.99 (if I get it correctly, after a major compaction and a single albeit large SSTable, bloom filter has very little effect). We've found that many records that were inserted counld not be fetched in a multiget_slice query. It seemed as if the bloom filters resulted in false negatives.

          By the way, the fix patch (0001-give-default-val-to-fp_chance.patch) works for the 1.1 branch but not for 1.0.

          Show
          Ophir Radnitz added a comment - We've tried this patch with 1.0.6 with fp_ratio of 0.99 (if I get it correctly, after a major compaction and a single albeit large SSTable, bloom filter has very little effect). We've found that many records that were inserted counld not be fetched in a multiget_slice query. It seemed as if the bloom filters resulted in false negatives . By the way, the fix patch (0001-give-default-val-to-fp_chance.patch) works for the 1.1 branch but not for 1.0.
          Jonathan Ellis made changes -
          Resolution Fixed [ 1 ]
          Status Resolved [ 5 ] Reopened [ 4 ]
          Hide
          Radim Kolar added a comment -

          FP ratio it is not displayed in output of cli: show schema, describe;

          Show
          Radim Kolar added a comment - FP ratio it is not displayed in output of cli: show schema, describe;
          Jonathan Ellis made changes -
          Status Reopened [ 4 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Hide
          Jonathan Ellis added a comment -

          committed

          Show
          Jonathan Ellis added a comment - committed
          Yuki Morishita made changes -
          Hide
          Yuki Morishita added a comment -

          Radim,

          Thanks for the report. The problem is that the new bloom_filter_fp_chance in avro interface definition does not have proper default.
          I attached the patch to fix it.

          Show
          Yuki Morishita added a comment - Radim, Thanks for the report. The problem is that the new bloom_filter_fp_chance in avro interface definition does not have proper default. I attached the patch to fix it.
          Jonathan Ellis made changes -
          Resolution Fixed [ 1 ]
          Status Resolved [ 5 ] Reopened [ 4 ]
          Hide
          Radim Kolar added a comment -

          i compiled jars with this patch and cassandra do not boots an existing node

          Opening /var/lib/cassandra/data/system/Migrations-hc-109 (757635 bytes)
          INFO [SSTableBatchOpen:1] 2011-12-24 18:26:47,326 SSTableReader.java (line 134) Opening /var/lib/cassandra/data/system/LocationInfo-hc-273 (647 bytes)
          INFO [SSTableBatchOpen:1] 2011-12-24 18:26:47,338 SSTableReader.java (line 134) Opening /var/lib/cassandra/data/system/HintsColumnFamily-hc-1 (275 bytes)
          INFO [SSTableBatchOpen:2] 2011-12-24 18:26:47,338 SSTableReader.java (line 134) Opening /var/lib/cassandra/data/system/HintsColumnFamily-hc-2 (85 bytes)
          INFO [main] 2011-12-24 18:26:47,396 DatabaseDescriptor.java (line 501) Loading schema version ad8d50b0-2cc3-11e1-0000-b1504fb874be
          ERROR [main] 2011-12-24 18:26:47,555 AbstractCassandraDaemon.java (line 372) Exception encountered during startup
          org.apache.avro.AvroTypeException: Found {"type":"record","name":"CfDef","namespace":"org.apache.cassandra.db.migration.avro","fields":[

          {"name":"keyspace","type":"string"}

          ,

          {"name":"name","type":"string"}

          ,

          {"name":"column_type","type":["string","null"]}

          ,

          {"name":"comparator_type","type":["string","null"]}

          ,

          {"name":"subcomparator_type","type":["string","null"]}

          ,

          {"name":"comment","type":["string","null"]}

          ,

          {"name":"row_cache_size","type":["double","null"]}

          ,

          {"name":"key_cache_size","type":["double","null"]}

          ,

          {"name":"read_repair_chance","type":["double","null"]}

          ,

          {"name":"replicate_on_write","type":"boolean","default":false}

          ,

          {"name":"gc_grace_seconds","type":["int","null"]}

          ,

          {"name":"default_validation_class","type":["null","string"],"default":null}

          ,

          {"name":"key_validation_class","type":["null","string"],"default":null}

          ,

          {"name":"min_compaction_threshold","type":["null","int"],"default":null}

          ,

          {"name":"max_compaction_threshold","type":["null","int"],"default":null}

          ,

          {"name":"row_cache_save_period_in_seconds","type":["int","null"],"default":0}

          ,

          {"name":"key_cache_save_period_in_seconds","type":["int","null"],"default":3600}

          ,

          {"name":"row_cache_keys_to_save","type":["null","int"],"default":null}

          ,

          {"name":"merge_shards_chance","type":["null","double"],"default":null}

          ,

          {"name":"id","type":["int","null"]}

          ,{"name":"column_metadata","type":[{"type":"array","items":{"type":"record","name":"ColumnDef","fields":[

          {"name":"name","type":"bytes"}

          ,

          {"name":"validation_class","type":"string"}

          ,{"name":"index_type","type":[

          {"type":"enum","name":"IndexType","symbols":["KEYS","CUSTOM"],"aliases":["org.apache.cassandra.config.avro.IndexType"]}

          ,"null"]},

          {"name":"index_name","type":["string","null"]}

          ,{"name":"index_options","type":["null",

          {"type":"map","values":"string"}

          ],"default":null}]}},"null"]},

          {"name":"row_cache_provider","type":["string","null"],"default":"org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider"}

          ,

          {"name":"key_alias","type":["null","bytes"],"default":null}

          ,

          {"name":"compaction_strategy","type":["null","string"],"default":null}

          ,{"name":"compaction_strategy_options","type":["null",

          {"type":"map","values":"string"}

          ],"default":null},{"name":"compression_options","type":["null",

          {"type":"map","values":"string"}

          ],"default":null}]}, expecting {"type":"record","name":"CfDef","namespace":"org.apache.cassandra.db.migration.avro","fields":[

          {"name":"keyspace","type":"string"}

          ,

          {"name":"name","type":"string"}

          ,

          {"name":"column_type","type":["string","null"]}

          ,

          {"name":"comparator_type","type":["string","null"]}

          ,

          {"name":"subcomparator_type","type":["string","null"]}

          ,

          {"name":"comment","type":["string","null"]}

          ,

          {"name":"row_cache_size","type":["double","null"]}

          ,

          {"name":"key_cache_size","type":["double","null"]}

          ,

          {"name":"read_repair_chance","type":["double","null"]}

          ,

          {"name":"replicate_on_write","type":"boolean","default":false}

          ,

          {"name":"gc_grace_seconds","type":["int","null"]}

          ,

          {"name":"default_validation_class","type":["null","string"],"default":null}

          ,

          {"name":"key_validation_class","type":["null","string"],"default":null}

          ,

          {"name":"min_compaction_threshold","type":["null","int"],"default":null}

          ,

          {"name":"max_compaction_threshold","type":["null","int"],"default":null}

          ,

          {"name":"row_cache_save_period_in_seconds","type":["int","null"],"default":0}

          ,

          {"name":"key_cache_save_period_in_seconds","type":["int","null"],"default":3600}

          ,

          {"name":"row_cache_keys_to_save","type":["null","int"],"default":null}

          ,

          {"name":"merge_shards_chance","type":["null","double"],"default":null}

          ,

          {"name":"id","type":["int","null"]}

          ,{"name":"column_metadata","type":[{"type":"array","items":{"type":"record","name":"ColumnDef","fields":[

          {"name":"name","type":"bytes"}

          ,

          {"name":"validation_class","type":"string"}

          ,{"name":"index_type","type":[

          {"type":"enum","name":"IndexType","symbols":["KEYS","CUSTOM"],"aliases":["org.apache.cassandra.config.avro.IndexType"]}

          ,"null"]},

          {"name":"index_name","type":["string","null"]}

          ,{"name":"index_options","type":["null",

          {"type":"map","values":"string"}

          ],"default":null}],"aliases":["org.apache.cassandra.config.avro.ColumnDef"]}},"null"]},

          {"name":"row_cache_provider","type":["string","null"],"default":"org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider"}

          ,

          {"name":"key_alias","type":["null","bytes"],"default":null}

          ,

          {"name":"compaction_strategy","type":["null","string"],"default":null}

          ,{"name":"compaction_strategy_options","type":["null",

          {"type":"map","values":"string"}

          ],"default":null},{"name":"compression_options","type":["null",

          {"type":"map","values":"string"}

          ],"default":null},

          {"name":"bloom_filter_fp_chance","type":["double","null"]}

          ],"aliases":["org.apache.cassandra.config.avro.CfDef"]}
          at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:212)
          at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
          at org.apache.avro.io.ResolvingDecoder.readFieldOrder(ResolvingDecoder.java:121)
          at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:138)
          at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:114)
          at org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java:192)
          at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:116)
          at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:142)
          at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:114)
          at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:105)
          at org.apache.cassandra.io.SerDeUtils.deserialize(SerDeUtils.java:60)
          at org.apache.cassandra.db.DefsTable.loadFromStorage(DefsTable.java:98)
          at org.apache.cassandra.config.DatabaseDescriptor.loadSchemas(DatabaseDescriptor.java:502)
          at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:179)
          at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:355)
          at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:107)

          Show
          Radim Kolar added a comment - i compiled jars with this patch and cassandra do not boots an existing node Opening /var/lib/cassandra/data/system/Migrations-hc-109 (757635 bytes) INFO [SSTableBatchOpen:1] 2011-12-24 18:26:47,326 SSTableReader.java (line 134) Opening /var/lib/cassandra/data/system/LocationInfo-hc-273 (647 bytes) INFO [SSTableBatchOpen:1] 2011-12-24 18:26:47,338 SSTableReader.java (line 134) Opening /var/lib/cassandra/data/system/HintsColumnFamily-hc-1 (275 bytes) INFO [SSTableBatchOpen:2] 2011-12-24 18:26:47,338 SSTableReader.java (line 134) Opening /var/lib/cassandra/data/system/HintsColumnFamily-hc-2 (85 bytes) INFO [main] 2011-12-24 18:26:47,396 DatabaseDescriptor.java (line 501) Loading schema version ad8d50b0-2cc3-11e1-0000-b1504fb874be ERROR [main] 2011-12-24 18:26:47,555 AbstractCassandraDaemon.java (line 372) Exception encountered during startup org.apache.avro.AvroTypeException: Found {"type":"record","name":"CfDef","namespace":"org.apache.cassandra.db.migration.avro","fields":[ {"name":"keyspace","type":"string"} , {"name":"name","type":"string"} , {"name":"column_type","type":["string","null"]} , {"name":"comparator_type","type":["string","null"]} , {"name":"subcomparator_type","type":["string","null"]} , {"name":"comment","type":["string","null"]} , {"name":"row_cache_size","type":["double","null"]} , {"name":"key_cache_size","type":["double","null"]} , {"name":"read_repair_chance","type":["double","null"]} , {"name":"replicate_on_write","type":"boolean","default":false} , {"name":"gc_grace_seconds","type":["int","null"]} , {"name":"default_validation_class","type":["null","string"],"default":null} , {"name":"key_validation_class","type":["null","string"],"default":null} , {"name":"min_compaction_threshold","type":["null","int"],"default":null} , {"name":"max_compaction_threshold","type":["null","int"],"default":null} , {"name":"row_cache_save_period_in_seconds","type":["int","null"],"default":0} , {"name":"key_cache_save_period_in_seconds","type":["int","null"],"default":3600} , {"name":"row_cache_keys_to_save","type":["null","int"],"default":null} , {"name":"merge_shards_chance","type":["null","double"],"default":null} , {"name":"id","type":["int","null"]} ,{"name":"column_metadata","type":[{"type":"array","items":{"type":"record","name":"ColumnDef","fields":[ {"name":"name","type":"bytes"} , {"name":"validation_class","type":"string"} ,{"name":"index_type","type":[ {"type":"enum","name":"IndexType","symbols":["KEYS","CUSTOM"],"aliases":["org.apache.cassandra.config.avro.IndexType"]} ,"null"]}, {"name":"index_name","type":["string","null"]} ,{"name":"index_options","type":["null", {"type":"map","values":"string"} ],"default":null}]}},"null"]}, {"name":"row_cache_provider","type":["string","null"],"default":"org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider"} , {"name":"key_alias","type":["null","bytes"],"default":null} , {"name":"compaction_strategy","type":["null","string"],"default":null} ,{"name":"compaction_strategy_options","type":["null", {"type":"map","values":"string"} ],"default":null},{"name":"compression_options","type":["null", {"type":"map","values":"string"} ],"default":null}]}, expecting {"type":"record","name":"CfDef","namespace":"org.apache.cassandra.db.migration.avro","fields":[ {"name":"keyspace","type":"string"} , {"name":"name","type":"string"} , {"name":"column_type","type":["string","null"]} , {"name":"comparator_type","type":["string","null"]} , {"name":"subcomparator_type","type":["string","null"]} , {"name":"comment","type":["string","null"]} , {"name":"row_cache_size","type":["double","null"]} , {"name":"key_cache_size","type":["double","null"]} , {"name":"read_repair_chance","type":["double","null"]} , {"name":"replicate_on_write","type":"boolean","default":false} , {"name":"gc_grace_seconds","type":["int","null"]} , {"name":"default_validation_class","type":["null","string"],"default":null} , {"name":"key_validation_class","type":["null","string"],"default":null} , {"name":"min_compaction_threshold","type":["null","int"],"default":null} , {"name":"max_compaction_threshold","type":["null","int"],"default":null} , {"name":"row_cache_save_period_in_seconds","type":["int","null"],"default":0} , {"name":"key_cache_save_period_in_seconds","type":["int","null"],"default":3600} , {"name":"row_cache_keys_to_save","type":["null","int"],"default":null} , {"name":"merge_shards_chance","type":["null","double"],"default":null} , {"name":"id","type":["int","null"]} ,{"name":"column_metadata","type":[{"type":"array","items":{"type":"record","name":"ColumnDef","fields":[ {"name":"name","type":"bytes"} , {"name":"validation_class","type":"string"} ,{"name":"index_type","type":[ {"type":"enum","name":"IndexType","symbols":["KEYS","CUSTOM"],"aliases":["org.apache.cassandra.config.avro.IndexType"]} ,"null"]}, {"name":"index_name","type":["string","null"]} ,{"name":"index_options","type":["null", {"type":"map","values":"string"} ],"default":null}],"aliases": ["org.apache.cassandra.config.avro.ColumnDef"] }},"null"]}, {"name":"row_cache_provider","type":["string","null"],"default":"org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider"} , {"name":"key_alias","type":["null","bytes"],"default":null} , {"name":"compaction_strategy","type":["null","string"],"default":null} ,{"name":"compaction_strategy_options","type":["null", {"type":"map","values":"string"} ],"default":null},{"name":"compression_options","type":["null", {"type":"map","values":"string"} ],"default":null}, {"name":"bloom_filter_fp_chance","type":["double","null"]} ],"aliases": ["org.apache.cassandra.config.avro.CfDef"] } at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:212) at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) at org.apache.avro.io.ResolvingDecoder.readFieldOrder(ResolvingDecoder.java:121) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:138) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:114) at org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java:192) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:116) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:142) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:114) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:105) at org.apache.cassandra.io.SerDeUtils.deserialize(SerDeUtils.java:60) at org.apache.cassandra.db.DefsTable.loadFromStorage(DefsTable.java:98) at org.apache.cassandra.config.DatabaseDescriptor.loadSchemas(DatabaseDescriptor.java:502) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:179) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:355) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:107)
          Jonathan Ellis made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Reviewer jbellis
          Resolution Fixed [ 1 ]
          Hide
          Jonathan Ellis added a comment -

          committed

          Show
          Jonathan Ellis added a comment - committed
          Hide
          Yuki Morishita added a comment -

          +1

          Show
          Yuki Morishita added a comment - +1
          Jonathan Ellis made changes -
          Attachment 3497-v4.txt [ 12508497 ]
          Hide
          Jonathan Ellis added a comment -

          v4 attached with unbox-of-null fixed.

          Show
          Jonathan Ellis added a comment - v4 attached with unbox-of-null fixed.
          Hide
          Yuki Morishita added a comment -

          Jonathan,

          Yours is what I first tried, but instead I tried to do it in SSTR, and I think that is what we can do best for 1.0.x.
          One thing to point out is that it NPE when fpChance is null and try to convert it to double at SSTableWriter.java#403.

          Show
          Yuki Morishita added a comment - Jonathan, Yours is what I first tried, but instead I tried to do it in SSTR, and I think that is what we can do best for 1.0.x. One thing to point out is that it NPE when fpChance is null and try to convert it to double at SSTableWriter.java#403.
          Jonathan Ellis made changes -
          Attachment 3497-v3.txt [ 12508489 ]
          Jonathan Ellis made changes -
          Attachment 3497-v3.txt [ 12508488 ]
          Jonathan Ellis made changes -
          Attachment 3497-v3.txt [ 12508488 ]
          Jonathan Ellis made changes -
          Attachment 3497-v3.txt [ 12508487 ]
          Jonathan Ellis made changes -
          Attachment 3497-v3.txt [ 12508487 ]
          Jonathan Ellis made changes -
          Attachment 3497-v3.txt [ 12508486 ]
          Jonathan Ellis made changes -
          Attachment 3497-v3.txt [ 12508486 ]
          Hide
          Jonathan Ellis added a comment -

          Sorry, I didn't look closely enough the first time. The BloomFilter#modify approach won't work: when we change the BF parameters we change what bits should be set – there's no way to rebuild it with new parameters without re-inserting all the keys.

          Attached v3 that just changes the BloomFilter constructor in SSTableWriter. (So, people will have to scrub to rebuild things, but that's the best we can do.) Also changed the setting to bloom_filter_fp_chance and updated cli help.

          How does that look to you?

          Show
          Jonathan Ellis added a comment - Sorry, I didn't look closely enough the first time. The BloomFilter#modify approach won't work: when we change the BF parameters we change what bits should be set – there's no way to rebuild it with new parameters without re-inserting all the keys. Attached v3 that just changes the BloomFilter constructor in SSTableWriter. (So, people will have to scrub to rebuild things, but that's the best we can do.) Also changed the setting to bloom_filter_fp_chance and updated cli help. How does that look to you?
          Yuki Morishita made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Yuki Morishita made changes -
          Attachment CASSANDRA-1.0-3497.txt [ 12508463 ]
          Hide
          Yuki Morishita added a comment -

          OK, in attached patch, I removed filter_enabled option.

          Show
          Yuki Morishita added a comment - OK, in attached patch, I removed filter_enabled option.
          Yuki Morishita made changes -
          Attachment CASSANDRA-1.0-3497.txt [ 12508458 ]
          Hide
          Jonathan Ellis added a comment -

          Can we do it with a single setting?

          fp_ratio = null: use current 15-buckets-per-element filters
          fp_ratio = 0: no filter
          fp_ratio > 0: BF based on given FP probability

          Further, I think we should split this up so that for 1.0 we only worry about the null and positive cases – let's do a separate ticket for 1.1 about skipping the BF entirely.

          Show
          Jonathan Ellis added a comment - Can we do it with a single setting? fp_ratio = null: use current 15-buckets-per-element filters fp_ratio = 0: no filter fp_ratio > 0: BF based on given FP probability Further, I think we should split this up so that for 1.0 we only worry about the null and positive cases – let's do a separate ticket for 1.1 about skipping the BF entirely.
          Yuki Morishita made changes -
          Attachment CASSANDRA-1.0-3497.txt [ 12508458 ]
          Hide
          Yuki Morishita added a comment -

          I added 2 new Bloom Filter related options to CFMetadata.

          • filter_enabled
            if set to false, SSTableReader uses EMPTY bloom filter. Default to true.
          • fp_ratio
            if the value is greater than 0, SSTableReader adjusts Bloom Filter based on FP ratio and uses it. Default to 0.

          BloomFilter is created and saved as usual, but when opening SSTableReader, you got the one based on the CF setting.

          One thing to note is that the change is effective when next time SSTableReader is opened, so you need to restart node or compact/scrub sstable for existing sstables.

          Show
          Yuki Morishita added a comment - I added 2 new Bloom Filter related options to CFMetadata. filter_enabled if set to false, SSTableReader uses EMPTY bloom filter. Default to true. fp_ratio if the value is greater than 0, SSTableReader adjusts Bloom Filter based on FP ratio and uses it. Default to 0. BloomFilter is created and saved as usual, but when opening SSTableReader, you got the one based on the CF setting. One thing to note is that the change is effective when next time SSTableReader is opened, so you need to restart node or compact/scrub sstable for existing sstables.
          Hide
          Jonathan Ellis added a comment -

          Let's just go with a per-CF option. Brandon's right that ideally we'd like to configure it differently (ideally leaving them out entirely) in analytical DCs but I don't want to invent a totally new concept in 1.0.x, and having it per-CF (which we get via schema) is more important than having it per-DC (which we get with strategy_options).

          Show
          Jonathan Ellis added a comment - Let's just go with a per-CF option. Brandon's right that ideally we'd like to configure it differently (ideally leaving them out entirely) in analytical DCs but I don't want to invent a totally new concept in 1.0.x, and having it per-CF (which we get via schema) is more important than having it per-DC (which we get with strategy_options).
          Hide
          Brandon Williams added a comment -

          Maybe easiest fix is to have node-wide setting for fp ratio in cassandra.yaml (w/ jmx interface exposed) and have different values for each datacenter?

          Yes, I think that's good enough for the multi-datacenter scenario, however as Radim mentioned we also have a good use case for a per-CF threshold. We could do both, and then use whichever value is the lower, the one in the CF schema or the one in the node's yaml.

          Show
          Brandon Williams added a comment - Maybe easiest fix is to have node-wide setting for fp ratio in cassandra.yaml (w/ jmx interface exposed) and have different values for each datacenter? Yes, I think that's good enough for the multi-datacenter scenario, however as Radim mentioned we also have a good use case for a per-CF threshold. We could do both, and then use whichever value is the lower, the one in the CF schema or the one in the node's yaml.
          Hide
          Yuki Morishita added a comment -

          The problem is that currently strategy_options for NTS is thoroughly for replication setting, for example

          {DC1:2, DC2:2}

          .
          We can do like strategy_options=

          {DC1:2, DC2:1, DC2:fp(0.5)}

          or strategy_options=

          {DC1:2, DC2:1,fp(0.5)}

          or something preserving backward compatibility, but I think it's complicated.

          Maybe easiest fix is to have node-wide setting for fp ratio in cassandra.yaml (w/ jmx interface exposed) and have different values for each datacenter?

          Show
          Yuki Morishita added a comment - The problem is that currently strategy_options for NTS is thoroughly for replication setting, for example {DC1:2, DC2:2} . We can do like strategy_options= {DC1:2, DC2:1, DC2:fp(0.5)} or strategy_options= {DC1:2, DC2:1,fp(0.5)} or something preserving backward compatibility, but I think it's complicated. Maybe easiest fix is to have node-wide setting for fp ratio in cassandra.yaml (w/ jmx interface exposed) and have different values for each datacenter?
          Jonathan Ellis made changes -
          Fix Version/s 1.0.7 [ 12319244 ]
          Fix Version/s 1.1 [ 12317615 ]
          Jonathan Ellis made changes -
          Assignee Yuki Morishita [ yukim ]
          Fix Version/s 1.1 [ 12317615 ]
          Hide
          Radim Kolar added a comment -

          It will be good to have ability to shrink bloom filter during loading. Save only standard cassandra bloom filters but shrink them during load according to CF settings.

          Show
          Radim Kolar added a comment - It will be good to have ability to shrink bloom filter during loading. Save only standard cassandra bloom filters but shrink them during load according to CF settings.
          Hide
          Radim Kolar added a comment -

          BF configuration needs to be per CF like in HBASE. This will allow to have CF used for log with minimal BF if their rows are rarely read back.

          See HBASE for example:
          http://hbase.apache.org/book/blooms.html#d1161e4353

          Show
          Radim Kolar added a comment - BF configuration needs to be per CF like in HBASE. This will allow to have CF used for log with minimal BF if their rows are rarely read back. See HBASE for example: http://hbase.apache.org/book/blooms.html#d1161e4353
          Hide
          Brandon Williams added a comment -

          Perhaps as a strategy_option?

          Show
          Brandon Williams added a comment - Perhaps as a strategy_option?
          Jonathan Ellis made changes -
          Priority Major [ 3 ] Minor [ 4 ]
          Hide
          Jonathan Ellis added a comment -

          Hmm, that sounds messy. How do you propose to distinguish BF configuration per-datacenter in the schema?

          Show
          Jonathan Ellis added a comment - Hmm, that sounds messy. How do you propose to distinguish BF configuration per-datacenter in the schema?
          Brandon Williams made changes -
          Field Original Value New Value
          Description When you have a live dc and purely analytical dc, in many situations you can have less nodes on the analytical side, but end up getting restricted by having the BloomFilters in-memory, even though so you have absolutely no use for them. It would be nice if you could reduce this memory requirement by tuning the desired FP ratio, or even just disabling them altogether. When you have a live dc and purely analytical dc, in many situations you can have less nodes on the analytical side, but end up getting restricted by having the BloomFilters in-memory, even though you have absolutely no use for them. It would be nice if you could reduce this memory requirement by tuning the desired FP ratio, or even just disabling them altogether.
          Brandon Williams created issue -

            People

            • Assignee:
              Yuki Morishita
              Reporter:
              Brandon Williams
              Reviewer:
              Jonathan Ellis
            • Votes:
              2 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development