Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-1555

Considerations for larger bloom filters

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Fix Version/s: 0.7.1
    • Component/s: None
    • Labels:
      None

      Description

      To (optimally) support SSTables larger than 143 million keys, we need to support bloom filters larger than 2^31 bits, which java.util.BitSet can't handle directly.

      A few options:

      • Switch to a BitSet class which supports 2^31 * 64 bits (Lucene's OpenBitSet)
      • Partition the java.util.BitSet behind our current BloomFilter
        • Straightforward bit partitioning: bit N is in bitset N // 2^31
        • Separate equally sized complete bloom filters for member ranges, which can be used independently or OR'd together under memory pressure.

      All of these options require new approaches to serialization.

        Attachments

        1. 1555_v5.txt
          120 kB
          T Jake Luciani
        2. 1555_v6.txt
          119 kB
          T Jake Luciani
        3. 1555-v7.txt
          119 kB
          Jonathan Ellis
        4. cassandra-1555.tgz
          12 kB
          Ryan King
        5. CASSANDRA-1555v2.patch
          112 kB
          Ryan King
        6. CASSANDRA-1555v3.patch.gz
          20 kB
          Ryan King
        7. CASSANDRA-1555v4.patch.gz
          20 kB
          Ryan King

          Issue Links

            Activity

              People

              • Assignee:
                kingryan Ryan King
                Reporter:
                stuhood Stu Hood
                Reviewer:
                T Jake Luciani
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: