Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-15834

Bloom filter false positive rate calculation does not take into account true negatives

    XMLWordPrintableJSON

Details

    Description

      The bloom filter false positive ratio is currently computed as:

      bf_fp_ratio = false_positive_count / (false_positive_count + true_positive_count)

      However, this calculation doesn't take into account true negatives (false negatives never happen on bloom filters).

      In a situation where there are 1000 reads for non existing rows, and there are 10 false positives, the bloom filter false positive ratio will be wrongly calculated as 10/10 = 1.0, while it should be 10/1000 = 0.01.

      We should update the calculation to:

      bf_fp_ratio = false_positive_count / #bf_queries

      Original jira by pauloricardomg

      Attachments

        Activity

          People

            jtgrabowski Jaroslaw Grabowski
            jtgrabowski Jaroslaw Grabowski
            Jaroslaw Grabowski
            Brandon Williams
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 40m
                40m