Uploaded image for project: 'Metron (Retired)'
  1. Metron (Retired)
  2. METRON-627

Add HyperLogLogPlus implementation to Stellar

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Done
    • Major
    • Resolution: Done
    • None
    • 0.3.1
    • None

    Description

      Calculating set cardinality can be a useful tool for a security analyst. For instance, a large volume of non-unique src ip addresses hitting your network may be an indication that you are currently under attack. There have been many advancements in distinct value (DV) estimation over the years. We have seen implementations evolve from K-Minimum-Values (KMV), to LogLog, to HyperLogLog, and now to Google's much-improved HyperLogLogPlu algorithm. The key improvements in this latest manifestation of the algorithm are:
      moves to a 64-bit hash
      handles sparse sets
      is more accurate with small cardinality

      This Jira tracks the effort to add a HyperLogLogPlus implementation to Metron.

      References:
      https://research.neustar.biz/2013/01/24/hyperloglog-googles-take-on-engineering-hll/
      http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/40671.pdf

      Attachments

        Issue Links

          Activity

            People

              mmiklavcic Michael Miklavcic
              mmiklavcic Michael Miklavcic
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: