Details

    • Type: Wish
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Fix Version/s: 4.x
    • Component/s: None
    • Labels:
      None

      Description

      Considering that HyperLogLog and its variants have become pretty popular in analytics space and Cassandra has "read-before-write" collections (Lists), I think it would not be too painful to add support for HyperLogLog "collection" type. They would act similar to CQL 3 Sets, meaning you would be able to "set" the value and "add" an element, but you won't be able to remove an element. Also, when getting the value of a HyperLogLog collection column, you'd get the cardinality.

      There are a couple of good attributes with HyperLogLog which fit Cassandra pretty well.

      • Adding an element is idempotent (adding an existing element doesn't change the HLL)
      • HLL can be thought of as a CRDT, since we can safely merge them. Which means we can merge two HLLs during read-repair. But if that's too much work, I guess we can even live with LWW since these counts are "estimates" after all.

      There is already a proof of concept at:
      http://vilkeliskis.com/blog/2013/12/28/hacking_cassandra.html

        Issue Links

          Activity

          Hide
          michaelsembwever mck added a comment -

          Bumping to fix version 4.x, as 3.11.0 is a bug-fix only release.
            ref https://s.apache.org/EHBy

          Show
          michaelsembwever mck added a comment - Bumping to fix version 4.x, as 3.11.0 is a bug-fix only release.   ref https://s.apache.org/EHBy
          Hide
          MoonBouncer Martin Herren added a comment -

          You got one more fanboy for this. Love the fact that Redis has a builtin HLL type.

          Either as a data type, or as a table type as for counter tables. It could also be an enhancement of counter tables so they could have counters and hlls columns.

          Either way i'd like to see this feature !

          Show
          MoonBouncer Martin Herren added a comment - You got one more fanboy for this. Love the fact that Redis has a builtin HLL type. Either as a data type, or as a table type as for counter tables. It could also be an enhancement of counter tables so they could have counters and hlls columns. Either way i'd like to see this feature !
          Hide
          drew_kutchar Drew Kutcharian added a comment -
          Show
          drew_kutchar Drew Kutcharian added a comment - Thanks Aleksey Yeschenko
          Hide
          iamaleksey Aleksey Yeschenko added a comment -

          Assigning to myself so that it doesn't get lost.

          Show
          iamaleksey Aleksey Yeschenko added a comment - Assigning to myself so that it doesn't get lost.

            People

            • Assignee:
              Unassigned
              Reporter:
              drew_kutchar Drew Kutcharian
            • Votes:
              4 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

              • Created:
                Updated:

                Development