Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-14158

[C++][Compute] Implement count distinct kernel using HyperLogLog

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 7.0.0
    • None
    • C++

    Description

      Having a version of the aggregation kernel count distinct using HyperLogLog may be useful.

      Note: The implementation should support the merge operator.

      cc icook lidavidm

      Some resources/links:
      http://algo.inria.fr/flajolet/Publications/FlFuGaMe07.pdf
      https://engineering.fb.com/2018/12/13/data-infrastructure/hyperloglog/
      https://github.com/facebookincubator/velox/tree/main/velox/aggregates/hyperloglog

      Attachments

        Activity

          People

            Unassigned Unassigned
            aucahuasi Percy Camilo TriveƱo Aucahuasi
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: