Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-1787

thetaSketch Support for Druid Adapter

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.12.0
    • 1.14.0
    • druid-adapter
    • None

    Description

      Currently, the Druid adapter does not support the thetaSketch aggregate type, which is used to measure the cardinality of a column quickly. Many Druid instances support theta sketches, so I think it would be a nice feature to have.

      I've been looking at the Druid adapter, and propose we add a new DruidType called thetaSketch and then add logic in the getJsonAggregation method in class DruidQuery to generate the thetaSketch aggregate. This will require accessing information about the columns (what data type they are) so that the thetaSketch aggregate is only produced if the column's type is thetaSketch.

      Also, I've noticed that a hyperUnique DruidType is currently defined, but a hyperUnique aggregate is never produced. Since both are approximate aggregators, I could also couple in the logic for hyperUnique.

      I'd love to hear your thoughts on my approach, and any suggestions you have for this feature.

      Attachments

        Issue Links

          Activity

            People

              zhumayun Zain Humayun
              zhumayun Zain Humayun
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: