Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-1787

thetaSketch Support for Druid Adapter

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.12.0
    • Fix Version/s: 1.14.0
    • Component/s: druid
    • Labels:
      None

      Description

      Currently, the Druid adapter does not support the thetaSketch aggregate type, which is used to measure the cardinality of a column quickly. Many Druid instances support theta sketches, so I think it would be a nice feature to have.

      I've been looking at the Druid adapter, and propose we add a new DruidType called thetaSketch and then add logic in the getJsonAggregation method in class DruidQuery to generate the thetaSketch aggregate. This will require accessing information about the columns (what data type they are) so that the thetaSketch aggregate is only produced if the column's type is thetaSketch.

      Also, I've noticed that a hyperUnique DruidType is currently defined, but a hyperUnique aggregate is never produced. Since both are approximate aggregators, I could also couple in the logic for hyperUnique.

      I'd love to hear your thoughts on my approach, and any suggestions you have for this feature.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                zhumayun Zain Humayun
                Reporter:
                zhumayun Zain Humayun
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: