Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-1787

thetaSketch Support for Druid Adapter

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.12.0
    • 1.14.0
    • druid-adapter
    • None

    Description

      Currently, the Druid adapter does not support the thetaSketch aggregate type, which is used to measure the cardinality of a column quickly. Many Druid instances support theta sketches, so I think it would be a nice feature to have.

      I've been looking at the Druid adapter, and propose we add a new DruidType called thetaSketch and then add logic in the getJsonAggregation method in class DruidQuery to generate the thetaSketch aggregate. This will require accessing information about the columns (what data type they are) so that the thetaSketch aggregate is only produced if the column's type is thetaSketch.

      Also, I've noticed that a hyperUnique DruidType is currently defined, but a hyperUnique aggregate is never produced. Since both are approximate aggregators, I could also couple in the logic for hyperUnique.

      I'd love to hear your thoughts on my approach, and any suggestions you have for this feature.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            zhumayun Zain Humayun
            zhumayun Zain Humayun
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment