Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-2170

Use Druid Expressions capabilities to improve the amount of work that can be pushed to Druid


    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.16.0
    • Component/s: druid
    • Labels:


      Druid 0.11 has newly built in capabilities called Expressions that can be used to push expression like projects/aggregates/filters. 

      In order to leverage this new feature, some changes need to be done to the Druid Calcite adapter. 

      This is a link to the current supported functions and expressions in Druid
      As you can see from the Docs an expression can be an actual tree of operators,
      Expression can be used with Filters, Projects, Aggregates, PostAggregates and
      Having filters. For Filters will have new Filter kind called Filter expression.
      FYI, you might ask can we push everything as Expression Filter the short answer
      is no because, other kinds of Druid filters perform better when used, Hence
      Expression filter is a plan B sort of thing. In order to push expression as
      Projects and Aggregates we will be using Expression based Virtual Columns.

      The major change is the merging of the logic of pushdown verification code and
      the Translation of RexCall/RexNode to Druid Json, native physical language. The
      main drive behind this redesign is the fact that in order to check if we can
      push down a tree of expressions to Druid we have to compute the Druid Expression
      String anyway. Thus instead of having 2 different code paths, one for pushdown
      validation and one for Json generation we can have one function that does both.
      For instance instead of having one code path to test and check if a given filter
      can be pushed or not and then having a translation layer code, will have
      one function that either returns a valid Druid Filter or null if it is not
      possible to pushdown. The same idea will be applied to how we push Projects and
      Aggregates, Post Aggregates and Sort.

      Here are the main elements/Classes of the new design. First will be merging the logic of
      Translation of Literals/InputRex/RexCall to a Druid physical representation.
      Translate leaf RexNode to Valid pair Druid Column + Extraction functions if possible

       * @param rexNode leaf Input Ref to Druid Column
       * @param rowType row type
       * @param druidQuery druid query
       * @return {@link Pair} of Column name and Extraction Function on the top of the input ref or
       * {@link Pair of(null, null)} when can not translate to valid Druid column
       protected static Pair<String, ExtractionFunction> toDruidColumn(RexNode rexNode,
       RelDataType rowType, DruidQuery druidQuery

      In the other hand, in order to Convert Literals to Druid Literals will introduce

       * @param rexNode rexNode to translate to Druid literal equivalante
       * @param rowType rowType associated to rexNode
       * @param druidQuery druid Query
       * @return non null string or null if it can not translate to valid Druid equivalent
      private static String toDruidLiteral(RexNode rexNode, RelDataType rowType,
       DruidQuery druidQuery

      Main new functions used to pushdown nodes and Druid Json generation.

      Filter pushdown verification and generates is done via


      For project pushdown added


      For Grouping pushdown added


      For Aggregation pushdown added


      For sort pushdown added

      Pushing of PostAggregates will be using Expression post Aggregates and use

      to generate expression

      For Expression computation most of the work is done here

      This static function generates Druid String expression out of a given RexNode or
      returns null if not possible.

      public static String toDruidExpression(
      final RexNode rexNode,
      final RelDataType inputRowType,
      final DruidQuery druidRel

      In order to support various kind of expressions added the following interface

      Thus user can implement custom expression converter based on the SqlOperator syntax and signature.

      public interface DruidSqlOperatorConverter {
       * Returns the calcite SQL operator corresponding to Druid operator.
       * @return operator
       SqlOperator calciteOperator();
       * Translate rexNode to valid Druid expression.
       * @param rexNode rexNode to translate to Druid expression
       * @param rowType row type associated with rexNode
       * @param druidQuery druid query used to figure out configs/fields related like timeZone
       * @return valid Druid expression or null if it can not convert the rexNode
       @Nullable String toDruidExpression(RexNode rexNode, RelDataType rowType, DruidQuery druidQuery);

      The Druid Query Class will provide
      org.apache.calcite.adapter.druid.DruidQuery#getOperatorConversionMap which is a
      map of SqlOperator to DruidSqlOperatorConverter.
      Any feedback is welcome.



          Issue Links



              • Assignee:
                bslim slim bouguerra
                bslim slim bouguerra
              • Votes:
                0 Vote for this issue
                4 Start watching this issue


                • Created: