Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-2170

Use Druid Expressions capabilities to improve the amount of work that can be pushed to Druid

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.16.0
    • Component/s: druid
    • Labels:
      None

      Description

      Druid 0.11 has newly built in capabilities called Expressions that can be used to push expression like projects/aggregates/filters. 

      In order to leverage this new feature, some changes need to be done to the Druid Calcite adapter. 

      This is a link to the current supported functions and expressions in Druid
      http://druid.io/docs/latest/misc/math-expr.html
      As you can see from the Docs an expression can be an actual tree of operators,
      Expression can be used with Filters, Projects, Aggregates, PostAggregates and
      Having filters. For Filters will have new Filter kind called Filter expression.
      FYI, you might ask can we push everything as Expression Filter the short answer
      is no because, other kinds of Druid filters perform better when used, Hence
      Expression filter is a plan B sort of thing. In order to push expression as
      Projects and Aggregates we will be using Expression based Virtual Columns.

      The major change is the merging of the logic of pushdown verification code and
      the Translation of RexCall/RexNode to Druid Json, native physical language. The
      main drive behind this redesign is the fact that in order to check if we can
      push down a tree of expressions to Druid we have to compute the Druid Expression
      String anyway. Thus instead of having 2 different code paths, one for pushdown
      validation and one for Json generation we can have one function that does both.
      For instance instead of having one code path to test and check if a given filter
      can be pushed or not and then having a translation layer code, will have
      one function that either returns a valid Druid Filter or null if it is not
      possible to pushdown. The same idea will be applied to how we push Projects and
      Aggregates, Post Aggregates and Sort.

      Here are the main elements/Classes of the new design. First will be merging the logic of
      Translation of Literals/InputRex/RexCall to a Druid physical representation.
      Translate leaf RexNode to Valid pair Druid Column + Extraction functions if possible

      /**
       * @param rexNode leaf Input Ref to Druid Column
       * @param rowType row type
       * @param druidQuery druid query
       *
       * @return {@link Pair} of Column name and Extraction Function on the top of the input ref or
       * {@link Pair of(null, null)} when can not translate to valid Druid column
       */
       protected static Pair<String, ExtractionFunction> toDruidColumn(RexNode rexNode,
       RelDataType rowType, DruidQuery druidQuery
       )
      

      In the other hand, in order to Convert Literals to Druid Literals will introduce

      /**
       * @param rexNode rexNode to translate to Druid literal equivalante
       * @param rowType rowType associated to rexNode
       * @param druidQuery druid Query
       *
       * @return non null string or null if it can not translate to valid Druid equivalent
       */
      @Nullable
      private static String toDruidLiteral(RexNode rexNode, RelDataType rowType,
       DruidQuery druidQuery
      )
      

      Main new functions used to pushdown nodes and Druid Json generation.

      Filter pushdown verification and generates is done via

      org.apache.calcite.adapter.druid.DruidJsonFilter#toDruidFilters
      

      For project pushdown added

      org.apache.calcite.adapter.druid.DruidQuery#computeProjectAsScan.
      

      For Grouping pushdown added

      org.apache.calcite.adapter.druid.DruidQuery#computeProjectGroupSet.
      

      For Aggregation pushdown added

      org.apache.calcite.adapter.druid.DruidQuery#computeDruidJsonAgg
      

      For sort pushdown added

      org.apache.calcite.adapter.druid.DruidQuery#computeSort\{code}
      Pushing of PostAggregates will be using Expression post Aggregates and use
      

      org.apache.calcite.adapter.druid.DruidExpressions#toDruidExpression{code}
      to generate expression

      For Expression computation most of the work is done here

      org.apache.calcite.adapter.druid.DruidExpressions#toDruidExpression\{code}
      This static function generates Druid String expression out of a given RexNode or
      returns null if not possible.
      

      @Nullable
      public static String toDruidExpression(
      final RexNode rexNode,
      final RelDataType inputRowType,
      final DruidQuery druidRel
      )

      In order to support various kind of expressions added the following interface
      

      org.apache.calcite.adapter.druid.DruidSqlOperatorConverter{code}
      Thus user can implement custom expression converter based on the SqlOperator syntax and signature.

      public interface DruidSqlOperatorConverter {
       /**
       * Returns the calcite SQL operator corresponding to Druid operator.
       *
       * @return operator
       */
       SqlOperator calciteOperator();
       /**
       * Translate rexNode to valid Druid expression.
       * @param rexNode rexNode to translate to Druid expression
       * @param rowType row type associated with rexNode
       * @param druidQuery druid query used to figure out configs/fields related like timeZone
       *
       * @return valid Druid expression or null if it can not convert the rexNode
       */
       @Nullable String toDruidExpression(RexNode rexNode, RelDataType rowType, DruidQuery druidQuery);
      }
      

      The Druid Query Class will provide
      org.apache.calcite.adapter.druid.DruidQuery#getOperatorConversionMap which is a
      map of SqlOperator to DruidSqlOperatorConverter.
      Any feedback is welcome.

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                bslim slim bouguerra
                Reporter:
                bslim slim bouguerra
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: