Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-2696

Make it easier to configure SqlToRelConverter.Config.getInSubQueryThreshold()

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Reopened
    • Major
    • Resolution: Unresolved
    • 1.17.0
    • None
    • core

    Description

      A Filter containing an IN clause is not passed to Enumerable.scan.

      I'm using the Calcite JDBC driver with an own SchemaFactory (defined by a model property) that provides a schema containing a ProjectableFilterableTable:

      String model = "inline:" //
      + "{" //
      + " version: '1.0', " //
      + " defaultSchema: 'test'," //
      + " schemas: [" //
      + " {" //
      + " name: 'test'," //
      + " type: 'custom'," //
      + " factory: '" + TestSchemaFactory.class.getName() + "'" //
      + " }"
      + " ]" //
      + "}";
      Properties properties = new Properties();
      properties.put(CalciteConnectionProperty.MODEL.camelName(), model);
      connection = DriverManager.getConnection("jdbc:calcite:", properties);
      

       

       

      class TestTable extends AbstractQueryableTable implements ProjectableFilterableTable {
      
        public Enumerable<Object[]> scan(DataContext root, List<RexNode> filters, int[] projects) {
      ...
        }
      
        ...
      }

       

      It maps to a Java class and provides two Integer typed columns "value1" and "value2".

      The following query leads to a quite expensive behavior in the scan method if the following statement is executed:

       

      SELECT "value" FROM "TEST_TABLE" WHERE "value1" = 1 AND "value2" in (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20)
      

      The scan method is invoked with a filter that only covers the part "value1" = 1, the IN clause is completely omitted. The result on the JDBC side is still valid but in my case this still leads to a full scan of a large underlying data set (millions of rows).

      Interestingly the filter part reflecting the IN operator is provided if the number of elements in the list is below 20. It seems that this is controlled by org.apache.calcite.sql2rel.SqlToRelConverter.Config#getInSubQueryThreshold. It would at be very helpful if this behavior could be confgiured on the JDBC property level.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned Assign to me
            dirk.mahler Dirk Mahler

            Dates

              Created:
              Updated:

              Time Tracking

              Estimated:
              Original Estimate - Not Specified
              Not Specified
              Remaining:
              Remaining Estimate - 0h
              0h
              Logged:
              Time Spent - 10m
              10m

              Issue deployment