Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-3873

Use global caching for ReflectiveVisitDispatcher implementation

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Not A Problem
    • 1.22.0
    • None
    • core

    Description

      For curiosity, I use flame graph to profiling a simple query. The code snippet looks like below.

          String sql = "select empno, gender, name from EMPS where name = 'John'";
          Connection connection = null;
          Statement statement = null;
          try {
            Properties info = new Properties();
            info.put("model", jsonPath("smart"));
            connection = DriverManager.getConnection("jdbc:calcite:", info);      
            String x = null;
            for (int i = 0; i < 50000; i++) {
              statement = connection.createStatement();
              final ResultSet resultSet =
                  statement.executeQuery(
                      sql);
              while (resultSet.next()) {
                x = resultSet.getInt(1)
                    + resultSet.getString(2)
                    + resultSet.getString(3);
              }      
            }
          } catch (SQLException e) {
            e.printStackTrace();
          } finally {
            close(connection, statement);
          }
      

       

      I attach the generated flame graph pic1.svg

      3% on sql2rel
      9% on query optimizing,
      62% of the time is spent on code gen and implementation,
      20% on result set iterating and checking,
      … 
      

      Hope this graph is informative. Since I start to learn Calcite recently, I cannot tell where to start tuning, but from the graph one tiny point catches my attention, I find there are many reflection invocations in Prepare#trimUnusedFields. So, I spent some time trying to mitigate the small overhead.

      I optimize ReflectiveVisitDispatcher by introducing a global Guava cache with limited size to cache methods, also I add full unit tests for ReflectUtil.

      I count the reference of the method: ReflectUtil#createMethodDispatcher and

      ReflectUtil#createDispatcher (see below). Total 68 possible invocations, so the cache size is limited, by caching all the methods during the lifecycle of the process, we can eliminate reflection looking up methods overhead.

      org.apache.calcite.rel.rel2sql.RelToSqlConverter: 18 possible invocations.
      org.apache.calcite.sql2rel.RelDecorrelator: 15 possible invocations.
      org.apache.calcite.sql2rel.RelFieldTrimmer: 11 possible invocations.
      org.apache.calcite.sql2rel.RelStructuredTypeFlattener.RewriteRelVisitor: 22 possible invocations.
      org.apache.calcite.interpreter.Interpreter.CompilerImpl: 2 possible invocations.
      

      Before introducing the global caching, caching is shared per ReflectiveVisitDispatcher instance, now different ReflectiveVisitDispatcher in different thread is able to reuse the cached methods.

      See pic2.svg, after tuning, trimUnusedFields only takes 0.64% of the sampling time compared with 1.38% previously. I think this will help in a lot more places.

       

      Attachments

        1. jmh_result.txt
          7 kB
          neoremind
        2. pic1.svg
          1.59 MB
          neoremind
        3. pic2.svg
          1.60 MB
          neoremind
        4. ReflectVisitorDispatcherTest.java
          2 kB
          neoremind

        Issue Links

          Activity

            People

              Unassigned Unassigned
              neoremind neoremind
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10h 20m
                  10h 20m