Details
-
Improvement
-
Status: Closed
-
Minor
-
Resolution: Not A Problem
-
1.22.0
-
None
Description
For curiosity, I use flame graph to profiling a simple query. The code snippet looks like below.
String sql = "select empno, gender, name from EMPS where name = 'John'"; Connection connection = null; Statement statement = null; try { Properties info = new Properties(); info.put("model", jsonPath("smart")); connection = DriverManager.getConnection("jdbc:calcite:", info); String x = null; for (int i = 0; i < 50000; i++) { statement = connection.createStatement(); final ResultSet resultSet = statement.executeQuery( sql); while (resultSet.next()) { x = resultSet.getInt(1) + resultSet.getString(2) + resultSet.getString(3); } } } catch (SQLException e) { e.printStackTrace(); } finally { close(connection, statement); }
I attach the generated flame graph pic1.svg
3% on sql2rel 9% on query optimizing, 62% of the time is spent on code gen and implementation, 20% on result set iterating and checking, …
Hope this graph is informative. Since I start to learn Calcite recently, I cannot tell where to start tuning, but from the graph one tiny point catches my attention, I find there are many reflection invocations in Prepare#trimUnusedFields. So, I spent some time trying to mitigate the small overhead.
I optimize ReflectiveVisitDispatcher by introducing a global Guava cache with limited size to cache methods, also I add full unit tests for ReflectUtil.
I count the reference of the method: ReflectUtil#createMethodDispatcher and
ReflectUtil#createDispatcher (see below). Total 68 possible invocations, so the cache size is limited, by caching all the methods during the lifecycle of the process, we can eliminate reflection looking up methods overhead.
org.apache.calcite.rel.rel2sql.RelToSqlConverter: 18 possible invocations. org.apache.calcite.sql2rel.RelDecorrelator: 15 possible invocations. org.apache.calcite.sql2rel.RelFieldTrimmer: 11 possible invocations. org.apache.calcite.sql2rel.RelStructuredTypeFlattener.RewriteRelVisitor: 22 possible invocations. org.apache.calcite.interpreter.Interpreter.CompilerImpl: 2 possible invocations.
Before introducing the global caching, caching is shared per ReflectiveVisitDispatcher instance, now different ReflectiveVisitDispatcher in different thread is able to reuse the cached methods.
See pic2.svg, after tuning, trimUnusedFields only takes 0.64% of the sampling time compared with 1.38% previously. I think this will help in a lot more places.
Attachments
Attachments
Issue Links
- links to