Description
Given a table
create table test ( pk1 varchar not null , pk2 varchar not null, pk3 varchar not null, v1 varchar, v2 varchar, CONSTRAINT TEST_PK PRIMARY KEY ( pk1, pk2, pk3 ))
Consider following three cases:
1. OrderBy of ClientScanPlan
for sql:
select v1 from (select v1,v2,pk3 from test t where pk1 = '6' order by t.v2,t.pk3,t.v1 limit 10) a order by v2,pk3
Obviously, the outer OrderBy "order by v2,pk3" should be compiled out because it matchs the inner query OrderBy "order by t.v2,t.pk3,t.v1" , but unfortunately it is not compiled out.
2. GroupBy of ClientAggregatePlan
for sql :
select v1 from (select v1,pk2,pk1 from test t where pk1 = '6' order by t.pk2,t.v1,t.pk1 limit 10) a group by pk2,v1
Obviously, the outer GroupBy "group by pk2,v1" should be orderPreserving because it matchs the inner query OrderBy "order by t.pk2,t.v1,t.pk1" , but unfortunately the isOrderPreserving() of outer GroupBy return false.
3. OrderBy of SortMergeJoinPlan(from PHOENIX-4618)
for sql:
SELECT * FROM T1 JOIN T2 ON T1.a = T2.a and T1.b = T2.b
The result of the sort-merge-join is sorted on (T1.a, T1.b) and (T2.a, T2.b) at the same time.
Thus, both 1)
SELECT * FROM T1 JOIN T2 ON T1.a = T2.a and T1.b = T2.b ORDER BY T1.a, T1.b
and 2)
SELECT * FROM T1 JOIN T2 ON T1.a = T2.a and T1.b = T2.b ORDER BY T2.a, T2.b
should avoid doing an extra order-by after the sort-merge-join operation.
All the above three cases are caused by the same problem that the OrderPreservingTracker relies solely on row keys for inferring alignment between the target Expression and the source sortedness.
For following QueryCompiler.compileSingleQuery, because the inner query of above cases has OrderBy, so in line 534, local variable isInRowKeyOrder is false:
518 protected QueryPlan compileSingleQuery(StatementContext context, SelectStatement select, List<Object> binds, boolean asSubquery, boolean allowPageFilter) throws SQLException{ 519 SelectStatement innerSelect = select.getInnerSelectStatement(); 520 if (innerSelect == null) { 521 return compileSingleFlatQuery(context, select, binds, asSubquery, allowPageFilter, null, null, true); 522 } 523 524 QueryPlan innerPlan = compileSubquery(innerSelect, false); 525 TupleProjector tupleProjector = new TupleProjector(innerPlan.getProjector()); 526 innerPlan = new TupleProjectionPlan(innerPlan, tupleProjector, null); 527 528 // Replace the original resolver and table with those having compiled type info. 529 TableRef tableRef = context.getResolver().getTables().get(0); 530 ColumnResolver resolver = FromCompiler.getResolverForCompiledDerivedTable(statement.getConnection(), tableRef, innerPlan.getProjector()); 531 context.setResolver(resolver); 532 tableRef = resolver.getTables().get(0); 533 context.setCurrentTable(tableRef); 534 boolean isInRowKeyOrder = innerPlan.getGroupBy() == GroupBy.EMPTY_GROUP_BY && innerPlan.getOrderBy() == OrderBy.EMPTY_ORDER_BY; 535 536 return compileSingleFlatQuery(context, select, binds, asSubquery, allowPageFilter, innerPlan, tupleProjector, isInRowKeyOrder); 537 }
Because the OrderPreservingTracker relies solely on row keys, so in following line 170 of OrderByCompiler.compile , when isInRowKeyOrder is false, the OrderPreservingTracker.isOrderPreserving() checking is skipped:
169 // If we're ordering by the order returned by the scan, we don't need an order by 170 if (isInRowKeyOrder && tracker.isOrderPreserving()) { 171 if (tracker.isReverse()) { 172 // Don't use reverse scan if: 173 // 1) we're using a skip scan, as our skip scan doesn't support this yet. 174 // 2) we have the FORWARD_SCAN hint set to choose to keep loading of column 175 // families on demand versus doing a reverse scan 176 // REV_ROW_KEY_ORDER_BY scan would not take effect for a projected table, so don't return it for such table types. 177 if (context.getConnection().getQueryServices().getProps().getBoolean(QueryServices.USE_REVERSE_SCAN_ATTRIB, QueryServicesOptions.DEFAULT_USE_REVERSE_SCAN) 178 && !context.getScanRanges().useSkipScanFilter() 179 && context.getCurrentTable().getTable().getType() != PTableType.PROJECTED 180 && context.getCurrentTable().getTable().getType() != PTableType.SUBQUERY 181 && !statement.getHint().hasHint(Hint.FORWARD_SCAN)) { 182 return OrderBy.REV_ROW_KEY_ORDER_BY; 183 } 184 } else { 185 return OrderBy.FWD_ROW_KEY_ORDER_BY; 186 } 187 }
What we can do to fix this problem are :
- refactor the OrderPreservingTracker to make it inferring alignment between the target Expression and the source sortedness based on OrderByExpression , not the row keys, so we can remove the variable isInRowKeyOrder.
- add a new method getActualOutputOrderBys to the QueryPlan to get the output OrderBys of the query result , eg. for the above third SortMergeJoinPlan example ,getActualOutputOrderBys returns (T1.a, T1.b) and (T2.a, T2.b) .
Besides above three cases, a implicit semantic optimization could be implemented also:
for following sql , even the literal OrderBy of inner query is "order by t.pk1,t.pk2" . By getActualOutputOrderBys of inner QueryPlan, we can infer that the actual OrderBy of inner query is "order by t.pk1,t.pk2,t.pk3" , so the outer OrderBy "order by pk1,pk2,pk3" could be compiled out:
select pk1 from (select pk3,pk2,pk1 from test t where v1 = '6' order by t.pk1,t.pk2 limit 10) a where pk3 > '8' order by pk1,pk2,pk3
Attachments
Attachments
Issue Links
- contains
-
PHOENIX-4618 Optimize out order-by on sort-merge-join when possible
- Closed
- is related to
-
PHOENIX-7352 Improve OrderPreservingTracker to support extracting partial ordering columns for TupleProjectionPlan
- Resolved
- links to