Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
2.4.0
-
None
-
None
Description
SQL interface support repartitionByRange to improvement data pushdown. I have test this feature with a big table(data size: 1.1 T, row count: 282,001,954,428) .
The test sql is:
select * from table where id=401564838907
The test result:
Mode | Input Size | Records | Total Time | Duration | Prepare data Resource Allocation MB-seconds |
default | 959.2 GB | 237624395522 | 11.2 h | 1.3 min | 6496280086 |
DISTRIBUTE BY | 970.8 GB | 244642791213 | 11.4 h | 1.3 min | 10536069846 |
SORT BY | 456.3 GB | 101587838784 | 5.4 h | 31 s | 8965158620 |
DISTRIBUTE BY + SORT BY | 219.0 GB | 51723521593 | 3.3 h | 54 s | 12552656774 |
RANGE PARTITION BY | 38.5 GB | 75355144 | 45 min | 13 s | 14525275297 |
RANGE PARTITION BY + SORT BY | 17.4 GB | 14334724 | 45 min | 12 s | 16255296698 |