Details
-
Type:
New Feature
-
Status: Closed
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: None
-
Fix Version/s: 0.8.0
-
Component/s: Query Processor
-
Labels:None
-
Hadoop Flags:Reviewed
-
Release Note:This patch adds support for the 'TABLESAMPLE(x PERCENT)' clause.
Description
We need a better input sampling to serve at least two purposes:
1. test their queries against a smaller data set
2. understand more about how the data look like without scanning the whole table.
A simple function that gives a subset splits will help in those cases. It doesn't have to be strict sampling.