Details
-
Improvement
-
Status: Open
-
Not a Priority
-
Resolution: Unresolved
-
None
-
None
Description
For now, we could only creating hive table input splits serially.
HiveTableInputFormat::createInputSplits cost 40s~60s for a table with 1500 partition in my tests. IMO, we could support creating splits in parallel, which will save most of the time to do this.