Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Reviewed
Description
DataDrivenDBInputFormat runs a query to establish bounding values for each split it generates; but if it's going to generate only one split (mapreduce.job.maps == 1), then there's no reason to do this. This will remove overhead associated with a single-threaded import of a non-indexed table since it avoids a full table scan.
Attachments
Attachments
Issue Links
- depends upon
-
MAPREDUCE-1460 Oracle support in DataDrivenDBInputFormat
- Closed
- is depended upon by
-
MAPREDUCE-1502 Sqoop should run mysqldump in a mapper as opposed to a user-side process
- Resolved