Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
Currently at the end of a job FileSink operator moves/rename temp directory to another directory from which FetchTask fetches result. This is done to avoid fetching potential partial/invalid files by failed/runway tasks. This operation is expensive for cloud storage. It could be avoided if FetchTask is passed on set of files to read from instead of whole directory.
Attachments
Attachments
Issue Links
- causes
-
HIVE-28530 Set files in thread safe manner in HiveSequenceFileInputFormat
- Resolved
- Dependent
-
HIVE-21386 Extend the fetch task enhancement done in HIVE-21279 to make it work with query result cache
- Closed
- links to