Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
2.3.0
-
None
-
None
Description
After HADOOP-9652, listStatus() or globStatus() calls against a local file system directory is very slow. A user was loading data from local file system to Hive and it took about 30 seconds. The same operation took less than a second pre-HADOOP-9652.
The input path had many other files beside the input files and strace showed that fork & exec of stat against each and every one of them. jstack confirmed that this was being done from getNativeFileLinkStatus().
Attachments
Attachments
Issue Links
- is broken by
-
HADOOP-9652 Allow RawLocalFs#getFileLinkStatus to fill in the link owner and mode if requested
- Closed
- is duplicated by
-
HADOOP-9877 Fix listing of snapshot directories in globStatus
- Closed
- is related to
-
HADOOP-10112 har file listing doesn't work with wild card
- Closed
-
HADOOP-9769 Remove org.apache.hadoop.fs.Stat when JDK6 support is dropped
- Patch Available