Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Incompatible change, Reviewed
-
Description
HDFS should support some type of statistics that allows an administrator to determine when a file was last accessed.
Since HDFS does not have quotas yet, it is likely that users keep on accumulating files in their home directories without much regard to the amount of space they are occupying. This causes memory-related problems with the namenode.
Access times are costly to maintain. AFS does not maintain access times. I thind DCE-DFS does maintain access times with a coarse granularity.
One proposal for HDFS would be to implement something like an "access bit".
1. This access-bit is set when a file is accessed. If the access bit is already set, then this call does not result in a transaction.
2. A FileSystem.clearAccessBits() indicates that the access bits of all files need to be cleared.
An administrator can effectively use the above mechanism (maybe a daily cron job) to determine files that are recently used.
Attachments
Attachments
Issue Links
- blocks
-
HDFS-430 create posix-like (as far as we can) layer for Linux on top of libhdfs
- Resolved
- is depended upon by
-
HADOOP-4077 Access permissions for setting access times and modification times for files
- Closed
-
HDFS-220 Transparent archival and restore of files from HDFS
- Resolved
- is related to
-
HADOOP-4099 HFTP interface compatibility with older releases broken
- Closed
-
HADOOP-4986 FSNamesystem.getBlockLocations sets access time without holding the namespace locks
- Closed
-
HADOOP-3336 Direct a subset of namenode RPC events for audit logging
- Closed
- relates to
-
HDFS-2712 setTimes should support only for files and move the atime field down to iNodeFile.
- Patch Available
-
HDFS-258 Confirm that all block history events are available in logs
- Open