Details
Description
For some applications, it's convenient to specify a path to cache, and have HDFS automatically cache new data added to the path without sending a new caching request or a manual refresh command.
One example is new data appended to a cached file. It would be nice to re-cache a block at the new appended length, and cache new blocks added to the file.
Another example is a cached Hive partition directory, where a user can drop new files directly into the partition. It would be nice if these new files were cached.
In both cases, this automatic caching would happen after the file is closed, i.e. block replica is finalized.
Attachments
Attachments
Issue Links
- breaks
-
HDFS-5388 Loading fsimage fails to find cache pools during namenode startup.
- Resolved
- is duplicated by
-
HDFS-5313 NameNode hangs during startup trying to apply OP_ADD_PATH_BASED_CACHE_DIRECTIVE.
- Resolved
- is related to
-
HDFS-5348 Fix error message when dfs.datanode.max.locked.memory is improperly configured
- Resolved
-
HDFS-5349 DNA_CACHE and DNA_UNCACHE should be by blockId only
- Resolved
-
HDFS-5358 Add replication field to PathBasedCacheDirective
- Resolved
-
HDFS-5359 Allow LightWeightGSet#Iterator to remove elements
- Resolved
-
HDFS-5366 recaching improvements
- Closed