Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
We use files and directories prefixed with '_' to store logs, metadata and other info that might be useful to the owner of a job within the output directory. The standard input methods then ignore such files by default.
HADOOP-2391 lead to some discussion of the '_' convention in output directories. No all developers input formats are supporting this. We should review the convention and document it well so that future input methods support it. Or we should come up with an alternate approach.
My hope is that after some discuss we will close this bug by creating a documentation patch explaining the convention.
It sounds like the convention is implemented via some input filter classes. We should discuss if this generic solution is helping or obscuring the intent of the convention. Perhaps we should just have a non-configurable filter, so '_' prefixed files are treated like '.' prefixed files by most unix tools.
Attachments
Issue Links
- is related to
-
HADOOP-2391 Speculative Execution race condition with output paths
- Closed