Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-2715

Review and document '_' prefix convention in input directories

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • documentation

    Description

      We use files and directories prefixed with '_' to store logs, metadata and other info that might be useful to the owner of a job within the output directory. The standard input methods then ignore such files by default.

      HADOOP-2391 lead to some discussion of the '_' convention in output directories. No all developers input formats are supporting this. We should review the convention and document it well so that future input methods support it. Or we should come up with an alternate approach.

      My hope is that after some discuss we will close this bug by creating a documentation patch explaining the convention.

      It sounds like the convention is implemented via some input filter classes. We should discuss if this generic solution is helping or obscuring the intent of the convention. Perhaps we should just have a non-configurable filter, so '_' prefixed files are treated like '.' prefixed files by most unix tools.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              eric14 Eric Baldeschwieler
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: