could you lay out your alternative proposal for a conf option? I could rename the conf and make it so it takes a delimited set of kv pairs, e.g. ".snapshot=.user-snapshot,.some-new-reserved=.renamed-new-reserved", but I felt that was kind of ugly. I wanted full sub rather than prefix here for flexibility.
This is not what I have in mind. /<path>/<reserved_file> could be renamed to /<path>/<reserved_file>+<configured_rename_suffix>. The user finds all the renamed files (from the log) and renames them once the system comes up, if necessary. In fact coming to think of it, I think the suffix should probably be not configurable either. The system can choose some convention where rename suffix could be - ".<layout_version>.reserved_renamed_after_ugprade". The user must run -upgrade with a new option that allows renaming of reserved file.
The advantages of this are:
- Post upgrade at any time (until user renames the file), all the renamed files can be found
- The probability of conflict of file names during upgrade is lower
- Unnecessary configuration changes are avoided
With just a prefix mutation, I could easily imagine having to run some operation after the NN had started up to then find all of the renamed paths and again rename them to some other name with the prefix removed. That's a pain that we shouldn't put our users through.
I do not understand what the pain is. It is just renaming the files. Also what if a user wants to rename .snapshot in one directory to x and .snapshot in another directory to y, based on the context of how the file is being used.
Empirically we rarely add reserved names (only two in the lifetime of HDFS so far) and I don't anticipate adding many more
If we looked at this same question last year, we would have said we will never have reserved names in HDFS at all. I think there are some that I have plans of adding. I want to add some directories in the future for storing file system specific information. This is the way I envision moving fsimage and possibly finalized editlog segments into HDFS itself.
... but rather to do as Andrew suggests and add a conf option per reserved name, and add the code necessary for each one as it comes up....
I think adding one configuration for each reserved name does not seem right. In fact, if possible we should avoid configuration changes at all for renames that are required one time during upgrade to a version that adds reserved file names.