Details
-
New Feature
-
Status: Resolved
-
Minor
-
Resolution: Duplicate
-
0.24.0
-
None
-
None
Description
Getting mapping scripts right is surprisingly hard -and if its wrong in production bad things happen. It would be good to have something simpler for beginners -and one that is trivial to generate by a machine based on infrastructure data.
I propose adding an alternative mapper, one driven by a java property file
- the specific topology mapper must be identified for loading
- it uses another key to identify the property file to load. This is checked for on startup -if missing, fail.
- one property, perhaps "default-rack" identifies the default rack mapping for any host not in the list
- every other entry lists a hostname to rack mapping
- hostname mapping is done on the first entry in the FQDN, to be less brittle to domain resolution.
Example
default-rack=/rack1
host1=/rack1
host2=/rack1
host3=/rack2
host4=/rack2
Implementation
- add a new mapper that builds a concurrent hash map
- read in every entry in the specified property file, add it to the map
- when queried, extract the hostname (i.e. everything before any ".")
- match that in the hash table, return if found
- if not found: return the default rack
Feature creep would be to poll this file for changes at a (specified) frequency, and pick up the changes when they occur. This would require removing the caching topology mapper that wraps all others in the NN and RM.
Attachments
Issue Links
- depends upon
-
HDFS-2492 BlockManager cross-rack replication checks only work for ScriptBasedMapping
- Open
- duplicates
-
HADOOP-7030 Add TableMapping topology implementation to read host to rack mapping from a file
- Closed
- relates to
-
HADOOP-7030 Add TableMapping topology implementation to read host to rack mapping from a file
- Closed