It looks like the main two candidates to be shared between YARN and HDFS Router-based federation are:
- Generic State Store: Right now, in
HDFS-10647 we make use of a generic State Store that stores entities. Once we want to store a new piece of information, we just need to create a new serialization/deserialization method. I'm pretty sure we could abstract this even further and it would be fairly easy to use in YARN. Actually, I think this could be even extended to components like the RMStateStore.
- Membership maintenance: Our proposal in HDFS-10467 is to follow a pull model to avoid changes in the Namenode. However, this brings some additional problems like fault tolerance: the Router can die while the Namenode is still running. To mitigate this, we made Routers able to monitor multiple Namenodes. Then, when the Routers get the information from the State Store, they do use quorum to unify views. Not sure if it makes sense for YARN federation to use this approach but it's worth considering. The other challenge would be to make this service abstract enough.
Subru Krishnan, Carlo Curino, Giovanni Matteo Fumarola, any thoughts on this?