At the moment we use Zookeeper as a distributed coordinator for implementing JobManager high availability services. But in the cloud-native environment, there is a trend that more and more users prefer to use Kubernetes as the underlying scheduler backend while Storage Object as the Storage medium, both of these two services don't require Zookeeper deployment.
As a result, in the K8s setups, people have to deploy and maintain their Zookeeper clusters for solving JobManager SPOF. This ticket proposes to provide a simplified FileSystem HA implementation with the leader-election removed, which saves the efforts of Zookeeper deployment.
To achieve this, we plan to
- Introduce a FileSystemHaServices which implements the HighAvailabilityServices.
- Replace Deployment with StatefulSet to ensure at most one semantics, preventing potential concurrent access to the underlying FileSystem.