Details
-
Wish
-
Status: Accepted
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
Package distribution in a large scale cluster is a non-trivial problem. This problem will get worse once we start to support container images (MESOS-2840). Imaging O(10000) agents simultaneously fetching an O(GB) container image from HDFS which only has 3 replicas for the given image file.
It'll be great if Mesos can support fetching packages using a P2P protocol (e.g., BitTorrent). We can also configure the P2P clients so that they are locality aware (e.g., fetching from peers in the same rack).
Content distribution using P2P protocols is not a new thing. It has been discussed and used in production, and proven to be successful:
http://www.ebaytechblog.com/2012/01/31/bittorrent-for-package-distribution-in-the-enterprise/
https://blog.twitter.com/2010/murder-fast-datacenter-code-deploys-using-bittorrent
https://torrentfreak.com/facebook-uses-bittorrent-and-they-love-it-100625/