One of the requirements for EMR is that we have distcp in some form. This is implement distributed copy to/from a blobstore.
While this could be implemented in a lower layer in the stack (e.g. hadoop) multiple services would benefit from making this available at whirr level (e.g. nosql stores).