Description
We need to improve AWS S3 support. As you know, S3 has different characteristics as follows:
- No move operation. Move is emulated by copy and remove.
- too slow directory listing
- unnecessary locality (i.e., always remote access)
- eventual consistency
Emulating S3 via just HDFS implementation may cause lots of performance degradation. We need to mitigate the performance degradation points.
This is an umbrella issue to track sub tasks.