Great discussion, I'm enjoying it (although an hour with a whiteboard would be even better).
My eventual (no pun intended!) plan was going to be to evolve the interfaces and separation of responsibilities for AbstractS3AccessPolicy and S3Store such that S3Store never makes its own internal metadata calls (like the internal getFileStatus calls you mentioned).
Yeah, I was trying to come up with a way to do this. My goals are clean / minimal code, and near-optimal performance (want upstream s3a to be very fast).
The crux of the problem seems that the top level logic of things like mkdirs()/rename() etc. need to call getFileStatus(), and those internal getFileStatus() calls are subject to the same source-of-truth and retry policies as the public API. In this case, the only way I can think to really separate the top-level logic for FileSystem ops (e.g. mkdir) from the policy on MetadataStore, is to build some sort of execution plan in top level logic then pass to a policy layer to execute it. You'd have three layers, roughly, FileSystem top-level, Raw S3 I/O, Policy / execution. This seems like it would be slower and way over-complicated (some steps in execution are conditional, you end up with a pseudo-language). There is also the AOP approach that s3mper took, but I think we can do better since we can modify the upstream code, and our goals are a bit more ambitious (more than just listStatus(), plus ability to use MetadataStore as source of truth).
I thought it would introduce complicated control flow in a lot of the S3AFileSystem methods. However, refactorings like this are always subjective, and it's entirely possible that I was wrong.
This is true. Seems like the classic performance vs. complexity tradeoff. I don't think you are wrong at all, just a question of priorities. I'm willing to sacrifice a little code separation for significant performance benefit, and possibly a more complete solution (e.g. if internal getFileStatus() call misses a file above mkdirs(path) because that file creation was eventually consistent).
If you prefer, we can go that way
That is my preference at this stage. It was not as bad as I thought it would be when we prototyped it. Also I like doing refactoring / generalization after seeing some concrete cases in code (i.e. start by complicating top level S3AFileSystem logic).
My preference would probably change if I could think of a clean way to handle the internal getFileStatus() calls.
and later revisit the larger split I proposed here in patch 001 if it's deemed necessary.
I think we do need to break up S3AFileSystem. I'll be very supportive of future refactorization here, in general.
Really the only portion of patch 001 that I consider an absolute must is the S3ClientFactory work. I think it's vital to the project that we have the ability to mock S3 interactions to simulate eventual consistency.
Yes, I like this part of it. Being able to mock the S3 service out would be awesome, and I didn't notice any real downside.