> I think the extended version of the API would help in doing incremental distcp when hdfs-append is supported.
Thanks for the use case! An append-savvy incremental distcp might first use listStatus to get all file lengths and dates from both filesystems, then figure out which had grown longer but whose creation dates had not changed, indicating they'd been appended to. Then a batch call could be made to fetch block locations of just newly appended sections, and these would be used to construct splits that can be localized well. Does that sound right?
In this case we would not list directories, but rather always pass in a list of individual files. The mapping from inputs to outputs would be 1:1 so it could take the form:
A corollary is that it does not make sense to pass start/end positions for a directory, although these could be ignored.
Do we want to try to develop a single swiss-army-knife batch call, or add operation-optimized calls as we go?