>This sounds rather ad-hoc. What is the use case?
One use case is doing backup a number of directories, say /user1/data, /user2/data, /user3/data, etc. during off peak hours everyday. Each of these directories may contain large number of files/bytes. If we simply do distcp, then it cannot finish copying everything within a single day.
Also, since DistCp currently copies files sequentially, files in /user1/data will be copied first. The other users will be unhappy.
If distcp support a limit option, we could do something like
distcp /user1/data limit 100GB, 1000000 files
distcp /user2/data limit 100GB, 1000000 files
These commands will be executed everyday. Suppose /user1/data contains 5 files as following
Then, distcp will copy file1 and file2 in the first day. In the second day, since file1 and file2 already exist, distcp will copy file3 and file4. User1 will expect 3 days to finish copying all files.