This may be impossible to fix without a complete restructuring of bulk import.
There's a race condition between the update of the metadata with a bulk-file-loaded flag, and the closing of the transaction. The current code keeps this window very small, but it is still possible.
Another "fix" is to never move files to the failed directory: always copy them. However, the race condition is just moved from the Master to the Garbage Collector.
The work-around now is to increase the number of retries to a very high number.