When running ValidatesRunner tests with the ULR, artifacts are never deleted. Since a new job is run per test, this uses up massive amounts of disk storage quickly (over 20 Gigabytes per execution). This often causes the machine running these tests to run out of disk space which means tests start failing.
The ULR should be modified to delete these artifacts after they have been staged to avoid this issue. Flink already does this, so the infrastructure exists.