I did not include a cancellation mechanism in the DataBags themselves because it was not clear to me that it would be necessary.
The only point at which a significant amount of time can be spent in the DataBag code is in the add() method right as a spill is occurring. The program execution may be in Array.sort() (SortedDataBag and DistinctDataBag) or it may be in the process of serializing tuples to disk. Given anticipated spill thresholds (1,000-100,000 tuples or memory in the 10-100 MB range), and the fact that disk I/O is sequential (and thus fast), it seemed like an unnecessary complication to support cancellation since those operations would complete in the 10's of seconds range. Any physical query operator using the DataBag would then be able to cancel immediately after the spill finished (QueryIterSort passes the cancel request to it's embedded iterator which will then throw the QueryCancellationException on the next iteration).
After the add phase is complete, and the QueryIterSort starts returning results, cancellation will be handled by the super class (QueryIteratorBase).
Porting the tests meant that they would test the QueryIterSort with the embedded DataBag to be sure that the temporary files were cleaned up when the iterator was cancelled. So it's not really testing cancellation on the DataBag per say, but rather the new QueryIterSort.