Currently vertex mutations happen while preparing a superstep execution in a single-threaded manner. This is extremely inefficient specially for the case of out-of-core as we have to go over all the partitions, load them, and test if they have any applicable mutation. Also, there is an unexpected behavior in cases where a mutation happen to a vertex and some messages being delivered to that vertex at the same time (current implementation fails the job in that case).
In order to fix/optimize this, mutations should happen in a multi-threaded fashion right before start of processing of a partition. Also, while partition migration happens, mutation requests should migrate along with the partition and its messages.