In this case, it is not possible to identify in which phase the counter mutation failed.
That's right, you can't (identify in which phase the counter mutation failed). But given how counters currently work we can't send you that information: the timeout is sent by the coordinator which only get acks once everything is finished, so if it doesn't get acks, it doesn't know which phase we're in. We'd need to change the protocol used internally as suggested a long time ago in
CASSANDRA-3199, but we've so far decided that the ROI for that wasn't good enough (mostly due to the huge headache that making this change while maintaining backward compatibility/rolling upgrade would be). Note in particular that even doing that wouldn't avoid the timeout, it would just make a tiny bit more info available to the coordinator when it happens but that info might not even help being sure whether the counter update has been persisted or not.
Overall, closing that issue as not a problem. Yes, whenever a node dies some counter inserts can timeout during the windows it takes for the failure detector to mark that node dead and this even if you have in theory enough nodes alive to fulfill the CL requirements. And yes, that's sad. But it's unfortunately a intrinsic limitation of the counter design for which we don't have a solution.
Or to put it another way, this is working as designed, which doesn't mean we disagree that this is a weakness of said design.