Here is a patch. There are some edits I did just for code clarity.
I expanded the scope to include having the AbstractSecondPassGroupingCollector.needScores be determined based on the constructor args instead of always returning true (this is an optimization and addresses a TODO comment). I did not do this likewise for other collectors. Do you think this is okay or would you prefer a separate issue and increasing the scope there?
The Solr side was tricky to debug & fix. I ended up doing a refactoring in TopGroupsResultTransformer to extract out near duplicated code. I strengthened the test in TestDistributedGrouping so it actually tests score ordered groups, which it didn't before because the commenter (you?) thought distributed IDF was necessary when in fact just returning some field value as a score works fine. I ran into SOLR-6612 but avoided it by adding "maxScore" to the "handle" map as a "SKIP". I spent a little time trying to fix it but I stopped myself as it was becoming a distraction. Some little improvements in the issue might reflect that.