I setup Java Flight Recorder to profile indexing a 650MB JSON file using bin/post on 2 shard, 2 replica setup. It shows that the JavaBinCodec.writeExternString(String) method contributes a lot of garbage during indexing in SolrCloud. More specifically, it contributes ~1GB of HashMap$Node objects and ~450MB of HashMap$Node objects.
Most of this allocation is because every request is serialized using a new instance of JavaBinUpdateRequestCodec which internally allocates a new HashMap for storing the extern strings.
We should explore keeping a global extern string map to eliminate redundant allocations.