Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
1.2.0
-
None
Description
Right now, the NettyBlockTransferService uses the total number of cores on the system as the number of threads and buffer arenas to create. The latter is more troubling – this can lead to significant allocation of extra heap and direct memory in situations where executors are relatively small compared to the whole machine. For instance, on a machine with 32 cores, we will allocate (32 cores * 16MB per arena = 512MB) * 2 for client and server = 1GB direct and heap memory. This can be a huge overhead if you're only using, say, 8 of those cores.