In my cluster, we are suffering from OOM of shuffle-service.
We found that a lot of executors are fetching blocks from a single shuffle-service. Analyzing the memory, we found that the blockIds(shuffle_shuffleId_mapId_reduceId) takes about 1.5GBytes.
In current code, chunks are fetched from shuffle service in two steps:
Step-1. Send OpenBlocks, which contains the blocks list to to fetch;
Step-2. Fetch the consecutive chunks from shuffle-service by streamId and chunkIndex
Thus memory cost can be improved for step-1.