Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-22147

BlockId.hashCode allocates a StringBuilder/String on each call

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 2.2.0
    • Fix Version/s: 2.3.0
    • Component/s: Block Manager
    • Labels:
      None

      Description

      The base class BlockId defines hashCode and equals for all its subclasses in terms of name. This makes the definitions of different ID types very concise. The downside, however, is redundant allocations. While I don't think this could be the major issue, it is still a bit disappointing to increase GC pressure on the driver for nothing. For our machine learning workloads, we've seen as much as 10% of all allocations on the driver coming from BlockId.hashCode calls done for BlockManagerMasterEndpoint.blockLocations.

        Attachments

          Activity

            People

            • Assignee:
              lebedev Sergei Lebedev
              Reporter:
              lebedev Sergei Lebedev
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: