Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-22147

BlockId.hashCode allocates a StringBuilder/String on each call

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.2.0
    • 2.3.0
    • Block Manager, Spark Core
    • None

    Description

      The base class BlockId defines hashCode and equals for all its subclasses in terms of name. This makes the definitions of different ID types very concise. The downside, however, is redundant allocations. While I don't think this could be the major issue, it is still a bit disappointing to increase GC pressure on the driver for nothing. For our machine learning workloads, we've seen as much as 10% of all allocations on the driver coming from BlockId.hashCode calls done for BlockManagerMasterEndpoint.blockLocations.

      Attachments

        Activity

          People

            lebedev Sergei Lebedev
            lebedev Sergei Lebedev
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: