Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-18197

Optimise AppendOnlyMap implementation

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.6.2, 2.0.1
    • Fix Version/s: 2.1.0
    • Component/s: Spark Core
    • Labels:
      None

      Description

      This improvement works by using the cheapest comparison test first and we observed a 1% performance on PageRank (HiBench large) with this change

      tprof output before the change follows, AppendOnlyMap.changeValue is where the optimisation occurs:

      PID 256059 22.86    java_1fa29
          MOD  81337  7.26     JITCODE
           SYM  11250  1.00      java/io/ObjectOutputStream.writeObject0(Ljava/lang/Object;Z)V_7fe098983af4
           SYM   8053  0.72      org/apache/spark/util/collection/AppendOnlyMap.changeValue(Ljava/lang/Object;Lscala/Function2;)Ljava/lang/Object;_7fe098c211e8
           SYM   5175  0.46      java/lang/String.equals(Ljava/lang/Object;)Z_7fe0989eb2e8
           SYM   3616  0.32      org/apache/spark/util/SizeEstimator$.estimate(Ljava/lang/Object;Ljava/util/IdentityHashMap;)J_7fe098bc35a8
           SYM   3235  0.29      org/apache/spark/util/collection/ExternalSorter$$anonfun$4$$anon$6.compare(Ljava/lang/Object;Ljava/lang/Object;)I_7fe098c855a8
           SYM   3182  0.28      java/io/ObjectInputStream$BlockDataInputStream.readUTFBody(J)Ljava/lang/String;_7fe098980ec8
           SYM   3111  0.28      org/apache/spark/util/SizeEstimator$SearchState.enqueue(Ljava/lang/Object;)V_7fe0989f2920
      

      tprof after the change

      MOD  56804  5.07     JITCODE
           SYM   8766  0.78      java/io/ObjectOutputStream.writeObject0(Ljava/lang/Object;Z)V_7f0088bb2034
           SYM   5746  0.51      java/io/ObjectStreamClass.lookup(Ljava/lang/Class;Z)Ljava/io/ObjectStreamClass;_7f0088944ae8
           SYM   3378  0.30      java/io/ObjectInputStream.readObject0(Z)Ljava/lang/Object;_7f0088c5ed00
           SYM   3121  0.28      java/io/ObjectInputStream$BlockDataInputStream.readUTFBody(J)Ljava/lang/String;_7f0088c6de08
           SYM   2857  0.26      org/apache/spark/storage/BufferReleasingInputStream.read([BII)I_7f0088b3e3a8
           SYM   2786  0.25      org/apache/spark/util/collection/AppendOnlyMap.changeValue(Ljava/lang/Object;Lscala/Function2;)Ljava/lang/Object;_7f008899b048
      

        Issue Links

          Activity

          Hide
          apachespark Apache Spark added a comment -

          User 'a-roberts' has created a pull request for this issue:
          https://github.com/apache/spark/pull/15714

          Show
          apachespark Apache Spark added a comment - User 'a-roberts' has created a pull request for this issue: https://github.com/apache/spark/pull/15714

            People

            • Assignee:
              aroberts Adam Roberts
              Reporter:
              aroberts Adam Roberts
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development