Giraph
  1. Giraph
  2. GIRAPH-616

Decouple vertices and edges in DiskBackedPartitionStore and avoid writing back edges when the algorithm does not change topology.

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Many algorithms work on a static graph. In these cases, when running out-of-core graph we end up writing back the edges that have not changed since we read them. By decoupling vertices and edges, we can write back only the freshly computed vertex values.

      1. GIRAPH-616.diff
        11 kB
        Claudio Martella
      2. GIRAPH-616.diff
        11 kB
        Claudio Martella

        Issue Links

          Activity

          Hide
          Claudio Martella added a comment -

          Implements the decoupling inside of the OOC graph, and introduces the isStaticGraph parameter (default to false).

          passes mvn verify

          Show
          Claudio Martella added a comment - Implements the decoupling inside of the OOC graph, and introduces the isStaticGraph parameter (default to false). passes mvn verify
          Hide
          Claudio Martella added a comment -

          We get the expected improvement, but there is a lot of variability in these benchmarks. It is difficult to assess exactly how much we win. But we don't introduce regression for sure.

          PageRankBenchmark, 1M vertices, 100 edges each

          trunk in-memory:

          13/04/10 19:11:36 INFO mapred.JobClient: Total (milliseconds)=65181
          13/04/10 19:11:36 INFO mapred.JobClient: Superstep 3 (milliseconds)=4383
          13/04/10 19:11:36 INFO mapred.JobClient: Superstep 4 (milliseconds)=4007
          13/04/10 19:11:36 INFO mapred.JobClient: Superstep 10 (milliseconds)=476
          13/04/10 19:11:36 INFO mapred.JobClient: Setup (milliseconds)=9780
          13/04/10 19:11:36 INFO mapred.JobClient: Shutdown (milliseconds)=83
          13/04/10 19:11:36 INFO mapred.JobClient: Superstep 7 (milliseconds)=3828
          13/04/10 19:11:36 INFO mapred.JobClient: Superstep 9 (milliseconds)=3707
          13/04/10 19:11:36 INFO mapred.JobClient: Superstep 0 (milliseconds)=10283
          13/04/10 19:11:36 INFO mapred.JobClient: Superstep 8 (milliseconds)=3374
          13/04/10 19:11:36 INFO mapred.JobClient: Input superstep (milliseconds)=7695
          13/04/10 19:11:36 INFO mapred.JobClient: Superstep 6 (milliseconds)=4275
          13/04/10 19:11:36 INFO mapred.JobClient: Superstep 5 (milliseconds)=4119
          13/04/10 19:11:36 INFO mapred.JobClient: Superstep 2 (milliseconds)=3944
          13/04/10 19:11:36 INFO mapred.JobClient: Superstep 1 (milliseconds)=5224

          GIRAPH-616 OOC graph (maxPartitions=2), isStaticGraph = false:

          13/04/10 19:06:08 INFO mapred.JobClient: Total (milliseconds)=131783
          13/04/10 19:06:08 INFO mapred.JobClient: Superstep 3 (milliseconds)=11603
          13/04/10 19:06:08 INFO mapred.JobClient: Superstep 4 (milliseconds)=8596
          13/04/10 19:06:08 INFO mapred.JobClient: Superstep 10 (milliseconds)=10196
          13/04/10 19:06:08 INFO mapred.JobClient: Setup (milliseconds)=7960
          13/04/10 19:06:08 INFO mapred.JobClient: Shutdown (milliseconds)=97
          13/04/10 19:06:08 INFO mapred.JobClient: Superstep 7 (milliseconds)=9283
          13/04/10 19:06:08 INFO mapred.JobClient: Superstep 9 (milliseconds)=6139
          13/04/10 19:06:08 INFO mapred.JobClient: Superstep 0 (milliseconds)=10843
          13/04/10 19:06:08 INFO mapred.JobClient: Superstep 8 (milliseconds)=7047
          13/04/10 19:06:08 INFO mapred.JobClient: Input superstep (milliseconds)=25272
          13/04/10 19:06:08 INFO mapred.JobClient: Superstep 6 (milliseconds)=6272
          13/04/10 19:06:08 INFO mapred.JobClient: Superstep 5 (milliseconds)=13822
          13/04/10 19:06:08 INFO mapred.JobClient: Superstep 2 (milliseconds)=5456
          13/04/10 19:06:08 INFO mapred.JobClient: Superstep 1 (milliseconds)=9193

          GIRAPH-616 OOC graph (maxPartitions=2), isStaticGraph = true:

          13/04/10 19:14:03 INFO mapred.JobClient: Total (milliseconds)=82618
          13/04/10 19:14:03 INFO mapred.JobClient: Superstep 3 (milliseconds)=5542
          13/04/10 19:14:03 INFO mapred.JobClient: Superstep 4 (milliseconds)=6629
          13/04/10 19:14:03 INFO mapred.JobClient: Superstep 10 (milliseconds)=3500
          13/04/10 19:14:03 INFO mapred.JobClient: Setup (milliseconds)=8515
          13/04/10 19:14:03 INFO mapred.JobClient: Shutdown (milliseconds)=80
          13/04/10 19:14:03 INFO mapred.JobClient: Superstep 7 (milliseconds)=5353
          13/04/10 19:14:03 INFO mapred.JobClient: Superstep 9 (milliseconds)=5251
          13/04/10 19:14:03 INFO mapred.JobClient: Superstep 0 (milliseconds)=4734
          13/04/10 19:14:03 INFO mapred.JobClient: Superstep 8 (milliseconds)=5496
          13/04/10 19:14:03 INFO mapred.JobClient: Input superstep (milliseconds)=6628
          13/04/10 19:14:03 INFO mapred.JobClient: Superstep 6 (milliseconds)=9124
          13/04/10 19:14:03 INFO mapred.JobClient: Superstep 5 (milliseconds)=5817
          13/04/10 19:14:03 INFO mapred.JobClient: Superstep 2 (milliseconds)=5504
          13/04/10 19:14:03 INFO mapred.JobClient: Superstep 1 (milliseconds)=10438

          Show
          Claudio Martella added a comment - We get the expected improvement, but there is a lot of variability in these benchmarks. It is difficult to assess exactly how much we win. But we don't introduce regression for sure. PageRankBenchmark, 1M vertices, 100 edges each trunk in-memory: 13/04/10 19:11:36 INFO mapred.JobClient: Total (milliseconds)=65181 13/04/10 19:11:36 INFO mapred.JobClient: Superstep 3 (milliseconds)=4383 13/04/10 19:11:36 INFO mapred.JobClient: Superstep 4 (milliseconds)=4007 13/04/10 19:11:36 INFO mapred.JobClient: Superstep 10 (milliseconds)=476 13/04/10 19:11:36 INFO mapred.JobClient: Setup (milliseconds)=9780 13/04/10 19:11:36 INFO mapred.JobClient: Shutdown (milliseconds)=83 13/04/10 19:11:36 INFO mapred.JobClient: Superstep 7 (milliseconds)=3828 13/04/10 19:11:36 INFO mapred.JobClient: Superstep 9 (milliseconds)=3707 13/04/10 19:11:36 INFO mapred.JobClient: Superstep 0 (milliseconds)=10283 13/04/10 19:11:36 INFO mapred.JobClient: Superstep 8 (milliseconds)=3374 13/04/10 19:11:36 INFO mapred.JobClient: Input superstep (milliseconds)=7695 13/04/10 19:11:36 INFO mapred.JobClient: Superstep 6 (milliseconds)=4275 13/04/10 19:11:36 INFO mapred.JobClient: Superstep 5 (milliseconds)=4119 13/04/10 19:11:36 INFO mapred.JobClient: Superstep 2 (milliseconds)=3944 13/04/10 19:11:36 INFO mapred.JobClient: Superstep 1 (milliseconds)=5224 GIRAPH-616 OOC graph (maxPartitions=2), isStaticGraph = false: 13/04/10 19:06:08 INFO mapred.JobClient: Total (milliseconds)=131783 13/04/10 19:06:08 INFO mapred.JobClient: Superstep 3 (milliseconds)=11603 13/04/10 19:06:08 INFO mapred.JobClient: Superstep 4 (milliseconds)=8596 13/04/10 19:06:08 INFO mapred.JobClient: Superstep 10 (milliseconds)=10196 13/04/10 19:06:08 INFO mapred.JobClient: Setup (milliseconds)=7960 13/04/10 19:06:08 INFO mapred.JobClient: Shutdown (milliseconds)=97 13/04/10 19:06:08 INFO mapred.JobClient: Superstep 7 (milliseconds)=9283 13/04/10 19:06:08 INFO mapred.JobClient: Superstep 9 (milliseconds)=6139 13/04/10 19:06:08 INFO mapred.JobClient: Superstep 0 (milliseconds)=10843 13/04/10 19:06:08 INFO mapred.JobClient: Superstep 8 (milliseconds)=7047 13/04/10 19:06:08 INFO mapred.JobClient: Input superstep (milliseconds)=25272 13/04/10 19:06:08 INFO mapred.JobClient: Superstep 6 (milliseconds)=6272 13/04/10 19:06:08 INFO mapred.JobClient: Superstep 5 (milliseconds)=13822 13/04/10 19:06:08 INFO mapred.JobClient: Superstep 2 (milliseconds)=5456 13/04/10 19:06:08 INFO mapred.JobClient: Superstep 1 (milliseconds)=9193 GIRAPH-616 OOC graph (maxPartitions=2), isStaticGraph = true: 13/04/10 19:14:03 INFO mapred.JobClient: Total (milliseconds)=82618 13/04/10 19:14:03 INFO mapred.JobClient: Superstep 3 (milliseconds)=5542 13/04/10 19:14:03 INFO mapred.JobClient: Superstep 4 (milliseconds)=6629 13/04/10 19:14:03 INFO mapred.JobClient: Superstep 10 (milliseconds)=3500 13/04/10 19:14:03 INFO mapred.JobClient: Setup (milliseconds)=8515 13/04/10 19:14:03 INFO mapred.JobClient: Shutdown (milliseconds)=80 13/04/10 19:14:03 INFO mapred.JobClient: Superstep 7 (milliseconds)=5353 13/04/10 19:14:03 INFO mapred.JobClient: Superstep 9 (milliseconds)=5251 13/04/10 19:14:03 INFO mapred.JobClient: Superstep 0 (milliseconds)=4734 13/04/10 19:14:03 INFO mapred.JobClient: Superstep 8 (milliseconds)=5496 13/04/10 19:14:03 INFO mapred.JobClient: Input superstep (milliseconds)=6628 13/04/10 19:14:03 INFO mapred.JobClient: Superstep 6 (milliseconds)=9124 13/04/10 19:14:03 INFO mapred.JobClient: Superstep 5 (milliseconds)=5817 13/04/10 19:14:03 INFO mapred.JobClient: Superstep 2 (milliseconds)=5504 13/04/10 19:14:03 INFO mapred.JobClient: Superstep 1 (milliseconds)=10438
          Hide
          Maja Kabiljo added a comment -

          Awesome results!
          If you were to try with bigger graph, maybe the variance would be smaller? Do you know where the variance comes from?

          The patch looks good to me, let's have giraph-613 figured out first, and we can commit this.

          Show
          Maja Kabiljo added a comment - Awesome results! If you were to try with bigger graph, maybe the variance would be smaller? Do you know where the variance comes from? The patch looks good to me, let's have giraph-613 figured out first, and we can commit this.
          Hide
          Claudio Martella added a comment -

          Maybe my cluster is introducing the variance, as I experienced it also trunk in-memory. It could be network, jobtracker or something else. As you said, probably a bigger job/graph should stabilise things. I'm gonna try that tomorrow.

          Show
          Claudio Martella added a comment - Maybe my cluster is introducing the variance, as I experienced it also trunk in-memory. It could be network, jobtracker or something else. As you said, probably a bigger job/graph should stabilise things. I'm gonna try that tomorrow.
          Hide
          Claudio Martella added a comment -

          Re-ran experiments, this time with bigger graph and more workers. The results are more stable. Had to increase heap-space for in-memory to achieve good performance. With same memory as with OOC it would still reach the end of computation, but would run slower supposedly due to higher pressure on GC.

          PR, 50M vertices, 100 edges each, 60 workers.

          GIRAPH-616 in-memory

          13/04/11 23:45:16 INFO mapred.JobClient: Total (milliseconds)=582072
          13/04/11 23:45:16 INFO mapred.JobClient: Superstep 3 (milliseconds)=47174
          13/04/11 23:45:16 INFO mapred.JobClient: Superstep 4 (milliseconds)=49080
          13/04/11 23:45:16 INFO mapred.JobClient: Superstep 10 (milliseconds)=1648
          13/04/11 23:45:16 INFO mapred.JobClient: Setup (milliseconds)=20144
          13/04/11 23:45:16 INFO mapred.JobClient: Shutdown (milliseconds)=198
          13/04/11 23:45:16 INFO mapred.JobClient: Superstep 7 (milliseconds)=50510
          13/04/11 23:45:16 INFO mapred.JobClient: Superstep 9 (milliseconds)=46757
          13/04/11 23:45:16 INFO mapred.JobClient: Superstep 0 (milliseconds)=53758
          13/04/11 23:45:16 INFO mapred.JobClient: Superstep 8 (milliseconds)=54245
          13/04/11 23:45:16 INFO mapred.JobClient: Input superstep (milliseconds)=47222
          13/04/11 23:45:16 INFO mapred.JobClient: Superstep 6 (milliseconds)=64889
          13/04/11 23:45:16 INFO mapred.JobClient: Superstep 5 (milliseconds)=52122
          13/04/11 23:45:16 INFO mapred.JobClient: Superstep 2 (milliseconds)=46064
          13/04/11 23:45:16 INFO mapred.JobClient: Superstep 1 (milliseconds)=48257

          GIRAPH-616 isStatic=true maxPartitionsInMemory=2

          13/04/11 23:14:52 INFO mapred.JobClient: Total (milliseconds)=644252
          13/04/11 23:14:52 INFO mapred.JobClient: Superstep 3 (milliseconds)=52543
          13/04/11 23:14:52 INFO mapred.JobClient: Superstep 4 (milliseconds)=54847
          13/04/11 23:14:52 INFO mapred.JobClient: Superstep 10 (milliseconds)=16012
          13/04/11 23:14:52 INFO mapred.JobClient: Setup (milliseconds)=20257
          13/04/11 23:14:52 INFO mapred.JobClient: Shutdown (milliseconds)=242
          13/04/11 23:14:52 INFO mapred.JobClient: Superstep 7 (milliseconds)=52789
          13/04/11 23:14:52 INFO mapred.JobClient: Superstep 9 (milliseconds)=52341
          13/04/11 23:14:52 INFO mapred.JobClient: Superstep 0 (milliseconds)=51049
          13/04/11 23:14:52 INFO mapred.JobClient: Superstep 8 (milliseconds)=56641
          13/04/11 23:14:52 INFO mapred.JobClient: Input superstep (milliseconds)=47426
          13/04/11 23:14:52 INFO mapred.JobClient: Superstep 6 (milliseconds)=53835
          13/04/11 23:14:52 INFO mapred.JobClient: Superstep 5 (milliseconds)=58083
          13/04/11 23:14:52 INFO mapred.JobClient: Superstep 2 (milliseconds)=62154
          13/04/11 23:14:52 INFO mapred.JobClient: Superstep 1 (milliseconds)=66027

          GIRAPH-616 isStatic=false maxPartitionsInMemory=2

          13/04/11 23:02:10 INFO mapred.JobClient: Giraph Timers
          13/04/11 23:02:10 INFO mapred.JobClient: Total (milliseconds)=764215
          13/04/11 23:02:10 INFO mapred.JobClient: Superstep 3 (milliseconds)=72673
          13/04/11 23:02:10 INFO mapred.JobClient: Superstep 4 (milliseconds)=62751
          13/04/11 23:02:10 INFO mapred.JobClient: Superstep 10 (milliseconds)=25774
          13/04/11 23:02:10 INFO mapred.JobClient: Setup (milliseconds)=25106
          13/04/11 23:02:10 INFO mapred.JobClient: Shutdown (milliseconds)=54
          13/04/11 23:02:10 INFO mapred.JobClient: Superstep 7 (milliseconds)=64634
          13/04/11 23:02:10 INFO mapred.JobClient: Superstep 9 (milliseconds)=67493
          13/04/11 23:02:10 INFO mapred.JobClient: Superstep 0 (milliseconds)=49969
          13/04/11 23:02:10 INFO mapred.JobClient: Superstep 8 (milliseconds)=73192
          13/04/11 23:02:10 INFO mapred.JobClient: Input superstep (milliseconds)=53890
          13/04/11 23:02:10 INFO mapred.JobClient: Superstep 6 (milliseconds)=69300
          13/04/11 23:02:10 INFO mapred.JobClient: Superstep 5 (milliseconds)=60797
          13/04/11 23:02:10 INFO mapred.JobClient: Superstep 2 (milliseconds)=64752
          13/04/11 23:02:10 INFO mapred.JobClient: Superstep 1 (milliseconds)=73824

          Show
          Claudio Martella added a comment - Re-ran experiments, this time with bigger graph and more workers. The results are more stable. Had to increase heap-space for in-memory to achieve good performance. With same memory as with OOC it would still reach the end of computation, but would run slower supposedly due to higher pressure on GC. PR, 50M vertices, 100 edges each, 60 workers. GIRAPH-616 in-memory 13/04/11 23:45:16 INFO mapred.JobClient: Total (milliseconds)=582072 13/04/11 23:45:16 INFO mapred.JobClient: Superstep 3 (milliseconds)=47174 13/04/11 23:45:16 INFO mapred.JobClient: Superstep 4 (milliseconds)=49080 13/04/11 23:45:16 INFO mapred.JobClient: Superstep 10 (milliseconds)=1648 13/04/11 23:45:16 INFO mapred.JobClient: Setup (milliseconds)=20144 13/04/11 23:45:16 INFO mapred.JobClient: Shutdown (milliseconds)=198 13/04/11 23:45:16 INFO mapred.JobClient: Superstep 7 (milliseconds)=50510 13/04/11 23:45:16 INFO mapred.JobClient: Superstep 9 (milliseconds)=46757 13/04/11 23:45:16 INFO mapred.JobClient: Superstep 0 (milliseconds)=53758 13/04/11 23:45:16 INFO mapred.JobClient: Superstep 8 (milliseconds)=54245 13/04/11 23:45:16 INFO mapred.JobClient: Input superstep (milliseconds)=47222 13/04/11 23:45:16 INFO mapred.JobClient: Superstep 6 (milliseconds)=64889 13/04/11 23:45:16 INFO mapred.JobClient: Superstep 5 (milliseconds)=52122 13/04/11 23:45:16 INFO mapred.JobClient: Superstep 2 (milliseconds)=46064 13/04/11 23:45:16 INFO mapred.JobClient: Superstep 1 (milliseconds)=48257 GIRAPH-616 isStatic=true maxPartitionsInMemory=2 13/04/11 23:14:52 INFO mapred.JobClient: Total (milliseconds)=644252 13/04/11 23:14:52 INFO mapred.JobClient: Superstep 3 (milliseconds)=52543 13/04/11 23:14:52 INFO mapred.JobClient: Superstep 4 (milliseconds)=54847 13/04/11 23:14:52 INFO mapred.JobClient: Superstep 10 (milliseconds)=16012 13/04/11 23:14:52 INFO mapred.JobClient: Setup (milliseconds)=20257 13/04/11 23:14:52 INFO mapred.JobClient: Shutdown (milliseconds)=242 13/04/11 23:14:52 INFO mapred.JobClient: Superstep 7 (milliseconds)=52789 13/04/11 23:14:52 INFO mapred.JobClient: Superstep 9 (milliseconds)=52341 13/04/11 23:14:52 INFO mapred.JobClient: Superstep 0 (milliseconds)=51049 13/04/11 23:14:52 INFO mapred.JobClient: Superstep 8 (milliseconds)=56641 13/04/11 23:14:52 INFO mapred.JobClient: Input superstep (milliseconds)=47426 13/04/11 23:14:52 INFO mapred.JobClient: Superstep 6 (milliseconds)=53835 13/04/11 23:14:52 INFO mapred.JobClient: Superstep 5 (milliseconds)=58083 13/04/11 23:14:52 INFO mapred.JobClient: Superstep 2 (milliseconds)=62154 13/04/11 23:14:52 INFO mapred.JobClient: Superstep 1 (milliseconds)=66027 GIRAPH-616 isStatic=false maxPartitionsInMemory=2 13/04/11 23:02:10 INFO mapred.JobClient: Giraph Timers 13/04/11 23:02:10 INFO mapred.JobClient: Total (milliseconds)=764215 13/04/11 23:02:10 INFO mapred.JobClient: Superstep 3 (milliseconds)=72673 13/04/11 23:02:10 INFO mapred.JobClient: Superstep 4 (milliseconds)=62751 13/04/11 23:02:10 INFO mapred.JobClient: Superstep 10 (milliseconds)=25774 13/04/11 23:02:10 INFO mapred.JobClient: Setup (milliseconds)=25106 13/04/11 23:02:10 INFO mapred.JobClient: Shutdown (milliseconds)=54 13/04/11 23:02:10 INFO mapred.JobClient: Superstep 7 (milliseconds)=64634 13/04/11 23:02:10 INFO mapred.JobClient: Superstep 9 (milliseconds)=67493 13/04/11 23:02:10 INFO mapred.JobClient: Superstep 0 (milliseconds)=49969 13/04/11 23:02:10 INFO mapred.JobClient: Superstep 8 (milliseconds)=73192 13/04/11 23:02:10 INFO mapred.JobClient: Input superstep (milliseconds)=53890 13/04/11 23:02:10 INFO mapred.JobClient: Superstep 6 (milliseconds)=69300 13/04/11 23:02:10 INFO mapred.JobClient: Superstep 5 (milliseconds)=60797 13/04/11 23:02:10 INFO mapred.JobClient: Superstep 2 (milliseconds)=64752 13/04/11 23:02:10 INFO mapred.JobClient: Superstep 1 (milliseconds)=73824
          Hide
          Maja Kabiljo added a comment -

          Maybe Alessandro Presta has some comments, since he wrote out-of-core partition stuff, otherwise +1 from me.

          Show
          Maja Kabiljo added a comment - Maybe Alessandro Presta has some comments, since he wrote out-of-core partition stuff, otherwise +1 from me.
          Hide
          Claudio Martella added a comment -

          Actually this class was rewritten by me when I made it LRU. I'm committing this one in 24h to give Alessandro time to comment.

          Show
          Claudio Martella added a comment - Actually this class was rewritten by me when I made it LRU. I'm committing this one in 24h to give Alessandro time to comment.
          Hide
          Maja Kabiljo added a comment -

          Oh sorry, missed that.

          Show
          Maja Kabiljo added a comment - Oh sorry, missed that.
          Hide
          Alessandro Presta added a comment -

          Seems like a good improvement. I'm not up to speed on the recent changes in Vertex so I can't comment in detail.

          Show
          Alessandro Presta added a comment - Seems like a good improvement. I'm not up to speed on the recent changes in Vertex so I can't comment in detail.
          Hide
          Hudson added a comment -

          Integrated in Giraph-trunk-Commit #911 (See https://builds.apache.org/job/Giraph-trunk-Commit/911/)
          GIRAPH-616: Decouple vertices and edges in DiskBackedPartitionStore and avoid writing back edges when the algorithm does not change topology. (Revision 228edbbd798f7718a5a7ccbcfd35c22e812be761)
          GIRAPH-616 (Revision 39f3591365cbe15b74c82574787d0592681d8ba5)

          Result = SUCCESS
          claudio : http://git-wip-us.apache.org/repos/asf?p=giraph.git&a=commit&h=228edbbd798f7718a5a7ccbcfd35c22e812be761
          Files :

          • giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java
          • giraph-core/src/main/java/org/apache/giraph/partition/DiskBackedPartitionStore.java
          • giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java

          claudio : http://git-wip-us.apache.org/repos/asf?p=giraph.git&a=commit&h=39f3591365cbe15b74c82574787d0592681d8ba5
          Files :

          • CHANGELOG
          Show
          Hudson added a comment - Integrated in Giraph-trunk-Commit #911 (See https://builds.apache.org/job/Giraph-trunk-Commit/911/ ) GIRAPH-616 : Decouple vertices and edges in DiskBackedPartitionStore and avoid writing back edges when the algorithm does not change topology. (Revision 228edbbd798f7718a5a7ccbcfd35c22e812be761) GIRAPH-616 (Revision 39f3591365cbe15b74c82574787d0592681d8ba5) Result = SUCCESS claudio : http://git-wip-us.apache.org/repos/asf?p=giraph.git&a=commit&h=228edbbd798f7718a5a7ccbcfd35c22e812be761 Files : giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java giraph-core/src/main/java/org/apache/giraph/partition/DiskBackedPartitionStore.java giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java claudio : http://git-wip-us.apache.org/repos/asf?p=giraph.git&a=commit&h=39f3591365cbe15b74c82574787d0592681d8ba5 Files : CHANGELOG
          Hide
          Claudio Martella added a comment -

          given the nice results, out of curiosity, i ran the same benchmark with ALSO OOC messages. basically in these tests we are currently keeping in memory 2 partitions out of the 60 assigned to each worker (3%), and each worker produces (and receives) on average 83M messages per superstep (5B edges / 60 workers). so I ran the tests adding giraph.maxMessagesInMemory=2490000, which is 3%. the results follow. I would be curious to see how long it would take to run the same number of iterations on the same graph with the same number of tasks with MR.

          13/04/12 08:45:43 INFO mapred.JobClient: Total (milliseconds)=2132200
          13/04/12 08:45:43 INFO mapred.JobClient: Superstep 3 (milliseconds)=205329
          13/04/12 08:45:43 INFO mapred.JobClient: Superstep 4 (milliseconds)=198965
          13/04/12 08:45:43 INFO mapred.JobClient: Superstep 10 (milliseconds)=109850
          13/04/12 08:45:43 INFO mapred.JobClient: Setup (milliseconds)=25407
          13/04/12 08:45:43 INFO mapred.JobClient: Shutdown (milliseconds)=83
          13/04/12 08:45:43 INFO mapred.JobClient: Superstep 7 (milliseconds)=200026
          13/04/12 08:45:43 INFO mapred.JobClient: Superstep 9 (milliseconds)=203015
          13/04/12 08:45:43 INFO mapred.JobClient: Superstep 0 (milliseconds)=110034
          13/04/12 08:45:43 INFO mapred.JobClient: Superstep 8 (milliseconds)=200514
          13/04/12 08:45:43 INFO mapred.JobClient: Input superstep (milliseconds)=40560
          13/04/12 08:45:43 INFO mapred.JobClient: Superstep 6 (milliseconds)=204376
          13/04/12 08:45:43 INFO mapred.JobClient: Superstep 5 (milliseconds)=199704
          13/04/12 08:45:43 INFO mapred.JobClient: Superstep 2 (milliseconds)=204082
          13/04/12 08:45:43 INFO mapred.JobClient: Superstep 1 (milliseconds)=230250

          The results are not bad, as each superstep last around 3 times more. but if you think that we're keeping in memory less than 10MB of messages per worker (considering a 4bytes float), it is quite understandable given the scale. I think that overall, we keep less stuff in memory than the default buffers of MR (for sorting and for io). i'd like to test some merging of the diskbackedmessagestore files in the background, to see if reducing files and disk seeks can make a difference (but isn't it overall as much I/O during the merge as we would now?).

          Show
          Claudio Martella added a comment - given the nice results, out of curiosity, i ran the same benchmark with ALSO OOC messages. basically in these tests we are currently keeping in memory 2 partitions out of the 60 assigned to each worker (3%), and each worker produces (and receives) on average 83M messages per superstep (5B edges / 60 workers). so I ran the tests adding giraph.maxMessagesInMemory=2490000, which is 3%. the results follow. I would be curious to see how long it would take to run the same number of iterations on the same graph with the same number of tasks with MR. 13/04/12 08:45:43 INFO mapred.JobClient: Total (milliseconds)=2132200 13/04/12 08:45:43 INFO mapred.JobClient: Superstep 3 (milliseconds)=205329 13/04/12 08:45:43 INFO mapred.JobClient: Superstep 4 (milliseconds)=198965 13/04/12 08:45:43 INFO mapred.JobClient: Superstep 10 (milliseconds)=109850 13/04/12 08:45:43 INFO mapred.JobClient: Setup (milliseconds)=25407 13/04/12 08:45:43 INFO mapred.JobClient: Shutdown (milliseconds)=83 13/04/12 08:45:43 INFO mapred.JobClient: Superstep 7 (milliseconds)=200026 13/04/12 08:45:43 INFO mapred.JobClient: Superstep 9 (milliseconds)=203015 13/04/12 08:45:43 INFO mapred.JobClient: Superstep 0 (milliseconds)=110034 13/04/12 08:45:43 INFO mapred.JobClient: Superstep 8 (milliseconds)=200514 13/04/12 08:45:43 INFO mapred.JobClient: Input superstep (milliseconds)=40560 13/04/12 08:45:43 INFO mapred.JobClient: Superstep 6 (milliseconds)=204376 13/04/12 08:45:43 INFO mapred.JobClient: Superstep 5 (milliseconds)=199704 13/04/12 08:45:43 INFO mapred.JobClient: Superstep 2 (milliseconds)=204082 13/04/12 08:45:43 INFO mapred.JobClient: Superstep 1 (milliseconds)=230250 The results are not bad, as each superstep last around 3 times more. but if you think that we're keeping in memory less than 10MB of messages per worker (considering a 4bytes float), it is quite understandable given the scale. I think that overall, we keep less stuff in memory than the default buffers of MR (for sorting and for io). i'd like to test some merging of the diskbackedmessagestore files in the background, to see if reducing files and disk seeks can make a difference (but isn't it overall as much I/O during the merge as we would now?).

            People

            • Assignee:
              Claudio Martella
              Reporter:
              Claudio Martella
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development