Hama
  1. Hama
  2. HAMA-556

Graph package to support stopping the interations when the node changes are within the tolerance value as in the case of page rank

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.4.0
    • Fix Version/s: 0.5.0
    • Component/s: bsp core
    • Labels:
      None

      Description

      Currently in the graph package, the iteration continues till the nodes are no longer updated (absolutely) or the maximum number of iterations has reached. It doesn't support testing the node changes with some tolerance and then stop the iterations as in the case of page rank.

      The above scenario might be applicable besides page rank also.

      org.apache.hama.graph.GraphJobRunner#bsp()

      while (updated && iteration < maxIteration)

      { ...... ...... }
      1. HAMA-556.patch
        40 kB
        Thomas Jungblut
      2. HAMA-556_1.patch
        53 kB
        Thomas Jungblut

        Activity

        Hide
        Hudson added a comment -

        Integrated in Hama-Nightly #541 (See https://builds.apache.org/job/Hama-Nightly/541/)
        HAMA-556: Graph package to support stopping the interations when the node changes are within the tolerance value as in the case of page rank (Revision 1335670)

        Result = SUCCESS
        tjungblut :
        Files :

        • /incubator/hama/trunk/CHANGES.txt
        • /incubator/hama/trunk/core/src/main/java/org/apache/hama/bsp/BSPPeerImpl.java
        • /incubator/hama/trunk/core/src/test/java/org/apache/hama/bsp/TestCheckpoint.java
        • /incubator/hama/trunk/examples/src/main/java/org/apache/hama/examples/ExampleDriver.java
        • /incubator/hama/trunk/examples/src/main/java/org/apache/hama/examples/MindistSearch.java
        • /incubator/hama/trunk/examples/src/main/java/org/apache/hama/examples/PageRank.java
        • /incubator/hama/trunk/examples/src/main/java/org/apache/hama/examples/PiEstimator.java
        • /incubator/hama/trunk/examples/src/main/java/org/apache/hama/examples/RandBench.java
        • /incubator/hama/trunk/examples/src/main/java/org/apache/hama/examples/SSSP.java
        • /incubator/hama/trunk/examples/src/test/java/org/apache/hama/examples/CombineExampleTest.java
        • /incubator/hama/trunk/examples/src/test/java/org/apache/hama/examples/MindistSearchTest.java
        • /incubator/hama/trunk/examples/src/test/java/org/apache/hama/examples/PageRankTest.java
        • /incubator/hama/trunk/examples/src/test/java/org/apache/hama/examples/PiEstimatorTest.java
        • /incubator/hama/trunk/examples/src/test/java/org/apache/hama/examples/RandBenchTest.java
        • /incubator/hama/trunk/examples/src/test/java/org/apache/hama/examples/util/PagerankTextToSeqTest.java
        • /incubator/hama/trunk/examples/src/test/java/org/apache/hama/examples/util/SSSPTextToSeqTest.java
        • /incubator/hama/trunk/graph/src/main/java/org/apache/hama/graph/AbsDiffAggregator.java
        • /incubator/hama/trunk/graph/src/main/java/org/apache/hama/graph/AbstractAggregator.java
        • /incubator/hama/trunk/graph/src/main/java/org/apache/hama/graph/Aggregator.java
        • /incubator/hama/trunk/graph/src/main/java/org/apache/hama/graph/AverageAggregator.java
        • /incubator/hama/trunk/graph/src/main/java/org/apache/hama/graph/GraphJob.java
        • /incubator/hama/trunk/graph/src/main/java/org/apache/hama/graph/GraphJobRunner.java
        • /incubator/hama/trunk/graph/src/main/java/org/apache/hama/graph/MaxAggregator.java
        • /incubator/hama/trunk/graph/src/main/java/org/apache/hama/graph/MinAggregator.java
        • /incubator/hama/trunk/graph/src/main/java/org/apache/hama/graph/Vertex.java
        • /incubator/hama/trunk/graph/src/main/java/org/apache/hama/graph/VertexInterface.java
        • /incubator/hama/trunk/graph/src/main/java/org/apache/hama/graph/VertexWritable.java
        Show
        Hudson added a comment - Integrated in Hama-Nightly #541 (See https://builds.apache.org/job/Hama-Nightly/541/ ) HAMA-556 : Graph package to support stopping the interations when the node changes are within the tolerance value as in the case of page rank (Revision 1335670) Result = SUCCESS tjungblut : Files : /incubator/hama/trunk/CHANGES.txt /incubator/hama/trunk/core/src/main/java/org/apache/hama/bsp/BSPPeerImpl.java /incubator/hama/trunk/core/src/test/java/org/apache/hama/bsp/TestCheckpoint.java /incubator/hama/trunk/examples/src/main/java/org/apache/hama/examples/ExampleDriver.java /incubator/hama/trunk/examples/src/main/java/org/apache/hama/examples/MindistSearch.java /incubator/hama/trunk/examples/src/main/java/org/apache/hama/examples/PageRank.java /incubator/hama/trunk/examples/src/main/java/org/apache/hama/examples/PiEstimator.java /incubator/hama/trunk/examples/src/main/java/org/apache/hama/examples/RandBench.java /incubator/hama/trunk/examples/src/main/java/org/apache/hama/examples/SSSP.java /incubator/hama/trunk/examples/src/test/java/org/apache/hama/examples/CombineExampleTest.java /incubator/hama/trunk/examples/src/test/java/org/apache/hama/examples/MindistSearchTest.java /incubator/hama/trunk/examples/src/test/java/org/apache/hama/examples/PageRankTest.java /incubator/hama/trunk/examples/src/test/java/org/apache/hama/examples/PiEstimatorTest.java /incubator/hama/trunk/examples/src/test/java/org/apache/hama/examples/RandBenchTest.java /incubator/hama/trunk/examples/src/test/java/org/apache/hama/examples/util/PagerankTextToSeqTest.java /incubator/hama/trunk/examples/src/test/java/org/apache/hama/examples/util/SSSPTextToSeqTest.java /incubator/hama/trunk/graph/src/main/java/org/apache/hama/graph/AbsDiffAggregator.java /incubator/hama/trunk/graph/src/main/java/org/apache/hama/graph/AbstractAggregator.java /incubator/hama/trunk/graph/src/main/java/org/apache/hama/graph/Aggregator.java /incubator/hama/trunk/graph/src/main/java/org/apache/hama/graph/AverageAggregator.java /incubator/hama/trunk/graph/src/main/java/org/apache/hama/graph/GraphJob.java /incubator/hama/trunk/graph/src/main/java/org/apache/hama/graph/GraphJobRunner.java /incubator/hama/trunk/graph/src/main/java/org/apache/hama/graph/MaxAggregator.java /incubator/hama/trunk/graph/src/main/java/org/apache/hama/graph/MinAggregator.java /incubator/hama/trunk/graph/src/main/java/org/apache/hama/graph/Vertex.java /incubator/hama/trunk/graph/src/main/java/org/apache/hama/graph/VertexInterface.java /incubator/hama/trunk/graph/src/main/java/org/apache/hama/graph/VertexWritable.java
        Hide
        Thomas Jungblut added a comment -

        Thanks for all that review and help

        Show
        Thomas Jungblut added a comment - Thanks for all that review and help
        Hide
        Thomas Jungblut added a comment -

        Just committed.
        Added the line in example driver and some comments on the aggregators.
        Also fixed the TestCheckpoint job that was missing the NullOutputformat and therefore threw an exception which was catched, but had not tested anything then.

        Hope I'm not adding too many side effects.

        Show
        Thomas Jungblut added a comment - Just committed. Added the line in example driver and some comments on the aggregators. Also fixed the TestCheckpoint job that was missing the NullOutputformat and therefore threw an exception which was catched, but had not tested anything then. Hope I'm not adding too many side effects.
        Hide
        Thomas Jungblut added a comment -

        Yes I know that, but that is what we have the SRC apache release for. Or our good documentation.

        But if you say so, we could abstract getValue(), finalizeAggregate seems optional to me, if you use the aggregate with two parameters aggregate is also optional.
        Thanks for your opinion!

        Show
        Thomas Jungblut added a comment - Yes I know that, but that is what we have the SRC apache release for. Or our good documentation. But if you say so, we could abstract getValue(), finalizeAggregate seems optional to me, if you use the aggregate with two parameters aggregate is also optional. Thanks for your opinion!
        Hide
        Suraj Menon added a comment -

        I don't know from top of my mind. Completely understand the scrolling pain on empty function definitions. I think there are lot of programmers who do not like overriding a non virtual function. If the programmer does not have access to source code, he does not know what the functions do and would be tentative in super() calls in the body of overriding function definition. Not a show stopper.

        Show
        Suraj Menon added a comment - I don't know from top of my mind. Completely understand the scrolling pain on empty function definitions. I think there are lot of programmers who do not like overriding a non virtual function. If the programmer does not have access to source code, he does not know what the functions do and would be tentative in super() calls in the body of overriding function definition. Not a show stopper.
        Hide
        Thomas Jungblut added a comment -

        Because it is not very clean code to let a user override all functions.
        For some aggregators you just need aggregate, for fancier methods you need finalizeAggregate and aggregate.

        I'm a friend of empty default methods in non-instantiatable classes and to override the methods really needed instead of crapping the code of a user with empty methods. BTW is this a known pattern?

        Show
        Thomas Jungblut added a comment - Because it is not very clean code to let a user override all functions. For some aggregators you just need aggregate, for fancier methods you need finalizeAggregate and aggregate. I'm a friend of empty default methods in non-instantiatable classes and to override the methods really needed instead of crapping the code of a user with empty methods. BTW is this a known pattern?
        Hide
        Suraj Menon added a comment -

        Any reason why AbstractAggregator, which is abstract class, has empty implementation for aggregate(..), finalizeAggregate() and getValue() and not defined as abstract?

        Show
        Suraj Menon added a comment - Any reason why AbstractAggregator, which is abstract class, has empty implementation for aggregate(..), finalizeAggregate() and getValue() and not defined as abstract?
        Hide
        Edward J. Yoon added a comment -

        +1

        Show
        Edward J. Yoon added a comment - +1
        Hide
        Thomas Jungblut added a comment -

        come on, this is a single line edit. I don't want to generate the patch for that again
        I am going to commit this patch tonight and add this line.

        Show
        Thomas Jungblut added a comment - come on, this is a single line edit. I don't want to generate the patch for that again I am going to commit this patch tonight and add this line.
        Hide
        Edward J. Yoon added a comment -

        MindistSearch seems missed in exampledriver.

        Show
        Edward J. Yoon added a comment - MindistSearch seems missed in exampledriver.
        Hide
        Thomas Jungblut added a comment -

        +1

        Show
        Thomas Jungblut added a comment - +1
        Hide
        Edward J. Yoon added a comment -

        Looks good to me.
        I'd like to commit this to TRUNK. What do you think?

        Show
        Edward J. Yoon added a comment - Looks good to me. I'd like to commit this to TRUNK. What do you think?
        Hide
        Thomas Jungblut added a comment -

        Final fixes

        Pagerank works again, connected components test also says it works.

        Testcases are all fine.

        But we have to make a refactoring issue, the graphrunner is a whole mess (also my fault).

        Also a good follow up should be that the graph can be optionally repaired, I have explained it on the userlist:

        http://mail-archives.apache.org/mod_mbox/incubator-hama-user/201204.mbox/%3CCAJ-%3DysmSdYoX5t1pPjGi0KXWM%3DYVN2s8CoCgw5V3jXU3F74MHw%40mail.gmail.com%3E

        Show
        Thomas Jungblut added a comment - Final fixes Pagerank works again, connected components test also says it works. Testcases are all fine. But we have to make a refactoring issue, the graphrunner is a whole mess (also my fault). Also a good follow up should be that the graph can be optionally repaired, I have explained it on the userlist: http://mail-archives.apache.org/mod_mbox/incubator-hama-user/201204.mbox/%3CCAJ-%3DysmSdYoX5t1pPjGi0KXWM%3DYVN2s8CoCgw5V3jXU3F74MHw%40mail.gmail.com%3E
        Hide
        Thomas Jungblut added a comment -

        Added aggregators, not tested more than a localrunner with a single task.

        The error accumulation works, however I have somehow broken pagerank: Sum is: 0.7475461787313158

        I have to investigate it further, but I have seen that the testcase was useless anyways:

        https://svn.apache.org/repos/asf/incubator/hama/trunk/examples/src/test/java/org/apache/hama/examples/PageRankTest.java

        In the current trunk version it just checks if the sum is in between 0 and 1, which is always the case.
        This does really not say that pagerank is working correct.

        Seems that this got broken when we translated the old pagerank.

        Should we put the aggregators into another package? Same with the combiners?

        BTW I added my connected component special vertex It is much more cleaner than that of giraph.

        Maybe we can add a list of aggregators, instead of just one later.

        IMHO, the not working pagerank is a blocker.

        Show
        Thomas Jungblut added a comment - Added aggregators, not tested more than a localrunner with a single task. The error accumulation works, however I have somehow broken pagerank: Sum is: 0.7475461787313158 I have to investigate it further, but I have seen that the testcase was useless anyways: https://svn.apache.org/repos/asf/incubator/hama/trunk/examples/src/test/java/org/apache/hama/examples/PageRankTest.java In the current trunk version it just checks if the sum is in between 0 and 1, which is always the case. This does really not say that pagerank is working correct. Seems that this got broken when we translated the old pagerank. Should we put the aggregators into another package? Same with the combiners? BTW I added my connected component special vertex It is much more cleaner than that of giraph. Maybe we can add a list of aggregators, instead of just one later. IMHO, the not working pagerank is a blocker.
        Hide
        Thomas Jungblut added a comment -

        It was the wrong patch :/

        Anyways, my "hack" seems to be not very accurate for those things, I think we should add aggregators in this issue and let it be done via a master task that does that aggregation.

        So the graph job has a list of aggregators, and after compute the value of each vertex will be "observed" or aggregated and after all vertices has been iterated, we are messaging a master task with the aggregated values.
        It applies the aggregation once again and then make it available to the vertices in the next superstep again.

        This way we could also handle the number of updated vertices, it will then just be a special SumAggregator.

        The big fail in this scenario is that we need an additional sync operation for the master task. I will need some time, maybe we can get the same without it.

        Show
        Thomas Jungblut added a comment - It was the wrong patch :/ Anyways, my "hack" seems to be not very accurate for those things, I think we should add aggregators in this issue and let it be done via a master task that does that aggregation. So the graph job has a list of aggregators, and after compute the value of each vertex will be "observed" or aggregated and after all vertices has been iterated, we are messaging a master task with the aggregated values. It applies the aggregation once again and then make it available to the vertices in the next superstep again. This way we could also handle the number of updated vertices, it will then just be a special SumAggregator. The big fail in this scenario is that we need an additional sync operation for the master task. I will need some time, maybe we can get the same without it.
        Hide
        Thomas Jungblut added a comment -

        Too early in the morning and I guess it will not work like I think it would..
        I have a shot later again.

        Show
        Thomas Jungblut added a comment - Too early in the morning and I guess it will not work like I think it would.. I have a shot later again.
        Hide
        Thomas Jungblut added a comment -

        BTW where is the activeness of a vertex? this is totally not included currently.
        The Pregel API don't include this kind of convergence, this needs supervision of the graph and can't be handled so good with it. Therefore "aggregators" should be used.

        Show
        Thomas Jungblut added a comment - BTW where is the activeness of a vertex? this is totally not included currently. The Pregel API don't include this kind of convergence, this needs supervision of the graph and can't be handled so good with it. Therefore "aggregators" should be used.
        Hide
        Thomas Jungblut added a comment - - edited

        We can use the logic in our SuperstepBSP's Superstep class where you can halt the computation by overriding a boolean-returning function.

        Anyways, we could translate this to the superstep bsp right? So once we have fault-tolerance working we can test it with the graph algorithms.

        Show
        Thomas Jungblut added a comment - - edited We can use the logic in our SuperstepBSP's Superstep class where you can halt the computation by overriding a boolean-returning function. Anyways, we could translate this to the superstep bsp right? So once we have fault-tolerance working we can test it with the graph algorithms.

          People

          • Assignee:
            Thomas Jungblut
            Reporter:
            praveen sripati
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development