Giraph
  1. Giraph
  2. GIRAPH-14

Support for the Facebook Hadoop branch

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.1.0
    • Fix Version/s: 0.1.0
    • Component/s: None
    • Labels:
      None

      Description

      I've been working with Joe Xie on support to get Giraph running on the Facebook Hadoop branch. He verified today that the examples worked on their cluster. I need to clean up my changes a little, but otherwise, will submit a cleaned up diff. As a side note, does anyone know how we can get Hudson support for Giraph?

      1. facebook.txt
        10 kB
        Avery Ching
      2. facebook2.txt
        10 kB
        Avery Ching
      3. facebook3.patch
        12 kB
        Avery Ching

        Issue Links

          Activity

          Hide
          Hyunsik Choi added a comment -

          The below link is for hudson.
          http://wiki.apache.org/general/Hudson

          I'll create another issue about it.

          Show
          Hyunsik Choi added a comment - The below link is for hudson. http://wiki.apache.org/general/Hudson I'll create another issue about it.
          Hide
          Avery Ching added a comment -

          Thanks Hyunsik.

          Show
          Avery Ching added a comment - Thanks Hyunsik.
          Hide
          Avery Ching added a comment -

          Supports the Facebook version of Hadoop with mvn -Dhadoop=facebook -Dhadoop.jar.path=<path to jar> <mvn command>

          Show
          Avery Ching added a comment - Supports the Facebook version of Hadoop with mvn -Dhadoop=facebook -Dhadoop.jar.path=<path to jar> <mvn command>
          Hide
          Joe Xie added a comment -

          Great job Avery! I can compile on my side too and it works with examples on the cluster. I still got a few testing errors during mvn package & test, let me know if you have any updates.
          Joe Xie

          Tests in error:
          testSuperstepBalancer(org.apache.giraph.TestVertexRangeBalancer)
          testBspCheckpoint(org.apache.giraph.TestManualCheckpoint)
          testSingleFault(org.apache.giraph.TestAutoCheckpoint)
          testInstantiateVertex(org.apache.giraph.TestBspBasic)
          testLocalJobRunnerConfig(org.apache.giraph.TestBspBasic)
          testBspFail(org.apache.giraph.TestBspBasic)
          testBspSuperStep(org.apache.giraph.TestBspBasic)
          testBspMsg(org.apache.giraph.TestBspBasic)
          testEmptyVertexInputFormat(org.apache.giraph.TestBspBasic)
          testBspCombiner(org.apache.giraph.TestBspBasic)
          testBspPageRank(org.apache.giraph.TestBspBasic)
          testBspShortestPaths(org.apache.giraph.TestBspBasic)
          testMutateGraph(org.apache.giraph.TestMutateGraphVertex)
          testContinue(org.apache.giraph.TestJsonBase64Format)
          testMatchingType(org.apache.giraph.TestVertexTypes)
          testDerivedMatchingType(org.apache.giraph.TestVertexTypes)
          testDerivedInputFormatType(org.apache.giraph.TestVertexTypes)
          testMismatchingVertex(org.apache.giraph.TestVertexTypes)
          testMismatchingCombiner(org.apache.giraph.TestVertexTypes)
          testJsonBase64FormatType(org.apache.giraph.TestVertexTypes)

          Show
          Joe Xie added a comment - Great job Avery! I can compile on my side too and it works with examples on the cluster. I still got a few testing errors during mvn package & test, let me know if you have any updates. Joe Xie Tests in error: testSuperstepBalancer(org.apache.giraph.TestVertexRangeBalancer) testBspCheckpoint(org.apache.giraph.TestManualCheckpoint) testSingleFault(org.apache.giraph.TestAutoCheckpoint) testInstantiateVertex(org.apache.giraph.TestBspBasic) testLocalJobRunnerConfig(org.apache.giraph.TestBspBasic) testBspFail(org.apache.giraph.TestBspBasic) testBspSuperStep(org.apache.giraph.TestBspBasic) testBspMsg(org.apache.giraph.TestBspBasic) testEmptyVertexInputFormat(org.apache.giraph.TestBspBasic) testBspCombiner(org.apache.giraph.TestBspBasic) testBspPageRank(org.apache.giraph.TestBspBasic) testBspShortestPaths(org.apache.giraph.TestBspBasic) testMutateGraph(org.apache.giraph.TestMutateGraphVertex) testContinue(org.apache.giraph.TestJsonBase64Format) testMatchingType(org.apache.giraph.TestVertexTypes) testDerivedMatchingType(org.apache.giraph.TestVertexTypes) testDerivedInputFormatType(org.apache.giraph.TestVertexTypes) testMismatchingVertex(org.apache.giraph.TestVertexTypes) testMismatchingCombiner(org.apache.giraph.TestVertexTypes) testJsonBase64FormatType(org.apache.giraph.TestVertexTypes)
          Hide
          Joe Xie added a comment -

          Thank you Avery! maybe my test file in the trunk is obsolete, let me know whether you can reproduce it.

          Show
          Joe Xie added a comment - Thank you Avery! maybe my test file in the trunk is obsolete, let me know whether you can reproduce it.
          Hide
          Avery Ching added a comment -

          It's good to hear that you can run it on your cluster. As far as the unittests, that is strange. I was able to repeat the same issues and will look into a fix.

          Show
          Avery Ching added a comment - It's good to hear that you can run it on your cluster. As far as the unittests, that is strange. I was able to repeat the same issues and will look into a fix.
          Hide
          Avery Ching added a comment -

          Looks like I needed to change the groupId so that the right dependencies are pulled in for hadoop. Please try this one out. The unittests all passed for me.

          (i.e. mvn -Dhadoop=facebook -Dhadoop.jar.path=/Users/aching/Desktop/hadoop-0.20.1-core.jar package)

          Show
          Avery Ching added a comment - Looks like I needed to change the groupId so that the right dependencies are pulled in for hadoop. Please try this one out. The unittests all passed for me. (i.e. mvn -Dhadoop=facebook -Dhadoop.jar.path=/Users/aching/Desktop/hadoop-0.20.1-core.jar package)
          Hide
          Joe Xie added a comment -

          thank you Avery! It passed the unittest for me too.

          Show
          Joe Xie added a comment - thank you Avery! It passed the unittest for me too.
          Hide
          Avery Ching added a comment -

          Great to hear it! When one of the committers gets a chance to review, I can commit.

          Show
          Avery Ching added a comment - Great to hear it! When one of the committers gets a chance to review, I can commit.
          Hide
          Jakob Homan added a comment -

          I'm not up to date on FB's distribution. It's available to the public? If we're going to support this, should instructions be given in the README, as there are for the non-secure build option? What's the long term story for these API support #ifdefs? At the moment it's a clever solution to a vexing problem, but longer term it would be good to have a solution that doesn't leave code as comments.

          Show
          Jakob Homan added a comment - I'm not up to date on FB's distribution. It's available to the public? If we're going to support this, should instructions be given in the README, as there are for the non-secure build option? What's the long term story for these API support #ifdefs? At the moment it's a clever solution to a vexing problem, but longer term it would be good to have a solution that doesn't leave code as comments.
          Hide
          Avery Ching added a comment -

          In theory, I believe that Facebook's distro is online (https://github.com/facebook/hadoop-20-warehouse). The long term story is to factor out the parts into modules and then compile them based on the user profile. Then we don't have to "munge" anything anymore. At least that's what I've thought of for now. I'm open to better solutions. Pre-processing will get unmaintainable if we have to support every version of Hadoop. That being said, we should support the big customers of Giraph and that likely includes Facebook as well.

          I'll add instructions to the README and submit a new patch.

          Show
          Avery Ching added a comment - In theory, I believe that Facebook's distro is online ( https://github.com/facebook/hadoop-20-warehouse ). The long term story is to factor out the parts into modules and then compile them based on the user profile. Then we don't have to "munge" anything anymore. At least that's what I've thought of for now. I'm open to better solutions. Pre-processing will get unmaintainable if we have to support every version of Hadoop. That being said, we should support the big customers of Giraph and that likely includes Facebook as well. I'll add instructions to the README and submit a new patch.
          Hide
          Avery Ching added a comment -

          Updated with README instructions for building with the Facebook Hadoop release.

          Show
          Avery Ching added a comment - Updated with README instructions for building with the Facebook Hadoop release.
          Hide
          Jakob Homan added a comment -

          Applied the patch and verified everything still works as expected on the regular build. +1.

          Show
          Jakob Homan added a comment - Applied the patch and verified everything still works as expected on the regular build. +1.
          Hide
          Avery Ching added a comment -

          Committed, with changelog addition.

          Show
          Avery Ching added a comment - Committed, with changelog addition.

            People

            • Assignee:
              Avery Ching
              Reporter:
              Avery Ching
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development