Uploaded image for project: 'Giraph (Retired)'
  1. Giraph (Retired)
  2. GIRAPH-244

Vertex API redesign

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • None
    • None

    Description

      This is an effort to rationalize the Giraph API. I've put together a few issues that we've talked about lately. I'm focusing on making Giraph development even more intuitive and less error-prone, and fixing a few potential sources of bugs.

      I'm sorry this is a big patch, but most of those issues are intertwined and I think this might be easier to review and integrate.

      Here's an account of the changes:

      Vertex API:

      • Renamed BasicVertex to Vertex (as I understand, we used to have both and then Vertex was removed).
      • Switched to Iterables instead of Iterators for both edges and messages. This makes code more concise for both implementors (no need to call .iterator() on collections) and users (can use foreach syntax). See also GIRAPH-221.
      • Added SimpleVertex and SimpleMutableVertex classes, where there are no edge values and the iterable to be implemented is getNeighbors(). We don’t have multiple inheritance, so the only way I could think of was to have SimpleVertex extend Vertex, SimpleMutableVertex extend MutableVertex, and duplicate the code for the edges iterables.
        Also, due to type erasure, one still has to deal with Edge objects in SimpleMutableVertex#initialize. Overall I think this is still an improvement over the current situation.
      • Added id and value field to the base Vertex class. All other classes were either writing the same boilerplate again and again, or using primitive fields and then creating Writables on the fly (inefficient; there was even a TODO about that). If there are any actually useful customizations here, I’ve yet to see them.
        Also removed redundant “Vertex” from getters/setters (compare vertex.getId() with vertex.getVertexId()).
      • Made halt a private field, and added a wakeUp() method to re-activate a vertex. isHalted()/voteToHalt()/wakeUp() are just more semantically-charged getter/setters.
      • Renamed number of vertices/edges in graph to getTotalNum*. The previous naming (getNumEdges) was arguably confusing. If this one sucks too, please suggest a better one.
      • Default implementations of hasEdge(), getEdgeValue(), getNumEdges(), readFields(), write(), toString(): the implementor can still optimize when there is a good opportunity. Currently we are duplicating a lot of code (see GIRAPH-238) and potentially introducing bugs (see GIRAPH-239).

      HashMapVertex:

      • Switched representation from Map<I, Edge<I, E>> to Map<I, E> (GIRAPH-242)
      • Only override methods that can be optimized.

      EdgeListVertex:

      • Switched representation from two sorted lists to one list of Edge<I, E> (see GIRAPH-243). Mainly this makes iteration over edges (target id and value) linear instead of O(n log n). Mutations are still slow and should generally be discouraged.
      • Only override methods that can be optimized.

      Small nits:

      • Our code conventions say we should try to avoid abbreviations, so I eliminated a few (req -> request, msg -> message).
      • Unilaterally refer to the endpoint of an edge as targetVertex (before we had a mix of destVertex and targetVertex).
      • You will notice some rearranged imports. That’s just my IDE trying to be helpful (see GIRAPH-230).

      Attachments

        1. 0001-GIRAPH-244-regression-for-non-secure-Hadoop.patch
          1 kB
          Jaeho Shin
        2. GIRAPH-244.patch
          282 kB
          Alessandro Presta
        3. GIRAPH-244.patch
          279 kB
          Alessandro Presta
        4. GIRAPH-244.patch
          267 kB
          Alessandro Presta
        5. GIRAPH-244.patch
          265 kB
          Alessandro Presta
        6. GIRAPH-244.patch
          269 kB
          Alessandro Presta
        7. GIRAPH-244.patch
          278 kB
          Alessandro Presta
        8. GIRAPH-244.patch
          257 kB
          Alessandro Presta
        9. GIRAPH-244.patch
          254 kB
          Alessandro Presta
        10. GIRAPH-244.patch
          245 kB
          Alessandro Presta
        11. GIRAPH-244.patch
          244 kB
          Alessandro Presta
        12. GIRAPH-244.patch
          236 kB
          Alessandro Presta
        13. GIRAPH-244.patch
          242 kB
          Alessandro Presta
        14. GIRAPH-244.patch
          242 kB
          Alessandro Presta
        15. GIRAPH-244.patch
          238 kB
          Alessandro Presta
        16. GIRAPH-244.patch
          237 kB
          Alessandro Presta
        17. GIRAPH-244.patch
          237 kB
          Alessandro Presta

        Issue Links

          Activity

            People

              apresta Alessandro Presta
              apresta Alessandro Presta
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: