Uploaded image for project: 'Giraph'
  1. Giraph
  2. GIRAPH-250

Let workers contending for InputSplits during INPUT_SUPERSTEP guess better, choose quicker.

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Won't Fix
    • Affects Version/s: 1.0.0
    • Fix Version/s: 1.0.0
    • Component/s: bsp, graph, zookeeper
    • Labels:
      None

      Description

      In the job logs it has become clear that workers trying to scan for master-created Znodes indicating an InputSplit is available to claim (and read) are starting very similar lists of znode names to scan (iterating from 0 through the list all at the same time)

      what you see in the logs is lots of misses, followed by finally a hit somewhere. By using iterating the list, but starting from a different spot for each worker (see the patch its a simple change using the hash code of the worker hostname + index and mod that by the size of the list of possible splits to claim) we (mostly) iterate starting from different parts of the input split list each worker gets, thereby lowering contention dramatically and ensuring everyone will more quickly claim (at least their first) input split. This seems to work very well so far.

      passes mvn verify etc.

        Attachments

        1. GIRAPH-250-3.patch
          3 kB
          Eli Reisman
        2. GIRAPH-250-2.patch
          3 kB
          Eli Reisman
        3. GIRAPH-250-1.patch
          2 kB
          Eli Reisman

          Activity

            People

            • Assignee:
              initialcontext Eli Reisman
              Reporter:
              initialcontext Eli Reisman
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: