Uploaded image for project: 'Giraph'
  1. Giraph
  2. GIRAPH-128

RPC port from BasicRPCCommunications should be only a starting port, and retried

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.1.0
    • Fix Version/s: 0.1.0
    • Component/s: None
    • Labels:
      None

      Description

      Currently Giraph uses a basic port + the task partition to get the RPC port. This doesn't work well for when there are multiple Giraph jobs running simultaneously in the same Hadoop cluster (port conflict). At the same time, it is nice to use this simple algorithm because it makes it very easy to debug problems (you can find the troublesome mapper from the RPC port name). I will be proposing a simple scheme to retry with another port. I will round the total number of mappers up to the nearest power of 10 (let's that that number Z). Then I will increment the port number by Z, retrying up to 20 tries. If you have enough ports, this scheme would guarantee that up to 20 mappers / node would be supported. It should be sufficient for most clusters. At the same time, we still maintain the easy debugging method since you it's still easy to figure out the mapper partition from the port (port % Z = map partition).

        Attachments

        1. GIRAPH-128.4.patch
          12 kB
          Avery Ching
        2. GIRAPH-128.3.patch
          12 kB
          Avery Ching
        3. GIRAPH-128.2.patch
          10 kB
          Avery Ching

          Activity

            People

            • Assignee:
              aching Avery Ching
              Reporter:
              aching Avery Ching
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: