Uploaded image for project: 'Giraph'
  1. Giraph
  2. GIRAPH-552

HBaseVertexInputFormat is ignoring region locality on input superstep

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Abandoned
    • Affects Version/s: 1.0.0
    • Fix Version/s: None
    • Component/s: graph
    • Labels:
      None

      Description

      During the input superstep, you can see the data for different regions being needlessly transferred across the network, instead of giving preference to machine-local regions if available.

      On modest to large size graphs (5mil V 10mil E) we've noticed this causing resource contention, Zookeeper timeouts, and other issues that often freeze the input superstep until manually killed on the task tracker hosts.

      This doesn't happen for TextVertexInputFormat subclasses. Perhaps it has to do with each instance of the HBaseVertexInputFormat subclass delegating to a private TableInputFormat instance.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              bfem Brian Femiano
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: