Uploaded image for project: 'REEF'
  1. REEF
  2. REEF-568

Work around the federated YARN node reports problem

    Details

    • Type: Task
    • Status: Resolved
    • Priority: Minor
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: 0.13
    • Component/s: None
    • Labels:
      None

      Description

      When trying to use REEF with Federation, there's a problem on the node reports YARN sends us.
      Just after initializing our yarn client library (hadoop-yarn-client-2.4.0), we ask for the RUNNING nodes in the cluster to populate our own Resource Catalog.
      YARN replies with the nodes that belong to a 'random' sub-cluster; sometimes with the nodes in the correct sub-cluster (where the containers will be placed), and sometimes with other ones. That causes the application to randomly fail.
      For example, we populate our resource catalog with nodes in sub-cluster 1, but the allocations are actually made on sub-cluster 2, so we fail.

      We need to do a work around for this issue, as YARN folks are not sure when they will have the right.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                nachoacano Ignacio Cano
                Reporter:
                nachoacano Ignacio Cano
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: