Uploaded image for project: 'REEF (Retired)'
  1. REEF (Retired)
  2. REEF-568

Work around the federated YARN node reports problem

    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Minor
    • Resolution: Duplicate
    • None
    • 0.13
    • None
    • None

    Description

      When trying to use REEF with Federation, there's a problem on the node reports YARN sends us.
      Just after initializing our yarn client library (hadoop-yarn-client-2.4.0), we ask for the RUNNING nodes in the cluster to populate our own Resource Catalog.
      YARN replies with the nodes that belong to a 'random' sub-cluster; sometimes with the nodes in the correct sub-cluster (where the containers will be placed), and sometimes with other ones. That causes the application to randomly fail.
      For example, we populate our resource catalog with nodes in sub-cluster 1, but the allocations are actually made on sub-cluster 2, so we fail.

      We need to do a work around for this issue, as YARN folks are not sure when they will have the right.

      Attachments

        Issue Links

          Activity

            People

              nachoacano Ignacio Cano
              nachoacano Ignacio Cano
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: