Details

    • Type: New Feature
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: REEF Driver
    • Labels:

      Description

      For a long-running service deployed on REEF, we need a way of consistently addressing the Driver for the service.

      Consider deploying it on YARN. YARN will start the Driver on an arbitrary YARN container. Thus, the Driver does not have a well-known hostname for clients to connect to as an endpoint. Such an endpoint should ideally remain constant, even across service restart. (Again, this is analogous to Hadoop MR being configured with fs.defaultFS=hdfs://

      {address}

      .)

      To address this problem, we need a service that we can use for registering and querying the Driver location. To make this work, we need an interface at Driver and one at Client.

      • Register the Driver location at Driver
      • Query the Driver location at the client side with a well-known name

      We can inject a proper implementation depending on the environment we rely on. For example, we can use YARN RM. In case Zookeeper's deployed, we can use Zookeeper.

      One concrete approach Markus Weimer suggested is the following.

      The YARN protocol allows each application master to register a "tracking URL". That is an HTTP endpoint unique to the each AM. You can query for that URL using the YARN protocol, or even its REST variant in newer versions of YARN. The flow would be like this:

      ​Your client library queries YARN for all running applications. Find the one with the right JobID (we form them via "reef-[DriverIdentifier]" right now, but we could drop the "reef-").

      Query for the URL of that Application Master and connect to it. You could e.g. return the connection details for a TCP socket connection via this initial HTTP(S) call.

      We have support for doing this in REEF. The module "reef-webserver" contains an implementation. It will register itself with YARN and you can register additional HTTP handlers with it. When you do, please don't use the "/reef/" namespace, as we use that for our own REEF REST APIs.

        Activity

        Hide
        tdbaker Tobin Baker added a comment -

        +1
        This could be useful for Myria (a parallel database deployed as a fault-tolerant service on EC2, with a long-running Driver).

        Show
        tdbaker Tobin Baker added a comment - +1 This could be useful for Myria (a parallel database deployed as a fault-tolerant service on EC2, with a long-running Driver).

          People

          • Assignee:
            Unassigned
            Reporter:
            bgchun Byung-Gon Chun
          • Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:

              Development