Hadoop Common
  1. Hadoop Common
  2. HADOOP-1985

Abstract node to switch mapping into a topology service class used by namenode and jobtracker

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.17.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change, Reviewed
    • Release Note:
      Hide
      This issue introduces rack awareness for map tasks. It also moves the rack resolution logic to the central servers - NameNode & JobTracker. The administrator can specify a loadable class given by topology.node.switch.mapping.impl to specify the class implementing the logic for rack resolution. The class must implement a method - resolve(List<String> names), where names is the list of DNS-names/IP-addresses that we want resolved. The return value is a list of resolved network paths of the form /foo/rack, where rack is the rackID where the node belongs to and foo is the switch where multiple racks are connected, and so on. The default implementation of this class is packaged along with hadoop and points to org.apache.hadoop.net.ScriptBasedMapping and this class loads a script that can be used for rack resolution. The script location is configurable. It is specified by topology.script.file.name and defaults to an empty script. In the case where the script name is empty, /default-rack is returned for all dns-names/IP-addresses. The loadable topology.node.switch.mapping.impl provides administrators fleixibilty to define how their site's node resolution should happen.
      For mapred, one can also specify the level of the cache w.r.t the number of levels in the resolved network path - defaults to two. This means that the JobTracker will cache tasks at the host level and at the rack level.
      Known issue: the task caching will not work with levels greater than 2 (beyond racks). This bug is tracked in HADOOP-3296.
      Show
      This issue introduces rack awareness for map tasks. It also moves the rack resolution logic to the central servers - NameNode & JobTracker. The administrator can specify a loadable class given by topology.node.switch.mapping.impl to specify the class implementing the logic for rack resolution. The class must implement a method - resolve(List<String> names), where names is the list of DNS-names/IP-addresses that we want resolved. The return value is a list of resolved network paths of the form /foo/rack, where rack is the rackID where the node belongs to and foo is the switch where multiple racks are connected, and so on. The default implementation of this class is packaged along with hadoop and points to org.apache.hadoop.net.ScriptBasedMapping and this class loads a script that can be used for rack resolution. The script location is configurable. It is specified by topology.script.file.name and defaults to an empty script. In the case where the script name is empty, /default-rack is returned for all dns-names/IP-addresses. The loadable topology.node.switch.mapping.impl provides administrators fleixibilty to define how their site's node resolution should happen. For mapred, one can also specify the level of the cache w.r.t the number of levels in the resolved network path - defaults to two. This means that the JobTracker will cache tasks at the host level and at the rack level. Known issue: the task caching will not work with levels greater than 2 (beyond racks). This bug is tracked in HADOOP-3296 .

      Description

      In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker. Currently the namenode asks the data nodes for this info and they run a local script to answer this question. In our environment and others that I know of there is no reason to push this to each node. It is easier to maintain a centralized script that maps node DNS names to switch strings.

      I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings. We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes. We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.

      Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

      1. jobinprogress.patch
        11 kB
        Devaraj Das
      2. 1985.v9.patch
        82 kB
        Devaraj Das
      3. 1985.v6.patch
        83 kB
        Devaraj Das
      4. 1985.v5.patch
        78 kB
        Devaraj Das
      5. 1985.v4.patch
        78 kB
        Devaraj Das
      6. 1985.v3.patch
        81 kB
        Devaraj Das
      7. 1985.v25.patch
        101 kB
        Devaraj Das
      8. 1985.v24.patch
        101 kB
        Devaraj Das
      9. 1985.v23.patch
        100 kB
        Devaraj Das
      10. 1985.v20.patch
        100 kB
        Devaraj Das
      11. 1985.v2.patch
        80 kB
        Devaraj Das
      12. 1985.v19.patch
        99 kB
        Devaraj Das
      13. 1985.v11.patch
        88 kB
        Devaraj Das
      14. 1985.v10.patch
        81 kB
        Devaraj Das
      15. 1985.v1.patch
        78 kB
        Devaraj Das
      16. 1985.new.patch
        27 kB
        Devaraj Das

        Issue Links

          Activity

          eric baldeschwieler created issue -
          Devaraj Das made changes -
          Field Original Value New Value
          Assignee Devaraj Das [ devaraj ]
          Devaraj Das made changes -
          Attachment 1985.new.patch [ 12369805 ]
          Devaraj Das made changes -
          Attachment 1985.v1.patch [ 12371666 ]
          Devaraj Das made changes -
          Fix Version/s 0.16.0 [ 12312740 ]
          Devaraj Das made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Devaraj Das made changes -
          Attachment 1985.v2.patch [ 12371728 ]
          Devaraj Das made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Devaraj Das made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Devaraj Das made changes -
          Attachment 1985.v3.patch [ 12372151 ]
          Devaraj Das made changes -
          Attachment 1985.v4.patch [ 12372415 ]
          Devaraj Das made changes -
          Component/s mapred [ 12310690 ]
          Component/s dfs [ 12310710 ]
          Devaraj Das made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Devaraj Das made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Devaraj Das made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Devaraj Das made changes -
          Attachment 1985.v5.patch [ 12372631 ]
          Devaraj Das made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Devaraj Das made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Devaraj Das made changes -
          Attachment 1985.v6.patch [ 12373076 ]
          Devaraj Das made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Devaraj Das made changes -
          Attachment jobinprogress.patch [ 12373306 ]
          Devaraj Das made changes -
          Attachment 1985.v9.patch [ 12373500 ]
          Devaraj Das made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Devaraj Das made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Devaraj Das made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Devaraj Das made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Devaraj Das made changes -
          Attachment 1985.v10.patch [ 12373749 ]
          Devaraj Das made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Devaraj Das made changes -
          Assignee Devaraj Das [ devaraj ] Doug Cutting [ cutting ]
          Status Open [ 1 ] Patch Available [ 10002 ]
          Devaraj Das made changes -
          Assignee Doug Cutting [ cutting ] Devaraj Das [ devaraj ]
          Devaraj Das made changes -
          Attachment 1985.v11.patch [ 12374045 ]
          Devaraj Das made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Devaraj Das made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Amar Kamat made changes -
          Link This issue blocks HADOOP-2119 [ HADOOP-2119 ]
          Nigel Daley made changes -
          Fix Version/s 0.17.0 [ 12312913 ]
          Fix Version/s 0.16.0 [ 12312740 ]
          Amar Kamat made changes -
          Link This issue blocks HADOOP-2729 [ HADOOP-2729 ]
          Amar Kamat made changes -
          Link This issue blocks HADOOP-2812 [ HADOOP-2812 ]
          Devaraj Das made changes -
          Attachment 1985.v19.patch [ 12375651 ]
          Devaraj Das made changes -
          Attachment 1985.v20.patch [ 12375946 ]
          Devaraj Das made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Devaraj Das made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Devaraj Das made changes -
          Attachment 1985.v23.patch [ 12376403 ]
          Devaraj Das made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Devaraj Das made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Devaraj Das made changes -
          Attachment 1985.v24.patch [ 12376595 ]
          Devaraj Das made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Devaraj Das made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Devaraj Das made changes -
          Attachment 1985.v25.patch [ 12376740 ]
          Devaraj Das made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Robert Chansler made changes -
          Hadoop Flags [Incompatible change]
          Devaraj Das made changes -
          Hadoop Flags [Incompatible change] [Incompatible change, Reviewed]
          Release Note This issue introduces rack awareness for map tasks. It also moves the rack resolution logic to the central servers - NameNode & JobTracker. The administrator can specify a loadable class given by topology.node.switch.mapping.impl to specify the class implementing the logic for rack resolution. The class must implement a method - resolve(List<String> names), where names is the list of DNS-names/IP-addresses that we want resolved. The return value is a list of resolved network paths of the form /foo/rack, where rack is the rackID where the node belongs to and foo is the switch where multiple racks are connected, and so on. The default implementation of this class is packaged along with hadoop and points to org.apache.hadoop.net.ScriptBasedMapping and this class loads a script that can be used for rack resolution. The script location is configurable. It is specified by topology.script.file.name and defaults to an empty script. In the case where the script name is empty, /default-rack is returned for all dns-names/IP-addresses. The loadable topology.node.switch.mapping.impl provides administrators fleixibilty to define how their site's node resolution should happen.
          For mapred, one can also specify the level of the cache w.r.t the number of levels in the resolved network path - defaults to two. This means that the JobTracker will cache tasks at the host level and at the rack level.
          Devaraj Das made changes -
          Release Note This issue introduces rack awareness for map tasks. It also moves the rack resolution logic to the central servers - NameNode & JobTracker. The administrator can specify a loadable class given by topology.node.switch.mapping.impl to specify the class implementing the logic for rack resolution. The class must implement a method - resolve(List<String> names), where names is the list of DNS-names/IP-addresses that we want resolved. The return value is a list of resolved network paths of the form /foo/rack, where rack is the rackID where the node belongs to and foo is the switch where multiple racks are connected, and so on. The default implementation of this class is packaged along with hadoop and points to org.apache.hadoop.net.ScriptBasedMapping and this class loads a script that can be used for rack resolution. The script location is configurable. It is specified by topology.script.file.name and defaults to an empty script. In the case where the script name is empty, /default-rack is returned for all dns-names/IP-addresses. The loadable topology.node.switch.mapping.impl provides administrators fleixibilty to define how their site's node resolution should happen.
          For mapred, one can also specify the level of the cache w.r.t the number of levels in the resolved network path - defaults to two. This means that the JobTracker will cache tasks at the host level and at the rack level.
          This issue introduces rack awareness for map tasks. It also moves the rack resolution logic to the central servers - NameNode & JobTracker. The administrator can specify a loadable class given by topology.node.switch.mapping.impl to specify the class implementing the logic for rack resolution. The class must implement a method - resolve(List<String> names), where names is the list of DNS-names/IP-addresses that we want resolved. The return value is a list of resolved network paths of the form /foo/rack, where rack is the rackID where the node belongs to and foo is the switch where multiple racks are connected, and so on. The default implementation of this class is packaged along with hadoop and points to org.apache.hadoop.net.ScriptBasedMapping and this class loads a script that can be used for rack resolution. The script location is configurable. It is specified by topology.script.file.name and defaults to an empty script. In the case where the script name is empty, /default-rack is returned for all dns-names/IP-addresses. The loadable topology.node.switch.mapping.impl provides administrators fleixibilty to define how their site's node resolution should happen.
          For mapred, one can also specify the level of the cache w.r.t the number of levels in the resolved network path - defaults to two. This means that the JobTracker will cache tasks at the host level and at the rack level.
          Known issue: the task caching will not work with levels greater than 2 (beyond racks). This bug is tracked in HADOOP-3296.
          Hadoop Flags [Reviewed, Incompatible change] [Incompatible change, Reviewed]
          Nigel Daley made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Owen O'Malley made changes -
          Component/s dfs [ 12310710 ]
          Owen O'Malley made changes -
          Component/s mapred [ 12310690 ]
          Steve Loughran made changes -
          Link This issue is related to HDFS-891 [ HDFS-891 ]
          Gavin made changes -
          Link This issue blocks HADOOP-2119 [ HADOOP-2119 ]
          Gavin made changes -
          Link This issue is depended upon by HADOOP-2119 [ HADOOP-2119 ]
          Gavin made changes -
          Link This issue blocks MAPREDUCE-267 [ MAPREDUCE-267 ]
          Gavin made changes -
          Link This issue is depended upon by MAPREDUCE-267 [ MAPREDUCE-267 ]

            People

            • Assignee:
              Devaraj Das
              Reporter:
              eric baldeschwieler
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development