Uploaded image for project: 'Bigtop'
  1. Bigtop
  2. BIGTOP-2555

hadoop charms should use bind-host overrides

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.1.0
    • Fix Version/s: 1.2.0
    • Component/s: deployment
    • Labels:
      None

      Description

      Make use of the bind-host overrides from BIGTOP-2554.

        Issue Links

          Activity

          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user kwmonroe opened a pull request:

          https://github.com/apache/bigtop/pull/153

          BIGTOP-2555: hadoop charms should use bind-host overrides

          • remove unused import
          • Bind NN and RM to 0.0.0.0 (all interfaces) using the hiera options from
            BIGTOP-2554. This fixes a problem where lxd would bind the apps to
            `facter fqdn` which may not be resolvable to other containers in the lxd env.
          • Note: since we recommend colocating NN and RM, make sure the RM repeats
            the NN overrides so the RM puppet apply doesn't lose hdfs-site.xml config
            from the NN puppet apply.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/juju-solutions/bigtop bug/BIGTOP-2555/bind-host-overrides

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/bigtop/pull/153.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #153



          Show
          githubbot ASF GitHub Bot added a comment - GitHub user kwmonroe opened a pull request: https://github.com/apache/bigtop/pull/153 BIGTOP-2555 : hadoop charms should use bind-host overrides remove unused import Bind NN and RM to 0.0.0.0 (all interfaces) using the hiera options from BIGTOP-2554 . This fixes a problem where lxd would bind the apps to `facter fqdn` which may not be resolvable to other containers in the lxd env. Note: since we recommend colocating NN and RM, make sure the RM repeats the NN overrides so the RM puppet apply doesn't lose hdfs-site.xml config from the NN puppet apply. You can merge this pull request into a Git repository by running: $ git pull https://github.com/juju-solutions/bigtop bug/ BIGTOP-2555 /bind-host-overrides Alternatively you can review and apply these changes as the patch at: https://github.com/apache/bigtop/pull/153.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #153
          Hide
          kwmonroe Kevin W Monroe added a comment -

          Charms using this patch are available in the ~bigdata-dev namespace. I've tested on lxd and azure using:

          https://api.jujucharms.com/charmstore/v5/hadoop-processing/archive/bundle-dev.yaml

          Show
          kwmonroe Kevin W Monroe added a comment - Charms using this patch are available in the ~bigdata-dev namespace. I've tested on lxd and azure using: https://api.jujucharms.com/charmstore/v5/hadoop-processing/archive/bundle-dev.yaml
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user ktsakalozos commented on the issue:

          https://github.com/apache/bigtop/pull/153

          LGTM +1 Tested on lxd and exposed services are indeed listening on the correct interface.

          As you can see we still have some services on 127.0.1.1 but we can handle any problems we may have in future PRs

          `
          Active Internet connections (only servers)
          Proto Recv-Q Send-Q Local Address Foreign Address State
          tcp 0 0 0.0.0.0:35482 0.0.0.0:* LISTEN
          tcp 0 0 127.0.1.1:10020 0.0.0.0:* LISTEN
          tcp 0 0 0.0.0.0:8649 0.0.0.0:* LISTEN
          tcp 0 0 127.0.1.1:19888 0.0.0.0:* LISTEN
          tcp 0 0 0.0.0.0:10033 0.0.0.0:* LISTEN
          tcp 0 0 0.0.0.0:8020 0.0.0.0:* LISTEN
          tcp 0 0 0.0.0.0:50070 0.0.0.0:* LISTEN
          tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
          tcp6 0 0 :::8088 :::* LISTEN
          tcp6 0 0 127.0.1.1:20888 :::* LISTEN
          tcp6 0 0 :::8025 :::* LISTEN
          tcp6 0 0 :::8030 :::* LISTEN
          tcp6 0 0 :::8032 :::* LISTEN
          tcp6 0 0 :::8033 :::* LISTEN
          tcp6 0 0 fe80::1:13128 :::* LISTEN
          tcp6 0 0 :::22 :::* LISTEN
          udp 0 0 0.0.0.0:68 0.0.0.0:*
          udp 0 0 0.0.0.0:8649 0.0.0.0:*
          udp 0 0 0.0.0.0:37509 0.0.0.0:*
          `

          Show
          githubbot ASF GitHub Bot added a comment - Github user ktsakalozos commented on the issue: https://github.com/apache/bigtop/pull/153 LGTM +1 Tested on lxd and exposed services are indeed listening on the correct interface. As you can see we still have some services on 127.0.1.1 but we can handle any problems we may have in future PRs ` Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State tcp 0 0 0.0.0.0:35482 0.0.0.0:* LISTEN tcp 0 0 127.0.1.1:10020 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:8649 0.0.0.0:* LISTEN tcp 0 0 127.0.1.1:19888 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:10033 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:8020 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:50070 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN tcp6 0 0 :::8088 :::* LISTEN tcp6 0 0 127.0.1.1:20888 :::* LISTEN tcp6 0 0 :::8025 :::* LISTEN tcp6 0 0 :::8030 :::* LISTEN tcp6 0 0 :::8032 :::* LISTEN tcp6 0 0 :::8033 :::* LISTEN tcp6 0 0 fe80::1:13128 :::* LISTEN tcp6 0 0 :::22 :::* LISTEN udp 0 0 0.0.0.0:68 0.0.0.0:* udp 0 0 0.0.0.0:8649 0.0.0.0:* udp 0 0 0.0.0.0:37509 0.0.0.0:* `
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user johnsca commented on the issue:

          https://github.com/apache/bigtop/pull/153

          I tested on aws and it works fine, but on lxd I see, like Kostantinos, two services still listening on 127.0.1.1. Furthermore, terasort fails with the following the the RM log: http://pastebin.ubuntu.com/23379093/

          Show
          githubbot ASF GitHub Bot added a comment - Github user johnsca commented on the issue: https://github.com/apache/bigtop/pull/153 I tested on aws and it works fine, but on lxd I see, like Kostantinos, two services still listening on 127.0.1.1. Furthermore, terasort fails with the following the the RM log: http://pastebin.ubuntu.com/23379093/
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user kwmonroe commented on the issue:

          https://github.com/apache/bigtop/pull/153

          Thanks for the eyeballs @ktsakalozos and @johnsca! I missed a mapred.jobhistory binding. You will no longer see any 127.0.x.y bindings.

          Also, thanks @johnsca for pointing out the terasort failure. This was due to a dns / hostname issue in lxd environments and was fixed with:

          https://github.com/juju-solutions/layer-apache-bigtop-base/commit/31c4273ade95fe8ee2212800e5ed6e8d56429ec7

          Show
          githubbot ASF GitHub Bot added a comment - Github user kwmonroe commented on the issue: https://github.com/apache/bigtop/pull/153 Thanks for the eyeballs @ktsakalozos and @johnsca! I missed a mapred.jobhistory binding. You will no longer see any 127.0.x.y bindings. Also, thanks @johnsca for pointing out the terasort failure. This was due to a dns / hostname issue in lxd environments and was fixed with: https://github.com/juju-solutions/layer-apache-bigtop-base/commit/31c4273ade95fe8ee2212800e5ed6e8d56429ec7
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user kwmonroe commented on the issue:

          https://github.com/apache/bigtop/pull/153

          Charms in ~bigdata-dev namespace have been refreshed and verified that terasort works on lxd again:
          ```
          results:
          meta:
          composite:
          direction: asc
          units: secs
          value: "385"
          start: 2016-10-25T22:19:00Z
          stop: 2016-10-25T22:25:25Z
          results:
          raw: '

          {"GC time elapsed (ms)": "19102", "Launched reduce tasks": "1", "Shuffled Maps ": "8", "FILE: Number of bytes written": "2081067810", "Physical memory (bytes) snapshot": "4834197504", "Total megabyte-seconds taken by all reduce tasks": "240579584", "Rack-local map tasks": "1", "HDFS: Number of large read operations": "0", "Failed Shuffles": "0", "Reduce output records": "10000000", "Map input records": "10000000", "Total vcore-seconds taken by all map tasks": "1682215", "WRONG_REDUCE": "0", "Spilled Records": "20000000", "Total time spent by all reduces in occupied slots (ms)": "234941", "FILE: Number of read operations": "0", "BAD_ID": "0", "Input split bytes": "1040", "Reduce input groups": "10000000", "Total megabyte-seconds taken by all map tasks": "1722588160", "HDFS: Number of read operations": "27", "Map output materialized bytes": "1040000048", "Bytes Read": "1000000000", "FILE: Number of bytes read": "1040000012", "CONNECTION": "0", "Combine output records": "0", "Total vcore-seconds taken by all reduce tasks": "234941", "Total time spent by all map tasks (ms)": "1682215", "CPU time spent (ms)": "179730", "Map output bytes": "1020000000", "Bytes Written": "1000000000", "IO_ERROR": "0", "Merged Map outputs": "8", "FILE: Number of write operations": "0", "Total time spent by all maps in occupied slots (ms)": "1682215", "Launched map tasks": "14", "Killed map tasks": "6", "Reduce shuffle bytes": "1040000048", "HDFS: Number of write operations": "2", "Map output records": "10000000", "HDFS: Number of bytes written": "1000000000", "Combine input records": "0", "FILE: Number of large read operations": "0", "Data-local map tasks": "13", "Total committed heap usage (bytes)": "4217896960", "Virtual memory (bytes) snapshot": "25333989376", "WRONG_LENGTH": "0", "Reduce input records": "10000000", "Total time spent by all reduce tasks (ms)": "234941", "HDFS: Number of bytes read": "1000001040", "WRONG_MAP": "0"}

          '
          status: completed
          timing:
          completed: 2016-10-25 22:25:27 +0000 UTC
          enqueued: 2016-10-25 22:18:47 +0000 UTC
          started: 2016-10-25 22:18:47 +0000 UTC
          ```

          Show
          githubbot ASF GitHub Bot added a comment - Github user kwmonroe commented on the issue: https://github.com/apache/bigtop/pull/153 Charms in ~bigdata-dev namespace have been refreshed and verified that terasort works on lxd again: ``` results: meta: composite: direction: asc units: secs value: "385" start: 2016-10-25T22:19:00Z stop: 2016-10-25T22:25:25Z results: raw: ' {"GC time elapsed (ms)": "19102", "Launched reduce tasks": "1", "Shuffled Maps ": "8", "FILE: Number of bytes written": "2081067810", "Physical memory (bytes) snapshot": "4834197504", "Total megabyte-seconds taken by all reduce tasks": "240579584", "Rack-local map tasks": "1", "HDFS: Number of large read operations": "0", "Failed Shuffles": "0", "Reduce output records": "10000000", "Map input records": "10000000", "Total vcore-seconds taken by all map tasks": "1682215", "WRONG_REDUCE": "0", "Spilled Records": "20000000", "Total time spent by all reduces in occupied slots (ms)": "234941", "FILE: Number of read operations": "0", "BAD_ID": "0", "Input split bytes": "1040", "Reduce input groups": "10000000", "Total megabyte-seconds taken by all map tasks": "1722588160", "HDFS: Number of read operations": "27", "Map output materialized bytes": "1040000048", "Bytes Read": "1000000000", "FILE: Number of bytes read": "1040000012", "CONNECTION": "0", "Combine output records": "0", "Total vcore-seconds taken by all reduce tasks": "234941", "Total time spent by all map tasks (ms)": "1682215", "CPU time spent (ms)": "179730", "Map output bytes": "1020000000", "Bytes Written": "1000000000", "IO_ERROR": "0", "Merged Map outputs": "8", "FILE: Number of write operations": "0", "Total time spent by all maps in occupied slots (ms)": "1682215", "Launched map tasks": "14", "Killed map tasks": "6", "Reduce shuffle bytes": "1040000048", "HDFS: Number of write operations": "2", "Map output records": "10000000", "HDFS: Number of bytes written": "1000000000", "Combine input records": "0", "FILE: Number of large read operations": "0", "Data-local map tasks": "13", "Total committed heap usage (bytes)": "4217896960", "Virtual memory (bytes) snapshot": "25333989376", "WRONG_LENGTH": "0", "Reduce input records": "10000000", "Total time spent by all reduce tasks (ms)": "234941", "HDFS: Number of bytes read": "1000001040", "WRONG_MAP": "0"} ' status: completed timing: completed: 2016-10-25 22:25:27 +0000 UTC enqueued: 2016-10-25 22:18:47 +0000 UTC started: 2016-10-25 22:18:47 +0000 UTC ```
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/bigtop/pull/153

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/bigtop/pull/153

            People

            • Assignee:
              kwmonroe Kevin W Monroe
              Reporter:
              kwmonroe Kevin W Monroe
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development