Uploaded image for project: 'Apache Twill'
  1. Apache Twill
  2. TWILL-237

Twill is using hdfs HAUtil api that is nont-compatible with hadoop 2.8

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.12.0
    • Component/s: yarn
    • Labels:
      None

      Description

      Twill is using hdfs.HAUtil apis that are suppose to be hdfs private and subsequently signature of isLogicalURI was changed (actually name was changed) in hadoop version 2.8

      Will post a patch for now to support both old and new names, but I think eventually references to private hdfs interfaces/classes should be removed from twill

        Issue Links

          Activity

          Hide
          chtyim Terence Yim added a comment -

          Thanks Yuliya Feldman for your contribution.

          Show
          chtyim Terence Yim added a comment - Thanks Yuliya Feldman for your contribution.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/twill/pull/55

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/twill/pull/55
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user yufeldman commented on a diff in the pull request:

          https://github.com/apache/twill/pull/55#discussion_r131224027

          — Diff: twill-yarn/src/main/java/org/apache/twill/filesystem/FileContextLocationUtil.java —
          @@ -0,0 +1,77 @@
          +/*
          + * Licensed to the Apache Software Foundation (ASF) under one
          + * or more contributor license agreements. See the NOTICE file
          + * distributed with this work for additional information
          + * regarding copyright ownership. The ASF licenses this file
          + * to you under the Apache License, Version 2.0 (the
          + * "License"); you may not use this file except in compliance
          + * with the License. You may obtain a copy of the License at
          + *
          + * http://www.apache.org/licenses/LICENSE-2.0
          + *
          + * Unless required by applicable law or agreed to in writing, software
          + * distributed under the License is distributed on an "AS IS" BASIS,
          + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
          + * See the License for the specific language governing permissions and
          + * limitations under the License.
          + */
          +package org.apache.twill.filesystem;
          +
          +import com.google.common.base.Throwables;
          +
          +import org.apache.hadoop.conf.Configuration;
          +import org.apache.hadoop.hdfs.HAUtil;
          +
          +import java.lang.invoke.MethodHandle;
          +import java.lang.invoke.MethodHandles;
          +import java.lang.invoke.MethodType;
          +import java.net.URI;
          +
          +/**
          + * Utility class.
          + */
          +final class FileContextLocationUtil {
          — End diff –

          Thank you for the final review. Addressed your comment

          Show
          githubbot ASF GitHub Bot added a comment - Github user yufeldman commented on a diff in the pull request: https://github.com/apache/twill/pull/55#discussion_r131224027 — Diff: twill-yarn/src/main/java/org/apache/twill/filesystem/FileContextLocationUtil.java — @@ -0,0 +1,77 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.twill.filesystem; + +import com.google.common.base.Throwables; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.hdfs.HAUtil; + +import java.lang.invoke.MethodHandle; +import java.lang.invoke.MethodHandles; +import java.lang.invoke.MethodType; +import java.net.URI; + +/** + * Utility class. + */ +final class FileContextLocationUtil { — End diff – Thank you for the final review. Addressed your comment
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on a diff in the pull request:

          https://github.com/apache/twill/pull/55#discussion_r131208032

          — Diff: twill-yarn/src/main/java/org/apache/twill/filesystem/FileContextLocationUtil.java —
          @@ -0,0 +1,77 @@
          +/*
          + * Licensed to the Apache Software Foundation (ASF) under one
          + * or more contributor license agreements. See the NOTICE file
          + * distributed with this work for additional information
          + * regarding copyright ownership. The ASF licenses this file
          + * to you under the Apache License, Version 2.0 (the
          + * "License"); you may not use this file except in compliance
          + * with the License. You may obtain a copy of the License at
          + *
          + * http://www.apache.org/licenses/LICENSE-2.0
          + *
          + * Unless required by applicable law or agreed to in writing, software
          + * distributed under the License is distributed on an "AS IS" BASIS,
          + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
          + * See the License for the specific language governing permissions and
          + * limitations under the License.
          + */
          +package org.apache.twill.filesystem;
          +
          +import com.google.common.base.Throwables;
          +
          +import org.apache.hadoop.conf.Configuration;
          +import org.apache.hadoop.hdfs.HAUtil;
          +
          +import java.lang.invoke.MethodHandle;
          +import java.lang.invoke.MethodHandles;
          +import java.lang.invoke.MethodType;
          +import java.net.URI;
          +
          +/**
          + * Utility class.
          + */
          +class FileContextLocationUtil {
          — End diff –

          Util class should be `final`.

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on a diff in the pull request: https://github.com/apache/twill/pull/55#discussion_r131208032 — Diff: twill-yarn/src/main/java/org/apache/twill/filesystem/FileContextLocationUtil.java — @@ -0,0 +1,77 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.twill.filesystem; + +import com.google.common.base.Throwables; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.hdfs.HAUtil; + +import java.lang.invoke.MethodHandle; +import java.lang.invoke.MethodHandles; +import java.lang.invoke.MethodType; +import java.net.URI; + +/** + * Utility class. + */ +class FileContextLocationUtil { — End diff – Util class should be `final`.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on a diff in the pull request:

          https://github.com/apache/twill/pull/55#discussion_r131207866

          — Diff: twill-yarn/src/main/java/org/apache/twill/filesystem/FileContextLocation.java —
          @@ -162,7 +163,8 @@ public URI toURI() {
          // append "port" to the path URI, while the DistributedFileSystem always use the cluster logical
          // name, which doesn't allow having port in it.
          URI uri = path.toUri();

          • if (HAUtil.isLogicalUri(locationFactory.getConfiguration(), uri)) {
            +
            + if (FileContextLocationUtil.useLogicalUri(locationFactory.getConfiguration(), uri)) {
              • End diff –

          That make sense.

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on a diff in the pull request: https://github.com/apache/twill/pull/55#discussion_r131207866 — Diff: twill-yarn/src/main/java/org/apache/twill/filesystem/FileContextLocation.java — @@ -162,7 +163,8 @@ public URI toURI() { // append "port" to the path URI, while the DistributedFileSystem always use the cluster logical // name, which doesn't allow having port in it. URI uri = path.toUri(); if (HAUtil.isLogicalUri(locationFactory.getConfiguration(), uri)) { + + if (FileContextLocationUtil.useLogicalUri(locationFactory.getConfiguration(), uri)) { End diff – That make sense.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user yufeldman commented on a diff in the pull request:

          https://github.com/apache/twill/pull/55#discussion_r124117726

          — Diff: twill-yarn/src/main/java/org/apache/twill/filesystem/FileContextLocation.java —
          @@ -162,7 +163,8 @@ public URI toURI() {
          // append "port" to the path URI, while the DistributedFileSystem always use the cluster logical
          // name, which doesn't allow having port in it.
          URI uri = path.toUri();

          • if (HAUtil.isLogicalUri(locationFactory.getConfiguration(), uri)) {
            +
            + if (FileContextLocationUtil.useLogicalUri(locationFactory.getConfiguration(), uri)) {
              • End diff –

          I am not sure it is as simple as that. We may need to change equals() to compare based on toURI() and not path or somehow make path and toURI() to be in sync within that class, otherwise parts become inconsistent and I am not sure unitTests failures in this case is the only issue.

          Show
          githubbot ASF GitHub Bot added a comment - Github user yufeldman commented on a diff in the pull request: https://github.com/apache/twill/pull/55#discussion_r124117726 — Diff: twill-yarn/src/main/java/org/apache/twill/filesystem/FileContextLocation.java — @@ -162,7 +163,8 @@ public URI toURI() { // append "port" to the path URI, while the DistributedFileSystem always use the cluster logical // name, which doesn't allow having port in it. URI uri = path.toUri(); if (HAUtil.isLogicalUri(locationFactory.getConfiguration(), uri)) { + + if (FileContextLocationUtil.useLogicalUri(locationFactory.getConfiguration(), uri)) { End diff – I am not sure it is as simple as that. We may need to change equals() to compare based on toURI() and not path or somehow make path and toURI() to be in sync within that class, otherwise parts become inconsistent and I am not sure unitTests failures in this case is the only issue.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on a diff in the pull request:

          https://github.com/apache/twill/pull/55#discussion_r124066638

          — Diff: twill-yarn/src/main/java/org/apache/twill/filesystem/FileContextLocation.java —
          @@ -162,7 +163,8 @@ public URI toURI() {
          // append "port" to the path URI, while the DistributedFileSystem always use the cluster logical
          // name, which doesn't allow having port in it.
          URI uri = path.toUri();

          • if (HAUtil.isLogicalUri(locationFactory.getConfiguration(), uri)) {
            +
            + if (FileContextLocationUtil.useLogicalUri(locationFactory.getConfiguration(), uri)) {
              • End diff –

          Yes. I think it should be safe to remove and always return an URI without port

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on a diff in the pull request: https://github.com/apache/twill/pull/55#discussion_r124066638 — Diff: twill-yarn/src/main/java/org/apache/twill/filesystem/FileContextLocation.java — @@ -162,7 +163,8 @@ public URI toURI() { // append "port" to the path URI, while the DistributedFileSystem always use the cluster logical // name, which doesn't allow having port in it. URI uri = path.toUri(); if (HAUtil.isLogicalUri(locationFactory.getConfiguration(), uri)) { + + if (FileContextLocationUtil.useLogicalUri(locationFactory.getConfiguration(), uri)) { End diff – Yes. I think it should be safe to remove and always return an URI without port
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user yufeldman commented on a diff in the pull request:

          https://github.com/apache/twill/pull/55#discussion_r124066342

          — Diff: twill-yarn/src/main/java/org/apache/twill/filesystem/FileContextLocation.java —
          @@ -162,7 +163,8 @@ public URI toURI() {
          // append "port" to the path URI, while the DistributedFileSystem always use the cluster logical
          // name, which doesn't allow having port in it.
          URI uri = path.toUri();

          • if (HAUtil.isLogicalUri(locationFactory.getConfiguration(), uri)) {
            +
            + if (FileContextLocationUtil.useLogicalUri(locationFactory.getConfiguration(), uri)) {
              • End diff –

          Should we just get rid of the check in this case?

          Show
          githubbot ASF GitHub Bot added a comment - Github user yufeldman commented on a diff in the pull request: https://github.com/apache/twill/pull/55#discussion_r124066342 — Diff: twill-yarn/src/main/java/org/apache/twill/filesystem/FileContextLocation.java — @@ -162,7 +163,8 @@ public URI toURI() { // append "port" to the path URI, while the DistributedFileSystem always use the cluster logical // name, which doesn't allow having port in it. URI uri = path.toUri(); if (HAUtil.isLogicalUri(locationFactory.getConfiguration(), uri)) { + + if (FileContextLocationUtil.useLogicalUri(locationFactory.getConfiguration(), uri)) { End diff – Should we just get rid of the check in this case?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on a diff in the pull request:

          https://github.com/apache/twill/pull/55#discussion_r124058716

          — Diff: twill-yarn/src/main/java/org/apache/twill/filesystem/FileContextLocation.java —
          @@ -162,7 +163,8 @@ public URI toURI() {
          // append "port" to the path URI, while the DistributedFileSystem always use the cluster logical
          // name, which doesn't allow having port in it.
          URI uri = path.toUri();

          • if (HAUtil.isLogicalUri(locationFactory.getConfiguration(), uri)) {
            +
            + if (FileContextLocationUtil.useLogicalUri(locationFactory.getConfiguration(), uri)) {
              • End diff –

          It seems like we actually don't need to check, but rather just always strip off the port from the `URI`, since the `FileContext` or `FileSystem` in Hadoop will always determine it internally based on the `Configuration`.

          In fact that would make the returning `URI` more predictable.

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on a diff in the pull request: https://github.com/apache/twill/pull/55#discussion_r124058716 — Diff: twill-yarn/src/main/java/org/apache/twill/filesystem/FileContextLocation.java — @@ -162,7 +163,8 @@ public URI toURI() { // append "port" to the path URI, while the DistributedFileSystem always use the cluster logical // name, which doesn't allow having port in it. URI uri = path.toUri(); if (HAUtil.isLogicalUri(locationFactory.getConfiguration(), uri)) { + + if (FileContextLocationUtil.useLogicalUri(locationFactory.getConfiguration(), uri)) { End diff – It seems like we actually don't need to check, but rather just always strip off the port from the `URI`, since the `FileContext` or `FileSystem` in Hadoop will always determine it internally based on the `Configuration`. In fact that would make the returning `URI` more predictable.
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user yufeldman opened a pull request:

          https://github.com/apache/twill/pull/55

          (TWILL-237) Twill is using hdfs HAUtil api that is nont-compatible wi…

          …th hadoop 2.8

          + Use Java's MethodHandle (dynamic lang support) rather than Method (reflection) in FileContextLocationUtil
          + Expose static API rather than the handle itself

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/yufeldman/twill branch-TWILL-237

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/twill/pull/55.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #55


          commit fe615a3af55ec9fd29640f3c03cb56de36062d96
          Author: Sudheesh Katkam <sudheesh@dremio.com>
          Date: 2017-06-16T22:42:18Z

          (TWILL-237) Twill is using hdfs HAUtil api that is nont-compatible with hadoop 2.8

          + Use Java's MethodHandle (dynamic lang support) rather than Method (reflection) in FileContextLocationUtil
          + Expose static API rather than the handle itself


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user yufeldman opened a pull request: https://github.com/apache/twill/pull/55 ( TWILL-237 ) Twill is using hdfs HAUtil api that is nont-compatible wi… …th hadoop 2.8 + Use Java's MethodHandle (dynamic lang support) rather than Method (reflection) in FileContextLocationUtil + Expose static API rather than the handle itself You can merge this pull request into a Git repository by running: $ git pull https://github.com/yufeldman/twill branch- TWILL-237 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/twill/pull/55.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #55 commit fe615a3af55ec9fd29640f3c03cb56de36062d96 Author: Sudheesh Katkam <sudheesh@dremio.com> Date: 2017-06-16T22:42:18Z ( TWILL-237 ) Twill is using hdfs HAUtil api that is nont-compatible with hadoop 2.8 + Use Java's MethodHandle (dynamic lang support) rather than Method (reflection) in FileContextLocationUtil + Expose static API rather than the handle itself

            People

            • Assignee:
              yufeldman Yuliya Feldman
              Reporter:
              yufeldman Yuliya Feldman
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development