Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.20.0
    • Component/s: None
    • Labels:
      None
    • Environment:

      UNIX, Linux

    • Hadoop Flags:
      Reviewed

      Description

      If a UNIX user has misconfigured groups, either locally or through LDAP, then Hadoop will not be able to start.

      1. drwhointardis.patch
        2 kB
        Alex Loddengaard
      2. drwhointardis_v2.patch
        2 kB
        Alex Loddengaard

        Activity

        Hide
        Vijay added a comment -

        I am facing this similar issue - This is Linux Box with KEON- Boks configured. Seems it is not fetching userid.

        uid=76948(f1234) gid=15929(admin) groups=15929(admin)

        $ hadoop dfs -ls
        Warning: $HADOOP_HOME is deprecated.

        Bad connection to FS. command aborted. exception: failure to login

        Show
        Vijay added a comment - I am facing this similar issue - This is Linux Box with KEON- Boks configured. Seems it is not fetching userid. uid=76948(f1234) gid=15929(admin) groups=15929(admin) $ hadoop dfs -ls Warning: $HADOOP_HOME is deprecated. Bad connection to FS. command aborted. exception: failure to login
        Owen O'Malley made changes -
        Component/s dfs [ 12310710 ]
        Nigel Daley made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Hide
        Alex Loddengaard added a comment -

        Haha! Are you crying about the choice of defaults or about my first patch being submitted?

        Show
        Alex Loddengaard added a comment - Haha! Are you crying about the choice of defaults or about my first patch being submitted?
        Hide
        Allen Wittenauer added a comment -

        cries

        Show
        Allen Wittenauer added a comment - cries
        Hide
        Hudson added a comment -

        Integrated in Hadoop-trunk #670 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/670/)
        . Set defaults for user, group in UnixUserGroupInformation so
        login fails more predictably when misconfigured. Contributed by Alex Loddengaard.

        Show
        Hudson added a comment - Integrated in Hadoop-trunk #670 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/670/ ) . Set defaults for user, group in UnixUserGroupInformation so login fails more predictably when misconfigured. Contributed by Alex Loddengaard.
        Hide
        Alex Loddengaard added a comment -

        Thanks, Chris. It's exciting to get my first patch in . Sweeeeeet.

        Show
        Alex Loddengaard added a comment - Thanks, Chris. It's exciting to get my first patch in . Sweeeeeet.
        Hide
        Chris Douglas added a comment -

        I don't think so... in environments where this wouldn't work before, now it doesn't work in a consistent way.

        Show
        Chris Douglas added a comment - I don't think so... in environments where this wouldn't work before, now it doesn't work in a consistent way.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Is this an incompatible change?

        Show
        Tsz Wo Nicholas Sze added a comment - Is this an incompatible change?
        Chris Douglas made changes -
        Fix Version/s 0.20.0 [ 12313438 ]
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Hadoop Flags [Reviewed]
        Resolution Fixed [ 1 ]
        Hide
        Chris Douglas added a comment -

        Nevermind. I just committed this. Thanks, Alex

        Show
        Chris Douglas added a comment - Nevermind. I just committed this. Thanks, Alex
        Hide
        Chris Douglas added a comment -

        Sorry, by references, I meant {@link #DEFAULT_USERNAME} in the javadoc for UnixUserGroupInformation::login

        Show
        Chris Douglas added a comment - Sorry, by references, I meant { @link #DEFAULT_USERNAME } in the javadoc for UnixUserGroupInformation::login
        Alex Loddengaard made changes -
        Attachment drwhointardis_v2.patch [ 12394589 ]
        Hide
        Alex Loddengaard added a comment -

        Attaching a new patch that makes the default username and group constants public instead of private.

        Agreed that it would be nice to link to these with JavaDoc.

        Show
        Alex Loddengaard added a comment - Attaching a new patch that makes the default username and group constants public instead of private. Agreed that it would be nice to link to these with JavaDoc.
        Hide
        Chris Douglas added a comment -

        I decided to make DEFAULT_USERNAME and DEFAULT_GROUP private constants, just to be sure to not pollute the public interface.

        It's a little confusing that they're referenced in the javadoc, then... I'd lean towards making them public, but wouldn't insist on it. If public, changing the mentions to be references would also be a good idea.

        Show
        Chris Douglas added a comment - I decided to make DEFAULT_USERNAME and DEFAULT_GROUP private constants, just to be sure to not pollute the public interface. It's a little confusing that they're referenced in the javadoc, then... I'd lean towards making them public, but wouldn't insist on it. If public, changing the mentions to be references would also be a good idea.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12393318/drwhointardis.patch
        against trunk revision 711482.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no tests are needed for this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3533/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3533/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3533/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3533/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12393318/drwhointardis.patch against trunk revision 711482. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3533/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3533/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3533/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3533/console This message is automatically generated.
        Alex Loddengaard made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Assignee Alex Loddengaard [ alexlod ]
        Alex Loddengaard made changes -
        Field Original Value New Value
        Attachment drwhointardis.patch [ 12393318 ]
        Hide
        Alex Loddengaard added a comment -

        Submitting a patch to set a default name and group if an Exception occurs when fetching these from UNIX. The default username is "DrWho" and the default group is "Tardis".

        I am not submitting a test, because UnixUserGroupInformation#getUnixUserName() and UnixUserGroupInformation#getUnixGroups() are static methods, which stops me from extending UnixUserGroupInformation and overwriting these methods to throw Exceptions. I considered creating a test by creating a binary for "whoami" and "groups" and adding these to my PATH, but I figured this was rather cumbersome with such a simple patch. Let me know if you disagree and I can make binaries for these.

        I decided to make DEFAULT_USERNAME and DEFAULT_GROUP private constants, just to be sure to not pollute the public interface.

        Show
        Alex Loddengaard added a comment - Submitting a patch to set a default name and group if an Exception occurs when fetching these from UNIX. The default username is "DrWho" and the default group is "Tardis". I am not submitting a test, because UnixUserGroupInformation#getUnixUserName() and UnixUserGroupInformation#getUnixGroups() are static methods, which stops me from extending UnixUserGroupInformation and overwriting these methods to throw Exceptions. I considered creating a test by creating a binary for "whoami" and "groups" and adding these to my PATH, but I figured this was rather cumbersome with such a simple patch. Let me know if you disagree and I can make binaries for these. I decided to make DEFAULT_USERNAME and DEFAULT_GROUP private constants, just to be sure to not pollute the public interface.
        Hide
        Alex Loddengaard added a comment -

        One problem with doing diagnostics for configuration failures and errors in the underlying OS/JRE is that they are hard to reliably translate. If you look for strings, your code breaks on every OS/JRE update. If you take a guess, you end up with those windows programs that assume that all failures are due to being out of memory, because that was a common fault. but if it isnt the fault, it is even more misleading.

        You bring up a great point, Steve. For this reason I'm convinced that perhaps creating a better error message isn't the best solution.

        What we can do is what you have done: place the stack trace on a bugrep where it is searchable. The next time someone gets this error and searches for it. they will find this page. So we should add the instructions on what to do to make the error message go away.

        In my case, I was working at a large corporation, and my IT department was unable to resolve the problem. I had to drop the internal Ubuntu derivative, along with LDAP, certain tools, etc, and switch to pure Ubuntu. My solution was rather cumbersome, but I my efforts to Google for solutions to my problem only led to dead ends and frustration. I even sent an email to the users list with no luck.

        +1 to DrWho:Tardis

        Show
        Alex Loddengaard added a comment - One problem with doing diagnostics for configuration failures and errors in the underlying OS/JRE is that they are hard to reliably translate. If you look for strings, your code breaks on every OS/JRE update. If you take a guess, you end up with those windows programs that assume that all failures are due to being out of memory, because that was a common fault. but if it isnt the fault, it is even more misleading. You bring up a great point, Steve. For this reason I'm convinced that perhaps creating a better error message isn't the best solution. What we can do is what you have done: place the stack trace on a bugrep where it is searchable. The next time someone gets this error and searches for it. they will find this page. So we should add the instructions on what to do to make the error message go away. In my case, I was working at a large corporation, and my IT department was unable to resolve the problem. I had to drop the internal Ubuntu derivative, along with LDAP, certain tools, etc, and switch to pure Ubuntu. My solution was rather cumbersome, but I my efforts to Google for solutions to my problem only led to dead ends and frustration. I even sent an email to the users list with no luck. +1 to DrWho:Tardis
        Hide
        Doug Cutting added a comment -

        > +1 for the return of DrWho !

        But what would the default group be? "Tardis"?

        Show
        Doug Cutting added a comment - > +1 for the return of DrWho ! But what would the default group be? "Tardis"?
        Hide
        Owen O'Malley added a comment -

        +1 for the return of DrWho !

        Historical background:

        Hadoop used to default to "Dr Who" as the user id, if it couldn't figure out your username from the system properties. It was removed because of the space and the assertion that it never happened. Introduced in HADOOP-52 and removed by HADOOP-2833.

        Show
        Owen O'Malley added a comment - +1 for the return of DrWho ! Historical background: Hadoop used to default to "Dr Who" as the user id, if it couldn't figure out your username from the system properties. It was removed because of the space and the assertion that it never happened. Introduced in HADOOP-52 and removed by HADOOP-2833 .
        Hide
        steve_l added a comment -

        One problem with doing diagnostics for configuration failures and errors in the underlying OS/JRE is that they are hard to reliably translate. If you look for strings, your code breaks on every OS/JRE update. If you take a guess, you end up with those windows programs that assume that all failures are due to being out of memory, because that was a common fault. but if it isnt the fault, it is even more misleading.

        What we can do is what you have done: place the stack trace on a bugrep where it is searchable. The next time someone gets this error and searches for it. they will find this page. So we should add the instructions on what to do to make the error message go away.

        Show
        steve_l added a comment - One problem with doing diagnostics for configuration failures and errors in the underlying OS/JRE is that they are hard to reliably translate. If you look for strings, your code breaks on every OS/JRE update. If you take a guess, you end up with those windows programs that assume that all failures are due to being out of memory, because that was a common fault. but if it isnt the fault, it is even more misleading. What we can do is what you have done: place the stack trace on a bugrep where it is searchable. The next time someone gets this error and searches for it. they will find this page. So we should add the instructions on what to do to make the error message go away.
        Hide
        Alex Loddengaard added a comment -

        Perhaps this isn't a bug, but I still think a simple workaround such as using a default group when 'groups' errs is a good idea. Or at the very least Hadoop should supply a better error message, something along the lines of "Hadoop is unable to get user group information about you." Thoughts?

        Show
        Alex Loddengaard added a comment - Perhaps this isn't a bug, but I still think a simple workaround such as using a default group when 'groups' errs is a good idea. Or at the very least Hadoop should supply a better error message, something along the lines of "Hadoop is unable to get user group information about you." Thoughts?
        Hide
        Tsz Wo Nicholas Sze added a comment -

        > Is this a bug in hadoop?

        Currently, "groups" is required external command for Hadoop. So, this is not a bug. See also the User Identity section in the Permissions User and Administrator Guide.

        Show
        Tsz Wo Nicholas Sze added a comment - > Is this a bug in hadoop? Currently, "groups" is required external command for Hadoop. So, this is not a bug. See also the User Identity section in the Permissions User and Administrator Guide.
        Hide
        Dipanjan Das added a comment - - edited

        I have been trying to install hadoop on my linux machine to create a development environment, and I have been following instructions on this page:

        http://wiki.apache.org/hadoop/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29

        However, I could not proceed with the installation because of a group ID that my system associates with my user name, breaks hadoop. As an example, if I run:

        $ bin/hadoop dfs -ls

        I get the following error:

        08/10/16 13:10:13 WARN fs.FileSystem: uri=hdfs://localhost:54310
        javax.security.auth.login.LoginException: Login failed: id: cannot find name for group ID 1106734464

        at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:250)
        at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:275)
        at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:257)
        at org.apache.hadoop.security.UserGroupInformation.login(UserGroupInformation.java:67)
        at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:1353)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1289)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:203)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:108)
        at org.apache.hadoop.fs.FsShell.init(FsShell.java:87)
        at org.apache.hadoop.fs.FsShell.run(FsShell.java:1717)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
        at org.apache.hadoop.fs.FsShell.main(FsShell.java:1866)
        Bad connection to FS. command aborted.

        Here, "1106734464" is a group ID that my username belongs to. When I type

        $ groups

        I get

        hadoop id: cannot find name for group ID 1106734464
        1106734464

        Here, "hadoop" is a username that I created for running hadoop. Similarly, if I run:

        $ id

        I get:

        uid=501(hadoop) gid=501(hadoop) groups=501(hadoop),1106734464

        Is this a bug in hadoop?

        Show
        Dipanjan Das added a comment - - edited I have been trying to install hadoop on my linux machine to create a development environment, and I have been following instructions on this page: http://wiki.apache.org/hadoop/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29 However, I could not proceed with the installation because of a group ID that my system associates with my user name, breaks hadoop. As an example, if I run: $ bin/hadoop dfs -ls I get the following error: 08/10/16 13:10:13 WARN fs.FileSystem: uri=hdfs://localhost:54310 javax.security.auth.login.LoginException: Login failed: id: cannot find name for group ID 1106734464 at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:250) at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:275) at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:257) at org.apache.hadoop.security.UserGroupInformation.login(UserGroupInformation.java:67) at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:1353) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1289) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:203) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:108) at org.apache.hadoop.fs.FsShell.init(FsShell.java:87) at org.apache.hadoop.fs.FsShell.run(FsShell.java:1717) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.fs.FsShell.main(FsShell.java:1866) Bad connection to FS. command aborted. Here, "1106734464" is a group ID that my username belongs to. When I type $ groups I get hadoop id: cannot find name for group ID 1106734464 1106734464 Here, "hadoop" is a username that I created for running hadoop. Similarly, if I run: $ id I get: uid=501(hadoop) gid=501(hadoop) groups=501(hadoop),1106734464 Is this a bug in hadoop?
        Hide
        Alex Loddengaard added a comment -

        I originally stumbled on this problem when using an internal derivative of Ubuntu, with funky LDAP settings. Running Hadoop would generate the following error:

        ERROR dfs.NameNode: java.io.IOException: javax.security.auth.login.LoginException: Login failed: id: cannot find name for group ID

        The last portion of the error, "Login failed: id: cannot find name of group ID," would also occur when I ran the "groups" binary from a command line.

        I propose using a default group in the case that getting a user's groups fails.

        Show
        Alex Loddengaard added a comment - I originally stumbled on this problem when using an internal derivative of Ubuntu, with funky LDAP settings. Running Hadoop would generate the following error: ERROR dfs.NameNode: java.io.IOException: javax.security.auth.login.LoginException: Login failed: id: cannot find name for group ID The last portion of the error, "Login failed: id: cannot find name of group ID," would also occur when I ran the "groups" binary from a command line. I propose using a default group in the case that getting a user's groups fails.
        Alex Loddengaard created issue -

          People

          • Assignee:
            Alex Loddengaard
            Reporter:
            Alex Loddengaard
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development