Hadoop Common
  1. Hadoop Common
  2. HADOOP-4368

Superuser privileges required to do "df"

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.18.1
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      New filesystem shell command -df reports capacity, space used and space free. Any user may execute this command without special privileges.

      Description

      super user privileges are required in DFS in order to get the file system statistics (FSNamesystem.java, getStats method). This means that when HDFS is mounted via fuse-dfs as a non-root user, "df" is going to return 16exabytes total and 0 free instead of the correct amount.

      As far as I can tell, there's no need to require super user privileges to see the file system size (and historically in Unix, this is not required).

      To fix this, simply comment out the privilege check in the getStats method.

      1. fuse_statfs_trunk.patch
        1 kB
        Brian Bockelman
      2. fuse_statfs.patch
        1 kB
        Brian Bockelman
      3. hadoop4368.fsstatus.patch
        7 kB
        Craig Macdonald
      4. hadoop4368.fsstatus.v2.patch
        9 kB
        Craig Macdonald
      5. hadoop4368.fsstatus.v3.patch
        16 kB
        Craig Macdonald
      6. hadoop4368.fsstatus.v4.patch
        19 kB
        Craig Macdonald
      7. hadoop4368.fsstatus.v5.patch
        19 kB
        Craig Macdonald
      8. hadoop4368.fsstatus.v6.patch
        20 kB
        Craig Macdonald
      9. hadoop4368.fsstatus.v7.patch
        24 kB
        Craig Macdonald
      10. Test plan for hadoop fs -df.html
        5 kB
        Ravi Phulari

        Issue Links

          Activity

          Hide
          Pete Wyckoff added a comment -

          hi brian,
          do you know why they (who?) added this check?

          pete

          Show
          Pete Wyckoff added a comment - hi brian, do you know why they (who?) added this check? pete
          Hide
          Brian Bockelman added a comment -

          No clue - I suspect that it was checked in when permissions were added to the FSNamesystem.

          I emailed the core-user list, but no dice.

          Show
          Brian Bockelman added a comment - No clue - I suspect that it was checked in when permissions were added to the FSNamesystem. I emailed the core-user list, but no dice.
          Hide
          Doug Cutting added a comment -

          >I suspect that it was checked in when permissions were added to the FSNamesystem.

          Did you verify this? 'svn blame' tells all.

          http://tinyurl.com/42uf5d

          which leads to HADOOP-2659, indicating the folks you might direct your question towards. core-dev might be the more appropriate list at this point.

          Show
          Doug Cutting added a comment - >I suspect that it was checked in when permissions were added to the FSNamesystem. Did you verify this? 'svn blame' tells all. http://tinyurl.com/42uf5d which leads to HADOOP-2659 , indicating the folks you might direct your question towards. core-dev might be the more appropriate list at this point.
          Hide
          Pete Wyckoff added a comment -

          core-dev might be the more appropriate list at this

          I emailed core-dev about this with the subject "Requiring super user for getStats"

          Show
          Pete Wyckoff added a comment - core-dev might be the more appropriate list at this I emailed core-dev about this with the subject "Requiring super user for getStats"
          Hide
          Konstantin Shvachko added a comment -

          You are talking about DFSAdmin -report command, right?
          In general DFSAdmin commands should require superuser privileges. If -report should be opened to regular users it should not be a DFSAdmin command but rather a regular shell command. So opening getStats for all users means moving it to the FSShell command group.

          Show
          Konstantin Shvachko added a comment - You are talking about DFSAdmin -report command, right? In general DFSAdmin commands should require superuser privileges. If -report should be opened to regular users it should not be a DFSAdmin command but rather a regular shell command. So opening getStats for all users means moving it to the FSShell command group.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > In general DFSAdmin commands should require superuser privileges.
          +1

          When I worked on HADOOP-2659, I added checks for superuser for all DFSAdmin commands.

          Show
          Tsz Wo Nicholas Sze added a comment - > In general DFSAdmin commands should require superuser privileges. +1 When I worked on HADOOP-2659 , I added checks for superuser for all DFSAdmin commands.
          Hide
          Brian Bockelman added a comment -

          Hey Konstantin, Nicholas,

          The DFSAdmin -report command gives you a lot more information than what getStats does. getStats only returns the value of capacity/free/used; these are kept in internal counters, and not recomputed when the function is executed. Traditionally, knowing how large the file system is is not a protected operation in Unix.

          I guess it would make sense to include the getStats command in the FSShell command group.

          Show
          Brian Bockelman added a comment - Hey Konstantin, Nicholas, The DFSAdmin -report command gives you a lot more information than what getStats does. getStats only returns the value of capacity/free/used; these are kept in internal counters, and not recomputed when the function is executed. Traditionally, knowing how large the file system is is not a protected operation in Unix. I guess it would make sense to include the getStats command in the FSShell command group.
          Hide
          Pete Wyckoff added a comment -

          In general DFSAdmin commands should require superuser privileges.

          but, not for commands where in the posix world super user is not required - I think the default should be to go with posix requirements unless there's a compelling reason in HDFS to require more.

          Show
          Pete Wyckoff added a comment - In general DFSAdmin commands should require superuser privileges. but, not for commands where in the posix world super user is not required - I think the default should be to go with posix requirements unless there's a compelling reason in HDFS to require more.
          Hide
          Konstantin Shvachko added a comment -

          shv> opening getStats for all users means moving it to the FSShell command group.

          Show
          Konstantin Shvachko added a comment - shv> opening getStats for all users means moving it to the FSShell command group.
          Hide
          Doug Cutting added a comment -

          Pete, no one disagrees with your thesis, that getStats should not be restricted to superuser. The solution is to move it from DFSAdmin to FSShell. Unfortunately it's not quite that simple, since this is currently DFS-specific, and FSShell is generic. But capacity is a fairly generic operation. So a proper fix might also add a FileSsytem method like:

          public FsStatus FileSystem#getStatus()
          
          public class FsStatus {
            public long getCapacity();
            public long getFree();
            public long getUsed();
          }
          
          Show
          Doug Cutting added a comment - Pete, no one disagrees with your thesis, that getStats should not be restricted to superuser. The solution is to move it from DFSAdmin to FSShell. Unfortunately it's not quite that simple, since this is currently DFS-specific, and FSShell is generic. But capacity is a fairly generic operation. So a proper fix might also add a FileSsytem method like: public FsStatus FileSystem#getStatus() public class FsStatus { public long getCapacity(); public long getFree(); public long getUsed(); }
          Hide
          Allen Wittenauer added a comment -

          I argue that anything that doesn't require super user shouldn't be under dfsadmin to make the distinction clear.

          One thing that has me concerned... what sort of load on the name node does generating the data for -report generate? If it is significant, then it should remained privileged.

          Show
          Allen Wittenauer added a comment - I argue that anything that doesn't require super user shouldn't be under dfsadmin to make the distinction clear. One thing that has me concerned... what sort of load on the name node does generating the data for -report generate? If it is significant, then it should remained privileged.
          Hide
          Craig Macdonald added a comment -

          @Allen
          The capacity operations on the namenode being discussed here are all lightweight, simply fetching three instance variables from the NameNode.

          @Doug
          So there is an existing DiskStatus already in DistributedFileSystem, which has a very similar API. Could this class be promoted to FileSystem?

          Show
          Craig Macdonald added a comment - @Allen The capacity operations on the namenode being discussed here are all lightweight, simply fetching three instance variables from the NameNode. @Doug So there is an existing DiskStatus already in DistributedFileSystem, which has a very similar API. Could this class be promoted to FileSystem?
          Hide
          Doug Cutting added a comment -

          > Could this class be promoted to FileSystem?

          Yes. It would be better renamed FsStatus, and 'dfsUsed' renamed just 'used', as indicated above. The default implementation might return Long.MAX_VALUE for capacity and free, and zero for used.

          Show
          Doug Cutting added a comment - > Could this class be promoted to FileSystem? Yes. It would be better renamed FsStatus, and 'dfsUsed' renamed just 'used', as indicated above. The default implementation might return Long.MAX_VALUE for capacity and free, and zero for used.
          Hide
          Brian Bockelman added a comment -

          Hey all,

          This has gotten a bit worse under 0.19.0. In 0.19.0, "df" simply doesn't report anything for Hadoop as a normal user but works as root. I.e., for normal user:

          [brian@red ~]$ df -h
          Filesystem Size Used Avail Use% Mounted on
          /dev/sda3 254G 34G 208G 14% /
          /dev/sda1 99M 14M 81M 15% /boot
          none 7.9G 0 7.9G 0% /dev/shm
          nfs03:/mnt/raid 4.1T 1.7T 2.4T 41% /mnt/nfs03
          nfs04:/mnt/raid 3.2T 1.1T 2.1T 34% /mnt/nfs04
          dcache-pnfs.unl.edu:/fs
          391M 79M 278M 22% /pnfs/unl.edu

          For root,

          [brian@red ~]$ sudo df -h
          Filesystem Size Used Avail Use% Mounted on
          /dev/sda3 254G 34G 208G 14% /
          /dev/sda1 99M 14M 81M 15% /boot
          none 7.9G 0 7.9G 0% /dev/shm
          nfs03:/mnt/raid 4.1T 1.7T 2.4T 41% /mnt/nfs03
          nfs04:/mnt/raid 3.2T 1.1T 2.1T 34% /mnt/nfs04
          dcache-pnfs.unl.edu:/fs
          391M 79M 278M 22% /pnfs/unl.edu
          fuse 140T 122T 140T 47% /mnt/hadoop

          Show
          Brian Bockelman added a comment - Hey all, This has gotten a bit worse under 0.19.0. In 0.19.0, "df" simply doesn't report anything for Hadoop as a normal user but works as root. I.e., for normal user: [brian@red ~] $ df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 254G 34G 208G 14% / /dev/sda1 99M 14M 81M 15% /boot none 7.9G 0 7.9G 0% /dev/shm nfs03:/mnt/raid 4.1T 1.7T 2.4T 41% /mnt/nfs03 nfs04:/mnt/raid 3.2T 1.1T 2.1T 34% /mnt/nfs04 dcache-pnfs.unl.edu:/fs 391M 79M 278M 22% /pnfs/unl.edu For root, [brian@red ~] $ sudo df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 254G 34G 208G 14% / /dev/sda1 99M 14M 81M 15% /boot none 7.9G 0 7.9G 0% /dev/shm nfs03:/mnt/raid 4.1T 1.7T 2.4T 41% /mnt/nfs03 nfs04:/mnt/raid 3.2T 1.1T 2.1T 34% /mnt/nfs04 dcache-pnfs.unl.edu:/fs 391M 79M 278M 22% /pnfs/unl.edu fuse 140T 122T 140T 47% /mnt/hadoop
          Hide
          Brian Bockelman added a comment -

          Attached a patch for 0.19.0.

          This patch adds a "connect as superuser" method which statfs uses.

          Show
          Brian Bockelman added a comment - Attached a patch for 0.19.0. This patch adds a "connect as superuser" method which statfs uses.
          Hide
          Brian Bockelman added a comment -

          Patch against trunk, post-Pete's reorganization of the fuse-dfs source.

          Show
          Brian Bockelman added a comment - Patch against trunk, post-Pete's reorganization of the fuse-dfs source.
          Hide
          Craig Macdonald added a comment -

          We cannot add a method for FsStatus for FileSystem in 0.19 branch, however, we can do in trunk. In this case, I vote +1 for accepting Brian's patch for the 0.19 branch and inclusion in the 0.19.1 release.

          For trunk, the FsStatus should be added to FileSystem

          Show
          Craig Macdonald added a comment - We cannot add a method for FsStatus for FileSystem in 0.19 branch, however, we can do in trunk. In this case, I vote +1 for accepting Brian's patch for the 0.19 branch and inclusion in the 0.19.1 release. For trunk, the FsStatus should be added to FileSystem
          Hide
          Konstantin Shvachko added a comment -

          I am confused. Does that mean that anybody will be able to connect to HDFS as a superuser via fuse?
          It seamed like a rather simple move of a command from one class to another. Why would you want to cut corners like that?

          Show
          Konstantin Shvachko added a comment - I am confused. Does that mean that anybody will be able to connect to HDFS as a superuser via fuse? It seamed like a rather simple move of a command from one class to another. Why would you want to cut corners like that?
          Hide
          Craig Macdonald added a comment -

          Does that mean that anybody will be able to connect to HDFS as a superuser via fuse?

          No. Brian's patch allows the statfs call (as represented by the df command) to work correctly, by calling the appropriate libhdfs methods as superuser, before dropping the connection.

          The moving of the appropriate API from DistributedFileSystem to FileSystem is indeed fairly straightforward. I guess I was raising the question as to whether that would be permitted for 0.19.1, as I obviously view the current proposed patch as an interim fix.

          Show
          Craig Macdonald added a comment - Does that mean that anybody will be able to connect to HDFS as a superuser via fuse? No. Brian's patch allows the statfs call (as represented by the df command) to work correctly, by calling the appropriate libhdfs methods as superuser, before dropping the connection. The moving of the appropriate API from DistributedFileSystem to FileSystem is indeed fairly straightforward. I guess I was raising the question as to whether that would be permitted for 0.19.1, as I obviously view the current proposed patch as an interim fix.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          doConnectAsSuperuser(..) won't work very well since the constant "superuser" is not used in HDFS and the constant "supergroup" is a configurable parameter.

          Show
          Tsz Wo Nicholas Sze added a comment - doConnectAsSuperuser(..) won't work very well since the constant "superuser" is not used in HDFS and the constant "supergroup" is a configurable parameter.
          Hide
          Raghu Angadi added a comment -

          my 2c: This policy should not be worked around in fuse_dfs. It should be fixed in HDFS. fuse should handle errors (including permissions errors) for any commands normally.

          +1 for not requiring superuser for df/stats in HDFS.

          Show
          Raghu Angadi added a comment - my 2c: This policy should not be worked around in fuse_dfs. It should be fixed in HDFS. fuse should handle errors (including permissions errors) for any commands normally. +1 for not requiring superuser for df/stats in HDFS.
          Hide
          Pete Wyckoff added a comment -

          More generally, is there a design doc or jira for where permissions are going in hdfs? so we can see how these jive with fuse-dfs?

          Show
          Pete Wyckoff added a comment - More generally, is there a design doc or jira for where permissions are going in hdfs? so we can see how these jive with fuse-dfs?
          Hide
          Tsz Wo Nicholas Sze added a comment -

          The main JIRA for permissions is HADOOP-1298. There is also a user guide in the forrest doc.

          Show
          Tsz Wo Nicholas Sze added a comment - The main JIRA for permissions is HADOOP-1298 . There is also a user guide in the forrest doc.
          Hide
          Pete Wyckoff added a comment -

          I will look at those, but what about authorization and future change plans? integration with other systems - nis, ldap, ...

          Show
          Pete Wyckoff added a comment - I will look at those, but what about authorization and future change plans? integration with other systems - nis, ldap, ...
          Hide
          Brian Bockelman added a comment -

          Hm - one of our sysadmins pointed out that the patch doesn't compile correctly because I submitted the wrong version of it.

          Nicholas, what's the best way to determine the user / group I should be using?

          Might as well get the patch right when I resubmit it.

          Show
          Brian Bockelman added a comment - Hm - one of our sysadmins pointed out that the patch doesn't compile correctly because I submitted the wrong version of it. Nicholas, what's the best way to determine the user / group I should be using? Might as well get the patch right when I resubmit it.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > Nicholas, what's the best way to determine the user / group I should be using?

          We currently use shell commands "whoami" and "bash -c groups" to get the username and group information. See org.apache.hadoop.util.Shell.

          Show
          Tsz Wo Nicholas Sze added a comment - > Nicholas, what's the best way to determine the user / group I should be using? We currently use shell commands "whoami" and "bash -c groups" to get the username and group information. See org.apache.hadoop.util.Shell.
          Hide
          Brian Bockelman added a comment -

          Nah - I meant, how do we determine what the name of the superuser should be. I want to connect to Hadoop as superuser/supergroup, but you said hardcoding "superuser"/"supergroup" would not work in all cases.

          Show
          Brian Bockelman added a comment - Nah - I meant, how do we determine what the name of the superuser should be. I want to connect to Hadoop as superuser/supergroup, but you said hardcoding "superuser"/"supergroup" would not work in all cases.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > I want to connect to Hadoop as superuser/supergroup, ...

          This sounds impossible or should not be allowed. Otherwise, it is very easy to hack the system.

          Currently, the superusers are either the user starting the namenode or the users who belong to the group specified by the conf property dfs.permissions.supergroup (the default is "supergroup"). There is no way for clients to get the superuser accounts in runtime.

          See also http://hadoop.apache.org/core/docs/r0.19.0/hdfs_permissions_guide.html#The+Super-User

          Show
          Tsz Wo Nicholas Sze added a comment - > I want to connect to Hadoop as superuser/supergroup, ... This sounds impossible or should not be allowed. Otherwise, it is very easy to hack the system. Currently, the superusers are either the user starting the namenode or the users who belong to the group specified by the conf property dfs.permissions.supergroup (the default is "supergroup"). There is no way for clients to get the superuser accounts in runtime. See also http://hadoop.apache.org/core/docs/r0.19.0/hdfs_permissions_guide.html#The+Super-User
          Hide
          Craig Macdonald added a comment -

          This patch in effect corrects the Java API:

          • adds FsStatus in FileSystem, and made DFSClient use this
          • DiskStatus in DistributedFileSystem is deprecate, but extends FsStatus.
          • FSNamesystem#getStats() no longer requires super user privelege - all it does is return the values of three variables, so does not add considerable load to the name node.
          • As suggested by Doug, the default FileSystem implementation returns Long.MAX_VALUE for free and capacity, and 0 for used. This could be fixed for LocalFileSystem using the java.io.File api, however, it depends on which partition of the local filesystem that you want to query.
            • In this case, perhaps getStats(Path p) would be more appropriate? DistributedFileSystem would ignore the Path parameter
          Show
          Craig Macdonald added a comment - This patch in effect corrects the Java API: adds FsStatus in FileSystem, and made DFSClient use this DiskStatus in DistributedFileSystem is deprecate, but extends FsStatus. FSNamesystem#getStats() no longer requires super user privelege - all it does is return the values of three variables, so does not add considerable load to the name node. As suggested by Doug, the default FileSystem implementation returns Long.MAX_VALUE for free and capacity, and 0 for used. This could be fixed for LocalFileSystem using the java.io.File api, however, it depends on which partition of the local filesystem that you want to query. In this case, perhaps getStats(Path p) would be more appropriate? DistributedFileSystem would ignore the Path parameter
          Hide
          Doug Cutting added a comment -

          This is looking good. A few nits:

          • FsStatus should not be nested, but rather a standalone class.
          • FsStatus needs Javadoc.
          • We should add a unit test for this new public API (you're welcome, Nigel). This should be as new test methods in existing unit test classes.
          Show
          Doug Cutting added a comment - This is looking good. A few nits: FsStatus should not be nested, but rather a standalone class. FsStatus needs Javadoc. We should add a unit test for this new public API (you're welcome, Nigel). This should be as new test methods in existing unit test classes.
          Hide
          Craig Macdonald added a comment -

          @Doug

          Will make these changes, however before I do so do you have any comment on the last issue I raised for LocalFileSystem etc?

          Show
          Craig Macdonald added a comment - @Doug Will make these changes, however before I do so do you have any comment on the last issue I raised for LocalFileSystem etc?
          Hide
          Craig Macdonald added a comment -

          Address Doug's comments. Also made FsStatus Writable, because FileStatus is also writable.

          For testing, I tried asserting that capacity = free + used, but this doesn't seem to be the case for HDFS, and could feasibly be not true on other filesystems implementation, so I omitted this from the test.

          Show
          Craig Macdonald added a comment - Address Doug's comments. Also made FsStatus Writable, because FileStatus is also writable. For testing, I tried asserting that capacity = free + used, but this doesn't seem to be the case for HDFS, and could feasibly be not true on other filesystems implementation, so I omitted this from the test.
          Hide
          Brian Bockelman added a comment -

          Craig,

          Thank you very much for helping out. Regrettably, I have not have the time to contribute/write this patch myself.

          Brian

          Show
          Brian Bockelman added a comment - Craig, Thank you very much for helping out. Regrettably, I have not have the time to contribute/write this patch myself. Brian
          Hide
          Tsz Wo Nicholas Sze added a comment -

          We should also move the command from DFSAdmin to FsShell as mentioned in a previous comment.

          Show
          Tsz Wo Nicholas Sze added a comment - We should also move the command from DFSAdmin to FsShell as mentioned in a previous comment .
          Hide
          Doug Cutting added a comment -

          Craig> any comment on the last issue I raised for LocalFileSystem etc?

          Hmm. I guess it makes sense to make it FileSystem#getStats(Path), so that LocalFileSystem can report different stats for different volumes. Should we also define FileSystem#getStats() to call this with the root path by default as a convenience?

          Show
          Doug Cutting added a comment - Craig> any comment on the last issue I raised for LocalFileSystem etc? Hmm. I guess it makes sense to make it FileSystem#getStats(Path), so that LocalFileSystem can report different stats for different volumes. Should we also define FileSystem#getStats() to call this with the root path by default as a convenience?
          Hide
          Craig Macdonald added a comment -

          Updated patch attached, addressing Nicholas' and Doug's comments.

          @Nicholas: I have added a -df command to FsShell, which has similar output to GNUs core-utils df command. I left the DFSAdmin output, as I thought this was (a) useful, and (b) gives more interpretations on the reported figures for the HDFS scenario than FsShell#df() should.

          NB: The java.io.File getTotalSpace() etc API works find for normally mounted local and NFS partitions on linux, but for partitions that are mounted by autofs, I found it always gave 0s. This seems to be a problem with these API methods that are new in Java 6. I couldn't find reference to such a problem anywhere on the web.

          Two questions:
          1. Should I provide ant-patch output myself?
          2. Should I fix libhdfs in this issue as well?

          Show
          Craig Macdonald added a comment - Updated patch attached, addressing Nicholas' and Doug's comments. @Nicholas: I have added a -df command to FsShell, which has similar output to GNUs core-utils df command. I left the DFSAdmin output, as I thought this was (a) useful, and (b) gives more interpretations on the reported figures for the HDFS scenario than FsShell#df() should. NB: The java.io.File getTotalSpace() etc API works find for normally mounted local and NFS partitions on linux, but for partitions that are mounted by autofs, I found it always gave 0s. This seems to be a problem with these API methods that are new in Java 6. I couldn't find reference to such a problem anywhere on the web. Two questions: 1. Should I provide ant-patch output myself? 2. Should I fix libhdfs in this issue as well?
          Hide
          Doug Cutting added a comment -

          > 1. Should I provide ant-patch output myself?

          Sure, it never hurts to do this.

          > 2. Should I fix libhdfs in this issue as well?

          If you like, but I don't see it as a requirement for this issue.

          Show
          Doug Cutting added a comment - > 1. Should I provide ant-patch output myself? Sure, it never hurts to do this. > 2. Should I fix libhdfs in this issue as well? If you like, but I don't see it as a requirement for this issue.
          Hide
          Craig Macdonald added a comment -

          @Doug: Brian, the original poster, was concerned about lookups from fuse_dfs. In actual fact the problem is with libhdfs and also the provision of an adequate Java API. In this case, fixing libhdfs as well as the Java API should resolve the entire issue.

          Attached work-in-progress patch. This attempts to add libhdfs support for FsStatus API. If anyone knows JNI, perhaps they can explain why it is calling append(Path f, int bufferSize) instead of getStatus()?? I'm mystified.

          $  ant -Dcompile.c++=true -Dlibhdfs=true clean compile test-c++-libhdfs
          <snip>
               [exec] Exception in thread "main" java.lang.NullPointerException
               [exec]     at org.apache.hadoop.hdfs.DistributedFileSystem.checkPath(DistributedFileSystem.java:89)
               [exec]     at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:154)
               [exec]     at org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:185)
               [exec]     at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:531)
               [exec] Call to org.apache.hadoop.fs.FsStatus::getCapacity failed!
               [exec] hdfsGetCapacity: -1
               [exec] Exception in thread "main" java.lang.NullPointerException
               [exec]     at org.apache.hadoop.hdfs.DistributedFileSystem.checkPath(DistributedFileSystem.java:89)
               [exec]     at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:154)
               [exec]     at org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:185)
               [exec]     at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:531)
               [exec] Call to org.apache.hadoop.fs.FsStatus::getUsed failed!
          <snip>
          
          Show
          Craig Macdonald added a comment - @Doug: Brian, the original poster, was concerned about lookups from fuse_dfs. In actual fact the problem is with libhdfs and also the provision of an adequate Java API. In this case, fixing libhdfs as well as the Java API should resolve the entire issue. Attached work-in-progress patch. This attempts to add libhdfs support for FsStatus API. If anyone knows JNI, perhaps they can explain why it is calling append(Path f, int bufferSize) instead of getStatus()?? I'm mystified. $ ant -Dcompile.c++=true -Dlibhdfs=true clean compile test-c++-libhdfs <snip> [exec] Exception in thread "main" java.lang.NullPointerException [exec] at org.apache.hadoop.hdfs.DistributedFileSystem.checkPath(DistributedFileSystem.java:89) [exec] at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:154) [exec] at org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:185) [exec] at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:531) [exec] Call to org.apache.hadoop.fs.FsStatus::getCapacity failed! [exec] hdfsGetCapacity: -1 [exec] Exception in thread "main" java.lang.NullPointerException [exec] at org.apache.hadoop.hdfs.DistributedFileSystem.checkPath(DistributedFileSystem.java:89) [exec] at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:154) [exec] at org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:185) [exec] at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:531) [exec] Call to org.apache.hadoop.fs.FsStatus::getUsed failed! <snip>
          Hide
          Craig Macdonald added a comment -

          Changed components to dfs, libhdfs

          Show
          Craig Macdonald added a comment - Changed components to dfs, libhdfs
          Hide
          Craig Macdonald added a comment -

          New patch attached, all working, all tests passing. This patch is ready for review.

          Show
          Craig Macdonald added a comment - New patch attached, all working, all tests passing. This patch is ready for review.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          The patch looks good to me. The only problem I can see is that FSNamesystem.getStats() should not be declared with "throw IOException" anymore. Similarly, NameNode.getStats() should not be declared with "throw IOException".

          Some minor suggestions:

          • remove getCapacityTotal(), getCapacityUsed(), getCapacityUsedPercent(), getCapacityUsedNonDFS() and getCapacityRemaining() in FSNamesystem. These methods are mostly used in a test. All the uses of these methods can be replaced by getStats().
          • deprecated getRawCapacity() and getRawUsed() in DistributedFileSystem and remove totalRawCapacity() and totalRawUsed() in DFSClient.

          It would be great if we can remove DiskStatus instead of deprecated it. However, it seems we can't because it is public.

          Thank you for working on this, Craig.

          Show
          Tsz Wo Nicholas Sze added a comment - The patch looks good to me. The only problem I can see is that FSNamesystem.getStats() should not be declared with "throw IOException" anymore. Similarly, NameNode.getStats() should not be declared with "throw IOException". Some minor suggestions: remove getCapacityTotal(), getCapacityUsed(), getCapacityUsedPercent(), getCapacityUsedNonDFS() and getCapacityRemaining() in FSNamesystem. These methods are mostly used in a test. All the uses of these methods can be replaced by getStats(). deprecated getRawCapacity() and getRawUsed() in DistributedFileSystem and remove totalRawCapacity() and totalRawUsed() in DFSClient. It would be great if we can remove DiskStatus instead of deprecated it. However, it seems we can't because it is public. Thank you for working on this, Craig.
          Hide
          Craig Macdonald added a comment -

          Canceling patch until I address Nicholas' (straightforward) comments.

          Show
          Craig Macdonald added a comment - Canceling patch until I address Nicholas' (straightforward) comments.
          Hide
          Craig Macdonald added a comment -

          @Nicholas: It appears that getCapacityTotal(), getCapacityUsed(), getCapacityRemaining are all exported in the FSNamesystemMBean interface. This is no longer in the public javadoc, but still exists.

          Show
          Craig Macdonald added a comment - @Nicholas: It appears that getCapacityTotal(), getCapacityUsed(), getCapacityRemaining are all exported in the FSNamesystemMBean interface. This is no longer in the public javadoc, but still exists.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          >It appears that getCapacityTotal(), getCapacityUsed(), getCapacityRemaining are all exported in the FSNamesystemMBean interface.

          How about we use getStats() in FSNamesystemMBean?

          Show
          Tsz Wo Nicholas Sze added a comment - >It appears that getCapacityTotal(), getCapacityUsed(), getCapacityRemaining are all exported in the FSNamesystemMBean interface. How about we use getStats() in FSNamesystemMBean?
          Hide
          Craig Macdonald added a comment -

          Here's what I propose:

          • getCapacityTotal(), getCapacityUsed(), getCapacityRemaining() remain in FSNamesystem, but implement these using the return from getStats().
          • I don't known JMX, but I think deprecating and ultimately removing the above getCapacity*() methods in favour of getStats() would make the use of the relevant statistics more difficult?
          • Other changes to DistributedFileSystem and DFSClient remain as in your original review.
          Show
          Craig Macdonald added a comment - Here's what I propose: getCapacityTotal(), getCapacityUsed(), getCapacityRemaining() remain in FSNamesystem, but implement these using the return from getStats(). I don't known JMX, but I think deprecating and ultimately removing the above getCapacity*() methods in favour of getStats() would make the use of the relevant statistics more difficult? Other changes to DistributedFileSystem and DFSClient remain as in your original review.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > I don't known JMX, but I think deprecating and ultimately removing the above getCapacity*() methods in favour of getStats() would make the use of the relevant statistics more difficult?

          I see your points. Let's keep these methods.

          Show
          Tsz Wo Nicholas Sze added a comment - > I don't known JMX, but I think deprecating and ultimately removing the above getCapacity*() methods in favour of getStats() would make the use of the relevant statistics more difficult? I see your points. Let's keep these methods.
          Hide
          Craig Macdonald added a comment -

          Updated patch addressing agreed changes.

          Show
          Craig Macdonald added a comment - Updated patch addressing agreed changes.
          Hide
          Tsz Wo Nicholas Sze added a comment -
          • FsStatus.java is not included in hadoop4368.fsstatus.v6.patch
          • For deprecating methods, please also add the annotation tag @Deprecated.

          Everything else looks good.

          Show
          Tsz Wo Nicholas Sze added a comment - FsStatus.java is not included in hadoop4368.fsstatus.v6.patch For deprecating methods, please also add the annotation tag @Deprecated. Everything else looks good.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12399122/hadoop4368.fsstatus.v6.patch
          against trunk revision 739416.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The applied patch generated 46 javac compiler warnings (more than the trunk's current 2563 warnings).

          -1 findbugs. The patch appears to cause Findbugs to fail.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3785/testReport/
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3785/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3785/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12399122/hadoop4368.fsstatus.v6.patch against trunk revision 739416. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 46 javac compiler warnings (more than the trunk's current 2563 warnings). -1 findbugs. The patch appears to cause Findbugs to fail. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3785/testReport/ Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3785/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3785/console This message is automatically generated.
          Hide
          Craig Macdonald added a comment -

          Ignore the HadoopQA results, patch did not include all files. Improved patch is now attached (v7).

          Show
          Craig Macdonald added a comment - Ignore the HadoopQA results, patch did not include all files. Improved patch is now attached (v7).
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12399212/hadoop4368.fsstatus.v7.patch
          against trunk revision 739416.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3788/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3788/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3788/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3788/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12399212/hadoop4368.fsstatus.v7.patch against trunk revision 739416. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3788/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3788/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3788/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3788/console This message is automatically generated.
          Hide
          Craig Macdonald added a comment -

          The contrib test that failed is: org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.TestStartAtOffset.testStartAfterOffset (by timeout), not related to this patch.

          I think this patch is ready.

          Show
          Craig Macdonald added a comment - The contrib test that failed is: org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.TestStartAtOffset.testStartAfterOffset (by timeout), not related to this patch. I think this patch is ready.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Craig, please add release note since this is a new feature (a new FsShell command).

          Show
          Tsz Wo Nicholas Sze added a comment - Craig, please add release note since this is a new feature (a new FsShell command).
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I just committed this. Thanks, Craig!

          Show
          Tsz Wo Nicholas Sze added a comment - I just committed this. Thanks, Craig!
          Hide
          Craig Macdonald added a comment -

          Added release note.

          Show
          Craig Macdonald added a comment - Added release note.
          Hide
          Hudson added a comment -
          Show
          Hudson added a comment - Integrated in Hadoop-trunk #756 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/756/ )
          Hide
          Robert Chansler added a comment -

          Editorial pass over all release notes prior to publication of 0.21.

          Show
          Robert Chansler added a comment - Editorial pass over all release notes prior to publication of 0.21.
          Hide
          Ravi Phulari added a comment -

          Attaching test plan.

          Show
          Ravi Phulari added a comment - Attaching test plan.

            People

            • Assignee:
              Craig Macdonald
              Reporter:
              Brian Bockelman
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 10m
                10m
                Remaining:
                Remaining Estimate - 10m
                10m
                Logged:
                Time Spent - Not Specified
                Not Specified

                  Development