Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Currently permissions are unsupported by fuse-dfs.

      This manifests itself as two issues:

      • Users accessing a fuse-dfs mount do so as the user running fuse_dfs executable. In this case, it would be better to run fuse-dfr as some privileged user, and use Hadoop API calls determine whether the current user was privileged enough to perform the action.
      • Users cannot view/change permissions on the mounted volume. See HADOOP-3264
      1. getlogininfo.c
        3 kB
        Craig Macdonald

        Issue Links

          Activity

          Hide
          Craig Macdonald added a comment -

          Done - I've split this issue into four sub-issues. Two are requirements for fuse-dfs, and there are two corresponding issues in libhdfs.

          Show
          Craig Macdonald added a comment - Done - I've split this issue into four sub-issues. Two are requirements for fuse-dfs, and there are two corresponding issues in libhdfs.
          Hide
          Pete Wyckoff added a comment -

          +1

          Show
          Pete Wyckoff added a comment - +1
          Hide
          Craig Macdonald added a comment - - edited

          Pete,

          There's a few issues here:

          What does fuse support:

          • FUSE allows access to the user performing each operation.
          • FUSE allows attributes and permissions to be retrieved and set.

          What does fuse-dfs not support, but should:

          • fuse-dfs should access the DFS as the requesting user "asuser" (thereby upholding the hadoop security model)
          • fuse-dfs should allow permissions to be set & retrieved from the DFS

          fuse-dfs is bound by the API provided by libhdfs. In this case, the missing features are:

          • libhdfs should allow connection to the DFS as a given user
          • libhdfs should allow permissions to be set and retrieved. HADOOP-3264

          I propose splitting this JIRA up into several issues:

          • fuse-dfs user impersonation
          • fuse-dfs permissions set & get

          And add the missing libhdfs dependent JIRA

          • Allow connecting username to be specified in libhdfs

          This JIRA will remain as an overview issue.

          Show
          Craig Macdonald added a comment - - edited Pete, There's a few issues here: What does fuse support: FUSE allows access to the user performing each operation. FUSE allows attributes and permissions to be retrieved and set. What does fuse-dfs not support, but should: fuse-dfs should access the DFS as the requesting user "asuser" (thereby upholding the hadoop security model) fuse-dfs should allow permissions to be set & retrieved from the DFS fuse-dfs is bound by the API provided by libhdfs. In this case, the missing features are: libhdfs should allow connection to the DFS as a given user libhdfs should allow permissions to be set and retrieved. HADOOP-3264 I propose splitting this JIRA up into several issues: fuse-dfs user impersonation fuse-dfs permissions set & get And add the missing libhdfs dependent JIRA Allow connecting username to be specified in libhdfs This JIRA will remain as an overview issue.
          Hide
          Pete Wyckoff added a comment -

          IS this issue really "respecting" permissions in the fuse module or also the ability to set permissions? Maybe should be divided in 2 since it seems the underlying APIs needed in libhdfs may end up in 2 JIRAs themselves??

          Show
          Pete Wyckoff added a comment - IS this issue really "respecting" permissions in the fuse module or also the ability to set permissions? Maybe should be divided in 2 since it seems the underlying APIs needed in libhdfs may end up in 2 JIRAs themselves??
          Hide
          Craig Macdonald added a comment -

          In all cases, sometimes a process will need to have an 'asuser' privilege - which mean it needs to access the DFS as other arbitrary users. Perhaps the libhdfs API would not need to change in the future, but the implementation to support the permissions may change.

          The crucial point is how does Hadoop tell if the current user has the 'asuser' privilege.

          Also, I'm using uid_t/gid_t to designate a user, (char* username, char** groupnames) would be OK by me as well.

          C

          Show
          Craig Macdonald added a comment - In all cases, sometimes a process will need to have an 'asuser' privilege - which mean it needs to access the DFS as other arbitrary users. Perhaps the libhdfs API would not need to change in the future, but the implementation to support the permissions may change. The crucial point is how does Hadoop tell if the current user has the 'asuser' privilege. Also, I'm using uid_t/gid_t to designate a user, (char* username, char** groupnames) would be OK by me as well. C
          Hide
          Allen Wittenauer added a comment - - edited

          /thinking out loud

          i wonder if this is the correct long term approach for if/when hadoop gets a real authentication api. i know--it doesn't exist yet, but...

          i guess i'm concerned about adding api's that might be deprecated sooner rather than later.

          Show
          Allen Wittenauer added a comment - - edited /thinking out loud i wonder if this is the correct long term approach for if/when hadoop gets a real authentication api. i know--it doesn't exist yet, but... i guess i'm concerned about adding api's that might be deprecated sooner rather than later.
          Hide
          Craig Macdonald added a comment -

          Code to determine:

          • uid -> username
          • uid -> username,all groupnames

          This code could be used to set the hadoop.job.ugi property to obtain the correct FileSystem object for each user accessing the DFS.

          Show
          Craig Macdonald added a comment - Code to determine: uid -> username uid -> username,all groupnames This code could be used to set the hadoop.job.ugi property to obtain the correct FileSystem object for each user accessing the DFS.
          Hide
          Craig Macdonald added a comment - - edited

          Ok, this would require some minor changes in libhdfs API.

          Currently, the connection API is:

          hdfsFS hdfsConnect(const char* host, tPort port);
          

          I suggest adding one or two additional API calls, to allow connection as a given user:

          /** 
            * hdfsConnect - Connect to a hdfs file system as the specified user, and all
            * of his/her groups
            */
          hdfsFS hdfsConnect(const char* host, tPort port, uid_t uid);
          
          /** 
            * hdfsConnect - Connect to a hdfs file system as the specified user, and only the specified group
            */
          hdfsFS hdfsConnect(const char* host, tPort port, uid_t uid, gid_t gid);
          
          

          This would require libhdfs to achieve two tasks:

          • For a given uid, determine the username and all his/her groups (names), and use these to access a FileSystem object
          • For a given uid, determine the username and the group names, and use these to access a FileSystem object

          Code to achieve this attached.

          Show
          Craig Macdonald added a comment - - edited Ok, this would require some minor changes in libhdfs API. Currently, the connection API is: hdfsFS hdfsConnect(const char* host, tPort port); I suggest adding one or two additional API calls, to allow connection as a given user: /** * hdfsConnect - Connect to a hdfs file system as the specified user, and all * of his/her groups */ hdfsFS hdfsConnect(const char* host, tPort port, uid_t uid); /** * hdfsConnect - Connect to a hdfs file system as the specified user, and only the specified group */ hdfsFS hdfsConnect(const char* host, tPort port, uid_t uid, gid_t gid); This would require libhdfs to achieve two tasks: For a given uid, determine the username and all his/her groups (names), and use these to access a FileSystem object For a given uid, determine the username and the group names, and use these to access a FileSystem object Code to achieve this attached.
          Hide
          Doug Cutting added a comment -

          > Doug, so your suggesting essentially a FileSystem object open for each user accessing the DFS?

          Yes. FileSystem.get() caches instances based on username, and the RPC system cache's connections based on username. So applications should not need to do more than be sure they pass a Configuration that contains the user's credentials. Currently credentials are set/get in a Configuration only by UnixUserGroupInformation.java.

          Show
          Doug Cutting added a comment - > Doug, so your suggesting essentially a FileSystem object open for each user accessing the DFS? Yes. FileSystem.get() caches instances based on username, and the RPC system cache's connections based on username. So applications should not need to do more than be sure they pass a Configuration that contains the user's credentials. Currently credentials are set/get in a Configuration only by UnixUserGroupInformation.java.
          Hide
          Craig Macdonald added a comment -

          Doug, so your suggesting essentially a FileSystem object open for each user accessing the DFS?

          The alternative is having mounts for every user (which is still in the spirit of fuse).

          Show
          Craig Macdonald added a comment - Doug, so your suggesting essentially a FileSystem object open for each user accessing the DFS? The alternative is having mounts for every user (which is still in the spirit of fuse).
          Hide
          Doug Cutting added a comment -

          > it would be better to run fuse-dfr as some privileged user, and use Hadoop API calls determine whether the current user was privileged

          This sounds potentially dangerous and expensive. If the fuse code has the username making the request, it can set it in the configuration passed to FileSystem.get(URI, Configuration).

          Show
          Doug Cutting added a comment - > it would be better to run fuse-dfr as some privileged user, and use Hadoop API calls determine whether the current user was privileged This sounds potentially dangerous and expensive. If the fuse code has the username making the request, it can set it in the configuration passed to FileSystem.get(URI, Configuration).

            People

            • Assignee:
              Unassigned
              Reporter:
              Craig Macdonald
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development