Hadoop Common
  1. Hadoop Common
  2. HADOOP-3264

libhdfs does not provide permissions in API

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Duplicate
    • Affects Version/s: 0.16.2
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      There is no support in libhdfs methods for obtaining or setting the permissions, owner or group-owner of files in hdfs.

      1. HADOOP-3264.2.txt
        15 kB
        Pete Wyckoff
      2. libhdfs_permissions.v1.patch
        6 kB
        Craig Macdonald
      3. libhdfs_permissions.v2.patch
        6 kB
        Craig Macdonald

        Issue Links

          Activity

          Craig Macdonald created issue -
          Hide
          Craig Macdonald added a comment -

          First stab at a patch for this. This patches the 'get' type-method, and the hdfs_test program. No 'set' type-methods yet.

          Show
          Craig Macdonald added a comment - First stab at a patch for this. This patches the 'get' type-method, and the hdfs_test program. No 'set' type-methods yet.
          Craig Macdonald made changes -
          Field Original Value New Value
          Attachment libhdfs_permissions.v1.patch [ 12380390 ]
          Craig Macdonald made changes -
          Attachment libhdfs_permissions.v1.patch [ 12380390 ]
          Hide
          Craig Macdonald added a comment -

          First stab at a patch for this issue. Supplements the hdfsFileInfo struct with permissions/owner/group information, and appropriately updates hdfs_test program. Does not support the changing of any permissions/ownership/group information.

          (license granted this time)

          Show
          Craig Macdonald added a comment - First stab at a patch for this issue. Supplements the hdfsFileInfo struct with permissions/owner/group information, and appropriately updates hdfs_test program. Does not support the changing of any permissions/ownership/group information. (license granted this time)
          Craig Macdonald made changes -
          Attachment libhdfs_permissions.v1.patch [ 12380391 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          There is a bug in the patch: In permission_disp, we need

          short permissionsId = (permissions >> (i * 3)) & (short)7;
          
          Show
          Tsz Wo Nicholas Sze added a comment - There is a bug in the patch: In permission_disp, we need short permissionsId = (permissions >> (i * 3)) & ( short )7;
          Hide
          Craig Macdonald added a comment - - edited

          Updated patch addresses Nicholas' comment. My bit shifting arithmetic never was any good. Displayed permissions now match expected default permissions (rwxr-xr-x for directories, rw-r-r- for files).

          Show
          Craig Macdonald added a comment - - edited Updated patch addresses Nicholas' comment. My bit shifting arithmetic never was any good. Displayed permissions now match expected default permissions (rwxr-xr-x for directories, rw-r- r - for files).
          Craig Macdonald made changes -
          Attachment libhdfs_permissions.v2.patch [ 12380427 ]
          Hide
          Craig Macdonald added a comment -

          For the set methods, what sort of methods should be designed?

          Option 1:

          One method that takes a reference to a hdfsFileInfo struct and updates all details of the path, or the details that are not null.

          int hdfsSetPathInfo(hdfsFS fs, const char* path, hdfsFileInfo* pathinfo);
          

          Option 2:

          Various methods for setting various properties of a file. Of note, there is a
          hdfsSetReplication method already.

          int hdfsSetPathOwner(hdfsFS fs, const char* path, char* owner);
          int hdfsSetPathGroup(hdfsFS fs, const char* path, char* owner);
          int hdfsSetPathPermissions(hdfsFS fs, const char* path, short permissions);
          
          int hdfsSetPathLastMod(hdfsFS fs, const char* path, tTime newtime);
          
          
          Show
          Craig Macdonald added a comment - For the set methods, what sort of methods should be designed? Option 1: One method that takes a reference to a hdfsFileInfo struct and updates all details of the path, or the details that are not null. int hdfsSetPathInfo(hdfsFS fs, const char * path, hdfsFileInfo* pathinfo); Option 2: Various methods for setting various properties of a file. Of note, there is a hdfsSetReplication method already. int hdfsSetPathOwner(hdfsFS fs, const char * path, char * owner); int hdfsSetPathGroup(hdfsFS fs, const char * path, char * owner); int hdfsSetPathPermissions(hdfsFS fs, const char * path, short permissions); int hdfsSetPathLastMod(hdfsFS fs, const char * path, tTime newtime);
          Hide
          Craig Macdonald added a comment -

          Pinging... does anyone have any thoughts on the proposed methods?

          Show
          Craig Macdonald added a comment - Pinging... does anyone have any thoughts on the proposed methods?
          Craig Macdonald made changes -
          Link This issue blocks HADOOP-3536 [ HADOOP-3536 ]
          Hide
          Pete Wyckoff added a comment -

          I like the latter methods as the APIs are better defined and called out, but I think the former single method is more in the spirit of hadoop and having a smaller #of apis but with params that can be extended.

          But, I do think the latter 4 methods are a better defined API. What happens when the pathinfo struct changes and old code doesn't know to set some needed field but still compiles against a newer version that's expecting that field. Unless the new code takes are to check the hdfsFileInfo structure is properly set, things could go haywire. e.g., adding last accessed int which is not set could result in last accessed being set to 1970.

          – pete

          Show
          Pete Wyckoff added a comment - I like the latter methods as the APIs are better defined and called out, but I think the former single method is more in the spirit of hadoop and having a smaller #of apis but with params that can be extended. But, I do think the latter 4 methods are a better defined API. What happens when the pathinfo struct changes and old code doesn't know to set some needed field but still compiles against a newer version that's expecting that field. Unless the new code takes are to check the hdfsFileInfo structure is properly set, things could go haywire. e.g., adding last accessed int which is not set could result in last accessed being set to 1970. – pete
          Hide
          Craig Macdonald added a comment -

          I'm not particularly opinionated on either options. Could we rely on .so versioning to flag incompatible API changes? I cant really comment on the versioning of C APIs.

          The current implemented methods in libhdfs that are related are

          hdfsFileInfo *hdfsGetPathInfo(hdfsFS fs, const char* path);
          char*** hdfsGetHosts(hdfsFS fs, const char* path, 
                      tOffset start, tOffset length);
          int hdfsSetReplication(hdfsFS fs, const char* path, int16_t replication);
          

          so in a sense, we have both kinds of methods going on here.

          Arun is the original libhdfs contributor, so perhaps he can comment on a preferred API style?

          C

          Show
          Craig Macdonald added a comment - I'm not particularly opinionated on either options. Could we rely on .so versioning to flag incompatible API changes? I cant really comment on the versioning of C APIs. The current implemented methods in libhdfs that are related are hdfsFileInfo *hdfsGetPathInfo(hdfsFS fs, const char * path); char *** hdfsGetHosts(hdfsFS fs, const char * path, tOffset start, tOffset length); int hdfsSetReplication(hdfsFS fs, const char * path, int16_t replication); so in a sense, we have both kinds of methods going on here. Arun is the original libhdfs contributor, so perhaps he can comment on a preferred API style? C
          Craig Macdonald made changes -
          Link This issue blocks HADOOP-3766 [ HADOOP-3766 ]
          Craig Macdonald made changes -
          Link This issue blocks HADOOP-3536 [ HADOOP-3536 ]
          Craig Macdonald made changes -
          Link This issue is related to HADOOP-3536 [ HADOOP-3536 ]
          Ben Slusky made changes -
          Link This issue is related to HADOOP-3549 [ HADOOP-3549 ]
          Hide
          Pete Wyckoff added a comment -

          I now think the best thing is to go "posix-esque"

          
          int chown(hdfsFS fs, const char *path, uid_t owner, gid_t group);
          int chmod(hdfsFS fs, const char *path, mode_t mode)
          int utime(hdfsFS fs, const char *filename, const struct utimbuf *buf);
          
          

          Now, if we're worried that uid_t, gid_t, mode_t, utimebuf are not portable to windows cygwin or others, then we could just define our own versions of these, but probably to whatever extent possible, make them identical to posix ones.

          • pete
          Show
          Pete Wyckoff added a comment - I now think the best thing is to go "posix-esque" int chown(hdfsFS fs, const char *path, uid_t owner, gid_t group); int chmod(hdfsFS fs, const char *path, mode_t mode) int utime(hdfsFS fs, const char *filename, const struct utimbuf *buf); Now, if we're worried that uid_t, gid_t, mode_t, utimebuf are not portable to windows cygwin or others, then we could just define our own versions of these, but probably to whatever extent possible, make them identical to posix ones. pete
          Hide
          Pete Wyckoff added a comment -

          Here's what I was thinking. This is not prime time yet as it's not tested, (but it compiles ).

          No unit tests or anything yet, so not really submitting, but thought i would post what i added to Craig's patch.

          – pete

          Show
          Pete Wyckoff added a comment - Here's what I was thinking. This is not prime time yet as it's not tested, (but it compiles ). No unit tests or anything yet, so not really submitting, but thought i would post what i added to Craig's patch. – pete
          Pete Wyckoff made changes -
          Attachment HADOOP-3264.2.txt [ 12389626 ]
          Hide
          Craig Macdonald added a comment -

          Pete,

          This looks very good. I have a few minor comments:

          • it appears that some of the changes from HADOOP-3963 have bled into your patch. Can you recreate the patch once HADOOP-3963 is committed?
          • src/c++/libhdfs/hdfs_test.c - I think that some of my work in progress changes are missing from your patch.
          • We need to add some test cases. I guess the test run will depend on the user running the test - can we test chmod if there are no other users to chmod a file to?

          C

          Show
          Craig Macdonald added a comment - Pete, This looks very good. I have a few minor comments: it appears that some of the changes from HADOOP-3963 have bled into your patch. Can you recreate the patch once HADOOP-3963 is committed? src/c++/libhdfs/hdfs_test.c - I think that some of my work in progress changes are missing from your patch. We need to add some test cases. I guess the test run will depend on the user running the test - can we test chmod if there are no other users to chmod a file to? C
          Hide
          Pete Wyckoff added a comment -

          > it appears that some of the changes from HADOOP-3963 have bled into your patch. Can you recreate the patch once HADOOP-3963 is committed?

          I know - I should get my act together and use git

          Let me take your test cases and add to them. I only included hdfs.c/h because I had 3963 and 3344 applied

          Show
          Pete Wyckoff added a comment - > it appears that some of the changes from HADOOP-3963 have bled into your patch. Can you recreate the patch once HADOOP-3963 is committed? I know - I should get my act together and use git Let me take your test cases and add to them. I only included hdfs.c/h because I had 3963 and 3344 applied
          Pete Wyckoff made changes -
          Assignee Pete Wyckoff [ wyckoff ]
          Pete Wyckoff made changes -
          Link This issue is duplicated by HADOOP-4104 [ HADOOP-4104 ]
          Pete Wyckoff made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Duplicate [ 3 ]
          Doug Cutting made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Owen O'Malley made changes -
          Component/s libhdfs [ 12311345 ]

            People

            • Assignee:
              Pete Wyckoff
              Reporter:
              Craig Macdonald
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development