Hive
  1. Hive
  2. HIVE-2936

Warehouse table subdirectories should inherit the group permissions of the warehouse parent directory

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.9.0
    • Component/s: Metastore
    • Labels:
      None

      Description

      When the Hive Metastore creates a subdirectory in the Hive warehouse for
      a new table it does so with the default HDFS permissions derived from dfs.umask or dfs.umaskmode. There should be a option to inherit the permissions of the parent directory (default warehouse or custom database directory) so that the table directories have the same permissions as the database directories.

      1. HIVE-2936-2.patch
        8 kB
        Rohini Palaniswamy
      2. ASF.LICENSE.NOT.GRANTED--HIVE-2504.patch
        2 kB
        Rohini Palaniswamy
      3. ASF.LICENSE.NOT.GRANTED--HIVE-2504.patch
        7 kB
        Rohini Palaniswamy
      4. ASF.LICENSE.NOT.GRANTED--HIVE-2504-1.patch
        8 kB
        Rohini Palaniswamy

        Issue Links

          Activity

          Hide
          Hudson added a comment -

          Integrated in Hive-trunk-hadoop2 #54 (See https://builds.apache.org/job/Hive-trunk-hadoop2/54/)
          HIVE-2936 : Warehouse table subdirectories should inherit the group permissions of the warehouse parent directory (Rohini via Ashutosh Chauhan) (Revision 1325791)

          Result = ABORTED
          hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1325791
          Files :

          • /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
          • /hive/trunk/conf/hive-default.xml.template
          • /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java
          • /hive/trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestEmbeddedHiveMetaStore.java
          • /hive/trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
          Show
          Hudson added a comment - Integrated in Hive-trunk-hadoop2 #54 (See https://builds.apache.org/job/Hive-trunk-hadoop2/54/ ) HIVE-2936 : Warehouse table subdirectories should inherit the group permissions of the warehouse parent directory (Rohini via Ashutosh Chauhan) (Revision 1325791) Result = ABORTED hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1325791 Files : /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java /hive/trunk/conf/hive-default.xml.template /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java /hive/trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestEmbeddedHiveMetaStore.java /hive/trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
          Hide
          Ashutosh Chauhan added a comment -

          This issue is closed now. It was released with the fix in 0.9.0. If there is a problem, please open a new jira and link this one with that.

          Show
          Ashutosh Chauhan added a comment - This issue is closed now. It was released with the fix in 0.9.0. If there is a problem, please open a new jira and link this one with that.
          Hide
          Hudson added a comment -

          Integrated in Hive-trunk-h0.21 #1372 (See https://builds.apache.org/job/Hive-trunk-h0.21/1372/)
          HIVE-2936 : Warehouse table subdirectories should inherit the group permissions of the warehouse parent directory (Rohini via Ashutosh Chauhan) (Revision 1325791)

          Result = ABORTED
          hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1325791
          Files :

          • /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
          • /hive/trunk/conf/hive-default.xml.template
          • /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java
          • /hive/trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestEmbeddedHiveMetaStore.java
          • /hive/trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
          Show
          Hudson added a comment - Integrated in Hive-trunk-h0.21 #1372 (See https://builds.apache.org/job/Hive-trunk-h0.21/1372/ ) HIVE-2936 : Warehouse table subdirectories should inherit the group permissions of the warehouse parent directory (Rohini via Ashutosh Chauhan) (Revision 1325791) Result = ABORTED hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1325791 Files : /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java /hive/trunk/conf/hive-default.xml.template /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java /hive/trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestEmbeddedHiveMetaStore.java /hive/trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
          Hide
          Ashutosh Chauhan added a comment -

          Committed.

          Show
          Ashutosh Chauhan added a comment - Committed.
          Hide
          Ashutosh Chauhan added a comment -

          +1 will commit, if tests pass.

          Show
          Ashutosh Chauhan added a comment - +1 will commit, if tests pass.
          Hide
          Rohini Palaniswamy added a comment -

          Patch with Ashutosh's review comments incorporated.

          Show
          Rohini Palaniswamy added a comment - Patch with Ashutosh's review comments incorporated.
          Hide
          Ashutosh Chauhan added a comment -
          if (this.inheritPerms) {
                  try {
                     return fs.getFileStatus(f).isDir();
                  } catch (FileNotFoundException fnfe) {}
                }
          
                boolean success = fs.mkdirs(f);
          
                if (this.inheritPerms && success) {
                  // Set the permission of parent directory.
                  fs.setPermission(f, fs.getFileStatus(f.getParent()).getPermission());
                }
                return success;
          

          this improves on urs by just doing 1 call for "inheritPerms on existing path."

          Show
          Ashutosh Chauhan added a comment - if ( this .inheritPerms) { try { return fs.getFileStatus(f).isDir(); } catch (FileNotFoundException fnfe) {} } boolean success = fs.mkdirs(f); if ( this .inheritPerms && success) { // Set the permission of parent directory. fs.setPermission(f, fs.getFileStatus(f.getParent()).getPermission()); } return success; this improves on urs by just doing 1 call for "inheritPerms on existing path."
          Hide
          Rohini Palaniswamy added a comment -
          • The else call is not required. If the path already existed and was a directory, mkdirs returns true else it throws FileAlreadyExistsException. That is why need to check before if it exists. If the directory already existed, we should not be setting the parent's permission on it.
          • The fs.exists(f.getParent()) needs to be done before the fs.mkdirs() else it will always be true. One thing that can be done is remove fs.exists(f.getParent()) check and always do a setPermission. If the parent directory was created during the mkdirs(), then the setPermission call will be redundant and just set the same permission again but if the parent existed, then instead of two calls there will be only one call.
          if (this.inheritPerms && fs.exists(f)) {
                  return fs.getFileStatus(f).isDir();
                }
                boolean success = fs.mkdirs(f);
                if (this.inheritPerms && success) {
                  // Set the permission of parent directory.
                  fs.setPermission(f, fs.getFileStatus(f.getParent()).getPermission());
                }
                return success;
          

          With this, without inheritPerms

          • 1 call to create the directory. fs.mkdirs() will throw an exception if the path already existed and was a file.

          with inheritPerms

          • There will be 2 calls if the path already existed.
          • There will be 3 calls, to check and create the path and set the permission.

          Does that sound ok?

          Show
          Rohini Palaniswamy added a comment - The else call is not required. If the path already existed and was a directory, mkdirs returns true else it throws FileAlreadyExistsException. That is why need to check before if it exists. If the directory already existed, we should not be setting the parent's permission on it. The fs.exists(f.getParent()) needs to be done before the fs.mkdirs() else it will always be true. One thing that can be done is remove fs.exists(f.getParent()) check and always do a setPermission. If the parent directory was created during the mkdirs(), then the setPermission call will be redundant and just set the same permission again but if the parent existed, then instead of two calls there will be only one call. if ( this .inheritPerms && fs.exists(f)) { return fs.getFileStatus(f).isDir(); } boolean success = fs.mkdirs(f); if ( this .inheritPerms && success) { // Set the permission of parent directory. fs.setPermission(f, fs.getFileStatus(f.getParent()).getPermission()); } return success; With this, without inheritPerms 1 call to create the directory. fs.mkdirs() will throw an exception if the path already existed and was a file. with inheritPerms There will be 2 calls if the path already existed. There will be 3 calls, to check and create the path and set the permission. Does that sound ok?
          Hide
          Ashutosh Chauhan added a comment -

          I see. Agree on both points. Another point is this increases number on calls on nn, it will be good to reduce that if possible. How about following:

            boolean success = fs.mkdirs(f);
                if(success) {
                  if (this.inheritPerms && fs.exists(f.getParent())) {
                    try {
                      fs.setPermission(f, fs.getFileStatus(f.getParent()).getPermission());
                    } catch (IOException ioe) {
                      LOG.equals("Failed to set permissions");
                      success = false;
                    }
          
                  }
                } else {
                  return fs.getFileStatus(f).isDir();
                }
                return success;
          

          How about this. The case of returning false if you fail to set Permissions is not clear. I return false, what you think ?

          Show
          Ashutosh Chauhan added a comment - I see. Agree on both points. Another point is this increases number on calls on nn, it will be good to reduce that if possible. How about following: boolean success = fs.mkdirs(f); if (success) { if ( this .inheritPerms && fs.exists(f.getParent())) { try { fs.setPermission(f, fs.getFileStatus(f.getParent()).getPermission()); } catch (IOException ioe) { LOG.equals( "Failed to set permissions" ); success = false ; } } } else { return fs.getFileStatus(f).isDir(); } return success; How about this. The case of returning false if you fail to set Permissions is not clear. I return false, what you think ?
          Hide
          Rohini Palaniswamy added a comment -
          • The parent does not exist scenario can happen when you are creating databases with a location specified. Had the parent existence check because if you are creating "CREATE DATABASE IF NOT EXISTS db1 LOCATION '/projects/myproj/data/db1'"; , there are chances that data directory does not exist. In that case, instead of finding the first top level directory that exists and using its permission, I let the directory be created with the default dfs umask applied permission.
          • It is not safe to use fs.mkdirs(path, permission) because the dfs umask is applied on that permission in DFSClient which is not desired. We have been bitten by wrong permission issues because of using that API. It is always safer to do mkdirs() and then do a setPermission() if you are dealing with HDFS.
          Show
          Rohini Palaniswamy added a comment - The parent does not exist scenario can happen when you are creating databases with a location specified. Had the parent existence check because if you are creating "CREATE DATABASE IF NOT EXISTS db1 LOCATION '/projects/myproj/data/db1'"; , there are chances that data directory does not exist. In that case, instead of finding the first top level directory that exists and using its permission, I let the directory be created with the default dfs umask applied permission. It is not safe to use fs.mkdirs(path, permission) because the dfs umask is applied on that permission in DFSClient which is not desired. We have been bitten by wrong permission issues because of using that API. It is always safer to do mkdirs() and then do a setPermission() if you are dealing with HDFS.
          Hide
          Ashutosh Chauhan added a comment -

          Looks good, couple of comments:

          • I dont see a need of parent existence check, parent will always exists. / is parent of itself.
          • fs.mkdirs() takes permission as an argument, so use that signature instead of first creating dirs and then setting perms.

          So, something like following:

           boolean success;
                if (this.inheritPerms) {
                  success = fs.mkdirs(f, fs.getFileStatus(f.getParent()).getPermission());
                } else {
                  success = fs.mkdirs(f);
                }
                return success;
          
          Show
          Ashutosh Chauhan added a comment - Looks good, couple of comments: I dont see a need of parent existence check, parent will always exists. / is parent of itself. fs.mkdirs() takes permission as an argument, so use that signature instead of first creating dirs and then setting perms. So, something like following: boolean success; if ( this .inheritPerms) { success = fs.mkdirs(f, fs.getFileStatus(f.getParent()).getPermission()); } else { success = fs.mkdirs(f); } return success;
          Hide
          Rohini Palaniswamy added a comment -

          Created a clone of HIVE-2904 to address some issues with inheriting permissions. Please refer to HIVE-2904 for prior history and discussions.

          Description of changes:
          Added hive.warehouse.subdir.inherit.perms which is false by default. The default behaviour of hive-0.8 stays. i.e directories will be created with the permissions of dfs.umask or dfs.umaskmode. If hive.warehouse.subdir.inherit.perms is set to true, then table directories will inherit the permission of the default warehouse or the custom database location. This comes in handy when you have databases created with different permissions or if the warehouse directory has permissions like 775.

          Show
          Rohini Palaniswamy added a comment - Created a clone of HIVE-2904 to address some issues with inheriting permissions. Please refer to HIVE-2904 for prior history and discussions. Description of changes: Added hive.warehouse.subdir.inherit.perms which is false by default. The default behaviour of hive-0.8 stays. i.e directories will be created with the permissions of dfs.umask or dfs.umaskmode. If hive.warehouse.subdir.inherit.perms is set to true, then table directories will inherit the permission of the default warehouse or the custom database location. This comes in handy when you have databases created with different permissions or if the warehouse directory has permissions like 775.

            People

            • Assignee:
              Rohini Palaniswamy
              Reporter:
              Rohini Palaniswamy
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development