Details

    • Type: Sub-task Sub-task
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.13.0
    • Component/s: None
    • Labels:
      None
    • Release Note:
      Hide
      Introduces new hive config parameter -
      hive.security.command.whitelist .
      This can use used to restrict the set of commands that can be run. Currently supported command list is - "set,reset,dfs,add,delete" and by default all these commands are supported. If you want to restrict any of these commands, you can set the config to a value that does not have the command in it.
      Show
      Introduces new hive config parameter - hive.security.command.whitelist . This can use used to restrict the set of commands that can be run. Currently supported command list is - "set,reset,dfs,add,delete" and by default all these commands are supported. If you want to restrict any of these commands, you can set the config to a value that does not have the command in it.

      Description

      From here: https://issues.apache.org/jira/browse/HIVE-5253?focusedCommentId=13782220&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13782220

      I think we should afford admins who want to disable this functionality the ability to do so. Since such admins might want to disable other commands such as add or dfs, it wouldn't be much trouble to allow them to do this as well. For example we could have a configuration option "hive.available.commands" (or similar) which specified add,set,delete,reset, etc by default. Then check this value in CommandProcessorFactory. It would probably make sense to add this property to the restrict list.

      1. HIVE-5400.patch
        7 kB
        Brock Noland
      2. HIVE-5400.patch
        23 kB
        Brock Noland
      3. HIVE-5400.patch
        26 kB
        Brock Noland

        Issue Links

          Activity

          Brock Noland created issue -
          Hide
          Edward Capriolo added a comment -

          Do we have a way to enforce this? Is their any command property that a user can not edit?

          ssh hacker@hive
          #bin/hive
          hive > set "hive.available.commands"="all"

          Show
          Edward Capriolo added a comment - Do we have a way to enforce this? Is their any command property that a user can not edit? ssh hacker@hive #bin/hive hive > set "hive.available.commands"="all"
          Hide
          Brock Noland added a comment -

          Not for users that invoke hive directly, but I know many environments which use the restrict list + hive server 2 to disallow setting of many properties.

          Show
          Brock Noland added a comment - Not for users that invoke hive directly, but I know many environments which use the restrict list + hive server 2 to disallow setting of many properties.
          Hide
          Edward Capriolo added a comment -

          It is important to note that Apache Hive does not have to concern itself with how this feature functions with external tools like Sentry. Hive's security model is optimistic and by no means definitive. I do not see it as a blocker because none of the other Processors provide any security (including the !) which allows people the ability to launch local commands.

          That being said, I do not have issue building the feature in such a way that allows some levels of control. Which is simple because in essence this is a very small patch.

          Show
          Edward Capriolo added a comment - It is important to note that Apache Hive does not have to concern itself with how this feature functions with external tools like Sentry. Hive's security model is optimistic and by no means definitive. I do not see it as a blocker because none of the other Processors provide any security (including the !) which allows people the ability to launch local commands. That being said, I do not have issue building the feature in such a way that allows some levels of control. Which is simple because in essence this is a very small patch.
          Hide
          Edward Capriolo added a comment -

          Not for users that invoke hive directly, but I know many environments which use the restrict list + hive server 2 to disallow setting of many properties.

          Ok then the feature seems fairly simple to create and manage then.

          Show
          Edward Capriolo added a comment - Not for users that invoke hive directly, but I know many environments which use the restrict list + hive server 2 to disallow setting of many properties. Ok then the feature seems fairly simple to create and manage then.
          Edward Capriolo made changes -
          Field Original Value New Value
          Assignee Edward Capriolo [ appodictic ]
          Hide
          Brock Noland added a comment -

          It is important to note that Apache Hive does not have to concern itself with how this feature functions with external tools like Sentry.

          Sorry I wasn't more clear...since Sentry is so new I was specifically referring to production deployments sans Sentry.

          Ok then the feature seems fairly simple to create and manage then.

          Yeah it should be a pretty small patch. Unless you are already working on it, I would have no issue taking this one up.

          Show
          Brock Noland added a comment - It is important to note that Apache Hive does not have to concern itself with how this feature functions with external tools like Sentry. Sorry I wasn't more clear...since Sentry is so new I was specifically referring to production deployments sans Sentry. Ok then the feature seems fairly simple to create and manage then. Yeah it should be a pretty small patch. Unless you are already working on it, I would have no issue taking this one up.
          Hide
          Brock Noland added a comment -

          Attached is a patch which implements this for HS2 only. I think that makes sense as Admins will not be able to stop Hive CLI users from bypassing this mechanism anyway.

          Show
          Brock Noland added a comment - Attached is a patch which implements this for HS2 only. I think that makes sense as Admins will not be able to stop Hive CLI users from bypassing this mechanism anyway.
          Brock Noland made changes -
          Attachment HIVE-5400.patch [ 12606178 ]
          Hide
          Brock Noland added a comment -

          Marking PA for HiveQA.

          Show
          Brock Noland added a comment - Marking PA for HiveQA.
          Brock Noland made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Edward Capriolo added a comment -

          Brock Noland

          Lets slow down a second. I think patching in this support for only hs2 is short sided. I think we do want to bring this code all the way down to the CLI, even if a local mode CLI can avoid this protection, I think completely skipping the local mode code path is the wrong way. Also I do not like the hard codes here:

          String[] commands = {"set", "dfs", "add", "delete"};
          

          We have already abstractions like Processors and a class that acts as a switchboard, I think they should have a way of describing what types of commands they provide (enum possibly), and then letting the switch board make the choice.

          Lets come up with a clean design that makes sense in the long run and is manageable not just something we hack in.

          Show
          Edward Capriolo added a comment - Brock Noland Lets slow down a second. I think patching in this support for only hs2 is short sided. I think we do want to bring this code all the way down to the CLI, even if a local mode CLI can avoid this protection, I think completely skipping the local mode code path is the wrong way. Also I do not like the hard codes here: String [] commands = { "set" , "dfs" , "add" , "delete" }; We have already abstractions like Processors and a class that acts as a switchboard, I think they should have a way of describing what types of commands they provide (enum possibly), and then letting the switch board make the choice. Lets come up with a clean design that makes sense in the long run and is manageable not just something we hack in.
          Hide
          Edward Capriolo added a comment -

          Did not mean "hack in" in a bad way. But we do not want a lot of strings and have connect the dots between seemingly unrelated classes as to why a feature is working or not.

          Show
          Edward Capriolo added a comment - Did not mean "hack in" in a bad way. But we do not want a lot of strings and have connect the dots between seemingly unrelated classes as to why a feature is working or not.
          Brock Noland made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hide
          Brock Noland added a comment -

          I am fine with implementing this for both HS2 and CLI/HS1 and since we use strings for CLi, HS1, and HS2 at present. I can add an enum which will be used by all three.

          Show
          Brock Noland added a comment - I am fine with implementing this for both HS2 and CLI/HS1 and since we use strings for CLi, HS1, and HS2 at present. I can add an enum which will be used by all three.
          Hide
          Edward Capriolo added a comment -

          Well lets do this. Lets extend the Command interface class.

          interface Command {
            public List<String> provides = { set, bla , bla }
          }
          

          Then lets let the CommandProcessorFactory look a the users session state, the hive conf, and determine to give back a command or throw an exception.

          This maye be harder then it sounds but it feels like a clean way to do it. This allows us to plug in new commands without much hassle.

          Show
          Edward Capriolo added a comment - Well lets do this. Lets extend the Command interface class. interface Command { public List< String > provides = { set, bla , bla } } Then lets let the CommandProcessorFactory look a the users session state, the hive conf, and determine to give back a command or throw an exception. This maye be harder then it sounds but it feels like a clean way to do it. This allows us to plug in new commands without much hassle.
          Hide
          Brock Noland added a comment -

          Hey,

          Yep, I had a patch in progress when you commented. It's slightly different than the suggestion but achieves the same result: Adding commands will be easier since single switch board is used and a test will fail if a command is added but either hive.available.commands isn't updated or the switch board isn't updated.

          The new patch works for all execution methods and removes the independent switchboard logic from HS2.

          Show
          Brock Noland added a comment - Hey, Yep, I had a patch in progress when you commented. It's slightly different than the suggestion but achieves the same result: Adding commands will be easier since single switch board is used and a test will fail if a command is added but either hive.available.commands isn't updated or the switch board isn't updated. The new patch works for all execution methods and removes the independent switchboard logic from HS2.
          Hide
          Brock Noland added a comment -
          Show
          Brock Noland added a comment - Review Item: https://reviews.apache.org/r/14447/
          Brock Noland made changes -
          Attachment HIVE-5400.patch [ 12606401 ]
          Brock Noland made changes -
          Remote Link This issue links to "Review Board (Web Link)" [ 12901 ]
          Brock Noland made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Edward Capriolo added a comment -

          I will look at this later tonight.

          Show
          Edward Capriolo added a comment - I will look at this later tonight.
          Hide
          Brock Noland added a comment -

          Latest patch from RB. Thanks Carl Steinbach for the review!

          Show
          Brock Noland added a comment - Latest patch from RB. Thanks Carl Steinbach for the review!
          Brock Noland made changes -
          Attachment HIVE-5400.patch [ 12606456 ]
          Hide
          Edward Capriolo added a comment -

          Question. Would we be better or returning CommandProcessorResponse's or throwing a new type of exception? I am struggling to rationalize SQLExceptions thrown from this part of hive code.

          Show
          Edward Capriolo added a comment - Question. Would we be better or returning CommandProcessorResponse's or throwing a new type of exception? I am struggling to rationalize SQLExceptions thrown from this part of hive code.
          Hide
          Hive QA added a comment -

          Overall: +1 all checks pass

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12606456/HIVE-5400.patch

          SUCCESS: +1 4046 tests passed

          Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1001/testReport
          Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1001/console

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          

          This message is automatically generated.

          Show
          Hive QA added a comment - Overall : +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12606456/HIVE-5400.patch SUCCESS: +1 4046 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1001/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1001/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase This message is automatically generated.
          Hide
          Brock Noland added a comment -

          Edward,

          Thank you very much for taking a look at this patch!

          Would we be better or returning CommandProcessorResponse's

          Several of the execution methods will do further processing before invoking the processor so I don't feel we should invoke the processor and then return the result. Additionally in general it's common to use a factory to get the "executor" and then call the "executor" in the callers context as it sees fit. This approach fits that paradigm.

          or throwing a new type of exception? I am struggling to rationalize SQLExceptions thrown from this part of hive code.

          I certainly empathize with your feeling but I don't feel Hive we have the correct hierarchy to implement anything "better" at this point. In regards to throwing a SQLException I don't see what the advantage of throwing a separate exception would be. We are indicating the correct SQL State. But more importantly two callers are only going to log it regardless of the exception type and the third (HS2) is simply going to convert it to a HiveSQLException. I'd prefer to throw a HiveSQLException but that is a member of the service module and since service depends on ql it would create a circular dependency.

          If at some point in the future we create additional layers of abstraction between the "tool" interface and the execution implementation then I could see an improved exception hierarchy. However, I don't feel that is a blocker for this patch. If there are concrete suggestions on this aspect I think it'd be a good use of a follow-on JIRA. The patch available at present improves admin's control, improves our switchboard logic, improves maintainability, and deletes almost as much code as it adds.

          Show
          Brock Noland added a comment - Edward, Thank you very much for taking a look at this patch! Would we be better or returning CommandProcessorResponse's Several of the execution methods will do further processing before invoking the processor so I don't feel we should invoke the processor and then return the result. Additionally in general it's common to use a factory to get the "executor" and then call the "executor" in the callers context as it sees fit. This approach fits that paradigm. or throwing a new type of exception? I am struggling to rationalize SQLExceptions thrown from this part of hive code. I certainly empathize with your feeling but I don't feel Hive we have the correct hierarchy to implement anything "better" at this point. In regards to throwing a SQLException I don't see what the advantage of throwing a separate exception would be. We are indicating the correct SQL State. But more importantly two callers are only going to log it regardless of the exception type and the third (HS2) is simply going to convert it to a HiveSQLException. I'd prefer to throw a HiveSQLException but that is a member of the service module and since service depends on ql it would create a circular dependency. If at some point in the future we create additional layers of abstraction between the "tool" interface and the execution implementation then I could see an improved exception hierarchy. However, I don't feel that is a blocker for this patch. If there are concrete suggestions on this aspect I think it'd be a good use of a follow-on JIRA. The patch available at present improves admin's control, improves our switchboard logic, improves maintainability, and deletes almost as much code as it adds.
          Hide
          Edward Capriolo added a comment -

          Agreed on the SQLException leave it as is. There is one more idea I want to pitch. Does it make more sense to implement a blacklist then a whitelist?

          Generally we fall on the side of leaving "dangerous" things on and not limiting features. A good example is hive.strict.mode. It should be on by default it all production deployments, but we have it off for the purposes of unit testing. Maybe I am biased here, but as a person who used hadoop before "security" I would rather things worked out of the box and I could turn them off later, other then the opposite.

          Again this is just a thought, and if you like the whitelist better lets just keep this.

          Show
          Edward Capriolo added a comment - Agreed on the SQLException leave it as is. There is one more idea I want to pitch. Does it make more sense to implement a blacklist then a whitelist? Generally we fall on the side of leaving "dangerous" things on and not limiting features. A good example is hive.strict.mode. It should be on by default it all production deployments, but we have it off for the purposes of unit testing. Maybe I am biased here, but as a person who used hadoop before "security" I would rather things worked out of the box and I could turn them off later, other then the opposite. Again this is just a thought, and if you like the whitelist better lets just keep this.
          Hide
          Edward Capriolo added a comment -

          Actually never mind the white list blacklist thing. It is the same difference and the code is already written. +1 will chew it over a bit and then commit in the morning.

          Show
          Edward Capriolo added a comment - Actually never mind the white list blacklist thing. It is the same difference and the code is already written. +1 will chew it over a bit and then commit in the morning.
          Hide
          Hive QA added a comment -

          Overall: +1 all checks pass

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12606456/HIVE-5400.patch

          SUCCESS: +1 4046 tests passed

          Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1008/testReport
          Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1008/console

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          

          This message is automatically generated.

          Show
          Hive QA added a comment - Overall : +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12606456/HIVE-5400.patch SUCCESS: +1 4046 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1008/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1008/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase This message is automatically generated.
          Hide
          Edward Capriolo added a comment -

          Committed. Thanks Brock

          Show
          Edward Capriolo added a comment - Committed. Thanks Brock
          Edward Capriolo made changes -
          Assignee Edward Capriolo [ appodictic ] Brock Noland [ brocknoland ]
          Fix Version/s 0.13.0 [ 12324986 ]
          Hide
          Edward Capriolo added a comment -

          Thanks again Brock.

          Show
          Edward Capriolo added a comment - Thanks again Brock.
          Edward Capriolo made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Hide
          Thejas M Nair added a comment -

          Adding documentation in release notes. cc [~lefty@hortonworks.com]

          Show
          Thejas M Nair added a comment - Adding documentation in release notes. cc [~lefty@hortonworks.com]
          Thejas M Nair made changes -
          Release Note Introduces new hive config parameter -
          hive.security.command.whitelist .
          This can use used to restrict the set of commands that can be run. Currently supported command list is - "set,reset,dfs,add,delete" and by default all these commands are supported. If you want to restrict any of these commands, you can set the config to a value that does not have the command in it.
          Hide
          Lefty Leverenz added a comment -

          Got it, thanks Thejas M Nair. I'll add hive.security.command.whitelist to the Configuration Properties doc in the Authentication/Authorization section.

          Does it belong in the subsection "Hive Client Security" or does it also apply to metastore security (in which case it could have a new subsection, perhaps at the beginning of Authentication/Authorization)?

          Show
          Lefty Leverenz added a comment - Got it, thanks Thejas M Nair . I'll add hive.security.command.whitelist to the Configuration Properties doc in the Authentication/Authorization section. Does it belong in the subsection "Hive Client Security" or does it also apply to metastore security (in which case it could have a new subsection, perhaps at the beginning of Authentication/Authorization)?
          Thejas M Nair made changes -
          Link This issue relates to HIVE-4887 [ HIVE-4887 ]
          Hide
          Thejas M Nair added a comment -

          [~lefty@hortonworks.com] Yes, that section sounds good. It belongs to the subsection "Hive Client Security".
          Thanks for updating the docs!

          Show
          Thejas M Nair added a comment - [~lefty@hortonworks.com] Yes, that section sounds good. It belongs to the subsection "Hive Client Security". Thanks for updating the docs!
          Hide
          Lefty Leverenz added a comment -

          Added this at the end of the Hive Client Security section:

          hive.security.command.whitelist

          • Default Value: set,reset,dfs,add,delete
          • Added In: Hive 0.13.0 with HIVE-5400

          Comma separated list of non-SQL Hive commands that users are authorized to execute. This can be used to restrict the set of authorized commands. The currently supported command list is "set,reset,dfs,add,delete" and by default all these commands are authorized. To restrict any of these commands, set hive.security.command.whitelist to a value that does not have the command in it.

          Show
          Lefty Leverenz added a comment - Added this at the end of the Hive Client Security section: hive.security.command.whitelist Default Value: set,reset,dfs,add,delete Added In: Hive 0.13.0 with HIVE-5400 Comma separated list of non-SQL Hive commands that users are authorized to execute. This can be used to restrict the set of authorized commands. The currently supported command list is "set,reset,dfs,add,delete" and by default all these commands are authorized. To restrict any of these commands, set hive.security.command.whitelist to a value that does not have the command in it.
          Hide
          Lefty Leverenz added a comment -

          The patch adds hive.security.command.whitelist & hive.conf.restricted.list to hive-default.xml.template, but hive.conf.restricted.list is not documented in the wiki.

          • Should it be in the wiki?
          • If so, which release added it?

          Trivia: both descriptions spell "separated" wrong ("seperated"):

          <property>
          + <name>hive.security.command.whitelist</name>
          + <value>set,reset,dfs,add,delete</value>
          + <description>Comma seperated list of non-SQL Hive commands users are authorized to execute</description>
          +</property>
          +
          +<property>
          + <name>hive.conf.restricted.list</name>
          + <value></value>
          + <description>Comma seperated list of configuration options which are immutable at runtime</description>
          +</property>

          Show
          Lefty Leverenz added a comment - The patch adds hive.security.command.whitelist & hive.conf.restricted.list to hive-default.xml.template, but hive.conf.restricted.list is not documented in the wiki. Should it be in the wiki? If so, which release added it? Trivia: both descriptions spell "separated" wrong ("seperated"): <property> + <name>hive.security.command.whitelist</name> + <value>set,reset,dfs,add,delete</value> + <description>Comma seperated list of non-SQL Hive commands users are authorized to execute</description> +</property> + +<property> + <name>hive.conf.restricted.list</name> + <value></value> + <description>Comma seperated list of configuration options which are immutable at runtime</description> +</property>
          Hide
          Brock Noland added a comment -

          Nice. I have always had trouble with that word.

          I created https://issues.apache.org/jira/browse/HIVE-5879 for that.

          Show
          Brock Noland added a comment - Nice. I have always had trouble with that word. I created https://issues.apache.org/jira/browse/HIVE-5879 for that.
          Brock Noland made changes -
          Link This issue is related to HIVE-5879 [ HIVE-5879 ]
          Hide
          Thejas M Nair added a comment -

          but hive.conf.restricted.list is not documented in the wiki.

          Should it be in the wiki?

          Yes

          If so, which release added it?

          hive 0.11 . (patch HIVE-2935)

          Show
          Thejas M Nair added a comment - but hive.conf.restricted.list is not documented in the wiki. Should it be in the wiki? Yes If so, which release added it? hive 0.11 . (patch HIVE-2935 )
          Hide
          Lefty Leverenz added a comment -

          Documented hive.conf.restricted.list and put it in a new section with hive.security.command.whitelist:

          Show
          Lefty Leverenz added a comment - Documented hive.conf.restricted.list and put it in a new section with hive.security.command.whitelist: Authentication/Authorization: Restricted List and Whitelist

            People

            • Assignee:
              Brock Noland
              Reporter:
              Brock Noland
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development