Hive
  1. Hive
  2. HIVE-7342

support hiveserver2,metastore specific config files

    Details

    • Release Note:
      Hide
      Adds support for server specific config files.

      HiveMetastore server reads hive-site.xml as well as hivemetastore-site.xml configuration files that are available in the $HIVE_CONF_DIR or in the classpath. If metastore is being used in embedded mode (ie hive.metastore.uris is not set or empty) in hive commandline or hiveserver2, the hivemetastore-site.xml gets loaded by the parent process as well.
      The value of hive.metastore.uris is examined to determine this, and the value should be set appropriately in hive-site.xml .
      Certain metastore configuration parameters like hive.metastore.sasl.enabled, hive.metastore.kerberos.principal, hive.metastore.execute.setugi, hive.metastore.thrift.framed.transport.enabled are used by the metastore client as well as server. For such common parameters it is better to set the values in hive-site.xml, that will help in keeping them consistent.

      HiveServer2 reads hive-site.xml as well as hiveserver2-site.xml that are available in the $HIVE_CONF_DIR or in the classpath.
      If hiveserver2 is using metastore in embedded mode, hivemetastore-site.xml also is loaded.

      The order of precedence of the config files is as follows (later one has higher precedence) -
      hive-site.xml -> hivemetastore-site.xml -> hiveserver2-site.xml -> '-hiveconf' commandline parameters

      Show
      Adds support for server specific config files. HiveMetastore server reads hive-site.xml as well as hivemetastore-site.xml configuration files that are available in the $HIVE_CONF_DIR or in the classpath. If metastore is being used in embedded mode (ie hive.metastore.uris is not set or empty) in hive commandline or hiveserver2, the hivemetastore-site.xml gets loaded by the parent process as well. The value of hive.metastore.uris is examined to determine this, and the value should be set appropriately in hive-site.xml . Certain metastore configuration parameters like hive.metastore.sasl.enabled, hive.metastore.kerberos.principal, hive.metastore.execute.setugi, hive.metastore.thrift.framed.transport.enabled are used by the metastore client as well as server. For such common parameters it is better to set the values in hive-site.xml, that will help in keeping them consistent. HiveServer2 reads hive-site.xml as well as hiveserver2-site.xml that are available in the $HIVE_CONF_DIR or in the classpath. If hiveserver2 is using metastore in embedded mode, hivemetastore-site.xml also is loaded. The order of precedence of the config files is as follows (later one has higher precedence) - hive-site.xml -> hivemetastore-site.xml -> hiveserver2-site.xml -> '-hiveconf' commandline parameters

      Description

      There is currently a single configuration file for all components in hive. ie, components such as hive cli, hiveserver2 and metastore all read from the same hive-site.xml.
      It will be useful to have a server specific hive-site.xml, so that you can have some different configuration value set for a server. For example, you might want to enabled authorization checks for hiveserver2, while disabling the checks for hive cli. The workaround today is to add any component specific configuration as a commandline (-hiveconf) argument.

      Using server specific config files (eg hiveserver2-site.xml, hivemetastore-site.xml) that override the entries in hive-site.xml will make the configuration much more easy to manage.

      1. HIVE-7342.2.patch
        22 kB
        Thejas M Nair
      2. HIVE-7342.1.patch
        16 kB
        Thejas M Nair

        Issue Links

          Activity

          Hide
          Lefty Leverenz added a comment -

          Okay, thanks Thejas.

          Show
          Lefty Leverenz added a comment - Okay, thanks Thejas.
          Hide
          Thejas M Nair added a comment -

          I think the documentation in main hive configuration section is sufficient. There is no separate configuration section in hcat cli document.

          Show
          Thejas M Nair added a comment - I think the documentation in main hive configuration section is sufficient. There is no separate configuration section in hcat cli document.
          Hide
          Lefty Leverenz added a comment -
          Show
          Lefty Leverenz added a comment - Is any HCatalog documentation needed? Sushanth's comment above: "HCatCLI should mimic HiveCLI in behaviour." HCatalog CLI
          Show
          Thejas M Nair added a comment - Added doc in https://cwiki.apache.org/confluence/display/Hive/AdminManual+Configuration#AdminManualConfiguration-ConfiguringHive
          Hide
          Thejas M Nair added a comment -

          This has been fixed in 0.14 release. Please open new jira if you see any issues.

          Show
          Thejas M Nair added a comment - This has been fixed in 0.14 release. Please open new jira if you see any issues.
          Hide
          Thejas M Nair added a comment -

          I missed adding some files to svn as part of my first commit, added them now.

          Show
          Thejas M Nair added a comment - I missed adding some files to svn as part of my first commit, added them now.
          Hide
          Thejas M Nair added a comment -

          Patch committed to trunk.
          Thanks for the reviews Jason, Sushanth and Prasad!

          Show
          Thejas M Nair added a comment - Patch committed to trunk. Thanks for the reviews Jason, Sushanth and Prasad!
          Hide
          Prasad Mujumdar added a comment -

          I guess treating hive-site as base file should be sufficient. It's unlikely that you will have variety of metastore setups (embedded and remote, or secure and unsecure) in a single deployment. Thanks for updating the release notes!

          +1

          Show
          Prasad Mujumdar added a comment - I guess treating hive-site as base file should be sufficient. It's unlikely that you will have variety of metastore setups (embedded and remote, or secure and unsecure) in a single deployment. Thanks for updating the release notes! +1
          Hide
          Thejas M Nair added a comment -

          Prasad Mujumdar
          Added a note about that in release note section, so that it can be included in documentation as well.

          Show
          Thejas M Nair added a comment - Prasad Mujumdar Added a note about that in release note section, so that it can be included in documentation as well.
          Hide
          Thejas M Nair added a comment -

          The server will load both base and server specific configs, the client will only load the base config.

          hive-site.xml is the base config file. It gets loaded by clients and servers. So it would be the right place for such config parameters.

          Show
          Thejas M Nair added a comment - The server will load both base and server specific configs, the client will only load the base config. hive-site.xml is the base config file. It gets loaded by clients and servers. So it would be the right place for such config parameters.
          Hide
          Prasad Mujumdar added a comment -

          Thejas M Nair The patch look fine to me.
          Just wondering if it would make sense to further split the metastore config into client (or base) and server. There are common configs like setugi, enableSasl etc that need to be in sync on both client and server. If those are available in a common file, it will be less prone to incompatible configs. The server will load both base and server specific configs, the client will only load the base config.

          Show
          Prasad Mujumdar added a comment - Thejas M Nair The patch look fine to me. Just wondering if it would make sense to further split the metastore config into client (or base) and server. There are common configs like setugi, enableSasl etc that need to be in sync on both client and server. If those are available in a common file, it will be less prone to incompatible configs. The server will load both base and server specific configs, the client will only load the base config.
          Hide
          Thejas M Nair added a comment -

          Good point. I will add note in release note saying that hive-site.xml should be used to set hive.metastore.uris , as that is what is used to determine if the metastore is going to be used in remote vs embedded mode.

          Show
          Thejas M Nair added a comment - Good point. I will add note in release note saying that hive-site.xml should be used to set hive.metastore.uris , as that is what is used to determine if the metastore is going to be used in remote vs embedded mode.
          Hide
          Sushanth Sowmyan added a comment -

          I'm +1 on the .2.patch.

          Having said that, I also wanted to bring up one more possible corner case that seems to exist, one that I do not think needs solving(at least not right away - might be worth opening a follow up to declare if a confvar can be overridden by a conf or not), but one that exists anyway.

          Consider the case of a hive client, that loads a hive-site.xml, and finds that the metastore uris parameter is blank. Thus, it expects that it is a local metastore operation, and thus, loads hivemetastore-site.xml. In this config, however, say the uris parameter is not blank. Then, at connect time, that hive client will act upon the uris value, which tells it that it is not a local metastore, and connects out to the referred url. However, this hive client has now loaded the metastore conf for other parameters, while still being a remote client that talks to a metastore.

          Show
          Sushanth Sowmyan added a comment - I'm +1 on the .2.patch. Having said that, I also wanted to bring up one more possible corner case that seems to exist, one that I do not think needs solving(at least not right away - might be worth opening a follow up to declare if a confvar can be overridden by a conf or not), but one that exists anyway. Consider the case of a hive client, that loads a hive-site.xml, and finds that the metastore uris parameter is blank. Thus, it expects that it is a local metastore operation, and thus, loads hivemetastore-site.xml. In this config, however, say the uris parameter is not blank. Then, at connect time, that hive client will act upon the uris value, which tells it that it is not a local metastore, and connects out to the referred url. However, this hive client has now loaded the metastore conf for other parameters, while still being a remote client that talks to a metastore.
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12655084/HIVE-7342.2.patch

          ERROR: -1 due to 2 failed/errored test(s), 5721 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
          org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
          

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/739/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/739/console
          Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-739/

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 2 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12655084

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12655084/HIVE-7342.2.patch ERROR: -1 due to 2 failed/errored test(s), 5721 tests executed Failed tests: org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/739/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/739/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-739/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed This message is automatically generated. ATTACHMENT ID: 12655084
          Hide
          Thejas M Nair added a comment -

          Sushanth Sowmyan Thanks for prompting me to take a closer look at the precedence! I found an issue, here is the updated patch.

          HIVE-7342.2.patch - With earlier patch hivemetastore-site.xml would take precedence over hiveserver2-site.xml if embedded metastore is used with hiveserver2, as metastore-site.xml was getting added later.
          With this change HiveConf initialization itself would check if embedded metastore is used and load the hivemetastore-site.xml. This way the order of adding the resources to the Configuration always remains the same.

          Patch also adds tests for both embedded and remote metastore mode.

          The order of predendence (later one takes precedence) :
          hive-site.xml -> hivemetastore-site.xml -> hiveserver2-site.xml -> HiveConf.ConfVars set through system properties (same as ones set through -hiveconf cmdline params)

          Show
          Thejas M Nair added a comment - Sushanth Sowmyan Thanks for prompting me to take a closer look at the precedence! I found an issue, here is the updated patch. HIVE-7342 .2.patch - With earlier patch hivemetastore-site.xml would take precedence over hiveserver2-site.xml if embedded metastore is used with hiveserver2, as metastore-site.xml was getting added later. With this change HiveConf initialization itself would check if embedded metastore is used and load the hivemetastore-site.xml. This way the order of adding the resources to the Configuration always remains the same. Patch also adds tests for both embedded and remote metastore mode. The order of predendence (later one takes precedence) : hive-site.xml -> hivemetastore-site.xml -> hiveserver2-site.xml -> HiveConf.ConfVars set through system properties (same as ones set through -hiveconf cmdline params)
          Hide
          Sushanth Sowmyan added a comment -

          HCatCLI should mimic HiveCLI in behaviour. WebHCat uses HCatCLI to perform HCat operations if I remember correctly.

          HCatIF/HCatOF were designed with an expectation of a remote metastore, but if it behaves the way you suggest, then that's appropriate.

          That said, I will admit to having some worries about what happens when a parameter is specified in multiple configs, and what the resolution order winds up being. I would like to look through this patch as well.

          Show
          Sushanth Sowmyan added a comment - HCatCLI should mimic HiveCLI in behaviour. WebHCat uses HCatCLI to perform HCat operations if I remember correctly. HCatIF/HCatOF were designed with an expectation of a remote metastore, but if it behaves the way you suggest, then that's appropriate. That said, I will admit to having some worries about what happens when a parameter is specified in multiple configs, and what the resolution order winds up being. I would like to look through this patch as well.
          Hide
          Thejas M Nair added a comment -

          does this have effect on HCatCLI, HCat and WebHCat?

          In remote metatstore mode, there would not be any impact. If embedded metastore is used with any of the above,and hivemeastore-site.xml is in conf dir (or classpath in general), it will add that config as well (overriding any existing params that are redefined).

          cc Sushanth Sowmyan

          Show
          Thejas M Nair added a comment - does this have effect on HCatCLI, HCat and WebHCat? In remote metatstore mode, there would not be any impact. If embedded metastore is used with any of the above,and hivemeastore-site.xml is in conf dir (or classpath in general), it will add that config as well (overriding any existing params that are redefined). cc Sushanth Sowmyan
          Hide
          Eugene Koifman added a comment -

          does this have effect on HCatCLI, HCat and WebHCat?

          Show
          Eugene Koifman added a comment - does this have effect on HCatCLI, HCat and WebHCat?
          Hide
          Jason Dere added a comment -

          +1

          Show
          Jason Dere added a comment - +1
          Hide
          Thejas M Nair added a comment -

          Yes, you are right about the configs used. The server side config will be also be used when the server is used in an embedded mode.

          Regarding: Beeline, remote mode: - no hive*xml is used in this mode.

          Show
          Thejas M Nair added a comment - Yes, you are right about the configs used. The server side config will be also be used when the server is used in an embedded mode. Regarding: Beeline, remote mode: - no hive*xml is used in this mode.
          Hide
          Jason Dere added a comment -

          Just trying to understand which apps will load which configs:

          • HiveCLI, remote metastore: hive-site
          • HiveCLI, embedded metastore: hive-site, metastore-site
          • Beeline, embedded mode: hive-site, hiveserver2-site, metastore-site (if embedded metastore)
          • Beeline, remote mode: ?
          • Metastore Server: hive-site, metastore-site
          • HiveServer2, remote metastore (I've heard this is not a recommended configuration?): hive-site, hiveserver2-site
          • HiveServer2, embedded metastore: hive-site, metastore-site, hiveserver2-site
          Show
          Jason Dere added a comment - Just trying to understand which apps will load which configs: HiveCLI, remote metastore: hive-site HiveCLI, embedded metastore: hive-site, metastore-site Beeline, embedded mode: hive-site, hiveserver2-site, metastore-site (if embedded metastore) Beeline, remote mode: ? Metastore Server: hive-site, metastore-site HiveServer2, remote metastore (I've heard this is not a recommended configuration?): hive-site, hiveserver2-site HiveServer2, embedded metastore: hive-site, metastore-site, hiveserver2-site
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12654371/HIVE-7342.1.patch

          ERROR: -1 due to 3 failed/errored test(s), 5702 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization
          org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
          org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
          

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/703/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/703/console
          Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-703/

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 3 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12654371

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12654371/HIVE-7342.1.patch ERROR: -1 due to 3 failed/errored test(s), 5702 tests executed Failed tests: org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/703/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/703/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-703/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed This message is automatically generated. ATTACHMENT ID: 12654371

            People

            • Assignee:
              Thejas M Nair
              Reporter:
              Thejas M Nair
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development