Index: conf/hive-default.xml.template =================================================================== --- conf/hive-default.xml.template (revision 1548369) +++ conf/hive-default.xml.template (working copy) @@ -30,7 +30,7 @@ -1 The default number of reduce tasks per job. Typically set to a prime close to the number of available hosts. Ignored when - mapred.job.tracker is "local". Hadoop set this to 1 by default, whereas hive uses -1 as its default value. + mapred.job.tracker is "local". Hadoop set this to 1 by default, whereas Hive uses -1 as its default value. By setting this property to -1, Hive will automatically figure out what should be the number of reducers. @@ -46,7 +46,7 @@ 999 max number of reducers will be used. If the one specified in the configuration parameter mapred.reduce.tasks is - negative, hive will use this one as the max number of reducers when + negative, Hive will use this one as the max number of reducers when automatically determine number of reducers. @@ -59,15 +59,15 @@ hive.cli.print.current.db false - Whether to include the current database in the hive prompt. + Whether to include the current database in the Hive prompt. hive.cli.prompt hive Command line prompt configuration value. Other hiveconf can be used in - this configuration value. Variable substitution will only be invoked at the hive - cli startup. + this configuration value. Variable substitution will only be invoked at the Hive + CLI startup. @@ -75,7 +75,7 @@ -1 The number of columns to use when formatting output generated by the DESCRIBE PRETTY table_name command. If the value of this property - is -1, then hive will use the auto-detected terminal width. + is -1, then Hive will use the auto-detected terminal width. @@ -93,16 +93,16 @@ hive.test.mode false - whether hive is running in test mode. If yes, it turns on sampling and prefixes the output tablename + Whether Hive is running in test mode. If yes, it turns on sampling and prefixes the output tablename. hive.test.mode.prefix test_ - if hive is running in test mode, prefixes the output table by this string + if Hive is running in test mode, prefixes the output table by this string - + @@ -112,19 +112,19 @@ hive.test.mode.samplefreq 32 - if hive is running in test mode and table is not bucketed, sampling frequency + if Hive is running in test mode and table is not bucketed, sampling frequency hive.test.mode.nosamplelist - if hive is running in test mode, dont sample the above comma seperated list of tables + if Hive is running in test mode, don't sample the above comma separated list of tables hive.metastore.uris - Thrift uri for the remote metastore. Used by metastore client to connect to remote metastore. + Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore. @@ -226,7 +226,7 @@ datanucleus.cache.level2 false - Use a level 2 cache. Turn this off if metadata is changed independently of hive metastore server + Use a level 2 cache. Turn this off if metadata is changed independently of Hive metastore server @@ -262,13 +262,13 @@ hive.metastore.event.listeners - list of comma seperated listeners for metastore events. + list of comma separated listeners for metastore events. hive.metastore.partition.inherit.table.properties - list of comma seperated keys occurring in table properties which will get inherited to newly created partitions. * implies all the keys will get inherited. + list of comma separated keys occurring in table properties which will get inherited to newly created partitions. * implies all the keys will get inherited. @@ -294,7 +294,7 @@ If true (default is false), ALTER TABLE operations which change the type of a column (say STRING) to an incompatible type (say MAP<STRING, STRING>) are disallowed. - RCFile default serde (ColumnarSerde) serializes the values in such a way that the + RCFile default SerDe (ColumnarSerDe) serializes the values in such a way that the datatypes can be converted from string to any type. The map is also serialized as a string, which can be read as a string as well. However, with any binary serialization, this is not true. Blocking the ALTER TABLE prevents ClassCastExceptions @@ -376,7 +376,7 @@ hive.default.rcfile.serde org.apache.hadoop.hive.serde2.columnar.LazyBinaryColumnarSerDe - The default SerDe hive will use for the rcfile format + The default SerDe Hive will use for the RCFile format @@ -411,7 +411,7 @@ perform the 2 groups bys. This makes sense if map-side aggregation is turned off. However, with maps-side aggregation, it might be useful in some cases to treat the 2 inserts independently, thereby performing the query above in 2MR jobs instead of 3 (due to spraying by distinct key first). - If this parameter is turned off, we dont consider the fact that the distinct key is the same across + If this parameter is turned off, we don't consider the fact that the distinct key is the same across different MR jobs. @@ -449,7 +449,7 @@ hive.session.history.enabled false - Whether to log hive query, query plan, runtime statistics etc + Whether to log Hive query, query plan, runtime statistics etc. @@ -505,7 +505,7 @@ a union is performed for the 2 joins generated above. So unless the same skewed key is present in both the joined tables, the join for the skewed key will be performed as a map-side join. - The main difference between this paramater and hive.optimize.skewjoin is that this parameter + The main difference between this parameter and hive.optimize.skewjoin is that this parameter uses the skew information stored in the metastore to optimize the plan at compile time itself. If there is no skew information in the metadata, this parameter will not have any affect. Both hive.optimize.skewjoin.compiletime and hive.optimize.skewjoin should be set to true. @@ -529,14 +529,14 @@ The merge is triggered if either of hive.merge.mapfiles or hive.merge.mapredfiles is set to true. If the user has set hive.merge.mapfiles to true and hive.merge.mapredfiles to false, the idea was the number of reducers are few, so the number of files anyway are small. However, with this optimization, - we are increasing the number of files possibly by a big margin. So, we merge aggresively. + we are increasing the number of files possibly by a big margin. So, we merge aggressively. hive.mapred.supports.subdirectories false - Whether the version of hadoop which is running supports sub-directories for tables/partitions. - Many hive optimizations can be applied if the hadoop version supports sub-directories for + Whether the version of Hadoop which is running supports sub-directories for tables/partitions. + Many Hive optimizations can be applied if the Hadoop version supports sub-directories for tables/partitions. It was added by MAPREDUCE-1501 @@ -576,16 +576,16 @@ This can lead to explosion across map-reduce boundary if the cardinality of T is very high, and map-side aggregation does not do a very good job. - This parameter decides if hive should add an additional map-reduce job. If the grouping set + This parameter decides if Hive should add an additional map-reduce job. If the grouping set cardinality (4 in the example above), is more than this value, a new MR job is added under the - assumption that the orginal group by will reduce the data size. + assumption that the original group by will reduce the data size. hive.join.emit.interval 1000 - How many rows in the right-most join operand Hive should buffer before emitting the join result. + How many rows in the right-most join operand Hive should buffer before emitting the join result. @@ -605,7 +605,7 @@ false Whether to enable skew join optimization. The algorithm is as follows: At runtime, detect the keys with a large skew. Instead of - processing those keys, store them temporarily in a hdfs directory. In a follow-up map-reduce + processing those keys, store them temporarily in an HDFS directory. In a follow-up map-reduce job, process those skewed keys. The same key need not be skewed for all the tables, and so, the follow-up map-reduce job (for the skewed keys) would be much faster, since it would be a map-join. @@ -639,7 +639,7 @@ hive.mapred.mode nonstrict - The mode in which the hive operations are being performed. + The mode in which the Hive operations are being performed. In strict mode, some risky queries are not allowed to run. They include: Cartesian Product. No partition being picked up for a query. @@ -653,7 +653,7 @@ hive.enforce.bucketmapjoin false If the user asked for bucketed map-side join, and it cannot be performed, - should the query fail or not ? For eg, if the buckets in the tables being joined are + should the query fail or not ? For example, if the buckets in the tables being joined are not a multiple of each other, bucketed map-side join cannot be performed, and the query will fail if hive.enforce.bucketmapjoin is set to true. @@ -688,13 +688,13 @@ hive.exec.compress.output false - This controls whether the final outputs of a query (to a local/hdfs file or a hive table) is compressed. The compression codec and other options are determined from hadoop config variables mapred.output.compress* + This controls whether the final outputs of a query (to a local/HDFS file or a Hive table) is compressed. The compression codec and other options are determined from Hadoop config variables mapred.output.compress* hive.exec.compress.intermediate false - This controls whether intermediate files produced by hive between multiple map-reduce jobs are compressed. The compression codec and other options are determined from hadoop config variables mapred.output.compress* + This controls whether intermediate files produced by Hive between multiple map-reduce jobs are compressed. The compression codec and other options are determined from Hadoop config variables mapred.output.compress* @@ -718,7 +718,7 @@ hive.counters.group.name HIVE - The name of counter group for internal hive variables(CREATED_FILE, FATAL_ERROR, etc.) + The name of counter group for internal Hive variables (CREATED_FILE, FATAL_ERROR, etc.) @@ -760,7 +760,7 @@ hive.metastore.init.hooks - A comma separated list of hooks to be invoked at the beginning of HMSHandler initialization. Aninit hook is specified as the name of Java class which extends org.apache.hadoop.hive.metastore.MetaStoreInitListener. + A comma separated list of hooks to be invoked at the beginning of HMSHandler initialization. An init hook is specified as the name of Java class which extends org.apache.hadoop.hive.metastore.MetaStoreInitListener. @@ -820,13 +820,13 @@ hive.mapjoin.localtask.max.memory.usage 0.90 - This number means how much memory the local task can take to hold the key/value into in-memory hash table; If the local task's memory usage is more than this number, the local task will be abort by themself. It means the data of small table is too large to be hold in the memory. + This number means how much memory the local task can take to hold the key/value into an in-memory hash table. If the local task's memory usage is more than this number, the local task will abort by itself. It means the data of the small table is too large to be held in memory. hive.mapjoin.followby.gby.localtask.max.memory.usage 0.55 - This number means how much memory the local task can take to hold the key/value into in-memory hash table when this map join followed by a group by; If the local task's memory usage is more than this number, the local task will be abort by themself. It means the data of small table is too large to be hold in the memory. + This number means how much memory the local task can take to hold the key/value into an in-memory hash table when this map join is followed by a group by. If the local task's memory usage is more than this number, the local task will abort by itself. It means the data of the small table is too large to be held in memory. @@ -838,14 +838,14 @@ hive.auto.convert.join false - Whether Hive enable the optimization about converting common join into mapjoin based on the input file size + Whether Hive enables the optimization about converting common join into mapjoin based on the input file size hive.auto.convert.join.noconditionaltask true - Whether Hive enable the optimization about converting common join into mapjoin based on the input file - size. If this paramater is on, and the sum of size for n-1 of the tables/partitions for a n-way join is smaller than the + Whether Hive enables the optimization about converting common join into mapjoin based on the input file + size. If this parameter is on, and the sum of size for n-1 of the tables/partitions for a n-way join is smaller than the specified size, the join is directly converted to a mapjoin (there is no conditional task). @@ -862,13 +862,13 @@ hive.script.auto.progress false - Whether Hive Tranform/Map/Reduce Clause should automatically send progress information to TaskTracker to avoid the task getting killed because of inactivity. Hive sends progress information when the script is outputting to stderr. This option removes the need of periodically producing stderr messages, but users should be cautious because this may prevent infinite loops in the scripts to be killed by TaskTracker. + Whether Hive Transform/Map/Reduce Clause should automatically send progress information to TaskTracker to avoid the task getting killed because of inactivity. Hive sends progress information when the script is outputting to stderr. This option removes the need of periodically producing stderr messages, but users should be cautious because this may prevent infinite loops in the scripts to be killed by TaskTracker. hive.script.serde org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe - The default serde for trasmitting input data to and reading output data from the user scripts. + The default SerDe for transmitting input data to and reading output data from the user scripts. @@ -917,7 +917,7 @@ stream.stderr.reporter.prefix reporter: - Streaming jobs that log to stardard error with this prefix can log counter or status information. + Streaming jobs that log to standard error with this prefix can log counter or status information. @@ -941,7 +941,7 @@ hive.udtf.auto.progress false - Whether Hive should automatically send progress information to TaskTracker when using UDTF's to prevent the task getting killed because of inactivity. Users should be cautious because this may prevent TaskTracker from killing tasks with infinte loops. + Whether Hive should automatically send progress information to TaskTracker when using UDTF's to prevent the task getting killed because of inactivity. Users should be cautious because this may prevent TaskTracker from killing tasks with infinite loops. @@ -1003,7 +1003,7 @@ hive.optimize.bucketingsorting true - If hive.enforce.bucketing or hive.enforce.sorting is true, dont create a reducer for enforcing + If hive.enforce.bucketing or hive.enforce.sorting is true, don't create a reducer for enforcing bucketing/sorting for queries of the form: insert overwrite table T2 select * from T1; where T1 and T2 are bucketed/sorted by the same keys into the same number of buckets. @@ -1045,10 +1045,10 @@ hive.auto.convert.sortmerge.join.to.mapjoin false If hive.auto.convert.sortmerge.join is set to true, and a join was converted to a sort-merge join, - this parameter decides whether each table should be tried as a big table, and effectviely a map-join should be + this parameter decides whether each table should be tried as a big table, and effectively a map-join should be tried. That would create a conditional task with n+1 children for a n-way join (1 child for each table as the - big table), and the backup task will be the sort-merge join. In some casess, a map-join would be faster than a - sort-merge join, if there is no advantage of having the output bucketed and sorted. For eg. if a very big sorted + big table), and the backup task will be the sort-merge join. In some cases, a map-join would be faster than a + sort-merge join, if there is no advantage of having the output bucketed and sorted. For example, if a very big sorted and bucketed table with few files (say 10 files) are being joined with a very small sorter and bucketed table with few files (10 files), the sort-merge join will only use 10 mappers, and a simple map-only join might be faster if the complete small table can fit in memory, and a map-join can be performed. @@ -1058,7 +1058,7 @@ hive.metastore.ds.connection.url.hook - Name of the hook to use for retriving the JDO connection URL. If empty, the value in javax.jdo.option.ConnectionURL is used + Name of the hook to use for retrieving the JDO connection URL. If empty, the value in javax.jdo.option.ConnectionURL is used @@ -1070,7 +1070,7 @@ hive.metastore.ds.retry.interval 1000 - The number of miliseconds between metastore retry attempts + The number of milliseconds between metastore retry attempts @@ -1094,25 +1094,25 @@ hive.metastore.sasl.enabled false - If true, the metastore thrift interface will be secured with SASL. Clients must authenticate with Kerberos. + If true, the metastore Thrift interface will be secured with SASL. Clients must authenticate with Kerberos. hive.metastore.thrift.framed.transport.enabled false - If true, the metastore thrift interface will use TFramedTransport. When false (default) a standard TTransport is used. + If true, the metastore Thrift interface will use TFramedTransport. When false (default) a standard TTransport is used. hive.metastore.kerberos.keytab.file - The path to the Kerberos Keytab file containing the metastore thrift server's service principal. + The path to the Kerberos Keytab file containing the metastore Thrift server's service principal. hive.metastore.kerberos.principal hive-metastore/_HOST@EXAMPLE.COM - The service principal for the metastore thrift server. The special string _HOST will be replaced automatically with the correct host name. + The service principal for the metastore Thrift server. The special string _HOST will be replaced automatically with the correct host name. @@ -1198,13 +1198,13 @@ hive.exec.default.partition.name __HIVE_DEFAULT_PARTITION__ - The default partition name in case the dynamic partition column value is null/empty string or anyother values that cannot be escaped. This value must not contain any special character used in HDFS URI (e.g., ':', '%', '/' etc). The user has to be aware that the dynamic partition value should not contain this value to avoid confusions. + The default partition name in case the dynamic partition column value is null/empty string or any other values that cannot be escaped. This value must not contain any special character used in HDFS URI (e.g., ':', '%', '/' etc). The user has to be aware that the dynamic partition value should not contain this value to avoid confusions. hive.stats.dbclass counter - The storage that stores temporary hive statistics. Currently, jdbc, hbase, counter and custom type is supported + The storage that stores temporary Hive statistics. Currently, jdbc, hbase, counter and custom type are supported. @@ -1216,13 +1216,13 @@ hive.stats.jdbcdriver org.apache.derby.jdbc.EmbeddedDriver - The JDBC driver for the database that stores temporary hive statistics. + The JDBC driver for the database that stores temporary Hive statistics. hive.stats.dbconnectionstring jdbc:derby:;databaseName=TempStatsStore;create=true - The default connection string for the database that stores temporary hive statistics. + The default connection string for the database that stores temporary Hive statistics. @@ -1252,14 +1252,14 @@ hive.stats.retries.wait 3000 - The base waiting window (in milliseconds) before the next retry. The actual wait time is calculated by baseWindow * failues baseWindow * (failure 1) * (random number between [0.0,1.0]). + The base waiting window (in milliseconds) before the next retry. The actual wait time is calculated by baseWindow * failures baseWindow * (failure 1) * (random number between [0.0,1.0]). hive.stats.reliable false Whether queries will fail because stats cannot be collected completely accurately. - If this is set to true, reading/writing from/into a partition may fail becuase the stats + If this is set to true, reading/writing from/into a partition may fail because the stats could not be computed accurately. @@ -1301,7 +1301,7 @@ hive.support.concurrency false - Whether hive supports concurrency or not. A zookeeper instance must be up and running for the default hive lock manager to support read-write locks. + Whether Hive supports concurrency or not. A ZooKeeper instance must be up and running for the default Hive lock manager to support read-write locks. @@ -1325,25 +1325,25 @@ hive.zookeeper.quorum - The list of zookeeper servers to talk to. This is only needed for read/write locks. + The list of ZooKeeper servers to talk to. This is only needed for read/write locks. hive.zookeeper.client.port 2181 - The port of zookeeper servers to talk to. This is only needed for read/write locks. + The port of ZooKeeper servers to talk to. This is only needed for read/write locks. hive.zookeeper.session.timeout 600000 - Zookeeper client's session timeout. The client is disconnected, and as a result, all locks released, if a heartbeat is not sent in the timeout. + ZooKeeper client's session timeout. The client is disconnected, and as a result, all locks released, if a heartbeat is not sent in the timeout. hive.zookeeper.namespace hive_zookeeper_namespace - The parent node under which all zookeeper nodes are created. + The parent node under which all ZooKeeper nodes are created. @@ -1355,7 +1355,7 @@ fs.har.impl org.apache.hadoop.hive.shims.HiveHarFileSystem - The implementation for accessing Hadoop Archives. Note that this won't be applicable to Hadoop vers less than 0.20 + The implementation for accessing Hadoop Archives. Note that this won't be applicable to Hadoop versions less than 0.20 @@ -1367,13 +1367,13 @@ hive.fetch.output.serde org.apache.hadoop.hive.serde2.DelimitedJSONSerDe - The serde used by FetchTask to serialize the fetch output. + The SerDe used by FetchTask to serialize the fetch output. hive.exec.mode.local.auto false - Let hive determine whether to run in local mode automatically + Let Hive determine whether to run in local mode automatically @@ -1443,19 +1443,19 @@ hive.conf.validation true - Eables type checking for registered hive configurations + Enables type checking for registered Hive configurations hive.security.authorization.enabled false - enable or disable the hive client authorization + enable or disable the Hive client authorization hive.security.authorization.manager org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider - the hive client authorization manager class name. + The Hive client authorization manager class name. The user defined authorization class should implement interface org.apache.hadoop.hive.ql.security.authorization.HiveAuthorizationProvider. @@ -1516,13 +1516,13 @@ hive.security.command.whitelist set,reset,dfs,add,delete - Comma seperated list of non-SQL Hive commands users are authorized to execute + Comma separated list of non-SQL Hive commands users are authorized to execute hive.conf.restricted.list - Comma seperated list of configuration options which are immutable at runtime + Comma separated list of configuration options which are immutable at runtime @@ -1537,13 +1537,13 @@ hive.error.on.empty.partition false - Whether to throw an excpetion if dynamic partition insert generates empty results. + Whether to throw an exception if dynamic partition insert generates empty results. hive.index.compact.file.ignore.hdfs false - True the hdfs location stored in the index file will be igbored at runtime. + When true the HDFS location stored in the index file will be ignored at runtime. If the data got moved or the name of the cluster got changed, the index data should still be usable. @@ -1634,7 +1634,7 @@ hive.exec.concatenate.check.index true - If this sets to true, hive will throw error when doing + If this is set to true, Hive will throw error when doing 'alter table tbl_name [partSpec] concatenate' on a table/partition that has indexes on it. The reason the user want to set this to true is because it can help user to avoid handling all index drop, recreation, @@ -1666,7 +1666,7 @@ hive.autogen.columnalias.prefix.includefuncname false - Whether to include function name in the column alias auto generated by hive. + Whether to include function name in the column alias auto generated by Hive. @@ -1678,7 +1678,7 @@ hive.start.cleanup.scratchdir false - To cleanup the hive scratchdir while starting the hive server + To cleanup the Hive scratchdir while starting the Hive Server @@ -1713,8 +1713,9 @@ hive.exec.driver.run.hooks - A comma separated list of hooks which implement HiveDriverRunHook and will be run at the - beginning and end of Driver.run, these will be run in the order specified + A comma separated list of hooks which implement HiveDriverRunHook + and will be run at the beginning and end of Driver.run, these will be run in + the order specified. @@ -1732,7 +1733,7 @@ false This adds an option to escape special chars (newlines, carriage returns and - tabs) when they are passed to the user script. This is useful if the hive tables + tabs) when they are passed to the user script. This is useful if the Hive tables can contain data that contains special characters. @@ -1741,9 +1742,9 @@ hive.exec.rcfile.use.explicit.header true - If this is set the header for RC Files will simply be RCF. If this is not + If this is set the header for RCFiles will simply be RCF. If this is not set the header will be that borrowed from sequence files, e.g. SEQ- followed - by the input and output RC File formats. + by the input and output RCFile formats. @@ -1779,7 +1780,7 @@ Some select queries can be converted to single FETCH task minimizing latency. Currently the query should be single sourced not having any subquery and should not have - any aggregations or distincts (which incurrs RS), lateral views and joins. + any aggregations or distincts (which incurs RS), lateral views and joins. 1. minimal : SELECT STAR, FILTER on partition columns, LIMIT only 2. more : SELECT, FILTER, LIMIT only (TABLESAMPLE, virtual columns) @@ -1799,8 +1800,8 @@ hive.fetch.task.aggr false - Aggregation queries with no group-by clause (for example, select count(*) from src) executes - final aggregations in single reduce task. If this is set true, hive delegates final aggregation + Aggregation queries with no group-by clause (for example, select count(*) from src) execute + final aggregations in single reduce task. If this is set true, Hive delegates final aggregation stage to fetch task, possibly decreasing the query time. @@ -1826,7 +1827,7 @@ hive.hmshandler.retry.interval 1000 - The number of miliseconds between HMSHandler retry attempts + The number of milliseconds between HMSHandler retry attempts @@ -1838,7 +1839,7 @@ hive.server.tcp.keepalive true - Whether to enable TCP keepalive for the Hive server. Keepalive will prevent accumulation of half-open connections. + Whether to enable TCP keepalive for the Hive Server. Keepalive will prevent accumulation of half-open connections. @@ -1977,7 +1978,7 @@ must be a proper implementation of the interface org.apache.hive.service.auth.PasswdAuthenticationProvider. HiveServer2 will call its Authenticate(user, passed) method to authenticate requests. - The implementation may optionally extend the Hadoop's + The implementation may optionally extend Hadoop's org.apache.hadoop.conf.Configured class to grab Hive's Configuration object. @@ -2018,8 +2019,8 @@ hive.server2.enable.doAs true - Setting this property to true will have hive server2 execute - hive operations as the user making the calls to it. + Setting this property to true will have HiveServer2 execute + Hive operations as the user making the calls to it. @@ -2027,9 +2028,9 @@ hive.server2.table.type.mapping CLASSIC - This setting reflects how HiveServer will report the table types for JDBC and other - client implementations that retrieves the available tables and supported table types - HIVE : Exposes the hive's native table tyes like MANAGED_TABLE, EXTERNAL_TABLE, VIRTUAL_VIEW + This setting reflects how HiveServer2 will report the table types for JDBC and other + client implementations that retrieve the available tables and supported table types + HIVE : Exposes Hive's native table types like MANAGED_TABLE, EXTERNAL_TABLE, VIRTUAL_VIEW CLASSIC : More generic types like TABLE and VIEW @@ -2038,11 +2039,11 @@ hive.server2.thrift.sasl.qop auth Sasl QOP value; Set it to one of following values to enable higher levels of - protection for hive server2 communication with clients. + protection for HiveServer2 communication with clients. "auth" - authentication only (default) "auth-int" - authentication plus integrity protection "auth-conf" - authentication plus integrity and confidentiality protection - This is applicable only hive server2 is configured to use kerberos authentication. + This is applicable only if HiveServer2 is configured to use Kerberos authentication. @@ -2086,7 +2087,7 @@ hive.compute.query.using.stats false - When set to true hive will answer few queries like count(1) purely using stats + When set to true Hive will answer a few queries like count(1) purely using stats stored in metastore. For basic stats collection turn on the config hive.stats.autogather to true. For more advanced stats collection need to run analyze table queries. @@ -2098,7 +2099,7 @@ Enforce metastore schema version consistency. True: Verify that version information stored in metastore matches with one from Hive jars. Also disable automatic - schema migration attempt. Users are required to manully migrate schema after Hive upgrade which ensures + schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures proper metastore schema migration. (Default) False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.