Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
3.0.0
-
None
Description
it would be convenient for testing to have a switch that enables the behavior where all suitable table tables (currently ORC + not sorted) are automatically created with transactional=true, ie. full acid.
Attachments
Attachments
- HIVE-18294.05.patch
- 14 kB
- Eugene Koifman
- HIVE-18294.04.patch
- 16 kB
- Eugene Koifman
- HIVE-18294.03.patch
- 16 kB
- Eugene Koifman
- HIVE-18294.01.patch
- 20 kB
- Eugene Koifman
Activity
Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12902716/HIVE-18294.03.patch
ERROR: -1 due to no test(s) being added or modified.
ERROR: -1 due to 50 failed/errored test(s), 11528 tests executed
Failed tests:
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] (batchId=53) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dynpart_sort_optimization_acid2] (batchId=31) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_orig_table] (batchId=62) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mm_all] (batchId=67) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mm_conversions] (batchId=75) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_ppd_decimal] (batchId=9) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char] (batchId=22) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[mm_all] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[acid_vectorization_original] (batchId=169) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2] (batchId=157) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_orig_table] (batchId=165) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=165) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=169) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=160) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mm_conversions] (batchId=168) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=160) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[acid_vectorization_original_tez] (batchId=103) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] (batchId=102) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=102) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[vector_join_part_col_char] (batchId=102) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_part] (batchId=93) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[stats_aggregator_error_1] (batchId=93) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[auto_sortmerge_join_10] (batchId=138) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucketsortoptimize_insert_7] (batchId=128) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] (batchId=120) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] (batchId=113) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query39] (batchId=248) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=209) org.apache.hadoop.hive.ql.TestTxnLoadData.loadData (batchId=257) org.apache.hadoop.hive.ql.TestTxnLoadData.loadDataNonAcid2AcidConversion (batchId=257) org.apache.hadoop.hive.ql.TestTxnLoadData.loadDataNonAcid2AcidConversionVectorized (batchId=257) org.apache.hadoop.hive.ql.TestTxnLoadData.loadDataPartitioned (batchId=257) org.apache.hadoop.hive.ql.TestTxnLoadData.loadDataUpdate (batchId=257) org.apache.hadoop.hive.ql.TestTxnLoadData.loadDataUpdateVectorized (batchId=257) org.apache.hadoop.hive.ql.TestTxnLoadData.loadDataVectorized (batchId=257) org.apache.hadoop.hive.ql.TestTxnLoadData.testAbort (batchId=257) org.apache.hadoop.hive.ql.TestTxnLoadData.testMultiStatement (batchId=257) org.apache.hadoop.hive.ql.TestTxnLoadData.testMultiStatementVectorized (batchId=257) org.apache.hadoop.hive.ql.TestTxnNoBuckets.testNoBuckets (batchId=278) org.apache.hadoop.hive.ql.TestTxnNoBuckets.testNonAcidToAcidVectorzied (batchId=278) org.apache.hadoop.hive.ql.TestTxnNoBuckets.testToAcidConversion02 (batchId=278) org.apache.hadoop.hive.ql.TestTxnNoBucketsVectorized.testNoBuckets (batchId=278) org.apache.hadoop.hive.ql.TestTxnNoBucketsVectorized.testNonAcidToAcidVectorzied (batchId=278) org.apache.hadoop.hive.ql.TestTxnNoBucketsVectorized.testToAcidConversion02 (batchId=278) org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.checkExpectedLocks (batchId=291) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=226) org.apache.hive.hcatalog.streaming.TestStreaming.testTableValidation (batchId=200)
Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8309/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8309/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8309/
Messages:
Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 50 tests failed
This message is automatically generated.
ATTACHMENT ID: 12902716 - PreCommit-HIVE-Build
ekoifman I'm no expert in the area....but I feel that this option is more closer to "hive" than to the "metastore"...is there any benefits adding pieces at both sides for this?
I feel that moving this to the hive side might also help enablig it more easily for CTAS
kgyrtkirk, I think metastore listener is like a data base trigger - it gets activated no matter how the change is made. For example, if some used Thrift API directly to create a table in the metastore, the listener would still get activated. Moving this to Hive would work better for CTAS - that is why it's in both places, but it would only work commands that go through SemanticAnalyzer.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
Prechecks | |||
0 | findbugs | 0m 0s | Findbugs executables are not available. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
master Compile Tests | |||
0 | mvndep | 1m 27s | Maven dependency ordering for branch |
+1 | mvninstall | 5m 50s | master passed |
+1 | compile | 1m 51s | master passed |
+1 | checkstyle | 1m 12s | master passed |
+1 | javadoc | 2m 1s | master passed |
Patch Compile Tests | |||
0 | mvndep | 0m 21s | Maven dependency ordering for patch |
+1 | mvninstall | 2m 9s | the patch passed |
+1 | compile | 1m 46s | the patch passed |
+1 | javac | 1m 46s | the patch passed |
-1 | checkstyle | 0m 16s | standalone-metastore: The patch generated 1 new + 209 unchanged - 0 fixed = 210 total (was 209) |
-1 | checkstyle | 0m 39s | ql: The patch generated 2 new + 1089 unchanged - 0 fixed = 1091 total (was 1089) |
+1 | whitespace | 0m 0s | The patch has no whitespace issues. |
+1 | javadoc | 1m 56s | the patch passed |
Other Tests | |||
+1 | asflicense | 0m 12s | The patch does not generate ASF License warnings. |
20m 26s |
Subsystem | Report/Notes |
---|---|
Optional Tests | asflicense javac javadoc findbugs checkstyle compile |
uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
Build tool | maven |
Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
git revision | master / 00212e0 |
Default Java | 1.8.0_111 |
checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8327/yetus/diff-checkstyle-standalone-metastore.txt |
checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8327/yetus/diff-checkstyle-ql.txt |
modules | C: common standalone-metastore ql U: . |
Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-8327/yetus.txt |
Powered by | Apache Yetus http://yetus.apache.org |
This message was automatically generated.
Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12902910/HIVE-18294.04.patch
ERROR: -1 due to no test(s) being added or modified.
ERROR: -1 due to 16 failed/errored test(s), 11528 tests executed
Failed tests:
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=72) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2] (batchId=157) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=165) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=169) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=160) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=160) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_part] (batchId=93) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[auto_sortmerge_join_10] (batchId=138) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucketsortoptimize_insert_7] (batchId=128) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] (batchId=120) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query39] (batchId=248) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=209) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=226)
Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8327/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8327/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8327/
Messages:
Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 16 tests failed
This message is automatically generated.
ATTACHMENT ID: 12902910 - PreCommit-HIVE-Build
auto_join25 has same failure in
https://builds.apache.org/job/PreCommit-HIVE-Build/8320/testReport/org.apache.hadoop.hive.cli/TestCliDriver/testCliDriver_auto_join25_/
mvn test -Dtest=TestSparkPerfCliDriver -Dqfile=query39.q runs fine locally
no related failures
gates could you review please
hmm...interesting, are there any setup in which a thrift client is not hive; but can work with acid tables?
"only work command that go thru semanticanalyzer" sorry, but I don't really understand which commands don't go that way, could you please give me some more detail?
I don't of any specific example where a thrift client is not hive but that doesn't mean one can't exist.
Our Thrift interface is public - anyone can call it directly to perform any operation on the metastore to create a table. That's why the listener concept is useful.
Why mirror CREATE_TABLES_AS_ACID in HiveConf in addition to MetastoreConf? Can't everywhere use the Metastore value? That way we don't have to worry about different parts of code reading different values. I know I haven't cleaned up things in HiveConf yet, but I plan to deprecate all the HiveConf values mirrored in MetastoreConf and shift the Hive code to use the MetastoreConf values.
Other than that, looks good.
I would prefer to only have the prop in MetastoreConf and the logic in TransactionalValidationListener - that would have been clean but there is some pushback on this. See HIVE-18285.
Unfortunately, CTAS command writes the data first (this part needs to know if it's doing an Acid write) and then creates the metastore object which causes the Listener to run. So I was forced to also put the same logic in SemanticAnalyzer.
I don't think the Conf object in SemanticAnalzyer has the MetastoreConf keys - but I would have run a test to make sure; it's a little hazy at this point.
You're correct that the value will have to be put in the conf file on both HS2 (or wherever the server code is running) and on the metastore (assuming they are running on separate servers). But if the correct value is in the config file the MetastoreConf methods will properly extract it. They were designed with exactly this in mind, so that we don't have to duplicate the conf enums. So you should be able to use MetastoreConf.getVar() to pull a value out hive-site.xml.
This jira is resolved and released with Hive 3.0 If you find an issue with it, please create a new jira.
ekoifman gates In TransactionalValidationListener.makeAcid() there is the following code:
if(!TableType.MANAGED_TABLE.toString().equalsIgnoreCase(newTable.getTableType())) { //todo should this check be in conformToAcid()? LOG.info("Could not make " + Warehouse.getQualifiedName(newTable) + " acid: it's " + newTable.getTableType()); return; }
What is the expected behavior for tables that have MANAGED_TABLE type but have EXTERNAL parameter set? Should these be excluded as well?
public enum TableType { MANAGED_TABLE, EXTERNAL_TABLE, VIRTUAL_VIEW, MATERIALIZED_VIEW }
I expected the table to be one of the above. I'm not sure what EXTERNAL parameter indicates and how that differs from TableType.EXTERNAL_TABLE
ekoifman There are two ways of specifying external tables: using tableType and using table parameter EXTERNAL set to true. Some parts of the code use one method and some use another method, HIVE-19253 has some discussion about it. I am trying to clean this up a bit so I need to understand the intention in various places that use it.
OK, since I didn't even know about this table parameter, it's safe to assume all of my code is checking TableType and ignoring the table parameter. I would vote for TableType if you are trying to standardize.
This message was automatically generated.