Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-22566

Drop table involved in materialized view leaves the table in inconsistent state

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 3.1.0
    • 4.0.0-alpha-1
    • HiveServer2
    • None

    Description

      If you try dropping a table which is part of the definition of a created materialized view, the table is not dropped, which is the desired state as it is part of the materialized view.

      However, there was a "drop" call to the table, so it tried to drop it but did not succeed, leaving it in an inconsistent state.

       

      Repro:

      -------

      1) Create tables:

       

      CREATE TABLE emps (  empid INT,  deptno INT,  name VARCHAR(256),  salary FLOAT,  hire_date TIMESTAMP)STORED AS ORC TBLPROPERTIES ('transactional'='true'); 
      
      CREATE TABLE depts (  deptno INT,  deptname VARCHAR(256),  locationid INT)STORED AS ORC TBLPROPERTIES ('transactional'='true');
      

       

      2) Create the VM:

       

      CREATE MATERIALIZED VIEW mv1 AS SELECT empid, deptname, hire_date FROM emps JOIN depts  ON (emps.deptno = depts.deptno) WHERE hire_date >= '2016-01-01';
      

       

      3) Following is in backend database at this point:

       

      mysql> select TBL_ID, DB_ID, SD_ID, TBL_NAME, TBL_TYPE from TBLS where DB_ID=16;
      +--------+-------+-------+----------+-------------------+
      | TBL_ID | DB_ID | SD_ID | TBL_NAME | TBL_TYPE          |
      +--------+-------+-------+----------+-------------------+
      |     81 |    16 |    81 | emps     | MANAGED_TABLE     |
      |     83 |    16 |    83 | depts    | MANAGED_TABLE     |
      |     84 |    16 |    84 | mv1      | MATERIALIZED_VIEW |
      +--------+-------+-------+----------+-------------------+
      3 rows in set (0.00 sec)
      

       

      4) Let's drop the 'emps' table:

       

      0: jdbc:hive2://c1122-node2.squadron.support.> drop table emps;
      INFO  : Compiling command(queryId=hive_20191202200025_c13079d0-8695-4485-8a18-14804b8b014b): drop table emps
      INFO  : Semantic Analysis Completed (retrial = false)
      INFO  : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
      INFO  : Completed compiling command(queryId=hive_20191202200025_c13079d0-8695-4485-8a18-14804b8b014b); Time taken: 0.05 seconds
      INFO  : Executing command(queryId=hive_20191202200025_c13079d0-8695-4485-8a18-14804b8b014b): drop table emps
      INFO  : Starting task [Stage-0:DDL] in serial mode
      INFO  : Completed executing command(queryId=hive_20191202200025_c13079d0-8695-4485-8a18-14804b8b014b); Time taken: 10.281 seconds
      INFO  : OK
      No rows affected (16.949 seconds)
      

      No issue displayed

       

      5) List tables:

       

      0: jdbc:hive2://c1122-node2.squadron.support.> show tables;
      INFO  : Compiling command(queryId=hive_20191202200125_ca12565b-1d4d-4433-a602-ecf685863413): show tables
      INFO  : Semantic Analysis Completed (retrial = false)
      INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from deserializer)], properties:null)
      INFO  : Completed compiling command(queryId=hive_20191202200125_ca12565b-1d4d-4433-a602-ecf685863413); Time taken: 0.041 seconds
      INFO  : Executing command(queryId=hive_20191202200125_ca12565b-1d4d-4433-a602-ecf685863413): show tables
      INFO  : Starting task [Stage-0:DDL] in serial mode
      INFO  : Completed executing command(queryId=hive_20191202200125_ca12565b-1d4d-4433-a602-ecf685863413); Time taken: 0.016 seconds
      INFO  : OK
      +-----------+
      | tab_name  |
      +-----------+
      | depts     |
      | emps      |
      +-----------+
      2 rows selected (0.08 seconds)
      

       

      6) Now, from the backend-db point of view:

       

      mysql> select TBL_ID, DB_ID, SD_ID, TBL_NAME, TBL_TYPE from TBLS where DB_ID=16;
      +--------+-------+-------+----------+-------------------+
      | TBL_ID | DB_ID | SD_ID | TBL_NAME | TBL_TYPE          |
      +--------+-------+-------+----------+-------------------+
      |     81 |    16 |  NULL | emps     | MANAGED_TABLE     |
      |     83 |    16 |    83 | depts    | MANAGED_TABLE     |
      |     84 |    16 |    84 | mv1      | MATERIALIZED_VIEW |
      +--------+-------+-------+----------+-------------------+
      3 rows in set (0.00 sec)
      

      The table is left with NULL in SD_ID, making it not available.

       

      7) From Metastore.log

       

      2019-12-02T20:00:25,545 INFO  [pool-6-thread-195]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(907)) - 196: source:172.25.34.150 drop_table : tbl=hive.mvs.emps
      2019-12-02T20:00:25,545 INFO  [pool-6-thread-195]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(349)) - ugi=hive	ip=172.25.34.150	cmd=source:172.25.34.150 drop_table : tbl=hive.mvs.emps	
      2019-12-02T20:00:25,580 INFO  [pool-6-thread-195]: metastore.ObjectStore$RetryingExecutor (ObjectStore.java:run(9966)) - Attempting to acquire the DB log notification lock: 0 out of 10 retries
      javax.jdo.JDODataStoreException: Error executing SQL query "select "NEXT_EVENT_ID" from "NOTIFICATION_SEQUENCE" for update".
      	at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543) ~[datanucleus-api-jdo-4.2.4.jar:?]
      	at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:391) ~[datanucleus-api-jdo-4.2.4.jar:?]
      	at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:216) ~[datanucleus-api-jdo-4.2.4.jar:?]
      	at org.apache.hadoop.hive.metastore.ObjectStore.lambda$lockForUpdate$0(ObjectStore.java:9936) ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
      	at org.apache.hadoop.hive.metastore.ObjectStore$RetryingExecutor.run(ObjectStore.java:9963) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
      	at org.apache.hadoop.hive.metastore.ObjectStore.lockForUpdate(ObjectStore.java:9938) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
      	at org.apache.hadoop.hive.metastore.ObjectStore.addNotificationEvent(ObjectStore.java:10002) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
      	at sun.reflect.GeneratedMethodAccessor55.invoke(Unknown Source) ~[?:?]
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_112]
      	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
      	at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
      	at com.sun.proxy.$Proxy28.addNotificationEvent(Unknown Source) [?:?]
      	at org.apache.hive.hcatalog.listener.DbNotificationListener.process(DbNotificationListener.java:968) [hive-hcatalog-server-extensions-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
      	at org.apache.hive.hcatalog.listener.DbNotificationListener.onDropTable(DbNotificationListener.java:198) [hive-hcatalog-server-extensions-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
      	at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier$19.notify(MetaStoreListenerNotifier.java:99) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
      	at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:273) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
      	at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:335) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
      	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_core(HiveMetaStore.java:2670) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
      	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_with_environment_context(HiveMetaStore.java:2842) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_112]
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_112]
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_112]
      	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
      	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
      	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
      	at com.sun.proxy.$Proxy30.drop_table_with_environment_context(Unknown Source) [?:?]
      	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_table_with_environment_context.getResult(ThriftHiveMetastore.java:15533) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
      	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_table_with_environment_context.getResult(ThriftHiveMetastore.java:15517) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
      	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
      	at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
      	at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
      	at java.security.AccessController.doPrivileged(Native Method) [?:1.8.0_112]
      	at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_112]
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) [hadoop-common-3.1.1.3.1.0.0-78.jar:?]
      	at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
      	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_112]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_112]
      	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]
      Caused by: java.sql.BatchUpdateException: Cannot delete or update a parent row: a foreign key constraint fails ("hive"."MV_TABLES_USED", CONSTRAINT "MV_TABLES_USED_FK2" FOREIGN KEY ("TBL_ID") REFERENCES "TBLS" ("TBL_ID"))
      	at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:2058) ~[mysql-connector-java.jar:?]
      	at com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1471) ~[mysql-connector-java.jar:?]
      	at com.zaxxer.hikari.pool.ProxyStatement.executeBatch(ProxyStatement.java:125) ~[HikariCP-2.6.1.jar:?]
      	at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeBatch(HikariProxyPreparedStatement.java) ~[HikariCP-2.6.1.jar:?]
      	at org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeBatch(ParamLoggingPreparedStatement.java:366) ~[datanucleus-rdbms-4.1.19.jar:?]
      	at org.datanucleus.store.rdbms.SQLController.processConnectionStatement(SQLController.java:676) ~[datanucleus-rdbms-4.1.19.jar:?]
      	at org.datanucleus.store.rdbms.SQLController.getStatementForQuery(SQLController.java:319) ~[datanucleus-rdbms-4.1.19.jar:?]
      	at org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getPreparedStatementForQuery(RDBMSQueryUtils.java:211) ~[datanucleus-rdbms-4.1.19.jar:?]
      	at org.datanucleus.store.rdbms.query.SQLQuery.performExecute(SQLQuery.java:633) ~[datanucleus-rdbms-4.1.19.jar:?]
      	at org.datanucleus.store.query.Query.executeQuery(Query.java:1855) ~[datanucleus-core-4.1.17.jar:?]
      	at org.datanucleus.store.rdbms.query.SQLQuery.executeWithArray(SQLQuery.java:807) ~[datanucleus-rdbms-4.1.19.jar:?]
      	at org.datanucleus.store.query.Query.execute(Query.java:1726) ~[datanucleus-core-4.1.17.jar:?]
      	at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:374) ~[datanucleus-api-jdo-4.2.4.jar:?]
      	... 37 more
      Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Cannot delete or update a parent row: a foreign key constraint fails ("hive"."MV_TABLES_USED", CONSTRAINT "MV_TABLES_USED_FK2" FOREIGN KEY ("TBL_ID") REFERENCES "TBLS" ("TBL_ID"))
      	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_112]
      	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_112]
      	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_112]
      	at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_112]
      	at com.mysql.jdbc.Util.handleNewInstance(Util.java:411) ~[mysql-connector-java.jar:?]
      	at com.mysql.jdbc.Util.getInstance(Util.java:386) ~[mysql-connector-java.jar:?]
      	at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1041) ~[mysql-connector-java.jar:?]
      	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4187) ~[mysql-connector-java.jar:?]
      	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4119) ~[mysql-connector-java.jar:?]
      	at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2570) ~[mysql-connector-java.jar:?]
      	at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2731) ~[mysql-connector-java.jar:?]
      	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2820) ~[mysql-connector-java.jar:?]
      	at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2159) ~[mysql-connector-java.jar:?]
      	at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2462) ~[mysql-connector-java.jar:?]
      	at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:2010) ~[mysql-connector-java.jar:?]
      	at com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1471) ~[mysql-connector-java.jar:?]
      	at com.zaxxer.hikari.pool.ProxyStatement.executeBatch(ProxyStatement.java:125) ~[HikariCP-2.6.1.jar:?]
      	at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeBatch(HikariProxyPreparedStatement.java) ~[HikariCP-2.6.1.jar:?]
      	at org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeBatch(ParamLoggingPreparedStatement.java:366) ~[datanucleus-rdbms-4.1.19.jar:?]
      	at org.datanucleus.store.rdbms.SQLController.processConnectionStatement(SQLController.java:676) ~[datanucleus-rdbms-4.1.19.jar:?]
      	at org.datanucleus.store.rdbms.SQLController.getStatementForQuery(SQLController.java:319) ~[datanucleus-rdbms-4.1.19.jar:?]
      	at org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getPreparedStatementForQuery(RDBMSQueryUtils.java:211) ~[datanucleus-rdbms-4.1.19.jar:?]
      	at org.datanucleus.store.rdbms.query.SQLQuery.performExecute(SQLQuery.java:633) ~[datanucleus-rdbms-4.1.19.jar:?]
      	at org.datanucleus.store.query.Query.executeQuery(Query.java:1855) ~[datanucleus-core-4.1.17.jar:?]
      	at org.datanucleus.store.rdbms.query.SQLQuery.executeWithArray(SQLQuery.java:807) ~[datanucleus-rdbms-4.1.19.jar:?]
      	at org.datanucleus.store.query.Query.execute(Query.java:1726) ~[datanucleus-core-4.1.17.jar:?]
      	at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:374) ~[datanucleus-api-jdo-4.2.4.jar:?]
      	... 37 more
      
      

       

       

      8) If you try to query the table:

       

      0: jdbc:hive2://c1122-node2.squadron.support.> select * from emps;
      Error: Error while compiling statement: FAILED: SemanticException Unable to fetch table emps. null (state=42000,code=40000)
      

       

      It fails as expected.

      9) If you try to query the MV:

      0: jdbc:hive2://c1122-node2.squadron.support.> select * from mv1; INFO : Compiling command(queryId=hive_20191202200818_91bf194d-8133-4670-b8d5-542ee56b6cc2): select * from mv1 INFO : Semantic Analysis Completed (retrial = false) INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:mv1.empid, type:int, comment:null), FieldSchema(name:mv1.deptname, type:varchar(256), comment:null), FieldSchema(name:mv1.hire_date, type:timestamp, comment:null)], properties:null) INFO : Completed compiling command(queryId=hive_20191202200818_91bf194d-8133-4670-b8d5-542ee56b6cc2); Time taken: 0.229 seconds INFO : Executing command(queryId=hive_20191202200818_91bf194d-8133-4670-b8d5-542ee56b6cc2): select * from mv1 INFO : Completed executing command(queryId=hive_20191202200818_91bf194d-8133-4670-b8d5-542ee56b6cc2); Time taken: 0.01 seconds INFO : OK +------------+---------------+----------------+ | mv1.empid | mv1.deptname | mv1.hire_date | +------------+---------------+----------------+ +------------+---------------+----------------+ No rows selected (0.276 seconds)
      

      It does not fail, as the underlying data has not changed, and the table is still being shown as valid.

       

      10) Insert data into "depts" table and rebuild the mv.

      $ INSERT INTO TABLE depts VALUES (101,'IT',25);
      $ INSERT INTO TABLE depts VALUES (102,'Eng',11);
      
      0: jdbc:hive2://c1122-node2.squadron.support.> ALTER MATERIALIZED VIEW mvs.mv1 REBUILD;
      Error: Error while compiling statement: FAILED: SemanticException Unable to fetch table emps. null (state=42000,code=40000)
      

      This fails as expected.

       

      Attachments

        1. HIVE-22566.02.patch
          8 kB
          Pablo Junge
        2. HIVE-22566.05.patch
          14 kB
          Pablo Junge
        3. HIVE-22566.patch
          7 kB
          Pablo Junge

        Issue Links

          Activity

            People

              pjungeb Pablo Junge
              pjungeb Pablo Junge
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: