Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-1416

Queries fail with metastore exception after upgrade and compute stats

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Not A Bug
    • Affects Version/s: Impala 2.0
    • Fix Version/s: Impala 2.0.1
    • Component/s: None
    • Labels:
      None

      Description

      Several users have reported that after upgrading to Impala 2.0 and Hive 0.13.1 queries will fail on tables if they have run compute stats.

      After running compute stats, they will get the following error when running queries on those tables:

      AnalysisException: Failed to load metadata for table: default.stats_test
      CAUSED BY: TableLoadingException: Failed to load metadata for table: stats_test
      CAUSED BY: TTransportException: null
      

      The Hive metastore log has the following error:

      2014-10-24 09:37:02,862 INFO  metastore.HiveMetaStore (HiveMetaStore.java:logInfo(632)) - 183: source:/172.24.16.192 get_table_statistics_req: db=default table=stats_test
      
       
      
      2014-10-24 09:37:02,862 INFO  HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(314)) - ugi=hive ip=/172.24.16.192 cmd=source:/172.24.16.192 get_table_statistics_req: db=default table=stats_test 
      
      2014-10-24 09:37:02,873 ERROR server.TThreadPoolServer (TThreadPoolServer.java:run(251)) - Thrift error occurred during processing of message. 
      
      org.apache.thrift.protocol.TProtocolException: Cannot write a TUnion with no set value! 
      
       at org.apache.thrift.TUnion$TUnionStandardScheme.write(TUnion.java:240) 
      
       at org.apache.thrift.TUnion$TUnionStandardScheme.write(TUnion.java:213) 
      
       at org.apache.thrift.TUnion.write(TUnion.java:152) 
      
       at org.apache.hadoop.hive.metastore.api.ColumnStatisticsObj$ColumnStatisticsObjStandardScheme.write(ColumnStatisticsObj.java:550) 
      
       at org.apache.hadoop.hive.metastore.api.ColumnStatisticsObj$ColumnStatisticsObjStandardScheme.write(ColumnStatisticsObj.java:488) 
      
       at org.apache.hadoop.hive.metastore.api.ColumnStatisticsObj.write(ColumnStatisticsObj.java:414) 
      
       at org.apache.hadoop.hive.metastore.api.TableStatsResult$TableStatsResultStandardScheme.write(TableStatsResult.java:388) 
      
       at org.apache.hadoop.hive.metastore.api.TableStatsResult$TableStatsResultStandardScheme.write(TableStatsResult.java:338) 
      
       at org.apache.hadoop.hive.metastore.api.TableStatsResult.write(TableStatsResult.java:288) 
      
       at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_statistics_req_result$get_table_statistics_req_resultStandardScheme.write(ThriftHiveMetastore.java) 
      
       at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_statistics_req_result$get_table_statistics_req_resultStandardScheme.write(ThriftHiveMetastore.java) 
      
       at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_statistics_req_result.write(ThriftHiveMetastore.java) 
      
       at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53) 
      
       at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) 
      
       at org.apache.hadoop.hive.metastore.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:48) 
      
       at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:244) 
      
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
      
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
      
      at java.lang.Thread.run(Thread.java:744)
      

      Additional notes from the list:

      Dropping and re-creating the tables and partitions in Hive restores access (if you're fortunate enough to have external tables) but running COMPUTE STATS on the recreated table will render it inaccessible again. Table metadata and data remains accessible in Hive even when inaccessible in Impala. We can compute stats in Hive without issue, and Impala seems to be able to make use of them, as the 'Warning: Missing relevant stats...' message disappears from query profiles.

        Attachments

          Activity

            People

            • Assignee:
              henryr Henry Robinson
              Reporter:
              mjacobs Matthew Jacobs
            • Votes:
              2 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: