Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-8243

ConcurrentModificationException in Catalog stress tests

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • Impala 3.1.0
    • Impala 3.2.0
    • Catalog
    • None

    Description

      Following is the full stack from the Catalog server logs.

      14:09:29.474424 14829 jni-util.cc:256] java.util.ConcurrentModificationException
      java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901)
      java.util.ArrayList$Itr.next(ArrayList.java:851)
      org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.write(StorageDescriptor.java:1449)
      org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.write(StorageDescriptor.java:1278)
      org.apache.hadoop.hive.metastore.api.StorageDescriptor.write(StorageDescriptor.java:1144)
      org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.write(Partition.java:1062)
      org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.write(Partition.java:919)
      org.apache.hadoop.hive.metastore.api.Partition.write(Partition.java:815)
      org.apache.impala.thrift.TPartialPartitionInfo$TPartialPartitionInfoStandardScheme.write(TPartialPartitionInfo.java:862)
      org.apache.impala.thrift.TPartialPartitionInfo$TPartialPartitionInfoStandardScheme.write(TPartialPartitionInfo.java:759)
      org.apache.impala.thrift.TPartialPartitionInfo.write(TPartialPartitionInfo.java:665)
      org.apache.impala.thrift.TPartialTableInfo$TPartialTableInfoStandardScheme.write(TPartialTableInfo.java:731)
      org.apache.impala.thrift.TPartialTableInfo$TPartialTableInfoStandardScheme.write(TPartialTableInfo.java:624)
      org.apache.impala.thrift.TPartialTableInfo.write(TPartialTableInfo.java:543)
      org.apache.impala.thrift.TGetPartialCatalogObjectResponse$TGetPartialCatalogObjectResponseStandardScheme.write(TGetPartialCatalogObjectResponse.java:977)
      org.apache.impala.thrift.TGetPartialCatalogObjectResponse$TGetPartialCatalogObjectResponseStandardScheme.write(TGetPartialCatalogObjectResponse.java:857)
      org.apache.impala.thrift.TGetPartialCatalogObjectResponse.write(TGetPartialCatalogObjectResponse.java:739)
      org.apache.thrift.TSerializer.serialize(TSerializer.java:79)
      org.apache.impala.service.JniCatalog.getPartialCatalogObject(JniCatalog.java:233)
      

      It looks like the bug is in the following piece of code.

      /**
         * Returns a Hive-compatible partition object that may be used in calls to the
         * metastore.
         */
        public org.apache.hadoop.hive.metastore.api.Partition toHmsPartition() {
          if (cachedMsPartitionDescriptor_ == null) return null;
          Preconditions.checkNotNull(table_.getNonPartitionFieldSchemas());
          // Update the serde library class based on the currently used file format.
          org.apache.hadoop.hive.metastore.api.StorageDescriptor storageDescriptor =
              new org.apache.hadoop.hive.metastore.api.StorageDescriptor(
                  table_.getNonPartitionFieldSchemas(),  <===== Reference to the actual field schema list.
                  getLocation(),
                  cachedMsPartitionDescriptor_.sdInputFormat,
                  cachedMsPartitionDescriptor_.sdOutputFormat,
                  cachedMsPartitionDescriptor_.sdCompressed,
      

      It appears we are leaking a reference to nonPartFieldSchemas_ in to the thrift object and once the thread leaves the lock scope, some other thread (load() for ex: ) can potentially change the source list and the serialization code could throw ConcurrentModificationException

      While the stack above is Catalog-v2 only, it is possible that some other threads can race in a similar fashion.

      Attachments

        Activity

          People

            bharathv Bharath Vissapragada
            bharathv Bharath Vissapragada
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: