Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4902

Concurrent DDL may fail with a ConcurrentModificationException

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: Impala 2.9.0
    • Fix Version/s: Impala 2.9.0
    • Component/s: Catalog
    • Labels:

      Description

      Concurrent DDL operations may sometimes fail with a ConcurrentModificationException.
      Those commands which modify the tblproperties map of a table or partition are affected, for example:

      • COMPUTE STATS
      • COMPUTE INCREMENTAL STATS
      • DROP STATS
      • ALTER TABLE SET TBPROPERTIES
      • ALTER TABLE SET CACHED
      • Possibly others

      The logs will contain a stack trace like the one below where the exception is thrown in thrift-generated code while iterating over a HashMap.

      Original Description
      Recently, the stress test was modified to include COMPUTE STATS statements with a variety of MT_DOP options.

      Occasionally those fail with:

      I0208 03:06:26.634202 67490 jni-util.cc:169] java.util.ConcurrentModificationException
              at java.util.HashMap$HashIterator.nextEntry(HashMap.java:922)
              at java.util.HashMap$EntryIterator.next(HashMap.java:962)
              at java.util.HashMap$EntryIterator.next(HashMap.java:960)
              at org.apache.impala.thrift.THdfsPartition$THdfsPartitionStandardScheme.write(THdfsPartition.java:1832)
              at org.apache.impala.thrift.THdfsPartition$THdfsPartitionStandardScheme.write(THdfsPartition.java:1546)
              at org.apache.impala.thrift.THdfsPartition.write(THdfsPartition.java:1389)
              at org.apache.impala.thrift.THdfsTable$THdfsTableStandardScheme.write(THdfsTable.java:1243)
              at org.apache.impala.thrift.THdfsTable$THdfsTableStandardScheme.write(THdfsTable.java:1071)
              at org.apache.impala.thrift.THdfsTable.write(THdfsTable.java:940)
              at org.apache.impala.thrift.TTable$TTableStandardScheme.write(TTable.java:1628)
              at org.apache.impala.thrift.TTable$TTableStandardScheme.write(TTable.java:1399)
              at org.apache.impala.thrift.TTable.write(TTable.java:1208)
              at org.apache.impala.thrift.TCatalogObject$TCatalogObjectStandardScheme.write(TCatalogObject.java:1241)
              at org.apache.impala.thrift.TCatalogObject$TCatalogObjectStandardScheme.write(TCatalogObject.java:1098)
              at org.apache.impala.thrift.TCatalogObject.write(TCatalogObject.java:938)
              at org.apache.impala.thrift.TGetAllCatalogObjectsResponse$TGetAllCatalogObjectsResponseStandardScheme.write(TGetAllCatalogObjectsResponse.java:487)
              at org.apache.impala.thrift.TGetAllCatalogObjectsResponse$TGetAllCatalogObjectsResponseStandardScheme.write(TGetAllCatalogObjectsResponse.java:421)
              at org.apache.impala.thrift.TGetAllCatalogObjectsResponse.write(TGetAllCatalogObjectsResponse.java:365)
              at org.apache.thrift.TSerializer.serialize(TSerializer.java:79)
              at org.apache.impala.service.JniCatalog.getCatalogObjects(JniCatalog.java:124)
      I0208 03:06:26.653929 67490 status.cc:114] ConcurrentModificationException: null
          @          0x11c5241  (unknown)
          @          0x1597bfe  (unknown)
          @          0x11b50b3  (unknown)
          @          0x1177fbb  (unknown)
          @          0x1192c33  (unknown)
          @          0x1191ba2  (unknown)
          @          0x1190ed9  (unknown)
          @          0x118fedb  (unknown)
          @          0x13430ce  (unknown)
          @          0x15e804b  (unknown)
          @          0x15ef024  (unknown)
          @          0x15eef67  (unknown)
          @          0x15eeec2  (unknown)
          @          0x1a5661a  (unknown)
          @       0x3b81a079d1  (unknown)
          @       0x3b816e88fd  (unknown)
      E0208 03:06:26.654156 67490 catalog-server.cc:282] ConcurrentModificationException: null
      

      (To quantify occasionally, last night's run had 3 such failures out of 2031 COMPUTE STATS statements, out of 10,000 total statements.)

      I haven't yet been able to figure out if there is some MT_DOP setting that makes this more likely to happen, or if we need concurrent COMPUTE STATS statements running concurrently, or what.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                alex.behm Alexander Behm
                Reporter:
                mikesbrown Michael Brown
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: