Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.1.0, 3.1.2, 3.2.0
-
None
-
None
Description
There is a race condition between creating a Hive client and loading classes that do not appear in shared prefixes config. For example, we confirmed that the code fails for the following configuration:
spark.sql.hive.metastore.version 0.13.0 spark.sql.hive.metastore.jars maven spark.sql.hive.metastore.sharedPrefixes <string that does not include com.amazonaws prefix> spark.hadoop.fs.s3a.impl org.apache.hadoop.fs.s3a.S3AFileSystem
And code:
-- Prerequisite commands to set up the table -- drop table if exists ivan_test_2; -- create table ivan_test_2 (a int, part string) using csv location 's3://bucket/hive-test' partitioned by (part); -- insert into ivan_test_2 values (1, 'a'); -- Command that triggers failure ALTER TABLE ivan_test_2 ADD PARTITION (part='b') LOCATION 's3://bucket/hive-test'
Stacktrace (line numbers might differ):
21/12/22 04:37:05 DEBUG IsolatedClientLoader: shared class: org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider 21/12/22 04:37:05 DEBUG IsolatedClientLoader: shared class: org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider 21/12/22 04:37:05 DEBUG IsolatedClientLoader: hive class: com.amazonaws.auth.EnvironmentVariableCredentialsProvider - null 21/12/22 04:37:05 ERROR S3AFileSystem: Failed to initialize S3AFileSystem for path s3://bucket/hive-test java.io.IOException: From option fs.s3a.aws.credentials.provider java.lang.ClassNotFoundException: Class com.amazonaws.auth.EnvironmentVariableCredentialsProvider not found at org.apache.hadoop.fs.s3a.S3AUtils.loadAWSProviderClasses(S3AUtils.java:725) at org.apache.hadoop.fs.s3a.S3AUtils.createAWSCredentialProviderSet(S3AUtils.java:688) at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:411) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3469) at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3574) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3521) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365) at org.apache.hadoop.hive.metastore.Warehouse.getFs(Warehouse.java:112) at org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:144) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createLocationForAddedPartition(HiveMetaStore.java:1993) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_partitions_core(HiveMetaStore.java:1865) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_partitions_req(HiveMetaStore.java:1910) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) at com.sun.proxy.$Proxy58.add_partitions_req(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.add_partitions(HiveMetaStoreClient.java:457) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89) at com.sun.proxy.$Proxy59.add_partitions(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.createPartitions(Hive.java:1514) at org.apache.spark.sql.hive.client.Shim_v0_13.createPartitions(HiveShim.scala:773) at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$createPartitions$1(HiveClientImpl.scala:683) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$withHiveState$1(HiveClientImpl.scala:346) at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$retryLocked$1(HiveClientImpl.scala:247) at org.apache.spark.sql.hive.client.HiveClientImpl.synchronizeOnObject(HiveClientImpl.scala:283) at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:239) at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:326) at org.apache.spark.sql.hive.client.HiveClientImpl.createPartitions(HiveClientImpl.scala:676) at org.apache.spark.sql.hive.client.PoolingHiveClient.$anonfun$createPartitions$1(PoolingHiveClient.scala:345) at org.apache.spark.sql.hive.client.PoolingHiveClient.$anonfun$createPartitions$1$adapted(PoolingHiveClient.scala:344) at org.apache.spark.sql.hive.client.PoolingHiveClient.withHiveClient(PoolingHiveClient.scala:112) at org.apache.spark.sql.hive.client.PoolingHiveClient.createPartitions(PoolingHiveClient.scala:344) at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$createPartitions$1(HiveExternalCatalog.scala:1170) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$withClient$2(HiveExternalCatalog.scala:150) at org.apache.spark.sql.hive.HiveExternalCatalog.maybeSynchronized(HiveExternalCatalog.scala:111) at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$withClient$1(HiveExternalCatalog.scala:149) at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:148) at org.apache.spark.sql.hive.HiveExternalCatalog.createPartitions(HiveExternalCatalog.scala:1152) at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.createPartitions(ExternalCatalogWithListener.scala:213) at org.apache.spark.sql.catalyst.catalog.SessionCatalogImpl.createPartitions(SessionCatalog.scala:1552) at org.apache.spark.sql.execution.command.AlterTableAddPartitionCommand.$anonfun$run$16(ddl.scala:530) at org.apache.spark.sql.execution.command.AlterTableAddPartitionCommand.$anonfun$run$16$adapted(ddl.scala:529) at scala.collection.Iterator.foreach(Iterator.scala:943) at scala.collection.Iterator.foreach$(Iterator.scala:943) at scala.collection.AbstractIterator.foreach(Iterator.scala:1431) at org.apache.spark.sql.execution.command.AlterTableAddPartitionCommand.run(ddl.scala:529) ... at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassNotFoundException: Class com.amazonaws.auth.EnvironmentVariableCredentialsProvider not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2571) at org.apache.hadoop.fs.s3a.S3AUtils.loadAWSProviderClasses(S3AUtils.java:722) ... 118 more
Based on the stacktrace, we call withHiveState method which sets IsolatedClientLoader.classLoader class loader for Hadoop configuration. All of the Hadoop configuration and file system code runs as a closure within withHiveState.
Hadoop configuration uses the set class loader to load classes with conf.getClassByNameOrNull() method. When isolated class loader tries to load the class and the class is not shared (com.amazonaws prefix is not shared), it will only look up the jars that are loaded with the class loader without checking the base class loader.
To mitigate the issue, you need to set:
- spark.sql.hive.metastore.sharedPrefixes <string that includes com.amazonaws prefix>
Or update credentials providers to not load com.amazonaws.auth.EnvironmentVariableCredentialsProvider.
I think we should revisit the mechanism of sharing classes in IsolatedClassLoader and maybe only enforcing so that Hive classes are loaded first instead of in isolation.
Attachments
Issue Links
- relates to
-
HADOOP-17372 S3A AWS Credential provider loading gets confused with isolated classloaders
- Resolved