Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Not A Bug
-
None
-
None
-
spark-2.4.4-bin-hadoop2.7
Description
When I use Hudi to create a hudi table then write to s3, I used below maven snnipet which is recommended by https://hudi.apache.org/s3_hoodie.html
<dependency>
<groupId>org.apache.hudi</groupId>
<artifactId>hudi-spark-bundle</artifactId>
<version>0.5.0-incubating</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-aws</artifactId>
<version>2.7.3</version>
</dependency>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk</artifactId>
<version>1.10.34</version>
</dependency>
and add the below configuration:
sc.hadoopConfiguration.set("fs.defaultFS", "s3://niketest1")
sc.hadoopConfiguration.set("fs.s3.impl", "org.apache.hadoop.fs.s3native.NativeS3FileSystem")
sc.hadoopConfiguration.set("fs.s3n.impl", "org.apache.hadoop.fs.s3native.NativeS3FileSystem")
sc.hadoopConfiguration.set("fs.s3.awsAccessKeyId", "xxxxxx")
sc.hadoopConfiguration.set("fs.s3.awsSecretAccessKey", "xxxxx")
sc.hadoopConfiguration.set("fs.s3n.awsAccessKeyId", "xxxxxx")
sc.hadoopConfiguration.set("fs.s3n.awsSecretAccessKey", "xxxxx")
my spark version is spark-2.4.4-bin-hadoop2.7 and when I run below
df.write.format("org.apache.hudi").options(hudiOptions).mode(SaveMode.Overwrite).save(hudiTablePath).
val hudiOptions = Map[String,String](
HoodieWriteConfig.TABLE_NAME -> "hudi12",
DataSourceWriteOptions.OPERATION_OPT_KEY -> DataSourceWriteOptions.INSERT_OPERATION_OPT_VAL,
DataSourceWriteOptions.PRECOMBINE_FIELD_OPT_KEY -> "rider",
DataSourceWriteOptions.STORAGE_TYPE_OPT_KEY -> DataSourceWriteOptions.MOR_STORAGE_TYPE_OPT_VAL)
val hudiTablePath = "s3://niketest1/hudi_test/hudi12"
the exception occur:
java.lang.IllegalArgumentException: BlockAlignedAvroParquetWriter does not support scheme s3n
at org.apache.hudi.common.io.storage.HoodieWrapperFileSystem.getHoodieScheme(HoodieWrapperFileSystem.java:109)
at org.apache.hudi.common.io.storage.HoodieWrapperFileSystem.convertToHoodiePath(HoodieWrapperFileSystem.java:85)
at org.apache.hudi.io.storage.HoodieParquetWriter.<init>(HoodieParquetWriter.java:57)
at org.apache.hudi.io.storage.HoodieStorageWriterFactory.newParquetStorageWriter(HoodieStorageWriterFactory.java:60)
at org.apache.hudi.io.storage.HoodieStorageWriterFactory.getStorageWriter(HoodieStorageWriterFactory.java:44)
at org.apache.hudi.io.HoodieCreateHandle.<init>(HoodieCreateHandle.java:70)
at org.apache.hudi.func.CopyOnWriteLazyInsertIterable$CopyOnWriteInsertHandler.consumeOneRecord(CopyOnWriteLazyInsertIterable.java:137)
at org.apache.hudi.func.CopyOnWriteLazyInsertIterable$CopyOnWriteInsertHandler.consumeOneRecord(CopyOnWriteLazyInsertIterable.java:125)
at org.apache.hudi.common.util.queue.BoundedInMemoryQueueConsumer.consume(BoundedInMemoryQueueConsumer.java:38)
at org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$2(BoundedInMemoryExecutor.java:120)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Is anyone can tell me what's cause this exception, I tried to use org.apache.hadoop.fs.s3.S3FileSystem to replace org.apache.hadoop.fs.s3native.NativeS3FileSystem for the conf "fs.s3.impl", but other exception occur and it seems org.apache.hadoop.fs.s3.S3FileSystem fit hadoop 2.6.
Thanks advance.
Attachments
Issue Links
- is depended upon by
-
HUDI-901 Bug Bash 0.6.0 Tracking Ticket
- Resolved