Details
-
Sub-task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.9.0
-
None
-
None
Description
Test presto integration for HDFS environment as well in addition to S3.
Blockers faced so far
bdscheller I tried to apply your presto patch to test mor queries on Presto. The way I set it up was create a docker image from your presto patch and use that image in hudi local docker environment. I observed couple of issues there:
- I got NoClassDefFoundError for these classes:
- org/apache/parquet/avro/AvroSchemaConverter
- org/apache/parquet/hadoop/ParquetFileReader
- org/apache/parquet/io/InputFile
- org/apache/parquet/format/TypeDefinedOrder
I was able to get around the first three errors by shading org.apache.parquet inside hudi-presto-bundle and changing presto-hive to depend on the hudi-presto-bundle. However, for the last one shading dint help because its already a Thrift generated class. I am wondering you also ran into similar issues while testing S3.
Could you please elaborate your test set up so we can do similar thing for HDFS as well. If we need to add more changes to hudi-presto-bundle, we would need to prioritize that for 0.5.3 release asap.
Attachments
Issue Links
- blocks
-
HUDI-305 Presto MOR "_rt" queries only reads base parquet file
- Resolved