Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-36669

Fail to load Lz4 codec

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 3.2.0
    • 3.2.0
    • SQL
    • None

    Description

      Currently we use Hadop 3.3.1's shaded client libraries. Lz4 is a provided dependency in Hadoop Common 3.3.1 for Lz4Codec. But it isn't excluded from relocation in these libraries. So to use lz4 as Parquet codec, we will hit the exception even we include lz4 as dependency.

      [info]   Cause: java.lang.NoClassDefFoundError: org/apache/hadoop/shaded/net/jpountz/lz4/LZ4Factory                                                                                            
      [info]   at org.apache.hadoop.io.compress.lz4.Lz4Compressor.<init>(Lz4Compressor.java:66)
      [info]   at org.apache.hadoop.io.compress.Lz4Codec.createCompressor(Lz4Codec.java:119)                                                                                                         
      [info]   at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:152)                                                                                                          
      [info]   at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:168)                                                                                                          
       

       

      I already submitted a PR to Hadoop to fix it. Before it is released, at Spark side, we either downgrade to 3.3.0 or revert back to non-shaded hadoop client library.

       

      Attachments

        Issue Links

          Activity

            People

              viirya L. C. Hsieh
              viirya L. C. Hsieh
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: