Uploaded image for project: 'CarbonData'
  1. CarbonData
  2. CARBONDATA-989

decompressing error while load 'gz' and 'bz2' data into table

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.2.0, 1.1.1
    • Component/s: None
    • Labels:
      None
    • Environment:
      spark 2.1.0
      hadoop 2.6.0 - CDH 5.5.2

      Description

      Run command in spark shell:
      import org.apache.spark.sql.SparkSession
      import org.apache.spark.sql.CarbonSession._
      val carbon = SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession("hdfs://nsha/user/ranmx/test/carbon")
      carbon.sql("CREATE TABLE IF NOT EXISTS test_table(id string, name string, city string, age Int) STORED BY 'carbondata'")
      carbon.sql("LOAD DATA inpath '/ranmx/test/sh.csv.bz2' INTO TABLE test_table")

      get error:
      17/04/26 11:11:26 ERROR LoadTable: main
      java.lang.NullPointerException
      at org.apache.hadoop.io.compress.bzip2.Bzip2Factory.isNativeBzip2Loaded(Bzip2Factory.java:54)
      at org.apache.hadoop.io.compress.bzip2.Bzip2Factory.getBzip2DecompressorType(Bzip2Factory.java:120)
      at org.apache.hadoop.io.compress.BZip2Codec.getDecompressorType(BZip2Codec.java:242)
      at org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:176)
      at org.apache.hadoop.io.compress.CompressionCodec$Util.createInputStreamWithCodecPool(CompressionCodec.java:157)
      at org.apache.hadoop.io.compress.BZip2Codec.createInputStream(BZip2Codec.java:157)
      at org.apache.carbondata.core.datastore.impl.FileFactory.getDataInputStream(FileFactory.java:139)
      at org.apache.carbondata.core.datastore.impl.FileFactory.getDataInputStream(FileFactory.java:104)
      at org.apache.carbondata.core.util.CarbonUtil.readHeader(CarbonUtil.java:1273)
      at org.apache.carbondata.spark.util.CommonUtil$.getCsvHeaderColumns(CommonUtil.scala:319)
      at org.apache.spark.sql.execution.command.LoadTable.run(carbonTableSchema.scala:474)
      ...

        Attachments

          Activity

            People

            • Assignee:
              ranmx Ran Mingxuan
              Reporter:
              ranmx Ran Mingxuan
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 24h
                24h
                Remaining:
                Time Spent - 4h 40m Remaining Estimate - 19h 20m
                19h 20m
                Logged:
                Time Spent - 4h 40m Remaining Estimate - 19h 20m
                4h 40m