Uploaded image for project: 'CarbonData'
  1. CarbonData
  2. CARBONDATA-2446

load data from parquet table always failed

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 1.3.1
    • None
    • data-load

    Description

       I have a parquet table  and a carbon table. This table have 1 billion rows.
      parquet table :
      ============
      CREATE TABLE mc_idx3(
      COL_1 integer,
      COL_2  integer,
      COL_3  string,
      COL_4  integer,
      COL_5  string,
      COL_6  string,
      COL_7   string,
      COL_8   string,
      COL_9   integer,
      COL_10 long,
      COL_11 string,
      COL_12 string,
      COL_13 string,
      COL_14 string,
      COL_15 integer,
      COL_16 string,
      COL_17 Timestamp )
      STORED AS PARQUET;
      ==============
      carbon table:
      ===============
      CREATE TABLE mc_idxok_cd1(
      COL_1 integer,
      COL_2  integer,
      COL_3  string,
      COL_4  integer,
      COL_5  string,
      COL_6  string,
      COL_7   string,
      COL_8   string,
      COL_9   integer,
      COL_10 long,
      COL_11 string,
      COL_12 string,
      COL_13 string,
      COL_14 string,
      COL_15 integer,
      COL_16 string,
      COL_17 Timestamp )
      STORED BY 'carbondata'
      TBLPROPERTIES (
      'SORT_COLUMNS'='COL_17,COL_1');
      =============
      when I using insert into table mc_idxok_cd1 select * from mc_idx3.
      It always failed.
      ERROR LOG:
      org.apache.carbondata.processing.loading.exception.CarbonDataLoadingException: There is an unexpected error: org.apache.carbondata.core.datastore.exception.CarbonDataWriterException: Problem while copying file from local store to carbon store
      at org.apache.carbondata.processing.loading.steps.DataWriterProcessorStepImpl.execute(DataWriterProcessorStepImpl.java:123)
      at org.apache.carbondata.processing.loading.DataLoadExecutor.execute(DataLoadExecutor.java:51)
      at org.apache.carbondata.spark.rdd.NewDataFrameLoaderRDD$$anon$2.<init>(NewCarbonDataLoadRDD.scala:390)
      at org.apache.carbondata.spark.rdd.NewDataFrameLoaderRDD.internalCompute(NewCarbonDataLoadRDD.scala:353)
      at org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:60)
      at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
      at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
      at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
      at org.apache.spark.scheduler.Task.run(Task.scala:108)
      at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at java.lang.Thread.run(Thread.java:745)
      Caused by: org.apache.carbondata.processing.loading.exception.CarbonDataLoadingException: org.apache.carbondata.core.datastore.exception.CarbonDataWriterException: Problem while copying file from local store to carbon store
      at org.apache.carbondata.processing.loading.steps.DataWriterProcessorStepImpl.processingComplete(DataWriterProcessorStepImpl.java:162)
      at org.apache.carbondata.processing.loading.steps.DataWriterProcessorStepImpl.finish(DataWriterProcessorStepImpl.java:148)
      at org.apache.carbondata.processing.loading.steps.DataWriterProcessorStepImpl.execute(DataWriterProcessorStepImpl.java:112)

      Attachments

        Activity

          People

            Unassigned Unassigned
            oceaneast ocean
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: