Uploaded image for project: 'CarbonData'
  1. CarbonData
  2. CARBONDATA-2880

Huge data Load fails when sort_scope is no_sort

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: core
    • Labels:
      None

      Description

      Huge data load fails when the sort_scope is no_sort for a carbon table. 

      CTAS also fails for huge data load with no_sort option

       

      Steps: Create a carbon table with tblproperties sort_scope=no_sort

                 Load huge data with default memory configurations

       Expected: Load should be successful

       Observed: Load fails

       java.io.IOException: org.apache.carbondata.core.scan.executor.exception.QueryExecutionException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.carbondata.core.memory.MemoryException: Not enough memory
      at org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.close(VectorizedCarbonRecordReader.java:173)
      at org.apache.carbondata.spark.rdd.QueryTaskCompletionListener.onTaskCompletion(QueryTaskCompletionListener.scala:41)
      at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:117)
      at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:117)
      at org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:130)
      at org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:128)
      at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
      at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
      at org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:128)
      at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116)
      at org.apache.spark.scheduler.Task.run(Task.scala:109)
      at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:325)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      at java.lang.Thread.run(Thread.java:748)
      Caused by: org.apache.carbondata.core.scan.executor.exception.QueryExecutionException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.carbondata.core.memory.MemoryException: Not enough memory
      at org.apache.carbondata.core.scan.executor.impl.AbstractQueryExecutor.finish(AbstractQueryExecutor.java:661)
      at org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.close(VectorizedCarbonRecordReader.java:171)
      ... 14 more

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              surbhijoshi Surbhi Joshi
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: