Uploaded image for project: 'CarbonData'
  1. CarbonData
  2. CARBONDATA-2136

Exception displays while loading data with BAD_RECORDS_ACTION = REDIRECT

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.3.0
    • None
    • data-load
    • None
    • spark 2.1

    Description

      Exception displays while loading data with BAD_RECORDS_ACTION = REDIRECT

      Steps to reproduce:

      1) create the table:

      CREATE TABLE uniqdata(CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES('DICTIONARY_INCLUDE'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1',"TABLE_BLOCKSIZE"= "256 MB",'SORT_SCOPE'='NO_SORT','NO_INVERTED_INDEX'='CUST_ID,CUST_NAME,Double_COLUMN1,DECIMAL_COLUMN2');

      2) Load Data:

      LOAD DATA INPATH 'hdfs://localhost:54310/Data/uniqdata/2000_UniqData.csv' into table uniqdata OPTIONS('DELIMITER'=',', 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='REDIRECT','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');

      Expected Result: data should be loaded successfully.

      Actual Result:

      Error: java.lang.Exception: DataLoad failure: There is an unexpected error: unable to generate the mdkey (state=,code=0)

       

      3) ThriftServer logs: 

      18/02/06 16:38:11 INFO SparkExecuteStatementOperation: Running query 'LOAD DATA INPATH 'hdfs://localhost:54310/Data/uniqdata/2000_UniqData.csv' into table uniqdata OPTIONS('DELIMITER'=',', 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='REDIRECT','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1')' with 87eb4af5-e485-4a0b-bcae-6589f1252291
      18/02/06 16:38:11 INFO CarbonSparkSqlParser: Parsing command: LOAD DATA INPATH 'hdfs://localhost:54310/Data/uniqdata/2000_UniqData.csv' into table uniqdata OPTIONS('DELIMITER'=',', 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='REDIRECT','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1')
      18/02/06 16:38:11 INFO CarbonLateDecodeRule: pool-23-thread-41 skip CarbonOptimizer
      18/02/06 16:38:11 INFO CarbonLateDecodeRule: pool-23-thread-41 Skip CarbonOptimizer
      18/02/06 16:38:11 INFO HiveMetaStore: 42: get_table : db=bug tbl=uniqdata
      18/02/06 16:38:11 INFO audit: ugi=hduser ip=unknown-ip-addr cmd=get_table : db=bug tbl=uniqdata
      18/02/06 16:38:11 INFO HiveMetaStore: 42: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
      18/02/06 16:38:11 INFO ObjectStore: ObjectStore, initialize called
      18/02/06 16:38:11 INFO Query: Reading in results for query "org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection used is closing
      18/02/06 16:38:11 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
      18/02/06 16:38:11 INFO ObjectStore: Initialized ObjectStore
      18/02/06 16:38:11 INFO CatalystSqlParser: Parsing command: array<string>
      18/02/06 16:38:11 INFO CarbonLoadDataCommand: pool-23-thread-41 Deleting stale folders if present for table bug.uniqdata
      18/02/06 16:38:11 INFO CarbonLoadDataCommand: pool-23-thread-41 Initiating Direct Load for the Table : (bug.uniqdata)
      18/02/06 16:38:12 INFO HdfsFileLock: pool-23-thread-41 HDFS lock path:hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/tablestatus.lock
      18/02/06 16:38:12 INFO HdfsFileLock: pool-23-thread-41 HDFS lock path:hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/Segment_1.lock
      18/02/06 16:38:12 INFO DeleteLoadFolders: pool-23-thread-41 Info: Deleted the load 1
      18/02/06 16:38:12 INFO DeleteLoadFolders: pool-23-thread-41 Info: Segment lock on segment:1 is released
      18/02/06 16:38:12 INFO DataLoadingUtil$: pool-23-thread-41 Table status lock has been successfully acquired.
      18/02/06 16:38:12 INFO HdfsFileLock: pool-23-thread-41 Deleted the lock file hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/tablestatus.lock
      18/02/06 16:38:12 INFO CarbonLockUtil: pool-23-thread-41 Table status lock has been successfully released
      18/02/06 16:38:12 WARN DeleteLoadFolders: pool-23-thread-41 Files are not found in segment hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/Fact/Part0/Segment_0 it seems, files are already being deleted
      18/02/06 16:38:12 WARN DeleteLoadFolders: pool-23-thread-41 Files are not found in segment hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/Fact/Part0/Segment_1 it seems, files are already being deleted
      18/02/06 16:38:12 INFO HdfsFileLock: pool-23-thread-41 HDFS lock path:hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/tablestatus.lock
      18/02/06 16:38:12 INFO CarbonLoaderUtil: pool-23-thread-41 Acquired lock for tablebug.uniqdata for table status updation
      18/02/06 16:38:12 INFO HdfsFileLock: pool-23-thread-41 Deleted the lock file hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/tablestatus.lock
      18/02/06 16:38:12 INFO CarbonLoaderUtil: pool-23-thread-41 Table unlocked successfully after table status updationbug.uniqdata
      18/02/06 16:38:12 INFO GlobalDictionaryUtil$: pool-23-thread-41 Generate global dictionary from source data files!
      18/02/06 16:38:12 INFO MemoryStore: Block broadcast_36 stored as values in memory (estimated size 293.6 KB, free 2.5 GB)
      18/02/06 16:38:12 INFO MemoryStore: Block broadcast_36_piece0 stored as bytes in memory (estimated size 24.7 KB, free 2.5 GB)
      18/02/06 16:38:12 INFO BlockManagerInfo: Added broadcast_36_piece0 in memory on 192.168.2.160:44339 (size: 24.7 KB, free: 2.5 GB)
      18/02/06 16:38:12 INFO SparkContext: Created broadcast 36 from NewHadoopRDD at GlobalDictionaryUtil.scala:377
      18/02/06 16:38:12 INFO SparkContext: Starting job: collect at GlobalDictionaryUtil.scala:755
      18/02/06 16:38:12 INFO FileInputFormat: Total input paths to process : 1
      18/02/06 16:38:12 INFO DAGScheduler: Registering RDD 98 (RDD at CarbonRDD.scala:33)
      18/02/06 16:38:12 INFO DAGScheduler: Got job 22 (collect at GlobalDictionaryUtil.scala:755) with 10 output partitions
      18/02/06 16:38:12 INFO DAGScheduler: Final stage: ResultStage 30 (collect at GlobalDictionaryUtil.scala:755)
      18/02/06 16:38:12 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 29)
      18/02/06 16:38:12 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 29)
      18/02/06 16:38:12 INFO DAGScheduler: Submitting ShuffleMapStage 29 (CarbonBlockDistinctValuesCombineRDD[98] at RDD at CarbonRDD.scala:33), which has no missing parents
      18/02/06 16:38:12 INFO MemoryStore: Block broadcast_37 stored as values in memory (estimated size 11.3 KB, free 2.5 GB)
      18/02/06 16:38:12 INFO MemoryStore: Block broadcast_37_piece0 stored as bytes in memory (estimated size 5.8 KB, free 2.5 GB)
      18/02/06 16:38:12 INFO BlockManagerInfo: Added broadcast_37_piece0 in memory on 192.168.2.160:44339 (size: 5.8 KB, free: 2.5 GB)
      18/02/06 16:38:12 INFO SparkContext: Created broadcast 37 from broadcast at DAGScheduler.scala:996
      18/02/06 16:38:12 INFO DAGScheduler: Submitting 1 missing tasks from ShuffleMapStage 29 (CarbonBlockDistinctValuesCombineRDD[98] at RDD at CarbonRDD.scala:33)
      18/02/06 16:38:12 INFO TaskSchedulerImpl: Adding task set 29.0 with 1 tasks
      18/02/06 16:38:12 INFO TaskSetManager: Starting task 0.0 in stage 29.0 (TID 92, localhost, executor driver, partition 0, ANY, 6597 bytes)
      18/02/06 16:38:12 INFO Executor: Running task 0.0 in stage 29.0 (TID 92)
      18/02/06 16:38:12 INFO NewHadoopRDD: Input split: hdfs://localhost:54310/Data/uniqdata/2000_UniqData.csv:0+376223
      18/02/06 16:38:12 INFO Executor: Finished task 0.0 in stage 29.0 (TID 92). 1343 bytes result sent to driver
      18/02/06 16:38:12 INFO TaskSetManager: Finished task 0.0 in stage 29.0 (TID 92) in 108 ms on localhost (executor driver) (1/1)
      18/02/06 16:38:12 INFO TaskSchedulerImpl: Removed TaskSet 29.0, whose tasks have all completed, from pool
      18/02/06 16:38:12 INFO DAGScheduler: ShuffleMapStage 29 (RDD at CarbonRDD.scala:33) finished in 0.107 s
      18/02/06 16:38:12 INFO DAGScheduler: looking for newly runnable stages
      18/02/06 16:38:12 INFO DAGScheduler: running: Set()
      18/02/06 16:38:12 INFO DAGScheduler: waiting: Set(ResultStage 30)
      18/02/06 16:38:12 INFO DAGScheduler: failed: Set()
      18/02/06 16:38:12 INFO DAGScheduler: Submitting ResultStage 30 (CarbonGlobalDictionaryGenerateRDD[100] at RDD at CarbonRDD.scala:33), which has no missing parents
      18/02/06 16:38:12 INFO MemoryStore: Block broadcast_38 stored as values in memory (estimated size 10.7 KB, free 2.5 GB)
      18/02/06 16:38:12 INFO MemoryStore: Block broadcast_38_piece0 stored as bytes in memory (estimated size 5.4 KB, free 2.5 GB)
      18/02/06 16:38:12 INFO BlockManagerInfo: Added broadcast_38_piece0 in memory on 192.168.2.160:44339 (size: 5.4 KB, free: 2.5 GB)
      18/02/06 16:38:12 INFO SparkContext: Created broadcast 38 from broadcast at DAGScheduler.scala:996
      18/02/06 16:38:12 INFO DAGScheduler: Submitting 10 missing tasks from ResultStage 30 (CarbonGlobalDictionaryGenerateRDD[100] at RDD at CarbonRDD.scala:33)
      18/02/06 16:38:12 INFO TaskSchedulerImpl: Adding task set 30.0 with 10 tasks
      18/02/06 16:38:12 INFO TaskSetManager: Starting task 0.0 in stage 30.0 (TID 93, localhost, executor driver, partition 0, ANY, 6314 bytes)
      18/02/06 16:38:12 INFO TaskSetManager: Starting task 1.0 in stage 30.0 (TID 94, localhost, executor driver, partition 1, ANY, 6314 bytes)
      18/02/06 16:38:12 INFO TaskSetManager: Starting task 2.0 in stage 30.0 (TID 95, localhost, executor driver, partition 2, ANY, 6314 bytes)
      18/02/06 16:38:12 INFO TaskSetManager: Starting task 3.0 in stage 30.0 (TID 96, localhost, executor driver, partition 3, ANY, 6314 bytes)
      18/02/06 16:38:12 INFO Executor: Running task 0.0 in stage 30.0 (TID 93)
      18/02/06 16:38:12 INFO HdfsFileLock: Executor task launch worker-26 HDFS lock path:hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/8c24301c-143c-47f9-8bc7-4bf122787b70.lock
      18/02/06 16:38:12 INFO ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks
      18/02/06 16:38:12 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
      18/02/06 16:38:12 INFO Executor: Running task 1.0 in stage 30.0 (TID 94)
      18/02/06 16:38:12 INFO Executor: Running task 2.0 in stage 30.0 (TID 95)
      18/02/06 16:38:12 INFO HdfsFileLock: Executor task launch worker-28 HDFS lock path:hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/fa04704a-f022-4d7e-b8e9-078357d50e84.lock
      18/02/06 16:38:12 INFO ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks
      18/02/06 16:38:12 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
      18/02/06 16:38:12 INFO HdfsFileLock: Executor task launch worker-27 HDFS lock path:hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/c65a6129-0e3d-4686-8176-b82c200e9e81.lock
      18/02/06 16:38:12 INFO ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks
      18/02/06 16:38:12 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
      18/02/06 16:38:12 INFO Executor: Running task 3.0 in stage 30.0 (TID 96)
      18/02/06 16:38:12 INFO HdfsFileLock: Executor task launch worker-29 HDFS lock path:hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/e1872e3d-3c67-4cc5-9356-2e6c614725db.lock
      18/02/06 16:38:12 INFO ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks
      18/02/06 16:38:12 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Successfully able to get the dictionary lock for cust_id
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Executor task launch worker-26
      columnName: cust_id
      columnId: 8c24301c-143c-47f9-8bc7-4bf122787b70
      new distinct values count: 0
      combine lists: 2
      create dictionary cache: 1
      sort list, distinct and write: 1
      write sort info: 0
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Successfully able to get the dictionary lock for cust_name
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Successfully able to get the dictionary lock for active_emui_version
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Successfully able to get the dictionary lock for bigint_column1
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Executor task launch worker-29
      columnName: bigint_column1
      columnId: e1872e3d-3c67-4cc5-9356-2e6c614725db
      new distinct values count: 0
      combine lists: 4
      create dictionary cache: 2
      sort list, distinct and write: 1
      write sort info: 0
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Executor task launch worker-28
      columnName: active_emui_version
      columnId: fa04704a-f022-4d7e-b8e9-078357d50e84
      new distinct values count: 0
      combine lists: 3
      create dictionary cache: 4
      sort list, distinct and write: 1
      write sort info: 0
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Executor task launch worker-27
      columnName: cust_name
      columnId: c65a6129-0e3d-4686-8176-b82c200e9e81
      new distinct values count: 0
      combine lists: 2
      create dictionary cache: 6
      sort list, distinct and write: 1
      write sort info: 0
      18/02/06 16:38:12 INFO HdfsFileLock: Executor task launch worker-26 Deleted the lock file hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/8c24301c-143c-47f9-8bc7-4bf122787b70.lock
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Dictionary cust_id Unlocked Successfully.
      18/02/06 16:38:12 INFO Executor: Finished task 0.0 in stage 30.0 (TID 93). 1728 bytes result sent to driver
      18/02/06 16:38:12 INFO TaskSetManager: Starting task 4.0 in stage 30.0 (TID 97, localhost, executor driver, partition 4, ANY, 6314 bytes)
      18/02/06 16:38:12 INFO TaskSetManager: Finished task 0.0 in stage 30.0 (TID 93) in 123 ms on localhost (executor driver) (1/10)
      18/02/06 16:38:12 INFO Executor: Running task 4.0 in stage 30.0 (TID 97)
      18/02/06 16:38:12 INFO HdfsFileLock: Executor task launch worker-26 HDFS lock path:hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/4108c641-a578-4415-b3ed-6ccb1a587d9b.lock
      18/02/06 16:38:12 INFO ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks
      18/02/06 16:38:12 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
      18/02/06 16:38:12 INFO HdfsFileLock: Executor task launch worker-27 Deleted the lock file hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/c65a6129-0e3d-4686-8176-b82c200e9e81.lock
      18/02/06 16:38:12 INFO HdfsFileLock: Executor task launch worker-28 Deleted the lock file hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/fa04704a-f022-4d7e-b8e9-078357d50e84.lock
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Dictionary cust_name Unlocked Successfully.
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Dictionary active_emui_version Unlocked Successfully.
      18/02/06 16:38:12 INFO HdfsFileLock: Executor task launch worker-29 Deleted the lock file hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/e1872e3d-3c67-4cc5-9356-2e6c614725db.lock
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Dictionary bigint_column1 Unlocked Successfully.
      18/02/06 16:38:12 INFO Executor: Finished task 2.0 in stage 30.0 (TID 95). 1728 bytes result sent to driver
      18/02/06 16:38:12 INFO Executor: Finished task 1.0 in stage 30.0 (TID 94). 1728 bytes result sent to driver
      18/02/06 16:38:12 INFO Executor: Finished task 3.0 in stage 30.0 (TID 96). 1728 bytes result sent to driver
      18/02/06 16:38:12 INFO TaskSetManager: Starting task 5.0 in stage 30.0 (TID 98, localhost, executor driver, partition 5, ANY, 6314 bytes)
      18/02/06 16:38:12 INFO Executor: Running task 5.0 in stage 30.0 (TID 98)
      18/02/06 16:38:12 INFO TaskSetManager: Starting task 6.0 in stage 30.0 (TID 99, localhost, executor driver, partition 6, ANY, 6314 bytes)
      18/02/06 16:38:12 INFO Executor: Running task 6.0 in stage 30.0 (TID 99)
      18/02/06 16:38:12 INFO TaskSetManager: Starting task 7.0 in stage 30.0 (TID 100, localhost, executor driver, partition 7, ANY, 6314 bytes)
      18/02/06 16:38:12 INFO HdfsFileLock: Executor task launch worker-29 HDFS lock path:hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/34778415-43d0-4d19-917b-52d079c9284f.lock
      18/02/06 16:38:12 INFO HdfsFileLock: Executor task launch worker-27 HDFS lock path:hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/08a717d5-4a69-4fde-a14f-5f3bd762c149.lock
      18/02/06 16:38:12 INFO TaskSetManager: Finished task 2.0 in stage 30.0 (TID 95) in 135 ms on localhost (executor driver) (2/10)
      18/02/06 16:38:12 INFO TaskSetManager: Finished task 1.0 in stage 30.0 (TID 94) in 136 ms on localhost (executor driver) (3/10)
      18/02/06 16:38:12 INFO TaskSetManager: Finished task 3.0 in stage 30.0 (TID 96) in 136 ms on localhost (executor driver) (4/10)
      18/02/06 16:38:12 INFO Executor: Running task 7.0 in stage 30.0 (TID 100)
      18/02/06 16:38:12 INFO ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks
      18/02/06 16:38:12 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
      18/02/06 16:38:12 INFO ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks
      18/02/06 16:38:12 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
      18/02/06 16:38:12 INFO HdfsFileLock: Executor task launch worker-28 HDFS lock path:hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/67332d44-9548-473d-b171-ea410122f773.lock
      18/02/06 16:38:12 INFO ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks
      18/02/06 16:38:12 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Successfully able to get the dictionary lock for bigint_column2
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Executor task launch worker-26
      columnName: bigint_column2
      columnId: 4108c641-a578-4415-b3ed-6ccb1a587d9b
      new distinct values count: 0
      combine lists: 1
      create dictionary cache: 1
      sort list, distinct and write: 2
      write sort info: 0
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Successfully able to get the dictionary lock for decimal_column1
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Successfully able to get the dictionary lock for double_column1
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Successfully able to get the dictionary lock for decimal_column2
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Executor task launch worker-27
      columnName: decimal_column2
      columnId: 08a717d5-4a69-4fde-a14f-5f3bd762c149
      new distinct values count: 0
      combine lists: 3
      create dictionary cache: 2
      sort list, distinct and write: 6
      write sort info: 0
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Executor task launch worker-29
      columnName: decimal_column1
      columnId: 34778415-43d0-4d19-917b-52d079c9284f
      new distinct values count: 0
      combine lists: 3
      create dictionary cache: 2
      sort list, distinct and write: 6
      write sort info: 0
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Executor task launch worker-28
      columnName: double_column1
      columnId: 67332d44-9548-473d-b171-ea410122f773
      new distinct values count: 0
      combine lists: 1
      create dictionary cache: 8
      sort list, distinct and write: 0
      write sort info: 0
      18/02/06 16:38:12 INFO HdfsFileLock: Executor task launch worker-26 Deleted the lock file hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/4108c641-a578-4415-b3ed-6ccb1a587d9b.lock
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Dictionary bigint_column2 Unlocked Successfully.
      18/02/06 16:38:12 INFO Executor: Finished task 4.0 in stage 30.0 (TID 97). 1801 bytes result sent to driver
      18/02/06 16:38:12 INFO TaskSetManager: Starting task 8.0 in stage 30.0 (TID 101, localhost, executor driver, partition 8, ANY, 6314 bytes)
      18/02/06 16:38:12 INFO TaskSetManager: Finished task 4.0 in stage 30.0 (TID 97) in 118 ms on localhost (executor driver) (5/10)
      18/02/06 16:38:12 INFO Executor: Running task 8.0 in stage 30.0 (TID 101)
      18/02/06 16:38:12 INFO HdfsFileLock: Executor task launch worker-26 HDFS lock path:hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/8b0db910-ca20-4a95-9f9c-fe94fa460b27.lock
      18/02/06 16:38:12 INFO ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks
      18/02/06 16:38:12 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
      18/02/06 16:38:12 INFO BlockManagerInfo: Removed broadcast_33_piece0 on 192.168.2.160:44339 in memory (size: 5.8 KB, free: 2.5 GB)
      18/02/06 16:38:12 INFO BlockManagerInfo: Removed broadcast_34_piece0 on 192.168.2.160:44339 in memory (size: 5.4 KB, free: 2.5 GB)
      18/02/06 16:38:12 INFO BlockManagerInfo: Removed broadcast_35_piece0 on 192.168.2.160:44339 in memory (size: 30.7 KB, free: 2.5 GB)
      18/02/06 16:38:12 INFO BlockManagerInfo: Removed broadcast_37_piece0 on 192.168.2.160:44339 in memory (size: 5.8 KB, free: 2.5 GB)
      18/02/06 16:38:12 INFO HdfsFileLock: Executor task launch worker-28 Deleted the lock file hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/67332d44-9548-473d-b171-ea410122f773.lock
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Dictionary double_column1 Unlocked Successfully.
      18/02/06 16:38:12 INFO Executor: Finished task 7.0 in stage 30.0 (TID 100). 1801 bytes result sent to driver
      18/02/06 16:38:12 INFO BlockManagerInfo: Removed broadcast_31_piece0 on 192.168.2.160:44339 in memory (size: 30.6 KB, free: 2.5 GB)
      18/02/06 16:38:12 INFO TaskSetManager: Starting task 9.0 in stage 30.0 (TID 102, localhost, executor driver, partition 9, ANY, 6314 bytes)
      18/02/06 16:38:12 INFO Executor: Running task 9.0 in stage 30.0 (TID 102)
      18/02/06 16:38:12 INFO TaskSetManager: Finished task 7.0 in stage 30.0 (TID 100) in 123 ms on localhost (executor driver) (6/10)
      18/02/06 16:38:12 INFO HdfsFileLock: Executor task launch worker-28 HDFS lock path:hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/a5d84076-9ba2-4c63-88cc-6881452eccfb.lock
      18/02/06 16:38:12 INFO ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks
      18/02/06 16:38:12 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
      18/02/06 16:38:12 INFO BlockManagerInfo: Removed broadcast_32_piece0 on 192.168.2.160:44339 in memory (size: 24.7 KB, free: 2.5 GB)
      18/02/06 16:38:12 INFO ContextCleaner: Cleaned shuffle 6
      18/02/06 16:38:12 INFO HdfsFileLock: Executor task launch worker-27 Deleted the lock file hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/08a717d5-4a69-4fde-a14f-5f3bd762c149.lock
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Dictionary decimal_column2 Unlocked Successfully.
      18/02/06 16:38:12 INFO HdfsFileLock: Executor task launch worker-29 Deleted the lock file hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/34778415-43d0-4d19-917b-52d079c9284f.lock
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Dictionary decimal_column1 Unlocked Successfully.
      18/02/06 16:38:12 INFO Executor: Finished task 5.0 in stage 30.0 (TID 98). 1801 bytes result sent to driver
      18/02/06 16:38:12 INFO Executor: Finished task 6.0 in stage 30.0 (TID 99). 1801 bytes result sent to driver
      18/02/06 16:38:12 INFO TaskSetManager: Finished task 5.0 in stage 30.0 (TID 98) in 133 ms on localhost (executor driver) (7/10)
      18/02/06 16:38:12 INFO TaskSetManager: Finished task 6.0 in stage 30.0 (TID 99) in 133 ms on localhost (executor driver) (8/10)
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Successfully able to get the dictionary lock for double_column2
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Executor task launch worker-26
      columnName: double_column2
      columnId: 8b0db910-ca20-4a95-9f9c-fe94fa460b27
      new distinct values count: 0
      combine lists: 0
      create dictionary cache: 2
      sort list, distinct and write: 0
      write sort info: 0
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Successfully able to get the dictionary lock for integer_column1
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Executor task launch worker-28
      columnName: integer_column1
      columnId: a5d84076-9ba2-4c63-88cc-6881452eccfb
      new distinct values count: 0
      combine lists: 2
      create dictionary cache: 2
      sort list, distinct and write: 2
      write sort info: 0
      18/02/06 16:38:12 INFO HdfsFileLock: Executor task launch worker-26 Deleted the lock file hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/8b0db910-ca20-4a95-9f9c-fe94fa460b27.lock
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Dictionary double_column2 Unlocked Successfully.
      18/02/06 16:38:12 INFO Executor: Finished task 8.0 in stage 30.0 (TID 101). 1728 bytes result sent to driver
      18/02/06 16:38:12 INFO TaskSetManager: Finished task 8.0 in stage 30.0 (TID 101) in 204 ms on localhost (executor driver) (9/10)
      18/02/06 16:38:12 INFO HdfsFileLock: Executor task launch worker-28 Deleted the lock file hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/a5d84076-9ba2-4c63-88cc-6881452eccfb.lock
      18/02/06 16:38:12 INFO CarbonGlobalDictionaryGenerateRDD: Dictionary integer_column1 Unlocked Successfully.
      18/02/06 16:38:12 INFO Executor: Finished task 9.0 in stage 30.0 (TID 102). 1728 bytes result sent to driver
      18/02/06 16:38:12 INFO TaskSetManager: Finished task 9.0 in stage 30.0 (TID 102) in 200 ms on localhost (executor driver) (10/10)
      18/02/06 16:38:12 INFO TaskSchedulerImpl: Removed TaskSet 30.0, whose tasks have all completed, from pool
      18/02/06 16:38:12 INFO DAGScheduler: ResultStage 30 (collect at GlobalDictionaryUtil.scala:755) finished in 0.457 s
      18/02/06 16:38:12 INFO DAGScheduler: Job 22 finished: collect at GlobalDictionaryUtil.scala:755, took 0.601328 s
      18/02/06 16:38:12 INFO GlobalDictionaryUtil$: pool-23-thread-41 generate global dictionary successfully
      18/02/06 16:38:12 AUDIT CarbonDataRDDFactory$: [knoldus][hduser][Thread-1336]Data load request has been received for table bug.uniqdata
      18/02/06 16:38:13 WARN CarbonDataProcessorUtil: pool-23-thread-41 sort scope is set to NO_SORT
      18/02/06 16:38:13 INFO HdfsFileLock: pool-23-thread-41 HDFS lock path:hdfs://localhost:54310/opt/prestocarbonStore/bug/uniqdata/Segment_2.lock
      18/02/06 16:38:13 INFO CommonUtil$: pool-23-thread-41 [Block Distribution]
      18/02/06 16:38:13 INFO CommonUtil$: pool-23-thread-41 totalInputSpaceConsumed: 376223 , defaultParallelism: 4
      18/02/06 16:38:13 INFO CommonUtil$: pool-23-thread-41 mapreduce.input.fileinputformat.split.maxsize: 16777216
      18/02/06 16:38:13 INFO FileInputFormat: Total input paths to process : 1
      18/02/06 16:38:13 INFO DistributionUtil$: pool-23-thread-41 Executors configured : 1
      18/02/06 16:38:13 INFO DistributionUtil$: pool-23-thread-41 Total Time taken to ensure the required executors : 1
      18/02/06 16:38:13 INFO DistributionUtil$: pool-23-thread-41 Time elapsed to allocate the required executors: 0
      18/02/06 16:38:13 INFO CarbonDataRDDFactory$: pool-23-thread-41 Total Time taken in block allocation: 1
      18/02/06 16:38:13 INFO CarbonDataRDDFactory$: pool-23-thread-41 Total no of blocks: 1, No.of Nodes: 1
      18/02/06 16:38:13 INFO CarbonDataRDDFactory$: pool-23-thread-41 #Node: knoldus no.of.blocks: 1

      18/02/06 16:38:13 INFO SparkContext: Starting job: collect at CarbonDataRDDFactory.scala:1092
      18/02/06 16:38:13 INFO DAGScheduler: Got job 23 (collect at CarbonDataRDDFactory.scala:1092) with 1 output partitions
      18/02/06 16:38:13 INFO DAGScheduler: Final stage: ResultStage 31 (collect at CarbonDataRDDFactory.scala:1092)
      18/02/06 16:38:13 INFO DAGScheduler: Parents of final stage: List()
      18/02/06 16:38:13 INFO DAGScheduler: Missing parents: List()
      18/02/06 16:38:13 INFO DAGScheduler: Submitting ResultStage 31 (NewCarbonDataLoadRDD[101] at RDD at CarbonRDD.scala:33), which has no missing parents
      18/02/06 16:38:13 INFO NewCarbonDataLoadRDD: Preferred Location for split : knoldus
      18/02/06 16:38:13 INFO MemoryStore: Block broadcast_39 stored as values in memory (estimated size 38.3 KB, free 2.5 GB)
      18/02/06 16:38:13 INFO MemoryStore: Block broadcast_39_piece0 stored as bytes in memory (estimated size 30.7 KB, free 2.5 GB)
      18/02/06 16:38:13 INFO BlockManagerInfo: Added broadcast_39_piece0 in memory on 192.168.2.160:44339 (size: 30.7 KB, free: 2.5 GB)
      18/02/06 16:38:13 INFO SparkContext: Created broadcast 39 from broadcast at DAGScheduler.scala:996
      18/02/06 16:38:13 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 31 (NewCarbonDataLoadRDD[101] at RDD at CarbonRDD.scala:33)
      18/02/06 16:38:13 INFO TaskSchedulerImpl: Adding task set 31.0 with 1 tasks
      18/02/06 16:38:13 INFO TaskSetManager: Starting task 0.0 in stage 31.0 (TID 103, localhost, executor driver, partition 0, ANY, 6880 bytes)
      18/02/06 16:38:13 INFO Executor: Running task 0.0 in stage 31.0 (TID 103)
      18/02/06 16:38:13 INFO NewCarbonDataLoadRDD: Input split: knoldus
      18/02/06 16:38:13 INFO NewCarbonDataLoadRDD: The Block Count in this node :1
      18/02/06 16:38:13 INFO SparkPartitionLoader: [Executor task launch worker-28][partitionID:uniqdata;queryID:82072464882259] Temp location for loading data: /tmp/carbon82072469095359_0
      18/02/06 16:38:13 WARN CarbonDataProcessorUtil: [Executor task launch worker-28][partitionID:uniqdata;queryID:82072464882259] sort scope is set to NO_SORT
      18/02/06 16:38:13 INFO AbstractDataLoadProcessorStep: Thread-1137 Rows processed in step Data Writer : 0
      18/02/06 16:38:13 INFO AbstractDataLoadProcessorStep: Thread-1138 Rows processed in step Data Converter : 0
      18/02/06 16:38:13 INFO AbstractDataLoadProcessorStep: Thread-1139 Rows processed in step Input Processor : 0
      18/02/06 16:38:13 INFO DataLoadExecutor: [Executor task launch worker-28][partitionID:uniqdata;queryID:82072464882259] Data Loading is started for table uniqdata
      18/02/06 16:38:13 WARN CarbonDataProcessorUtil: [Executor task launch worker-28][partitionID:uniqdata;queryID:82072464882259] sort scope is set to NO_SORT
      18/02/06 16:38:13 INFO CarbonFactDataHandlerColumnar: [Executor task launch worker-28][partitionID:uniqdata;queryID:82072464882259] Initializing writer executors
      18/02/06 16:38:13 INFO CarbonFactDataHandlerColumnar: [Executor task launch worker-28][partitionID:uniqdata;queryID:82072464882259] Columns considered as NoInverted Index are cust_id,cust_name,decimal_column2,double_column1,
      18/02/06 16:38:13 INFO CarbonFactDataHandlerColumnar: [Executor task launch worker-28][partitionID:uniqdata;queryID:82072464882259] Number of rows per column blocklet 32000
      18/02/06 16:38:13 INFO AbstractFactDataWriter: [Executor task launch worker-28][partitionID:uniqdata;queryID:82072464882259] Total file size: 268435456 and dataBlock Size: 241591911
      18/02/06 16:38:13 INFO AbstractFactDataWriter: [Executor task launch worker-28][partitionID:uniqdata;queryID:82072464882259] Randomly choose factdata temp location: /tmp/carbon82072469095359_0/Fact/Part0/Segment_2/0
      18/02/06 16:38:13 ERROR CarbonRowDataWriterProcessorStepImpl: [Executor task launch worker-28][partitionID:uniqdata;queryID:82072464882259] Failed for table: uniqdata in DataWriterProcessorStepImpl
      org.apache.carbondata.processing.loading.exception.CarbonDataLoadingException: unable to generate the mdkey
      at org.apache.carbondata.processing.loading.steps.CarbonRowDataWriterProcessorStepImpl.processBatch(CarbonRowDataWriterProcessorStepImpl.java:281)
      at org.apache.carbondata.processing.loading.steps.CarbonRowDataWriterProcessorStepImpl.doExecute(CarbonRowDataWriterProcessorStepImpl.java:167)
      at org.apache.carbondata.processing.loading.steps.CarbonRowDataWriterProcessorStepImpl.execute(CarbonRowDataWriterProcessorStepImpl.java:122)
      at org.apache.carbondata.processing.loading.DataLoadExecutor.execute(DataLoadExecutor.java:51)
      at org.apache.carbondata.spark.rdd.NewCarbonDataLoadRDD$$anon$1.<init>(NewCarbonDataLoadRDD.scala:246)
      at org.apache.carbondata.spark.rdd.NewCarbonDataLoadRDD.internalCompute(NewCarbonDataLoadRDD.scala:221)
      at org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:60)
      at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
      at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
      at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
      at org.apache.spark.scheduler.Task.run(Task.scala:99)
      at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at java.lang.Thread.run(Thread.java:745)

       

      Attachments

        1. 2000_UniqData.csv
          367 kB
          Vandana Yadav

        Issue Links

          Activity

            People

              Jatin Demla Jatin
              Vandana7 Vandana Yadav
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 8h 10m
                  8h 10m