Uploaded image for project: 'CarbonData'
  1. CarbonData
  2. CARBONDATA-1786

Getting null pointer exception while loading data into table and while fetching data getting NULL values

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 1.3.0
    • None
    • data-load
    • None
    • spark 2.1

    Description

      Getting null pointer exception while loading data into table and while fetching data getting NULL values

      Steps to reproduce:
      1)Create table:
      CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES ("TABLE_BLOCKSIZE"= "256 MB");

      2)Load Data
      LOAD DATA INPATH 'hdfs://localhost:54310/Data/uniqdata/2000_UniqData.csv' into table uniqdata OPTIONS('DELIMITER'='/' , 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','TIMESTAMPFORMAT'='yyyy-mm-dd hh:mm:ss');

      3) Expected result: it should load data into table successfully.

      4) Actual Result: it throws an error
      Error: java.lang.NullPointerException (state=,code=0)

      logs:
      java.lang.NullPointerException
      at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
      at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
      at org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.delete(AbstractDFSCarbonFile.java:142)
      at org.apache.carbondata.processing.util.DeleteLoadFolders.physicalFactAndMeasureMetadataDeletion(DeleteLoadFolders.java:79)
      at org.apache.carbondata.processing.util.DeleteLoadFolders.deleteLoadFoldersFromFileSystem(DeleteLoadFolders.java:134)
      at org.apache.carbondata.spark.rdd.DataManagementFunc$.deleteLoadsAndUpdateMetadata(DataManagementFunc.scala:188)
      at org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:281)
      at org.apache.spark.sql.execution.command.management.LoadTableCommand.loadData(LoadTableCommand.scala:347)
      at org.apache.spark.sql.execution.command.management.LoadTableCommand.processData(LoadTableCommand.scala:183)
      at org.apache.spark.sql.execution.command.management.LoadTableCommand.run(LoadTableCommand.scala:64)
      at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
      at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
      at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
      at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
      at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
      at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135)
      at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
      at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
      at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
      at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:87)
      at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:87)
      at org.apache.spark.sql.Dataset.<init>(Dataset.scala:185)
      at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
      at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:592)
      at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:699)
      at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:220)
      at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:163)
      at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:160)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:422)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
      at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:173)
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at java.lang.Thread.run(Thread.java:745)
      17/11/21 12:12:21 ERROR SparkExecuteStatementOperation: Error running hive query:
      org.apache.hive.service.cli.HiveSQLException: java.lang.NullPointerException
      at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:258)
      at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:163)
      at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:160)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:422)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
      at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:173)
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at java.lang.Thread.run(Thread.java:745)

      5) Execute query:
      select * from uniqdata:

      6) Expected result: it should display zero rows as our load was not successful.

      7) Actual Result: it displays no of rows of pre-existing table with NULL values

      NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
      NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
      NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
      NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
      NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
      NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL

      --------------------------------------------------------------------------------------------------------------------------------------------------------------
      45,117 rows selected (2.222 seconds)

      Attachments

        1. 2000_UniqData.csv
          367 kB
          Vandana Yadav

        Issue Links

          Activity

            People

              anubhavtarar anubhav tarar
              Vandana7 Vandana Yadav
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 20m
                  2h 20m