Uploaded image for project: 'CarbonData'
  1. CarbonData
  2. CARBONDATA-3845

Bucket table creation fails with exception for empty BUCKET_NUMBER and BUCKET_COLUMNS

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 2.0.0
    • 2.1.0
    • data-query
    • None
    • Spark 2.3.2

    Description

      Steps and Issue-

      0: jdbc:hive2://10.20.251.163:23040/default> create table if not exists all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal float,customdecimal decimal(38,15),words string,smallwords char(8),varwords varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES ('BUCKET_NUMBER'='', 'BUCKET_COLUMNS'='');
      Error: java.lang.NumberFormatException: For input string: "" (state=,code=0)

       Same issue present if bucket_number is empty.

      0: jdbc:hive2://10.20.251.163:23040/default> create table if not exists all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal float,customdecimal decimal(38,15),words string,smallwords char(8),varwords varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES ('BUCKET_NUMBER'='', 'BUCKET_COLUMNS'='test');
      Error: java.lang.NumberFormatException: For input string: "" (state=,code=0)

      Log-

      2020-06-05 01:52:32,633 | ERROR | [HiveServer2-Background-Pool: Thread-102] | Error executing query, currentState RUNNING,  | org.apache.spark.internal.Logging$class.logError(Logging.scala:91)2020-06-05 01:52:32,633 | ERROR | [HiveServer2-Background-Pool: Thread-102] | Error executing query, currentState RUNNING,  | org.apache.spark.internal.Logging$class.logError(Logging.scala:91)java.lang.NumberFormatException: For input string: "" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:592) at java.lang.Integer.parseInt(Integer.java:615) at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:272) at scala.collection.immutable.StringOps.toInt(StringOps.scala:29) at org.apache.carbondata.spark.CarbonOption.bucketNumber$lzycompute(CarbonOption.scala:61) at org.apache.carbondata.spark.CarbonOption.bucketNumber(CarbonOption.scala:61) at org.apache.spark.sql.parser.CarbonSpark2SqlParser.getBucketFields(CarbonSpark2SqlParser.scala:765) at org.apache.spark.sql.parser.CarbonSparkSqlParserUtil$.buildTableInfoFromCatalogTable(CarbonSparkSqlParserUtil.scala:382) at org.apache.spark.sql.CarbonSource$.createTableInfo(CarbonSource.scala:235) at org.apache.spark.sql.CarbonSource$.createTableMeta(CarbonSource.scala:382) at org.apache.spark.sql.execution.command.table.CarbonCreateDataSourceTableCommand.processMetadata(CarbonCreateDataSourceTableCommand.scala:69) at org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123) at org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123) at org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104) at org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:120) at org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:123) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:190) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:232) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:185) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)2020-06-05 01:52:32,635 | ERROR | [HiveServer2-Background-Pool: Thread-102] | Error running hive query:  | org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:179)org.apache.hive.service.cli.HiveSQLException: java.lang.NumberFormatException: For input string: "" at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:269) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:185) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)2020-06-05 01:52:32,800 | INFO  | [HiveServer2-Handler-Pool: Thread-97] | Asked to cancel job group 72fed8a2-70af-4af1-8d17-b2f1e562138c | org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)

      Attachments

        Activity

          People

            Unassigned Unassigned
            chetdb Chetan Bhat
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 4h
                4h