Uploaded image for project: 'CarbonData'
  1. CarbonData
  2. CARBONDATA-2277

Filter on default values are not working

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.4.0
    • 1.4.0
    • core
    • None
    • Spark-2.2

    Description

      0: jdbc:hive2://localhost:10000> create table testFilter(data int) stored by 'carbondata';
      ---------+

      Result

      ---------+
      ---------+
      No rows selected (1.231 seconds)
      0: jdbc:hive2://localhost:10000> insert into testFilter values(22);
      ---------+

      Result

      ---------+
      ---------+
      No rows selected (3.726 seconds)
      0: jdbc:hive2://localhost:10000> alter table testFilter add columns(c1 int) TBLPROPERTIES('DEFAULT.VALUE.c1' = '25');
      ---------+

      Result

      ---------+
      ---------+
      No rows selected (1.761 seconds)
      0: jdbc:hive2://localhost:10000> select * from testFilter;
      ----------

      data c1

      ----------

      22 25

      ----------
      1 row selected (0.85 seconds)

      0: jdbc:hive2://localhost:10000> select * from testFilter where c1=25;
      Error: java.nio.BufferUnderflowException (state=,code=0)

      Stack Trace : 

      18/03/25 13:34:08 INFO CarbonLateDecodeRule: pool-20-thread-8 skip CarbonOptimizer
      18/03/25 13:34:08 INFO CarbonLateDecodeRule: pool-20-thread-8 Skip CarbonOptimizer
      18/03/25 13:34:08 INFO TableInfo: pool-20-thread-8 Table block size not specified for default_testfilter. Therefore considering the default value 1024 MB
      18/03/25 13:34:08 ERROR SparkExecuteStatementOperation: Error executing query, currentState RUNNING,
      java.nio.BufferUnderflowException
      at java.nio.Buffer.nextGetIndex(Buffer.java:506)
      at java.nio.HeapByteBuffer.getLong(HeapByteBuffer.java:412)
      at org.apache.carbondata.core.util.DataTypeUtil.getMeasureObjectFromDataType(DataTypeUtil.java:117)
      at org.apache.carbondata.core.scan.filter.executer.RestructureEvaluatorImpl.isMeasureDefaultValuePresentInFilterValues(RestructureEvaluatorImpl.java:113)
      at org.apache.carbondata.core.scan.filter.executer.RestructureExcludeFilterExecutorImpl.<init>(RestructureExcludeFilterExecutorImpl.java:43)
      at org.apache.carbondata.core.scan.filter.FilterUtil.getExcludeFilterExecuter(FilterUtil.java:281)
      at org.apache.carbondata.core.scan.filter.FilterUtil.createFilterExecuterTree(FilterUtil.java:147)
      at org.apache.carbondata.core.scan.filter.FilterUtil.createFilterExecuterTree(FilterUtil.java:158)
      at org.apache.carbondata.core.scan.filter.FilterUtil.getFilterExecuterTree(FilterUtil.java:1296)
      at org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMap.prune(BlockletDataMap.java:644)
      at org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMap.prune(BlockletDataMap.java:685)
      at org.apache.carbondata.core.datamap.TableDataMap.prune(TableDataMap.java:74)
      at org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getDataBlocksOfSegment(CarbonTableInputFormat.java:739)
      at org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:666)
      at org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:426)
      at org.apache.carbondata.spark.rdd.CarbonScanRDD.getPartitions(CarbonScanRDD.scala:96)
      at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
      at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
      at scala.Option.getOrElse(Option.scala:121)
      at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
      at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
      at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
      at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
      at scala.Option.getOrElse(Option.scala:121)
      at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
      at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
      at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
      at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
      at scala.Option.getOrElse(Option.scala:121)
      at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
      at org.apache.spark.SparkContext.runJob(SparkContext.scala:1958)
      at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:935)
      at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
      at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
      at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
      at org.apache.spark.rdd.RDD.collect(RDD.scala:934)
      at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:275)
      at org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$execute$1$1.apply(Dataset.scala:2371)
      at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
      at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2765)
      at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$execute$1(Dataset.scala:2370)
      at org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$collect$1.apply(Dataset.scala:2375)
      at org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$collect$1.apply(Dataset.scala:2375)
      at org.apache.spark.sql.Dataset.withCallback(Dataset.scala:2778)
      at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collect(Dataset.scala:2375)
      at org.apache.spark.sql.Dataset.collect(Dataset.scala:2351)
      at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:235)
      at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:163)
      at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:160)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:422)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
      at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:173)
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      at java.lang.Thread.run(Thread.java:748)
      18/03/25 13:34:08 ERROR SparkExecuteStatementOperation: Error running hive query:
      org.apache.hive.service.cli.HiveSQLException: java.nio.BufferUnderflowException
      at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:258)
      at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:163)
      at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:160)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:422)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
      at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:173)
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      at java.lang.Thread.run(Thread.java:748)

       

      Attachments

        Issue Links

          Activity

            People

              Jatin Demla Jatin
              Jatin Demla Jatin
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h
                  2h