[CARBONDATA-2574] MV Datamap - MV is not working if there is aggregate function with group by and without any projections. - ASF JIRA

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: data-query
Labels:
Environment:
3 Node Opensource ANT cluster.

Description

User query is not fetching data from the MV datamap, if there is aggregate function with group by and without any projections.

Test queries:(In Spark-shell)

scala> carbon.sql("CREATE TABLE originTable (empno int, empname String, designation String, doj Timestamp,workgroupcategory int, workgroupcategoryname String, deptno int, deptname String,projectcode int, projectjoindate Timestamp, projectenddate Timestamp,attendance int,utilization int,salary int) STORED BY 'org.apache.carbondata.format'").show(200,false)

++

scala> carbon.sql("LOAD DATA local inpath 'hdfs://hacluster/user/prasanna/data.csv' INTO TABLE originTable OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '\"','timestampformat'='dd-MM-yyyy')").show(200,false)

++

scala> carbon.sql("create datamap Mv_misscol using 'mv' as select sum(salary) from origintable group by empno").show(200,false)

++

scala> carbon.sql("rebuild datamap Mv_misscol").show(200,false)

++

scala> carbon.sql("explain select sum(salary) from origintable group by empno").show(200,false)

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

plan

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

== CarbonData Profiler ==

Table Scan on origintable

- total blocklets: 1

- filter: none

- pruned by Main DataMap

- skipped blocklets: 0

|

== Physical Plan ==

*HashAggregate(keys=empno#3850, functions=sum(cast(salary#3863 as bigint)))

+- Exchange hashpartitioning(empno#3850, 200)

+- *HashAggregate(keys=empno#3850, functions=partial_sum(cast(salary#3863 as bigint)))

+- *BatchedScan CarbonDatasourceHadoopRelation [ Database name :prasanna, Table name :origintable, Schema :Some(StructType(StructField(empno,IntegerType,true), StructField(empname,StringType,true), StructField(designation,StringType,true), StructField(doj,TimestampType,true), StructField(workgroupcategory,IntegerType,true), StructField(workgroupcategoryname,StringType,true), StructField(deptno,IntegerType,true), StructField(deptname,StringType,true), StructField(projectcode,IntegerType,true), StructField(projectjoindate,TimestampType,true), StructField(projectenddate,TimestampType,true), StructField(attendance,IntegerType,true), StructField(utilization,IntegerType,true), StructField(salary,IntegerType,true))) ] prasanna.origintableempno#3850,salary#3863|

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

scala> carbon.sql("select sum(salary) from origintable group by empno").show(200,false)

-----------

sum(salary)

-----------

7124

9054

5040

13547

11254

13245

9574

5040

11248

7245

-----------

scala> carbon.sql("show datamap on table origintable").show(200,false)

-----------------------------------------

DataMapName

ClassName

Associated Table

-----------------------------------------

Mv_misscol

mv

prasanna.Mv_misscol_table

-----------------------------------------

It is working fine, if there is a projection, also present with the aggregate function along with the group by. Please see the attached document for more details.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

data.csv
04/Jun/18 12:49
1 kB
Prasanna Ravichandran
MV_aggregate_without_projection_and_with_groupby.docx
04/Jun/18 12:50
45 kB
Prasanna Ravichandran

Issue Links

is a child of

CARBONDATA-2628 Fix MV defects

Open

MV Datamap - MV is not working if there is aggregate function with group by and without any projections.

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates