[CARBONDATA-45] Support MAP type - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.5.2
Component/s: core, sql
Labels:
None

Description

>>CREATE TABLE table1 (
                 deviceInformationId int,
                 channelsId string,
                 props map<key:int,value:string>)
              STORED BY 'org.apache.carbondata.format'

>>insert into table1 select 10,'channel1', map(1,'user1',101, 'root')

format of data to be read from csv, with '$' as level 1 delimiter and map keys terminated by '#'

>>load data local inpath '/tmp/data.csv' into table1 options ('COMPLEX_DELIMITER_LEVEL_1'='$', 'COMPLEX_DELIMITER_LEVEL_2'=':', 'COMPLEX_DELIMITER_FOR_KEY'='#')

20,channel2,2#user2$100#usercommon
30,channel3,3#user3$100#usercommon
40,channel4,4#user3$100#usercommon

>>select channelId, props[100] from table1 where deviceInformationId > 10;

20, usercommon
30, usercommon
40, usercommon

>>select channelId, props from table1 where props[2] = 'user2';

20, {2,'user2', 100, 'usercommon'}

Following cases needs to be handled:

Sub feature	Pending activity	Remarks
Basic Maptype support	Develop	Create table DDL, Load map data from CSV, select * from maptable
Maptype lookup in projection and filter	Develop	Projection and filters needs execution at spark
NULL values, UDFs, Describe support	Develop
Compaction support	Test + fix	As compaction works at byte level, no changes required. Needs to add test-cases
Insert into table	Develop	Source table data containing Map data needs to convert from spark datatype to string , as carbon takes string as input row
Support DDL for Map fields Dictionary include and Dictionary Exclude	Develop	Also needs to handle CarbonDictionaryDecoder to handle the same.
Support multilevel Map	Develop	currently DDL is validated to allow only 2 levels, remove this restriction
Support Map value to be a measure	Develop	Currently array and struct supports only dimensions which needs change
Support Alter table to add and remove Map column	Develop	implement DDL and requires default value handling
Projections of Map loopup push down to carbon	Develop	this is an optimization, when more number of values are present in Map
Filter map loolup push down to carbon	Develop	this is an optimization, when more number of values are present in Map
Update Map values	Develop	update map value

Design suggestion:

Map can be represented internally stored as Array<Struct<key,Value>>, So that conversion of data is required to Map data type while giving to spark. Schema will have new column of map type similar to Array.