[CARBONDATA-888] Dictionary include / exclude option in dataframe writer - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: 1.2.0
Fix Version/s: 1.2.0
Component/s: spark-integration
Labels:
None
Environment:
HDP 2.5, Spark 1.6

Description

While creating a Carbondata table from dataframe, currently it is not possible to specify columns that needs to be included in or excluded from the dictionary. An option is required to specify it as below :

df.write.format("carbondata")
.option("tableName", "test")
.option("compress","true")
.option("dictionary_include","incol1,intcol2")
.option("dictionary_exclude","stringcol1,stringcol2")
.mode(SaveMode.Overwrite)
.save()

We have lot of integer columns that are dimensions, dataframe.save is used to quickly create tables instead of writing ddls, and it would be nice to have this feature to execute POCs.

Attachments

Issue Links

links to

GitHub Pull Request #769

GitHub Pull Request #786

Activity

People

Assignee:: Sanoj MG

Reporter:: Sanoj MG

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 08/Apr/17 21:26

Updated:: 18/Sep/17 09:04

Resolved:: 18/Sep/17 09:04

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

3.5h