Uploaded image for project: 'CarbonData'
  1. CarbonData
  2. CARBONDATA-888

Dictionary include / exclude option in dataframe writer

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 1.2.0
    • 1.2.0
    • spark-integration
    • None
    • HDP 2.5, Spark 1.6

    Description

      While creating a Carbondata table from dataframe, currently it is not possible to specify columns that needs to be included in or excluded from the dictionary. An option is required to specify it as below :

      df.write.format("carbondata")
      .option("tableName", "test")
      .option("compress","true")
      .option("dictionary_include","incol1,intcol2")
      .option("dictionary_exclude","stringcol1,stringcol2")
      .mode(SaveMode.Overwrite)
      .save()

      We have lot of integer columns that are dimensions, dataframe.save is used to quickly create tables instead of writing ddls, and it would be nice to have this feature to execute POCs.

      Attachments

        Activity

          People

            sanoj_mg Sanoj MG
            sanoj_mg Sanoj MG
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 3.5h
                3.5h