Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-30023

Spark partitionby saves as columnName={value} | Can it be only columnvalue

    XMLWordPrintableJSON

Details

    • Question
    • Status: Resolved
    • Major
    • Resolution: Invalid
    • 2.4.3
    • None
    • Spark Core, SQL

    Description

      I am using scala and spark.

      This is using Dataframe and in dataframe i have a columns by name "year" "month" and "date" and many other columns which are not relevant here.

       

      Code snippet.
       
      {{df.write.partitionBy("year", "month", "date").format("csv").option("header", "true").save(outPath)
      }}
       

      and my expectation is to save in a hierarchy folder structure.
       
      2016/11/15/file*.csv

       

      but the files are getting saved as 

       
       
      year=2016/month=11/date=15/file*.csv

       

      Is there any way i can remove the column name from the directory structure and save only the column value here. ? 

       

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            shivakumar.ss ShivaKumar SS
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: