Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-14726

Support for sampling when inferring schema in CSV data source

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • 2.0.0
    • None
    • SQL
    • None

    Description

      Currently, I am using CSV data source and trying to get used to Spark 2.0 because it has built-in CSV data source.

      I realized that CSV data source infers schema with all the data. JSON data source supports sampling ratio option.

      It would be great if CSV data source has this option too (or is this supported already?).

      Attachments

        Activity

          People

            Unassigned Unassigned
            bomikim Bomi Kim
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: