Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2927

Add a conf to configure if we always read Binary columns stored in Parquet as String columns

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.1.0
    • SQL
    • None

    Description

      Based on Parquet spec (https://github.com/Parquet/parquet-format), "strings are stored as byte arrays (binary) with a UTF8 annotation". However, if the data generator does not follow it, we will only read binary values back instead of string values.

      Attachments

        Issue Links

          Activity

            People

              yhuai Yin Huai
              yhuai Yin Huai
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: