Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-18860

Update Parquet to 1.9.0

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • None
    • None
    • SQL
    • None

    Description

      This issue aims to update Parquet to 1.9.0 and remove the hacks due to Parquet 1.8.1 limitation.

      -  // !! HACK ALERT !!
      -  //
      -  // PARQUET-363 & PARQUET-278: parquet-mr 1.8.1 doesn't allow constructing empty GroupType,
      -  // which prevents us to avoid selecting any columns for queries like `SELECT COUNT(*) FROM t`.
      -  // This issue has been fixed in parquet-mr 1.8.2-SNAPSHOT.
      -  //
      -  // To workaround this problem, here we first construct a `MessageType` with a single dummy
      -  // field, and then remove the field to obtain an empty `MessageType`.
      -  //
      -  // TODO Reverts this change after upgrading parquet-mr to 1.8.2+
         val EMPTY_MESSAGE = Types
             .buildMessage()
      -      .required(PrimitiveType.PrimitiveTypeName.INT32).named("dummy")
             .named(ParquetSchemaConverter.SPARK_PARQUET_SCHEMA_NAME)
      -  EMPTY_MESSAGE.getFields.clear()
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              dongjoon Dongjoon Hyun
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: