Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-45334

Remove misleading comment in parquetSchemaConverter

    XMLWordPrintableJSON

Details

    • Documentation
    • Status: Resolved
    • Trivial
    • Resolution: Fixed
    • 3.5.0
    • 4.0.0
    • SQL

    Description

      I'm debugging a parquet issue and reading spark code as references. Happened to find a misleading comment which remains in the latest version as well.

      Types
        .buildGroup(repetition).as(LogicalTypeAnnotation.listType())
        .addField(Types
          .buildGroup(REPEATED)
          // "array" is the name chosen by parquet-hive (1.7.0 and prior version)
          .addField(convertField(StructField("array", elementType, nullable)))
          .named("bag"))
        .named(field.name) 

      the comment above is misleading since Hive always uses "array_element" as the name.

      It is imported by this PR https://github.com/apache/spark/pull/14399 and relates to this issue https://issues.apache.org/jira/browse/SPARK-16777

      Furthermore, the parquet-hive module has been removed from the parquet-mr project https://issues.apache.org/jira/browse/PARQUET-1676 

      I suggest removing this piece of comment and will submit a PR later.

      Attachments

        Issue Links

          Activity

            People

              amoylan Mengran Lan
              amoylan Mengran Lan
              Hyukjin Kwon Hyukjin Kwon
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: