[SPARK-16926] Partition columns are present in columns metadata for partition but not table - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 2.0.1, 2.1.0
Component/s: SQL
Labels:
None

Description

A change introduced in ~~SPARK-14388~~ removes partition columns from the column metadata of tables, but not for partitions. This causes TableReader to believe that the schema is different and create an unnecessary conversion object inspector, taking the else codepath in TableReader below:

    val soi = if (rawDeser.getObjectInspector.equals(tableDeser.getObjectInspector)) {
      rawDeser.getObjectInspector.asInstanceOf[StructObjectInspector]
    } else {
      ObjectInspectorConverters.getConvertedOI(
        rawDeser.getObjectInspector,
        tableDeser.getObjectInspector).asInstanceOf[StructObjectInspector]
    }

Printing the properties as debug output confirms the difference for the Hive table.

Table properties (tableDesc.getProperties):

16/08/04 20:36:58 DEBUG HadoopTableReader: columns.types, string:bigint:string:bigint:bigint:array<string>

Partition properties (partProps):

16/08/04 20:36:58 DEBUG HadoopTableReader: columns.types, string:bigint:string:bigint:bigint:array<string>:string:string:string

Where the final three string columns are partition columns

Attachments

Issue Links

links to

[Github] Pull Request #14515 (dafrista)

[Github] Pull Request #14930 (dafrista)

Activity

People

Assignee:: Brian Cho

Reporter:: Brian Cho

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 05/Aug/16 21:04

Updated:: 08/Oct/16 09:16

Resolved:: 01/Sep/16 21:13