[SPARK-20450] Unexpected first-query schema inference cost with 2.1.1 RC - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.1.1
Fix Version/s: 2.1.1
Component/s: SQL
Labels:
None

Description

https://issues.apache.org/jira/browse/SPARK-19611 fixes a regression from 2.0 where Spark silently fails to read case-sensitive fields missing a case-sensitive schema in the table properties. The fix is to detect this situation, infer the schema, and write the case-sensitive schema into the metastore.

However this can incur an unexpected performance hit the first time such a problematic table is queried (and there is a high false-positive rate here since most tables don't actually have case-sensitive fields).

Attachments

Issue Links

is related to

SPARK-19611 Spark 2.1.0 breaks some Hive tables backed by case-sensitive data files

Resolved

links to

[Github] Pull Request #17749 (ericl)

Activity

People

Assignee:: Eric Liang

Reporter:: Eric Liang

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 24/Apr/17 19:25

Updated:: 24/Apr/17 23:23

Resolved:: 24/Apr/17 23:23