[SPARK-21912] ORC/Parquet table should not create invalid column names - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.2.0
Fix Version/s: 2.3.0
Component/s: SQL
Labels:
None

Description

Currently, users meet job abortions while creating ORC data source tables with invalid column names. We had better prevent this by raising AnalysisException like Paquet data source tables.

scala> sql("CREATE TABLE orc1 USING ORC AS SELECT 1 `a b`")
17/09/04 13:28:21 ERROR Utils: Aborting task
java.lang.IllegalArgumentException: Error: : expected at the position 8 of 'struct<a b:int>' but ' ' is found.
	at org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.expect(TypeInfoUtils.java:360)
...
17/09/04 13:28:21 WARN FileOutputCommitter: Could not delete file:/Users/dongjoon/spark-release/spark-master/spark-warehouse/orc1/_temporary/0/_temporary/attempt_20170904132821_0001_m_000000_0
17/09/04 13:28:21 ERROR FileFormatWriter: Job job_20170904132821_0001 aborted.
17/09/04 13:28:21 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 1)
org.apache.spark.SparkException: Task failed while writing rows.

Attachments

Issue Links

blocks

SPARK-20901 Feature parity for ORC with Parquet

Open

is related to

SPARK-32889 orc table column name doesn't support special characters.

Resolved

links to

[Github] Pull Request #19124 (dongjoon-hyun)

[Github] Pull Request #19562 (dongjoon-hyun)

Activity

People

Assignee:: Dongjoon Hyun

Reporter:: Dongjoon Hyun

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 04/Sep/17 20:44

Updated:: 17/Sep/20 21:55

Resolved:: 07/Sep/17 05:21