Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.7.1, 0.10.0
-
None
-
None
-
None
Description
I am working with a custom InputFormat and a custom SerDe where it sometimes makes sense to have two external tables with different schemas and properties but the same location. When such tables are used in the same query, a RuntimeException may occur.
I realize that with Hive's built-in adapters, it may not ever be useful to create two external tables with the same location. The following example is nonsensical but it can be used to easily reproduce the error:
CREATE EXTERNAL TABLE f (fk STRING, name STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ' '
LOCATION '/local/data/';
CREATE EXTERNAL TABLE IF NOT EXISTS p (pk STRING)
LOCATION '/local/data/';
SELECT p.pk
FROM p LEFT OUTER JOIN f
ON p.pk = f.fk;
In /local/data, put file data.txt:
k1 apple
k2 orange
k2 pear
Produces the folllowing error:
Caused by: java.lang.RuntimeException: cannot find field fk from [0:pk]
at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:346)
at org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector.getStructFieldRef(LazySimpleStructObjectInspector.java:168)
...
Ideally this should be supported by Hive as it is useful for semi-structured documents (e.g. JSON, XML) where multiple big "relations" may be contained in the same file. However, if adding support is infeasible, it would be nice to detect this condition statically and raise a more meaningful error from the client process.
Attachments
Issue Links
- is related to
-
HIVE-24920 TRANSLATED_TO_EXTERNAL tables may write to the same location
- Closed