Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
See HIVE-15397.
There are multiple complementary ways to handle this properly:
1) Enhance MetadataOnly to recognize when table emptiness matters and only optimize safe query patterns (or only use the below in unsafe cases).
2) Create the original IF inside compilation, get record reader and see if it's empty. Seems like the only bulletproof method in terms of correctness, but it may break due to difference in setup and access between tasks and compilation. May also have security implications e.g. if compilation is in HS2 and permissions are different from tasks.
3) Instead of using NullIF for this case, somehow inject limit into table scan (using limit in the plan, or just hack it into TS itself specifically for this feature), and keep the original InputFormat. That way instead of 0 or 1 null rows it would return 0 or 1 rows from the original split, while avoiding large scans, which is the goal.
Attachments
Issue Links
- is a clone of
-
HIVE-15397 metadata-only queries may return incorrect results with empty tables
- Closed