[CALCITE-1208] Improve two-level column structure handling - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 1.7.0
Fix Version/s: 1.9.0
Component/s: core
Labels:
- phoenix

Description

Calcite now has support for nested column structure in parsing and validation, by representing the inner-level columns as a RexFieldAccess based on a RexInputRef. Meanwhile it does not flatten the inner level structure in wildcard expansion, which would then cause an UnsupportedOperationException in Avatica.

The idea is to take into account this nested structure in column resolving, but to flatten the structure when translating to RelNode/RexNode.
For example, if the table structure is defined as

VARCHAR K0,
VARCHAR C1,
RecordType(INTEGER C0, INTEGER C1) F0,
RecordType(INTEGER C0, INTEGER C2) F1

, it should be viewed as a flat type like

VARCHAR K0,
VARCHAR C1,
INTEGER F0.C0,
INTEGER F0.C1,
INTEGER F1.C0,
INTEGER F1.C2

, so that:
1) Column reference "K0" is translated as $0
2) Column reference "F0.C1" is translated as $3
3) Wildcard "*" is translated as: $0, $1, $2, $3, $4, $5
4) Complex-column wildcard "F1.*", which is translated as $2, $3
And we would like to resolve columns based on the following rules (here we only consider the "suffix" part of the qualified names, which means the table resolving is already done by this time):
a) A two-part column name is matched with its first-level column name and its second-level column name. For example, "F1.C0" corresponds to $4; "F1,X" will throw a column not found error.
b) A single-part column name is matched against non-nested columns first, and if no matches, it is then matched against those second-level column names. For example, "C1" will be matched as "$1" instead of "$3", since non-nested columns have a higher priority; "C2" will be matched as "$5"; "C0" will lead to an ambiguous column error, since it exists under both "F0" and "F1".
c) We would also like to have a way for defining "default first-level column" so that it has a precedence in column resolving over other first-level columns. For example, if "F0" is defined as default, "C0" will not cause an ambiguous column error, but instead be matched as "$2".
d) Reference to first-level column only without wildcard is not allowed, e.g., "F1".

Attachments

Issue Links

is depended upon by

CALCITE-1356 Release Calcite 1.9.0

Closed

is duplicated by

CALCITE-555 Differentiate between column not found and ambiguous column exceptions

Closed

is related to

DRILL-4682 Allow full schema identifier in SELECT clause

Open

relates to

CALCITE-555 Differentiate between column not found and ambiguous column exceptions

Closed

CALCITE-1322 Wrong prefix number in DelegatingScope.fullyQualify()

Open

CALCITE-1378 ArrayIndexOutOfBoundsException in sql-to-rel conversion for two-level columns

Closed

CALCITE-1431 RelDataTypeFactoryImpl.copyType() did not copy StructKind

Closed

CALCITE-1425 Support two-level column structure in INSERT/UPDATE/MERGE

Closed

CALCITE-1379 When expanding STAR, expand sub-fields in RecordType columns of StructKind.PEEK_FIELDS and StructKind.PEEK_FIELDS_DEFAULT

Closed

(4 relates to)

Activity

People

Assignee:: Julian Hyde

Reporter:: Wei Xue

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 20/Apr/16 02:03

Updated:: 27/Feb/24 22:23

Resolved:: 16/Sep/16 08:46