Details
-
Bug
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
1.16.0
-
None
-
None
-
None
Description
The sqltypeof() function was introduced in Drill 1.14 to work around limitations of the original typeof() function. The function is mentioned in Learning Apache Drill, Chapter 8, page 152:
ELECT sqlTypeOf(columns) AS cols_type, modeOf(columns) AS cols_mode FROM `csv/cust.csv` LIMIT 1; +--------------------+------------+ | cols_type | cols_mode | +--------------------+------------+ | CHARACTER VARYING | ARRAY | +--------------------+------------+
When the same query is run against the just-released Drill 1.17, we get the wrong results:
+-----------+-----------+ | cols_type | cols_mode | +-----------+-----------+ | ARRAY | ARRAY | +-----------+-----------+
The definition of sqlTypeOf() is that it should return the type portion of the columns (type, mode) major type. Clearly, it is no longer doing so for arrays. As a result, there is no function to obtain the data type for arrays.
The problem also shows up in the query from page 158:
SELECT a, b, sqlTypeOf(b) AS b_type, modeof(b) AS b_mode FROM `gen/70kmissing.json` WHERE mod(a, 70000) = 1;
Expected (table from the book with Drill 1.14 results):
+--------+-------+----------+-----------+ | a | b | b_type | b_mode | +--------+-------+----------+-----------+ | 1 | null | INTEGER | NULLABLE | +--------+-------+----------+-----------+
Actual Drill 1.17 results:
+-------+-----------+-------------------+----------+ | a | b | b_type | b_mode | +-------+-----------+-------------------+----------+ | 1 | null | NULL | NULLABLE | +-------+-----------+-------------------+----------+
(Second line of table is omitted because something else changed, not relevant to this ticket.)
The above might not actually be a bug, however if someone has changed the type of missing columns from the old INT to a newer (untyped) NULL. But, an indirect test suggests that the column is still `INT` and the function is wrong:
SELECT a, b FROM `gen/70kdouble.json` WHERE b IS NOT NULL ORDER BY a;
Data:
{a: 1} ... {a: 69999} {a: 70001, b: 10.5}
Error:
Error: UNSUPPORTED_OPERATION ERROR: Schema changes not supported in External Sort. Please enable Union type. Previous schema BatchSchema [fields=[[`a` (BIGINT:OPTIONAL)], [`b` (INT:OPTIONAL)]], selectionVector=NONE] Incoming schema BatchSchema [fields=[[`a` (BIGINT:OPTIONAL)], [`b` (FLOAT8:OPTIONAL)]], selectionVector=NONE]
Oddly, however, the query on page 160 works as expected:
SELECT sqlTypeOf(a) AS a_type, modeOf(a) AS a_mode FROM `json/all-null.json` LIMIT 1;
+---------+----------+ | a_type | a_mode | +---------+----------+ | INTEGER | NULLABLE | +---------+----------+
Someone will have to do some investigating to understand the current behaviour.