Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.17.0
-
None
-
None
Description
With a parquet containing an array field, for example:
apache drill 1.17> CREATE TABLE dfs.TEST.`example_any_pqt` AS (SELECT 'foo' AS a, 'bar' b, split('foo,bar',',') as c); apache drill 1.17> SELECT *, typeof(c) AS type, sqltypeof(c) AS sql_type FROM dfs.TEST.`example_any_pqt`; +-----+-----+---------------+---------+----------+ | a | b | c | type | sql_type | +-----+-----+---------------+---------+----------+ | foo | bar | ["foo","bar"] | VARCHAR | ARRAY | +-----+-----+---------------+---------+----------+
The next request work well
apache drill 1.17> SELECT * FROM (SELECT a, any_value(c) FROM dfs.TEST.`example_any_pqt` GROUP BY a) ORDER BY a; +-----+---------------+ | a | EXPR$1 | +-----+---------------+ | foo | ["foo","bar"] | +-----+---------------+
But the next request (with the same struct as the previous request) failed
apache drill 1.17> SELECT * FROM (SELECT a, b, any_value(c) FROM dfs.TEST.`example_any_pqt` GROUP BY a, b) ORDER BY a; Error: UNSUPPORTED_OPERATION ERROR: Schema changes not supported in External Sort. Please enable Union type. Previous schema BatchSchema [fields=[[`a` (VARCHAR:OPTIONAL)], [`b` (VARCHAR:OPTIONAL)], [`EXPR$2` (NULL:OPTIONAL)]], selectionVector=NONE] Incoming schema BatchSchema [fields=[[`a` (VARCHAR:OPTIONAL)], [`b` (VARCHAR:OPTIONAL)], [`EXPR$2` (VARCHAR:REPEATED), children=([`$data$` (VARCHAR:REQUIRED)])]], selectionVector=NONE] Fragment 0:0
Note that the same request without the order by works well. It's also possible to use intermediate table and apply the ORDER BY in a second time.
apache drill 1.17> SELECT * FROM (SELECT a, b, any_value(c) FROM dfs.TEST.`example_any_pqt` GROUP BY a, b); +-----+-----+---------------+ | a | b | EXPR$2 | +-----+-----+---------------+ | foo | bar | ["foo","bar"] | +-----+-----+---------------+ apache drill 1.17> CREATE TABLE dfs.TEST.`ok_pqt` AS (SELECT * FROM (SELECT a, b, any_value(c) FROM dfs.TEST.`example_any_pqt` GROUP BY a, b)); +----------+---------------------------+ | Fragment | Number of records written | +----------+---------------------------+ | 0_0 | 1 | +----------+---------------------------+ apache drill 1.17> SELECT * FROM dfs.TEST.`ok_pqt` ORDER BY a; +-----+-----+---------------+ | a | b | EXPR$2 | +-----+-----+---------------+ | foo | bar | ["foo","bar"] | +-----+-----+---------------+