Details
-
Bug
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
None
-
None
-
None
Description
REPEATED_COUNT of JSON containing an array of map does not work.
JSON file
drill$ cat /Users/jccote/repeated_count.json {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": "foo"} {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": "foo"} {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": "foo"} {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": "foo"} {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": "foo"} {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": "foo"} {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": "foo"} {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": "foo"}
select
0: jdbc:drill:zk=local> select repeated_count(mapArray) from dfs.`/Users/jccote/repeated_count.json`;
error
Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to materialize incoming schema. Errors:
Error in expression at index -1. Error: Missing function implementation: [repeated_count(MAP-REPEATED)]. Full expression: --UNKNOWN EXPRESSION--..
Fragment 0:0
[Error Id: 1057bb8e-1cc4-4a9a-a748-3a6a14092858 on 192.168.1.3:31010] (state=,code=0)
The same issue is present for an array of arrays
for JSON file
{"id": 1, "array": [[1, 2], [1, 3], [2, 3]]} {"id": 2, "array": []} {"id": 3, "array": [[2, 3], [1, 3, 4]]} {"id": 4, "array": [[1], [2], [3, 4], [5], [6]]} {"id": 5, "array": [[1, 2, 3], [4, 5], [6], [7], [8, 9], [2, 3], [2, 3], [2, 3], [2]]} {"id": 6, "array": [[1, 2], [3], [4], [5]]} {"id": 7, "array": []} {"id": 8, "array": [[1], [2], [3]]}
the following error is shown
0: jdbc:drill:schema=dfs.tmp> select REPEATED_COUNT(array) from `arrayOfArrays.json`;
Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to materialize incoming schema. Errors:
Error in expression at index -1. Error: Missing function implementation: [repeated_count(LIST-REPEATED)]. Full expression: --UNKNOWN EXPRESSION--..
Fragment 0:0
[Error Id: 12b81b85-c84b-4773-8427-48b80098cafe on qa102-45.qa.lab:31010] (state=,code=0)
Looking at the org.apache.drill.exec.expr.fn.impl.SimpleRepeatedFunctions
Looks like it's not enabled yet.
// TODO - need to confirm that these work SMP: They do not @FunctionTemplate(name = "repeated_count", scope = FunctionTemplate.FunctionScope.SIMPLE) public static class RepeatedLengthMap implements DrillSimpleFunc { ... // TODO - need to confirm that these work SMP: They do not @FunctionTemplate(name = "repeated_count", scope = FunctionTemplate.FunctionScope.SIMPLE) public static class RepeatedLengthList implements DrillSimpleFunc {
Also make REPEATED_COUNT function to support other REPEATED type. So Drill's REPEATED_COUNT function supports following REPEATED types: RepeatedBit, RepeatedInt, RepeatedBigInt, RepeatedFloat4, RepeatedFloat8, RepeatedDate, RepeatedTimeStamp, RepeatedTime, RepeatedIntervalDay, RepeatedIntervalYear, RepeatedInterval, RepeatedVarChar, RepeatedVarBinary, RepeatedVarDecimal, RepeatedDecimal9, RepeatedDecimal18, RepeatedDecimal28Sparse, RepeatedDecimal38Sparse, RepeatedList, RepeatedMap