Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-4858

REPEATED_COUNT on an array of maps and an array of arrays is not implemented

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 1.16.0
    • None
    • None

    Description

      REPEATED_COUNT of JSON containing an array of map does not work.

      JSON file

      drill$ cat /Users/jccote/repeated_count.json 
      {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": "foo"}
      {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": "foo"}
      {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": "foo"}
      {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": "foo"}
      {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": "foo"}
      {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": "foo"}
      {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": "foo"}
      {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": "foo"}
      

      select

      0: jdbc:drill:zk=local> select repeated_count(mapArray) from dfs.`/Users/jccote/repeated_count.json`;
      

      error

      Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to materialize incoming schema.  Errors:
       
      Error in expression at index -1.  Error: Missing function implementation: [repeated_count(MAP-REPEATED)].  Full expression: --UNKNOWN EXPRESSION--..
      
      Fragment 0:0
      
      [Error Id: 1057bb8e-1cc4-4a9a-a748-3a6a14092858 on 192.168.1.3:31010] (state=,code=0)
      

      The same issue is present for an array of arrays
      for JSON file

      {"id": 1, "array": [[1, 2], [1, 3], [2, 3]]}
      {"id": 2, "array": []}
      {"id": 3, "array": [[2, 3], [1, 3, 4]]}
      {"id": 4, "array": [[1], [2], [3, 4], [5], [6]]}
      {"id": 5, "array": [[1, 2, 3], [4, 5], [6], [7], [8, 9], [2, 3], [2, 3], [2, 3], [2]]}
      {"id": 6, "array": [[1, 2], [3], [4], [5]]}
      {"id": 7, "array": []}
      {"id": 8, "array": [[1], [2], [3]]}
      

      the following error is shown

      0: jdbc:drill:schema=dfs.tmp> select REPEATED_COUNT(array) from `arrayOfArrays.json`;
      Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to materialize incoming schema.  Errors:
      
      Error in expression at index -1.  Error: Missing function implementation: [repeated_count(LIST-REPEATED)].  Full expression: --UNKNOWN EXPRESSION--..
      
      Fragment 0:0
      
      [Error Id: 12b81b85-c84b-4773-8427-48b80098cafe on qa102-45.qa.lab:31010] (state=,code=0)
      

      Looking at the org.apache.drill.exec.expr.fn.impl.SimpleRepeatedFunctions
      Looks like it's not enabled yet.

        // TODO - need to confirm that these work   SMP: They do not
        @FunctionTemplate(name = "repeated_count", scope = FunctionTemplate.FunctionScope.SIMPLE)
        public static class RepeatedLengthMap implements DrillSimpleFunc {
      ...
        // TODO - need to confirm that these work   SMP: They do not
        @FunctionTemplate(name = "repeated_count", scope = FunctionTemplate.FunctionScope.SIMPLE)
        public static class RepeatedLengthList implements DrillSimpleFunc {
      

      Also make REPEATED_COUNT function to support other REPEATED type. So Drill's REPEATED_COUNT function supports following REPEATED types: RepeatedBit, RepeatedInt, RepeatedBigInt, RepeatedFloat4, RepeatedFloat8, RepeatedDate, RepeatedTimeStamp, RepeatedTime, RepeatedIntervalDay, RepeatedIntervalYear, RepeatedInterval, RepeatedVarChar, RepeatedVarBinary, RepeatedVarDecimal, RepeatedDecimal9, RepeatedDecimal18, RepeatedDecimal28Sparse, RepeatedDecimal38Sparse, RepeatedList, RepeatedMap

      Attachments

        Activity

          People

            bohdan Bohdan Kazydub
            jccote jean-claude
            Vitalii Diravka Vitalii Diravka
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: