Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-18403

[C++] Error consuming Substrait plan which uses count function: "only unary aggregate functions are currently supported"

    XMLWordPrintableJSON

Details

    Description

      ARROW-17523 added support for the Substrait extension function "count", but when I write code which produces a Substrait plan which calls it, and then try to run it in Acero, I get an error.

      The plan:

      message of type 'substrait.Plan' with 3 fields set
      extension_uris {
        extension_uri_anchor: 1
        uri: "https://github.com/substrait-io/substrait/blob/main/extensions/functions_arithmetic.yaml"
      }
      extension_uris {
        extension_uri_anchor: 2
        uri: "https://github.com/substrait-io/substrait/blob/main/extensions/functions_comparison.yaml"
      }
      extension_uris {
        extension_uri_anchor: 3
        uri: "https://github.com/substrait-io/substrait/blob/main/extensions/functions_aggregate_generic.yaml"
      }
      extensions {
        extension_function {
          extension_uri_reference: 3
          function_anchor: 2
          name: "count"
        }
      }
      relations {
        rel {
          aggregate {
            input {
              project {
                common {
                  emit {
                    output_mapping: 9
                    output_mapping: 10
                    output_mapping: 11
                    output_mapping: 12
                    output_mapping: 13
                    output_mapping: 14
                    output_mapping: 15
                    output_mapping: 16
                    output_mapping: 17
                  }
                }
                input {
                  read {
                    base_schema {
                      names: "int"
                      names: "dbl"
                      names: "dbl2"
                      names: "lgl"
                      names: "false"
                      names: "chr"
                      names: "verses"
                      names: "padded_strings"
                      names: "some_negative"
                      struct_ {
                        types {
                          i32 {
                            nullability: NULLABILITY_NULLABLE
                          }
                        }
                        types {
                          fp64 {
                            nullability: NULLABILITY_NULLABLE
                          }
                        }
                        types {
                          fp64 {
                            nullability: NULLABILITY_NULLABLE
                          }
                        }
                        types {
                          bool_ {
                            nullability: NULLABILITY_NULLABLE
                          }
                        }
                        types {
                          bool_ {
                            nullability: NULLABILITY_NULLABLE
                          }
                        }
                        types {
                          string {
                            nullability: NULLABILITY_NULLABLE
                          }
                        }
                        types {
                          string {
                            nullability: NULLABILITY_NULLABLE
                          }
                        }
                        types {
                          string {
                            nullability: NULLABILITY_NULLABLE
                          }
                        }
                        types {
                          fp64 {
                            nullability: NULLABILITY_NULLABLE
                          }
                        }
                      }
                    }
                    local_files {
                      items {
                        uri_file: "file:///tmp/RtmpsBsoZJ/file1915f604cff4a"
                        parquet {
                        }
                      }
                    }
                  }
                }
                expressions {
                  selection {
                    direct_reference {
                      struct_field {
                      }
                    }
                    root_reference {
                    }
                  }
                }
                expressions {
                  selection {
                    direct_reference {
                      struct_field {
                        field: 1
                      }
                    }
                    root_reference {
                    }
                  }
                }
                expressions {
                  selection {
                    direct_reference {
                      struct_field {
                        field: 2
                      }
                    }
                    root_reference {
                    }
                  }
                }
                expressions {
                  selection {
                    direct_reference {
                      struct_field {
                        field: 3
                      }
                    }
                    root_reference {
                    }
                  }
                }
                expressions {
                  selection {
                    direct_reference {
                      struct_field {
                        field: 4
                      }
                    }
                    root_reference {
                    }
                  }
                }
                expressions {
                  selection {
                    direct_reference {
                      struct_field {
                        field: 5
                      }
                    }
                    root_reference {
                    }
                  }
                }
                expressions {
                  selection {
                    direct_reference {
                      struct_field {
                        field: 6
                      }
                    }
                    root_reference {
                    }
                  }
                }
                expressions {
                  selection {
                    direct_reference {
                      struct_field {
                        field: 7
                      }
                    }
                    root_reference {
                    }
                  }
                }
                expressions {
                  selection {
                    direct_reference {
                      struct_field {
                        field: 8
                      }
                    }
                    root_reference {
                    }
                  }
                }
              }
            }
            groupings {
              grouping_expressions {
                selection {
                  direct_reference {
                    struct_field {
                      field: 3
                    }
                  }
                  root_reference {
                  }
                }
              }
            }
            measures {
              measure {
                function_reference: 2
                phase: AGGREGATION_PHASE_INITIAL_TO_RESULT
                output_type {
                  i64 {
                    nullability: NULLABILITY_NULLABLE
                  }
                }
                invocation: AGGREGATION_INVOCATION_ALL
              }
            }
          }
        }
      }
      

      The error:

      Error: NotImplemented: Only unary aggregate functions are currently supported
      /home/nic2/arrow/cpp/src/arrow/engine/substrait/relation_internal.cc:587  converter(aggregate_call)
      /home/nic2/arrow/cpp/src/arrow/engine/substrait/serde.cc:153  FromProto(plan_rel.has_root() ? plan_rel.root().input() : plan_rel.rel(), ext_set, conversion_options)
      

      I have no idea what the "phase" and "invocation" fields above do, but previous attempts to get Acero to consume this plan led to errors due to me using default values instead of the ones specified there (e.g. "Not Implemented: Unsupported aggregation phase 'AGGREGATION_PHASE_UNSPECIFIED'"), so I just changed them to see if it helped.

      Attachments

        Issue Links

          Activity

            People

              felipeo Felipe Oliveira
              thisisnic Nicola Crane
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 6h 50m
                  6h 50m