Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-3462

There appears to be no way to have complex intermediate state

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      After spending several frustrating days on the problem (see also DRILL-3461), it appears that there is no viable idiom for building an aggregator that has internal state that is anything more than a scalar.

      What is needed is:

      1) The ability to allocate a Repeated* type for use in a Workspace variables. Currently, new works to get the basic structure, but there is no good way to allocate the corresponding vector.

      2) The ability to use and to allocate a ComplexWriter in the Workspace variables.

      3) The ability to write a UDAF that supports multi-phase aggregation. It would be just fine if I simply have to write a combine method on my UDAF class. I don't think that there is any way to infer such a combiner from the parameters and workspace variables. An alternative API would be to have a form of the output function that is given an Iterable<OutputClass>, but that is probably much less efficient than simply having a combine method that is called repeatedly.

      Attachments

        Activity

          People

            Unassigned Unassigned
            tdunning Ted Dunning
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: