Uploaded image for project: 'CouchDB'
  1. CouchDB
  2. COUCHDB-403

User-defined GroupRowsFun

    XMLWordPrintableJSON

Details

    • Wish
    • Status: Closed
    • Minor
    • Resolution: Won't Fix
    • None
    • None
    • None
    • Committers Level (Medium to Hard)

    Description

      CouchDB has hard-coded functionality for grouping. From the user's point of view: group_level=N will truncate Array keys to the first N elements, and that's it.

      It would be wonderful if application-specific grouping functions could be added. Useful examples include:

      • for string keys, truncate to the first N characters (e.g. group by first 3 letters of surname)
      • for numeric keys, trunc(k/N) (e.g. divide by 100 would give you buckets of 0..99, 100..199, 200..299 etc)
      • combine with group_level: e.g. truncate array to first two elements plus the third element divided by 100

      ["string1","string2",Number,"rest"] => ["string1","string2",trunc(Number/100)]

      • for numeric keys: use trunc(log(V) * N) for exponential buckets
      • for hexadecimal-string keys: right-shift N places
      • ...etc

      In each case N would be a parameter chosen at query time, like group_level is now.

      It would be sufficient just to have a hook to statically link Erlang functions to do this. There would then need to be two new HTTP parameters: one to choose the grouping function and one for any arguments it needs.

      Theoretically this function could also be handed off to the external view server so the logic could be written in Javascript or whatever, but I think it would be too slow in practice.

      Note: group truncation functions would have need to meet certain constraints to work with grouping logic. Something like:
      K1 <= K2 implies grouptrunc(K1) <= grouptrunc(K2)

      It's not implemented exactly like that. As far as I can see, there's one function to compare keys for equality by looking at the first N elements (GroupRowsFun), and another function truncates them when emitting them (RespFun). For adding bolt-on functions it would be more convenient just to define a single group key truncation function.

      Attachments

        Activity

          People

            Unassigned Unassigned
            candlerb Brian Candler
            Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: