Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-5972

new statistics facet capabilities to StatsComponent facet - limit, sort and missing.

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      I thought it would be very useful to enable limiting and sorting StatsComponent facet response.
      I chose to implement it in Stats Component rather than Analytics component because Analytics doesn't support distributed queries yet.

      The default for limit is -1 - returns all facet values.
      The default for sort is no sorting.
      The default for missing is true.
      So if you use stats component exactly as before, the response won't change as of nowadays.
      If ask for sort or limit, missing facet value will be the last, as in regular facet.
      Sort types supported: min, max, sum and countdistinct for stats fields, and count and index for facet fields (all sort types are lower cased).
      Sort directions asc and desc are supported.
      Sorting by multiple fields is supported.

      our example use case will be employees' monthly salaries:

      The follwing query returns the 10 most "expensive" employees:
      "q=:&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary sum desc&f.employee_name.stats.facet.limit=10"
      The follwing query returns the 10 least "expensive" employees:
      "q=:&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary sum asc&f.employee_name.stats.facet.limit=10"
      The follwing query returns the employee that got the highest salary ever:
      "q=:&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary max desc&f.employee_name.stats.facet.limit=1"
      The follwing query returns the employee that got the lowest salary ever:
      "q=:&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary min asc&f.employee_name.stats.facet.limit=1"
      The follwing query returns the 10 first (lexicographically) employees:
      "q=:&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=employee_name index asc&f.employee_name.stats.facet.limit=10"
      The follwing query returns the 10 employees that have worked for the longest period:
      "q=:&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=employee_name count desc&f.employee_name.stats.facet.limit=10"
      The follwing query returns the 10 employee whose salaries vary the most:
      "q=:&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary countdistinct desc&f.employee_name.stats.facet.limit=10"

      Attached a patch implementing this in StatsComponent.

      Attachments

        1. SOLR-5972_multivalue_docvalue.patch
          2 kB
          Lyubov Romanchuk
        2. SOLR-5972.patch
          40 kB
          Elran Dvir
        3. SOLR-5972.patch
          40 kB
          Elran Dvir

        Issue Links

          Activity

            People

              Unassigned Unassigned
              elrand Elran Dvir
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: