Solr
  1. Solr
  2. SOLR-1023

StatsComponent should support dates (and other non-numeric fields)

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.4
    • Fix Version/s: 3.5, 4.0-ALPHA
    • Component/s: None
    • Labels:
      None
    • Environment:

      Mac OS 10.5, java version "1.5.0_16"

      Description

      Currently, the StatsComponent only supports single-value numeric fields:

      http://wiki.apache.org/solr/StatsComponent

      trying to use it with a date field I get an exception like: java.lang.NumberFormatException: For input string: "2009-01-27T20:04:04Z"

      trying to use it with a string I get an error 400 "Stats are valid for single valued numeric values."

      For constructing date facets it would be very useful to be able to get the minimum and maximum date from a DateField within a set of documents. In general, it could be useful to get the minimum and maximum from any field type that can be compared, though that's of less importance.

      1. SOLR-1023-CHANGES.TXT.branch_3x.patch
        0.5 kB
        Gunnlaugur Thor Briem
      2. SOLR-1023-CHANGES.TXT.branch_3x.patch
        0.5 kB
        Gunnlaugur Thor Briem
      3. SOLR-1023-CHANGES.TXT.trunk.patch
        0.5 kB
        Gunnlaugur Thor Briem
      4. SOLR-1023-against-branch_3x.svn.patch
        33 kB
        Gunnlaugur Thor Briem
      5. SOLR-1023.patch
        34 kB
        Ryan McKinley
      6. SOLR-1023.patch
        33 kB
        Ryan McKinley
      7. SOLR-1023.patch
        33 kB
        Gunnlaugur Thor Briem
      8. SOLR-1023-against-branch_3x.svn.patch
        32 kB
        Gunnlaugur Thor Briem
      9. SOLR-1023-against-lucene_3_4_0.patch
        32 kB
        Gunnlaugur Thor Briem
      10. stats-component-path-nightly-2009-10-08.patch
        25 kB
        Mark Holland
      11. SOLR-1023.patch
        30 kB
        Chris Male

        Issue Links

          Activity

          Hide
          Shalin Shekhar Mangar added a comment -

          Marking for 1.5 because although it is useful, there is no patch yet.

          Show
          Shalin Shekhar Mangar added a comment - Marking for 1.5 because although it is useful, there is no patch yet.
          Hide
          Rafał Kuć added a comment -

          I should be able to supply a patch adding date support for StatsCommponent and for string fields too. What I think is what other statistics appart from minimum and maximum would be usefull ? I`m thinking about count and missing. Any other ideas ?

          Show
          Rafał Kuć added a comment - I should be able to supply a patch adding date support for StatsCommponent and for string fields too. What I think is what other statistics appart from minimum and maximum would be usefull ? I`m thinking about count and missing. Any other ideas ?
          Hide
          Chris Male added a comment -

          I have attached a patch that adds support for String and Date fields. To support these I have also made some improvements in the underlying architecture so that it is more extensible. It is now possible to easy add statistics for other field types if desired in the future.

          I have also updated the test class to include tests for String and Date fields.

          Show
          Chris Male added a comment - I have attached a patch that adds support for String and Date fields. To support these I have also made some improvements in the underlying architecture so that it is more extensible. It is now possible to easy add statistics for other field types if desired in the future. I have also updated the test class to include tests for String and Date fields.
          Hide
          Mark Holland added a comment -

          If anyone is interested I've attached a patch that patches against nightly 2009-10-08.

          Show
          Mark Holland added a comment - If anyone is interested I've attached a patch that patches against nightly 2009-10-08.
          Hide
          Peter Wolanin added a comment -

          Thanks Mark - I'm disappointed that this didn't get into 1.4, but will try the patch.

          Show
          Peter Wolanin added a comment - Thanks Mark - I'm disappointed that this didn't get into 1.4, but will try the patch.
          Hide
          Hoss Man added a comment -

          Bulk updating 240 Solr issues to set the Fix Version to "next" per the process outlined in this email...

          http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3Calpine.DEB.1.10.1005251052040.24672@radix.cryptio.net%3E

          Selection criteria was "Unresolved" with a Fix Version of 1.5, 1.6, 3.1, or 4.0. email notifications were suppressed.

          A unique token for finding these 240 issues in the future: hossversioncleanup20100527

          Show
          Hoss Man added a comment - Bulk updating 240 Solr issues to set the Fix Version to "next" per the process outlined in this email... http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3Calpine.DEB.1.10.1005251052040.24672@radix.cryptio.net%3E Selection criteria was "Unresolved" with a Fix Version of 1.5, 1.6, 3.1, or 4.0. email notifications were suppressed. A unique token for finding these 240 issues in the future: hossversioncleanup20100527
          Hide
          Robert Muir added a comment -

          Bulk move 3.2 -> 3.3

          Show
          Robert Muir added a comment - Bulk move 3.2 -> 3.3
          Hide
          Robert Muir added a comment -

          3.4 -> 3.5

          Show
          Robert Muir added a comment - 3.4 -> 3.5
          Hide
          Gunnlaugur Thor Briem added a comment -

          Updated, fixed and cleaned-up patch against 3.4.0 (applies cleanly against branch_3x and trunk as well). All tests pass. Can we get this in?

          Show
          Gunnlaugur Thor Briem added a comment - Updated, fixed and cleaned-up patch against 3.4.0 (applies cleanly against branch_3x and trunk as well). All tests pass. Can we get this in?
          Hide
          Gunnlaugur Thor Briem added a comment -

          Fixing filename (mixup in the issue ID)

          Show
          Gunnlaugur Thor Briem added a comment - Fixing filename (mixup in the issue ID)
          Hide
          Ryan McKinley added a comment -

          Gunnlaugur, can you post an svn patch for trunk?

          I can try to sort out the git patch if not....

          Show
          Ryan McKinley added a comment - Gunnlaugur, can you post an svn patch for trunk? I can try to sort out the git patch if not....
          Hide
          Gunnlaugur Thor Briem added a comment -

          That took a bit of conflict resolution (I was quite wrong about the above patch applying cleanly to trunk), but here it is.

          Show
          Gunnlaugur Thor Briem added a comment - That took a bit of conflict resolution (I was quite wrong about the above patch applying cleanly to trunk), but here it is.
          Hide
          Gunnlaugur Thor Briem added a comment -

          ... and against svn branch_3x

          Show
          Gunnlaugur Thor Briem added a comment - ... and against svn branch_3x
          Hide
          Gunnlaugur Thor Briem added a comment -

          Patches with correctly formatted names this time (sorry)

          Show
          Gunnlaugur Thor Briem added a comment - Patches with correctly formatted names this time (sorry)
          Hide
          Ryan McKinley added a comment -

          this is an updated patch that uses BytesRef and FieldType.toObject() rather then Strings and internal conversion

          Show
          Ryan McKinley added a comment - this is an updated patch that uses BytesRef and FieldType.toObject() rather then Strings and internal conversion
          Hide
          Ryan McKinley added a comment -

          sorry attached the wrong file

          Show
          Ryan McKinley added a comment - sorry attached the wrong file
          Hide
          Ryan McKinley added a comment -

          hymm – just realized that the BytesRef improvements will not work in 3x because SchemaField does not expose toObject()

          I'd like to commit to trunk with the BytesRef improvement, then apply the Strings version to 3.x – this would not be a normal merge though, so i don't know what people think about that...

          Show
          Ryan McKinley added a comment - hymm – just realized that the BytesRef improvements will not work in 3x because SchemaField does not expose toObject() I'd like to commit to trunk with the BytesRef improvement, then apply the Strings version to 3.x – this would not be a normal merge though, so i don't know what people think about that...
          Hide
          Ryan McKinley added a comment -

          I commited this to trunk in #1201855

          I am unable to get things to merge with 3.x – anyone want to take a stab at that?

          Show
          Ryan McKinley added a comment - I commited this to trunk in #1201855 I am unable to get things to merge with 3.x – anyone want to take a stab at that?
          Hide
          Simon Willnauer added a comment -

          I am unable to get things to merge with 3.x – anyone want to take a stab at that?

          what's the problem ryan?

          Show
          Simon Willnauer added a comment - I am unable to get things to merge with 3.x – anyone want to take a stab at that? what's the problem ryan?
          Hide
          Ryan McKinley added a comment -

          I ran:

          ryan@xps /cygdrive/c/workspace/apache/lucene-3x/solr
          $ svn merge -c 1201855 https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/
          --- Merging r1201855 into '.':
          U    core\src\test\org\apache\solr\handler\component\StatsComponentTest.java
          Conflict discovered in 'core/src/java/org/apache/solr/request/UnInvertedField.java'.
          Select: (p) postpone, (df) diff-full, (e) edit,
                  (mc) mine-conflict, (tc) theirs-conflict,
                  (s) show all options: p
          C    core\src\java\org\apache\solr\request\UnInvertedField.java
          Conflict discovered in 'core/src/java/org/apache/solr/handler/component/FieldFacetStats.java'.
          Select: (p) postpone, (df) diff-full, (e) edit,
                  (mc) mine-conflict, (tc) theirs-conflict,
                  (s) show all options: p
          C    core\src\java\org\apache\solr\handler\component\FieldFacetStats.java
          A    core\src\java\org\apache\solr\handler\component\StatsValuesFactory.java
          U    core\src\java\org\apache\solr\handler\component\StatsValues.java
          Conflict discovered in 'core/src/java/org/apache/solr/handler/component/StatsComponent.java'.
          Select: (p) postpone, (df) diff-full, (e) edit,
                  (mc) mine-conflict, (tc) theirs-conflict,
                  (s) show all options: p
          C    core\src\java\org\apache\solr\handler\component\StatsComponent.java
          Summary of conflicts:
            Text conflicts: 3
          

          but resolving the conflicts is more then I'm up for at the moment.

          Show
          Ryan McKinley added a comment - I ran: ryan@xps /cygdrive/c/workspace/apache/lucene-3x/solr $ svn merge -c 1201855 https: //svn.apache.org/repos/asf/lucene/dev/trunk/solr/ --- Merging r1201855 into '.': U core\src\test\org\apache\solr\handler\component\StatsComponentTest.java Conflict discovered in 'core/src/java/org/apache/solr/request/UnInvertedField.java'. Select: (p) postpone, (df) diff-full, (e) edit, (mc) mine-conflict, (tc) theirs-conflict, (s) show all options: p C core\src\java\org\apache\solr\request\UnInvertedField.java Conflict discovered in 'core/src/java/org/apache/solr/handler/component/FieldFacetStats.java'. Select: (p) postpone, (df) diff-full, (e) edit, (mc) mine-conflict, (tc) theirs-conflict, (s) show all options: p C core\src\java\org\apache\solr\handler\component\FieldFacetStats.java A core\src\java\org\apache\solr\handler\component\StatsValuesFactory.java U core\src\java\org\apache\solr\handler\component\StatsValues.java Conflict discovered in 'core/src/java/org/apache/solr/handler/component/StatsComponent.java'. Select: (p) postpone, (df) diff-full, (e) edit, (mc) mine-conflict, (tc) theirs-conflict, (s) show all options: p C core\src\java\org\apache\solr\handler\component\StatsComponent.java Summary of conflicts: Text conflicts: 3 but resolving the conflicts is more then I'm up for at the moment.
          Hide
          Simon Willnauer added a comment -

          but resolving the conflicts is more then I'm up for at the moment.

          LOL - I will see If I can do it tomorrow...

          Show
          Simon Willnauer added a comment - but resolving the conflicts is more then I'm up for at the moment. LOL - I will see If I can do it tomorrow...
          Hide
          Hoss Man added a comment -

          I'm confused: if we have a patch against trunk (already committed) and we have a patch against 3x (attached) can't we just apply the 3x patch to the 3x branch and do a props only merge to record it??

          or is the logic in the two patches fundamentally different?

          Show
          Hoss Man added a comment - I'm confused: if we have a patch against trunk (already committed) and we have a patch against 3x (attached) can't we just apply the 3x patch to the 3x branch and do a props only merge to record it?? or is the logic in the two patches fundamentally different?
          Hide
          Gunnlaugur Thor Briem added a comment - - edited

          No, pretty similar.

          In my two patches the only difference was that the trunk version used BytesRef where the branch_3x one used String.

          Here are the changes Ryan made from my trunk patch:

          (a) started passing around SchemaField rather than FieldType, I think specifically in order to call ft.toObject(sf, value) in StatsValuesFactory. But that toObject method isn't available in branch_3x as he pointed out, so the whole FieldType to SchemaField change is probably not needed there. Much of the diff stems from this.

          (b) generalized class DoubleStatsValues extends AbstractStatsValues<Double> to class NumericStatsValues extends AbstractStatsValues<Number>

          (c) made DateStatsValues extend AbstractStatsValues<Date> (as it should) instead of AbstractStatsValues<String>, with the simplifications that allowed.

          (d) fixed some indentation bloopers from me.

          I've updated my branch_3x SVN patch with changes (b), (c) and (d), attaching.

          Show
          Gunnlaugur Thor Briem added a comment - - edited No, pretty similar. In my two patches the only difference was that the trunk version used BytesRef where the branch_3x one used String . Here are the changes Ryan made from my trunk patch: (a) started passing around SchemaField rather than FieldType , I think specifically in order to call ft.toObject(sf, value) in StatsValuesFactory . But that toObject method isn't available in branch_3x as he pointed out, so the whole FieldType to SchemaField change is probably not needed there. Much of the diff stems from this. (b) generalized class DoubleStatsValues extends AbstractStatsValues<Double> to class NumericStatsValues extends AbstractStatsValues<Number> (c) made DateStatsValues extend AbstractStatsValues<Date> (as it should) instead of AbstractStatsValues<String> , with the simplifications that allowed. (d) fixed some indentation bloopers from me. I've updated my branch_3x SVN patch with changes (b), (c) and (d), attaching.
          Hide
          Gunnlaugur Thor Briem added a comment -

          Updated patch against branch_3x SVN, matching Ryan's trunk patch except for FieldType.toObject usage and affiliated FieldType->SchemaField changes

          Show
          Gunnlaugur Thor Briem added a comment - Updated patch against branch_3x SVN, matching Ryan's trunk patch except for FieldType.toObject usage and affiliated FieldType->SchemaField changes
          Hide
          Gunnlaugur Thor Briem added a comment -

          Gah, I did it again! (.diff extension instead of .patch, this time uploading .patch)

          Show
          Gunnlaugur Thor Briem added a comment - Gah, I did it again! (.diff extension instead of .patch, this time uploading .patch)
          Hide
          Simon Willnauer added a comment -

          backported to 3.x in revision 1202104

          Show
          Simon Willnauer added a comment - backported to 3.x in revision 1202104
          Hide
          Gunnlaugur Thor Briem added a comment -

          We all forgot to add entries in {{CHANGES.TXT }} in the patch, see attached for trunk and branch_3x.

          Show
          Gunnlaugur Thor Briem added a comment - We all forgot to add entries in {{CHANGES.TXT }} in the patch, see attached for trunk and branch_3x.
          Hide
          Gunnlaugur Thor Briem added a comment -

          (minor) Fix branch_3x CHANGES.TXT patch to be from root

          Show
          Gunnlaugur Thor Briem added a comment - (minor) Fix branch_3x CHANGES.TXT patch to be from root
          Hide
          Simon Willnauer added a comment -

          added all changes entries thanks Gunnalaugur

          Show
          Simon Willnauer added a comment - added all changes entries thanks Gunnalaugur
          Hide
          Ryan McKinley added a comment -

          thanks Gunnlaugur and Simon!

          Show
          Ryan McKinley added a comment - thanks Gunnlaugur and Simon!
          Hide
          Uwe Schindler added a comment -

          Bulk close after 3.5 is released

          Show
          Uwe Schindler added a comment - Bulk close after 3.5 is released

            People

            • Assignee:
              Ryan McKinley
              Reporter:
              Peter Wolanin
            • Votes:
              3 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development