Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-1023

StatsComponent should support dates (and other non-numeric fields)

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.4
    • 3.5, 4.0-ALPHA
    • None
    • None
    • Mac OS 10.5, java version "1.5.0_16"

    Description

      Currently, the StatsComponent only supports single-value numeric fields:

      http://wiki.apache.org/solr/StatsComponent

      trying to use it with a date field I get an exception like: java.lang.NumberFormatException: For input string: "2009-01-27T20:04:04Z"

      trying to use it with a string I get an error 400 "Stats are valid for single valued numeric values."

      For constructing date facets it would be very useful to be able to get the minimum and maximum date from a DateField within a set of documents. In general, it could be useful to get the minimum and maximum from any field type that can be compared, though that's of less importance.

      Attachments

        1. stats-component-path-nightly-2009-10-08.patch
          25 kB
          Mark Holland
        2. SOLR-1023-CHANGES.TXT.trunk.patch
          0.5 kB
          Gunnlaugur Thor Briem
        3. SOLR-1023-CHANGES.TXT.branch_3x.patch
          0.5 kB
          Gunnlaugur Thor Briem
        4. SOLR-1023-CHANGES.TXT.branch_3x.patch
          0.5 kB
          Gunnlaugur Thor Briem
        5. SOLR-1023-against-lucene_3_4_0.patch
          32 kB
          Gunnlaugur Thor Briem
        6. SOLR-1023-against-branch_3x.svn.patch
          32 kB
          Gunnlaugur Thor Briem
        7. SOLR-1023-against-branch_3x.svn.patch
          33 kB
          Gunnlaugur Thor Briem
        8. SOLR-1023.patch
          30 kB
          Chris Male
        9. SOLR-1023.patch
          33 kB
          Gunnlaugur Thor Briem
        10. SOLR-1023.patch
          33 kB
          Ryan McKinley
        11. SOLR-1023.patch
          34 kB
          Ryan McKinley

        Issue Links

          Activity

            Marking for 1.5 because although it is useful, there is no patch yet.

            shalin Shalin Shekhar Mangar added a comment - Marking for 1.5 because although it is useful, there is no patch yet.

            I should be able to supply a patch adding date support for StatsCommponent and for string fields too. What I think is what other statistics appart from minimum and maximum would be usefull ? I`m thinking about count and missing. Any other ideas ?

            gro Rafał Kuć added a comment - I should be able to supply a patch adding date support for StatsCommponent and for string fields too. What I think is what other statistics appart from minimum and maximum would be usefull ? I`m thinking about count and missing. Any other ideas ?
            cmale Chris Male added a comment -

            I have attached a patch that adds support for String and Date fields. To support these I have also made some improvements in the underlying architecture so that it is more extensible. It is now possible to easy add statistics for other field types if desired in the future.

            I have also updated the test class to include tests for String and Date fields.

            cmale Chris Male added a comment - I have attached a patch that adds support for String and Date fields. To support these I have also made some improvements in the underlying architecture so that it is more extensible. It is now possible to easy add statistics for other field types if desired in the future. I have also updated the test class to include tests for String and Date fields.
            evilmango Mark Holland added a comment -

            If anyone is interested I've attached a patch that patches against nightly 2009-10-08.

            evilmango Mark Holland added a comment - If anyone is interested I've attached a patch that patches against nightly 2009-10-08.
            pwolanin Peter Wolanin added a comment -

            Thanks Mark - I'm disappointed that this didn't get into 1.4, but will try the patch.

            pwolanin Peter Wolanin added a comment - Thanks Mark - I'm disappointed that this didn't get into 1.4, but will try the patch.

            Bulk updating 240 Solr issues to set the Fix Version to "next" per the process outlined in this email...

            http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3Calpine.DEB.1.10.1005251052040.24672@radix.cryptio.net%3E

            Selection criteria was "Unresolved" with a Fix Version of 1.5, 1.6, 3.1, or 4.0. email notifications were suppressed.

            A unique token for finding these 240 issues in the future: hossversioncleanup20100527

            hossman Chris M. Hostetter added a comment - Bulk updating 240 Solr issues to set the Fix Version to "next" per the process outlined in this email... http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3Calpine.DEB.1.10.1005251052040.24672@radix.cryptio.net%3E Selection criteria was "Unresolved" with a Fix Version of 1.5, 1.6, 3.1, or 4.0. email notifications were suppressed. A unique token for finding these 240 issues in the future: hossversioncleanup20100527
            rcmuir Robert Muir added a comment -

            Bulk move 3.2 -> 3.3

            rcmuir Robert Muir added a comment - Bulk move 3.2 -> 3.3
            rcmuir Robert Muir added a comment -

            3.4 -> 3.5

            rcmuir Robert Muir added a comment - 3.4 -> 3.5

            Updated, fixed and cleaned-up patch against 3.4.0 (applies cleanly against branch_3x and trunk as well). All tests pass. Can we get this in?

            gthb Gunnlaugur Thor Briem added a comment - Updated, fixed and cleaned-up patch against 3.4.0 (applies cleanly against branch_3x and trunk as well). All tests pass. Can we get this in?

            Fixing filename (mixup in the issue ID)

            gthb Gunnlaugur Thor Briem added a comment - Fixing filename (mixup in the issue ID)
            ryantxu Ryan McKinley added a comment -

            Gunnlaugur, can you post an svn patch for trunk?

            I can try to sort out the git patch if not....

            ryantxu Ryan McKinley added a comment - Gunnlaugur, can you post an svn patch for trunk? I can try to sort out the git patch if not....

            That took a bit of conflict resolution (I was quite wrong about the above patch applying cleanly to trunk), but here it is.

            gthb Gunnlaugur Thor Briem added a comment - That took a bit of conflict resolution (I was quite wrong about the above patch applying cleanly to trunk), but here it is.

            ... and against svn branch_3x

            gthb Gunnlaugur Thor Briem added a comment - ... and against svn branch_3x

            Patches with correctly formatted names this time (sorry)

            gthb Gunnlaugur Thor Briem added a comment - Patches with correctly formatted names this time (sorry)
            ryantxu Ryan McKinley added a comment -

            this is an updated patch that uses BytesRef and FieldType.toObject() rather then Strings and internal conversion

            ryantxu Ryan McKinley added a comment - this is an updated patch that uses BytesRef and FieldType.toObject() rather then Strings and internal conversion
            ryantxu Ryan McKinley added a comment -

            sorry attached the wrong file

            ryantxu Ryan McKinley added a comment - sorry attached the wrong file
            ryantxu Ryan McKinley added a comment -

            hymm – just realized that the BytesRef improvements will not work in 3x because SchemaField does not expose toObject()

            I'd like to commit to trunk with the BytesRef improvement, then apply the Strings version to 3.x – this would not be a normal merge though, so i don't know what people think about that...

            ryantxu Ryan McKinley added a comment - hymm – just realized that the BytesRef improvements will not work in 3x because SchemaField does not expose toObject() I'd like to commit to trunk with the BytesRef improvement, then apply the Strings version to 3.x – this would not be a normal merge though, so i don't know what people think about that...
            ryantxu Ryan McKinley added a comment -

            I commited this to trunk in #1201855

            I am unable to get things to merge with 3.x – anyone want to take a stab at that?

            ryantxu Ryan McKinley added a comment - I commited this to trunk in #1201855 I am unable to get things to merge with 3.x – anyone want to take a stab at that?

            I am unable to get things to merge with 3.x – anyone want to take a stab at that?

            what's the problem ryan?

            simonw Simon Willnauer added a comment - I am unable to get things to merge with 3.x – anyone want to take a stab at that? what's the problem ryan?
            ryantxu Ryan McKinley added a comment -

            I ran:

            ryan@xps /cygdrive/c/workspace/apache/lucene-3x/solr
            $ svn merge -c 1201855 https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/
            --- Merging r1201855 into '.':
            U    core\src\test\org\apache\solr\handler\component\StatsComponentTest.java
            Conflict discovered in 'core/src/java/org/apache/solr/request/UnInvertedField.java'.
            Select: (p) postpone, (df) diff-full, (e) edit,
                    (mc) mine-conflict, (tc) theirs-conflict,
                    (s) show all options: p
            C    core\src\java\org\apache\solr\request\UnInvertedField.java
            Conflict discovered in 'core/src/java/org/apache/solr/handler/component/FieldFacetStats.java'.
            Select: (p) postpone, (df) diff-full, (e) edit,
                    (mc) mine-conflict, (tc) theirs-conflict,
                    (s) show all options: p
            C    core\src\java\org\apache\solr\handler\component\FieldFacetStats.java
            A    core\src\java\org\apache\solr\handler\component\StatsValuesFactory.java
            U    core\src\java\org\apache\solr\handler\component\StatsValues.java
            Conflict discovered in 'core/src/java/org/apache/solr/handler/component/StatsComponent.java'.
            Select: (p) postpone, (df) diff-full, (e) edit,
                    (mc) mine-conflict, (tc) theirs-conflict,
                    (s) show all options: p
            C    core\src\java\org\apache\solr\handler\component\StatsComponent.java
            Summary of conflicts:
              Text conflicts: 3
            

            but resolving the conflicts is more then I'm up for at the moment.

            ryantxu Ryan McKinley added a comment - I ran: ryan@xps /cygdrive/c/workspace/apache/lucene-3x/solr $ svn merge -c 1201855 https: //svn.apache.org/repos/asf/lucene/dev/trunk/solr/ --- Merging r1201855 into '.' : U core\src\test\org\apache\solr\handler\component\StatsComponentTest.java Conflict discovered in 'core/src/java/org/apache/solr/request/UnInvertedField.java' . Select: (p) postpone, (df) diff-full, (e) edit, (mc) mine-conflict, (tc) theirs-conflict, (s) show all options: p C core\src\java\org\apache\solr\request\UnInvertedField.java Conflict discovered in 'core/src/java/org/apache/solr/handler/component/FieldFacetStats.java' . Select: (p) postpone, (df) diff-full, (e) edit, (mc) mine-conflict, (tc) theirs-conflict, (s) show all options: p C core\src\java\org\apache\solr\handler\component\FieldFacetStats.java A core\src\java\org\apache\solr\handler\component\StatsValuesFactory.java U core\src\java\org\apache\solr\handler\component\StatsValues.java Conflict discovered in 'core/src/java/org/apache/solr/handler/component/StatsComponent.java' . Select: (p) postpone, (df) diff-full, (e) edit, (mc) mine-conflict, (tc) theirs-conflict, (s) show all options: p C core\src\java\org\apache\solr\handler\component\StatsComponent.java Summary of conflicts: Text conflicts: 3 but resolving the conflicts is more then I'm up for at the moment.

            but resolving the conflicts is more then I'm up for at the moment.

            LOL - I will see If I can do it tomorrow...

            simonw Simon Willnauer added a comment - but resolving the conflicts is more then I'm up for at the moment. LOL - I will see If I can do it tomorrow...

            I'm confused: if we have a patch against trunk (already committed) and we have a patch against 3x (attached) can't we just apply the 3x patch to the 3x branch and do a props only merge to record it??

            or is the logic in the two patches fundamentally different?

            hossman Chris M. Hostetter added a comment - I'm confused: if we have a patch against trunk (already committed) and we have a patch against 3x (attached) can't we just apply the 3x patch to the 3x branch and do a props only merge to record it?? or is the logic in the two patches fundamentally different?
            gthb Gunnlaugur Thor Briem added a comment - - edited

            No, pretty similar.

            In my two patches the only difference was that the trunk version used BytesRef where the branch_3x one used String.

            Here are the changes Ryan made from my trunk patch:

            (a) started passing around SchemaField rather than FieldType, I think specifically in order to call ft.toObject(sf, value) in StatsValuesFactory. But that toObject method isn't available in branch_3x as he pointed out, so the whole FieldType to SchemaField change is probably not needed there. Much of the diff stems from this.

            (b) generalized class DoubleStatsValues extends AbstractStatsValues<Double> to class NumericStatsValues extends AbstractStatsValues<Number>

            (c) made DateStatsValues extend AbstractStatsValues<Date> (as it should) instead of AbstractStatsValues<String>, with the simplifications that allowed.

            (d) fixed some indentation bloopers from me.

            I've updated my branch_3x SVN patch with changes (b), (c) and (d), attaching.

            gthb Gunnlaugur Thor Briem added a comment - - edited No, pretty similar. In my two patches the only difference was that the trunk version used BytesRef where the branch_3x one used String . Here are the changes Ryan made from my trunk patch: (a) started passing around SchemaField rather than FieldType , I think specifically in order to call ft.toObject(sf, value) in StatsValuesFactory . But that toObject method isn't available in branch_3x as he pointed out, so the whole FieldType to SchemaField change is probably not needed there. Much of the diff stems from this. (b) generalized class DoubleStatsValues extends AbstractStatsValues<Double> to class NumericStatsValues extends AbstractStatsValues<Number> (c) made DateStatsValues extend AbstractStatsValues<Date> (as it should) instead of AbstractStatsValues<String> , with the simplifications that allowed. (d) fixed some indentation bloopers from me. I've updated my branch_3x SVN patch with changes (b), (c) and (d), attaching.

            Updated patch against branch_3x SVN, matching Ryan's trunk patch except for FieldType.toObject usage and affiliated FieldType->SchemaField changes

            gthb Gunnlaugur Thor Briem added a comment - Updated patch against branch_3x SVN, matching Ryan's trunk patch except for FieldType.toObject usage and affiliated FieldType->SchemaField changes

            Gah, I did it again! (.diff extension instead of .patch, this time uploading .patch)

            gthb Gunnlaugur Thor Briem added a comment - Gah, I did it again! (.diff extension instead of .patch, this time uploading .patch)

            backported to 3.x in revision 1202104

            simonw Simon Willnauer added a comment - backported to 3.x in revision 1202104

            We all forgot to add entries in {{CHANGES.TXT }} in the patch, see attached for trunk and branch_3x.

            gthb Gunnlaugur Thor Briem added a comment - We all forgot to add entries in {{CHANGES.TXT }} in the patch, see attached for trunk and branch_3x.

            (minor) Fix branch_3x CHANGES.TXT patch to be from root

            gthb Gunnlaugur Thor Briem added a comment - (minor) Fix branch_3x CHANGES.TXT patch to be from root

            added all changes entries thanks Gunnalaugur

            simonw Simon Willnauer added a comment - added all changes entries thanks Gunnalaugur
            ryantxu Ryan McKinley added a comment -

            thanks Gunnlaugur and Simon!

            ryantxu Ryan McKinley added a comment - thanks Gunnlaugur and Simon!
            uschindler Uwe Schindler added a comment -

            Bulk close after 3.5 is released

            uschindler Uwe Schindler added a comment - Bulk close after 3.5 is released

            People

              ryantxu Ryan McKinley
              pwolanin Peter Wolanin
              Votes:
              3 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: