Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.1, 6.0
    • Component/s: None
    • Labels:
      None

      Description

      Add an option to compute user specified percentiles when computing stats

      Example...

      stats.field={!percentiles='1,2,98,99,99.999'}price
      
      1. SOLR-6350.patch
        58 kB
        Hoss Man
      2. SOLR-6350.patch
        47 kB
        Xu Zhang
      3. SOLR-6350.patch
        40 kB
        Xu Zhang
      4. SOLR-6350.patch
        33 kB
        Hoss Man
      5. SOLR-6350-xu.patch
        21 kB
        Xu Zhang
      6. SOLR-6350-xu.patch
        12 kB
        Xu Zhang
      7. SOLR-6350-Xu.patch
        121 kB
        Xu Zhang
      8. SOLR-6350-Xu.patch
        121 kB
        Xu Zhang

        Issue Links

          Activity

          Hide
          Hoss Man added a comment -

          Proposed implementation...

          • Add Ted Dunnings t-digest as a dependency
          • Build up an ArrayDigets in NumericStatsValues.accumulate()
          • in distributed mode, each shard should return ArrayDigest.asSmallBytes() base64 encoded - the coordinator node should use ArrayDigest.fromBytes() & AbstractTDigest.add(TDigest) to merge the results
          • percentiles should not be enabled by default - they really need a param to control which percentiles (if any) should be computed.
          Show
          Hoss Man added a comment - Proposed implementation... Add Ted Dunnings t-digest as a dependency Build up an ArrayDigets in NumericStatsValues.accumulate() in distributed mode, each shard should return ArrayDigest.asSmallBytes() base64 encoded - the coordinator node should use ArrayDigest.fromBytes() & AbstractTDigest.add(TDigest) to merge the results percentiles should not be enabled by default - they really need a param to control which percentiles (if any) should be computed.
          Hide
          Xu Zhang added a comment -

          First try, followed the proposed implementation. Added dependency, t-digest in single mode, and a quick-and-dirty test. Will take a look at distributed mode tomorrow.

          Also need to figure out how to use t-diges (initial page size, compression, whether to compress, and when to compress).

          Show
          Xu Zhang added a comment - First try, followed the proposed implementation. Added dependency, t-digest in single mode, and a quick-and-dirty test. Will take a look at distributed mode tomorrow. Also need to figure out how to use t-diges (initial page size, compression, whether to compress, and when to compress).
          Hide
          Xu Zhang added a comment -

          Implement Percentiles for distributed mode and test.

          Show
          Xu Zhang added a comment - Implement Percentiles for distributed mode and test.
          Hide
          Xu Zhang added a comment - - edited

          Probably we should give compression parameter to users.

          The T-Digest algorithm use compression parameter to control the trade-off between the size of the T-Digest and the accuracy which quantiles are estimated. Users should be able to pick the trade-off by themselves.

          And probably AVL t-digest implementation is better.

          Show
          Xu Zhang added a comment - - edited Probably we should give compression parameter to users. The T-Digest algorithm use compression parameter to control the trade-off between the size of the T-Digest and the accuracy which quantiles are estimated. Users should be able to pick the trade-off by themselves. And probably AVL t-digest implementation is better.
          Hide
          Jennifer Stumpf added a comment -

          We have a tremendous need for percentiles, especially for 5.0. We were formally using the patch from 3583 for 4.10.3, but we are getting ready to deploy with 5.0.
          Is there a plan for continuing with this improvement? Have you seen performance issues or just anticipate them (WRT the compression parameter)?

          Show
          Jennifer Stumpf added a comment - We have a tremendous need for percentiles, especially for 5.0. We were formally using the patch from 3583 for 4.10.3, but we are getting ready to deploy with 5.0. Is there a plan for continuing with this improvement? Have you seen performance issues or just anticipate them (WRT the compression parameter)?
          Hide
          Xu Zhang added a comment -

          Sorry for losing tracking of this Jira. I can work on this this weekend.

          Hoss Man Any quick feedback?

          Show
          Xu Zhang added a comment - Sorry for losing tracking of this Jira. I can work on this this weekend. Hoss Man Any quick feedback?
          Hide
          Hoss Man added a comment -

          Is there a plan for continuing with this improvement?

          i absolutely plan to get percentiles into StatsComponent – it is "next" on my todo list to review as soon as SOLR-6349 is resolved (i thought that was already marked as a blocker - i'll fix that) which will hopefully be in the next few days.

          Hoss Man Any quick feedback?

          Sorry Xu, i've been heads down on SOLR-6349 and haven't even looked at your patch here yet – if you have time/interest in working on it more, maybe review & revise the current patch it in the context of the latest patch in SOLR-6349, so that once SOLR-6349 is committed there's a SOLR-6350 patch that applies clean?

          Show
          Hoss Man added a comment - Is there a plan for continuing with this improvement? i absolutely plan to get percentiles into StatsComponent – it is "next" on my todo list to review as soon as SOLR-6349 is resolved (i thought that was already marked as a blocker - i'll fix that) which will hopefully be in the next few days. Hoss Man Any quick feedback? Sorry Xu, i've been heads down on SOLR-6349 and haven't even looked at your patch here yet – if you have time/interest in working on it more, maybe review & revise the current patch it in the context of the latest patch in SOLR-6349 , so that once SOLR-6349 is committed there's a SOLR-6350 patch that applies clean?
          Hide
          Xu Zhang added a comment -

          Sure. With pleasure

          New patch that have the Hoss's new change about solr6349. I will continue to do more tests around this and improve this.

          Show
          Xu Zhang added a comment - Sure. With pleasure New patch that have the Hoss's new change about solr6349. I will continue to do more tests around this and improve this.
          Hide
          Xu Zhang added a comment -

          Add another test case with isShard= true and distributed test case.
          The patch included Hoss's patch for Solr-6349, against trunk.

          Will keep updating and improve

          Show
          Xu Zhang added a comment - Add another test case with isShard= true and distributed test case. The patch included Hoss's patch for Solr-6349, against trunk. Will keep updating and improve
          Hide
          Xu Zhang added a comment -

          Almost same, fix existing tests.

          Show
          Xu Zhang added a comment - Almost same, fix existing tests.
          Hide
          Hoss Man added a comment -

          I spent some time today massaging Xu's latest patch to apply cleanly against trunk.

          Xu: please review and see if i missed anything?

          A few notes...

          • I didn't do any in depth review yet, but i did sprinkle some nocommits arround related to things that jumped out at me while resolving conflicts
          • StatsComponentTest.testPercentiles is failing for me (see below)
            • does this fail for you? did i break something merging the patch?
          • I didn't notice any distributed test, not sure if that's something still needing done, or if that was just because of a mistake in creating the patch file and "new files" weren't included.
          • I tried to do some manual experimentation with this patch, and got really bizare results...
            • every percentile i requested came back as "NaN"
            • if i requested more then one percentile value, i got a redundent percentiles block in the response for each, eg...
              ...&stats.field={!percentiles="1,99,99.9"}price
              
                "stats":{
                  "stats_fields":{
                    "price":{
                      "percentiles":[
                        "1","NaN",
                        "99","NaN",
                        "99.9","NaN"],
                      "percentiles":[
                        "1","NaN",
                        "99","NaN",
                        "99.9","NaN"],
                      "percentiles":[
                        "1","NaN",
                        "99","NaN",
                        "99.9","NaN"]}}}}
              
            • these results reproduced with both bin/solr -e techproducts and bin/solr -e cloud (in the later case using some synthetic data (generated by the same script used in SOLR-6349)

          Test failure i mentioned...

             [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=StatsComponentTest -Dtests.method=testPercentiles -Dtests.seed=EA16B33E5685B22 -Dtests.slow=true -Dtests.locale=mk_MK -Dtests.timezone=Pacific/Niue -Dtests.asserts=true -Dtests.file.encoding=ISO-8859-1
             [junit4] FAILURE 1.27s | StatsComponentTest.testPercentiles <<<
             [junit4]    > Throwable #1: java.lang.AssertionError: Difference too much
             [junit4]    > 	at __randomizedtesting.SeedInfo.seed([EA16B33E5685B22:3A66CB07A30CD216]:0)
             [junit4]    > 	at org.apache.solr.handler.component.StatsComponentTest.testPercentiles(StatsComponentTest.java:1447)
             [junit4]    > 	at java.lang.Thread.run(Thread.java:745)
          

          BTW: no need ot put your name as a suffix in the patch filename – convention is to just name the patch after the jira. The only reason to worry about distinguishing the file names is in cases where you explicitly posting a "variant" patch (ie: a strawman that you feel/know is broken and shouldn't be taken seriously long term, an alternative proposal to some other existing patch, an orthogonal patch only containing tests or some other independent changes, etc...)

          Show
          Hoss Man added a comment - I spent some time today massaging Xu's latest patch to apply cleanly against trunk. Xu: please review and see if i missed anything? A few notes... I didn't do any in depth review yet, but i did sprinkle some nocommits arround related to things that jumped out at me while resolving conflicts StatsComponentTest.testPercentiles is failing for me (see below) does this fail for you? did i break something merging the patch? I didn't notice any distributed test, not sure if that's something still needing done, or if that was just because of a mistake in creating the patch file and "new files" weren't included. I tried to do some manual experimentation with this patch, and got really bizare results... every percentile i requested came back as "NaN" if i requested more then one percentile value, i got a redundent percentiles block in the response for each, eg... ...&stats.field={!percentiles="1,99,99.9"}price "stats":{ "stats_fields":{ "price":{ "percentiles":[ "1","NaN", "99","NaN", "99.9","NaN"], "percentiles":[ "1","NaN", "99","NaN", "99.9","NaN"], "percentiles":[ "1","NaN", "99","NaN", "99.9","NaN"]}}}} these results reproduced with both bin/solr -e techproducts and bin/solr -e cloud (in the later case using some synthetic data (generated by the same script used in SOLR-6349 ) Test failure i mentioned... [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=StatsComponentTest -Dtests.method=testPercentiles -Dtests.seed=EA16B33E5685B22 -Dtests.slow=true -Dtests.locale=mk_MK -Dtests.timezone=Pacific/Niue -Dtests.asserts=true -Dtests.file.encoding=ISO-8859-1 [junit4] FAILURE 1.27s | StatsComponentTest.testPercentiles <<< [junit4] > Throwable #1: java.lang.AssertionError: Difference too much [junit4] > at __randomizedtesting.SeedInfo.seed([EA16B33E5685B22:3A66CB07A30CD216]:0) [junit4] > at org.apache.solr.handler.component.StatsComponentTest.testPercentiles(StatsComponentTest.java:1447) [junit4] > at java.lang.Thread.run(Thread.java:745) BTW: no need ot put your name as a suffix in the patch filename – convention is to just name the patch after the jira. The only reason to worry about distinguishing the file names is in cases where you explicitly posting a "variant" patch (ie: a strawman that you feel/know is broken and shouldn't be taken seriously long term, an alternative proposal to some other existing patch, an orthogonal patch only containing tests or some other independent changes, etc...)
          Hide
          Xu Zhang added a comment -

          This patch has the latest trunk change, and should fix NaN errors.

          Show
          Xu Zhang added a comment - This patch has the latest trunk change, and should fix NaN errors.
          Hide
          Xu Zhang added a comment -

          Thanks so much Hoss .

          I think I fix the NaN errors, which is because my patch doesn't compute percentile stats with the new trunk change. This also shows the edge case: when user asking percentiles for empty document set, we will give NaN.

          I didn't do any in depth review yet, but i did sprinkle some nocommits arround related to things that jumped out at me while resolving conflicts

          Some work seems quite dirty to me, will spend some time to improve that. For example, we have a test case which will test all stats combinations, I just exclude percentiles right now, which is quite awful.

          And another thing is I didn't do too much performance tests around this. There are plenty of parameters for Tdigest. I just pick a default number and ArrayDigest.

          I didn't notice any distributed test, not sure if that's something still needing done, or if that was just because of a mistake in creating the patch file and "new files" weren't included.

          I just added 4 simple test cases in the TestDistributedSearch.java. I probably missed them in my last patch. Have them back.

          BTW: no need ot put your name as a suffix in the patch filename – convention is to just name the patch after the jira. The only reason to worry about distinguishing the file names is in cases where you explicitly posting a "variant" patch (ie: a strawman that you feel/know is broken and shouldn't be taken seriously long term, an alternative proposal to some other existing patch, an orthogonal patch only containing tests or some other independent changes, etc...)

          Ha, thanks a lot.

          Show
          Xu Zhang added a comment - Thanks so much Hoss . I think I fix the NaN errors, which is because my patch doesn't compute percentile stats with the new trunk change. This also shows the edge case: when user asking percentiles for empty document set, we will give NaN. I didn't do any in depth review yet, but i did sprinkle some nocommits arround related to things that jumped out at me while resolving conflicts Some work seems quite dirty to me, will spend some time to improve that. For example, we have a test case which will test all stats combinations, I just exclude percentiles right now, which is quite awful. And another thing is I didn't do too much performance tests around this. There are plenty of parameters for Tdigest. I just pick a default number and ArrayDigest. I didn't notice any distributed test, not sure if that's something still needing done, or if that was just because of a mistake in creating the patch file and "new files" weren't included. I just added 4 simple test cases in the TestDistributedSearch.java. I probably missed them in my last patch. Have them back. BTW: no need ot put your name as a suffix in the patch filename – convention is to just name the patch after the jira. The only reason to worry about distinguishing the file names is in cases where you explicitly posting a "variant" patch (ie: a strawman that you feel/know is broken and shouldn't be taken seriously long term, an alternative proposal to some other existing patch, an orthogonal patch only containing tests or some other independent changes, etc...) Ha, thanks a lot.
          Hide
          Hoss Man added a comment -

          (FYI, haven't looked at latest patch, just replying to comments)

          This also shows the edge case: when user asking percentiles for empty document set, we will give NaN.

          I think we should probably return 'null' for each percentile in that case?

          For example, we have a test case which will test all stats combinations, I just exclude percentiles right now, which is quite awful.

          On the test side, we can just add a map of the "input" params for each stat (for most it will be "true" for percentiles it will be the comma seperated string)

          I'm still not really comfortable with how those inpts are parsed though ... ultimately i'd like to refactor all of that stuff and push it down into the StatsValuesFactories (so each factor has an API returning what Stats it supports, failures are produced if you request an unsupported stat) – but for now, maybe we can just introduce a boolean parseParams(StatsField) into each Stat - most Stat instances could use a default impl that would look something like...

          /** return value of true means user is requesting this stat */
          boolean parseParams(StatsField sf) {
            return sf.getLocalParams().getBool(this.getName());
          }
          

          ...but percentiles could be more interesting? ...

          /** return value of true means user is requesting this stat */
          boolean parseParams(StatsField sf) {
            String input = sf.getLocalParams().get(this.getName());
            if (null ! = input) {
              sf.setTDigetsOptions(input);
              return true;
            }
            return false;
          }
          

          ...what do you think?

          And another thing is I didn't do too much performance tests around this. There are plenty of parameters for Tdigest. I just pick a default number and ArrayDigest.

          Yeah, i definitely think we should make those options configurable via another local param percentilOptions="..." (or maybe a suffix on the list of percentiles?)

          Show
          Hoss Man added a comment - (FYI, haven't looked at latest patch, just replying to comments) This also shows the edge case: when user asking percentiles for empty document set, we will give NaN. I think we should probably return 'null' for each percentile in that case? For example, we have a test case which will test all stats combinations, I just exclude percentiles right now, which is quite awful. On the test side, we can just add a map of the "input" params for each stat (for most it will be "true" for percentiles it will be the comma seperated string) I'm still not really comfortable with how those inpts are parsed though ... ultimately i'd like to refactor all of that stuff and push it down into the StatsValuesFactories (so each factor has an API returning what Stats it supports, failures are produced if you request an unsupported stat) – but for now, maybe we can just introduce a boolean parseParams(StatsField) into each Stat - most Stat instances could use a default impl that would look something like... /** return value of true means user is requesting this stat */ boolean parseParams(StatsField sf) { return sf.getLocalParams().getBool( this .getName()); } ...but percentiles could be more interesting? ... /** return value of true means user is requesting this stat */ boolean parseParams(StatsField sf) { String input = sf.getLocalParams().get( this .getName()); if ( null ! = input) { sf.setTDigetsOptions(input); return true ; } return false ; } ...what do you think? And another thing is I didn't do too much performance tests around this. There are plenty of parameters for Tdigest. I just pick a default number and ArrayDigest. Yeah, i definitely think we should make those options configurable via another local param percentilOptions="..." (or maybe a suffix on the list of percentiles?)
          Hide
          Hoss Man added a comment -

          Actaully .. Xu: it looks like your latest patch doesn't include the nocommits i added based on my cursory review of your earlier work – i'm confused as to what happened here ... did you ignore my changes and generate a completely new patch from trunk?

          (there's not a lot of info lost there, it's easy to revive those notes – i just want to make sure i understand what's going on ensure and that moving forward we don't have deviating work ... going back to my earlier comment about patch naming conventions: patches with the same name should build/evolve from the earlier versions, and include those existing changes)

          Show
          Hoss Man added a comment - Actaully .. Xu: it looks like your latest patch doesn't include the nocommits i added based on my cursory review of your earlier work – i'm confused as to what happened here ... did you ignore my changes and generate a completely new patch from trunk? (there's not a lot of info lost there, it's easy to revive those notes – i just want to make sure i understand what's going on ensure and that moving forward we don't have deviating work ... going back to my earlier comment about patch naming conventions: patches with the same name should build/evolve from the earlier versions, and include those existing changes)
          Hide
          Xu Zhang added a comment -

          Thanks Hoss.

          I think I screwed it up. Really sorry about it. Will fix it tonight.

          Show
          Xu Zhang added a comment - Thanks Hoss. I think I screwed it up. Really sorry about it. Will fix it tonight.
          Hide
          simpleBread added a comment -

          I think my last patch just fix the bug, and I did exclude your cursory review.

          Show
          simpleBread added a comment - I think my last patch just fix the bug, and I did exclude your cursory review.
          Hide
          Hoss Man added a comment -

          I think I screwed it up. Really sorry about it. Will fix it tonight.

          no worries - i appreciate all your effort here ... I trust you to update your own patch to trunk more then i trust me – i just want to make sure we aren't losing track of anything.

          Show
          Hoss Man added a comment - I think I screwed it up. Really sorry about it. Will fix it tonight. no worries - i appreciate all your effort here ... I trust you to update your own patch to trunk more then i trust me – i just want to make sure we aren't losing track of anything.
          Hide
          Xu Zhang added a comment -

          I deleted my last patch. This has my last fix and has Hoss's change.

          Show
          Xu Zhang added a comment - I deleted my last patch. This has my last fix and has Hoss's change.
          Hide
          Xu Zhang added a comment -

          This also shows the edge case: when user asking percentiles for empty document set, we will give NaN.
          I think we should probably return 'null' for each percentile in that case?

          Sure, will add some test cases for this .

          I'm still not really comfortable with how those inpts are parsed though ... ultimately i'd like to refactor all of that stuff and push it down into the StatsValuesFactories (so each factor has an API returning what Stats it supports, failures are produced if you request an unsupported stat) – but for now, maybe we can just introduce a boolean parseParams(StatsField) into each Stat - most Stat instances could use a default impl that would look something like...

          /** return value of true means user is requesting this stat */
          boolean parseParams(StatsField sf) {
            return sf.getLocalParams().getBool(this.getName());
          }
          

          ...but percentiles could be more interesting? ...

          /** return value of true means user is requesting this stat */
          boolean parseParams(StatsField sf) {
            String input = sf.getLocalParams().get(this.getName());
            if (null ! = input) {
              sf.setTDigetsOptions(input);
              return true;
            }
            return false;
          }
          

          ...what do you think?

          +1

          Yeah, i definitely think we should make those options configurable via another local param percentilOptions="..." (or maybe a suffix on the list of percentiles?)

          I think "percentilOptions" would be really nice. I will improve the patch based on Hoss's comments, maybe tomorrow.

          Show
          Xu Zhang added a comment - This also shows the edge case: when user asking percentiles for empty document set, we will give NaN. I think we should probably return 'null' for each percentile in that case? Sure, will add some test cases for this . I'm still not really comfortable with how those inpts are parsed though ... ultimately i'd like to refactor all of that stuff and push it down into the StatsValuesFactories (so each factor has an API returning what Stats it supports, failures are produced if you request an unsupported stat) – but for now, maybe we can just introduce a boolean parseParams(StatsField) into each Stat - most Stat instances could use a default impl that would look something like... /** return value of true means user is requesting this stat */ boolean parseParams(StatsField sf) { return sf.getLocalParams().getBool( this .getName()); } ...but percentiles could be more interesting? ... /** return value of true means user is requesting this stat */ boolean parseParams(StatsField sf) { String input = sf.getLocalParams().get( this .getName()); if ( null ! = input) { sf.setTDigetsOptions(input); return true ; } return false ; } ...what do you think? +1 Yeah, i definitely think we should make those options configurable via another local param percentilOptions="..." (or maybe a suffix on the list of percentiles?) I think "percentilOptions" would be really nice. I will improve the patch based on Hoss's comments, maybe tomorrow.
          Hide
          Xu Zhang added a comment - - edited

          Update based on Hoss's suggestions.
          1, Modify test.
          2, add parseParams in to Stat
          3, add compression into local parameters for percentiles, similar to Elastic

          Show
          Xu Zhang added a comment - - edited Update based on Hoss's suggestions. 1, Modify test. 2, add parseParams in to Stat 3, add compression into local parameters for percentiles, similar to Elastic
          Hide
          Yonik Seeley added a comment -

          Nice work! It would be great to have this percentile functionality for the new JSON Facet API / Facet Module as well (SOLR-7214).
          We should see how we can share some of the implementation here.

          Show
          Yonik Seeley added a comment - Nice work! It would be great to have this percentile functionality for the new JSON Facet API / Facet Module as well ( SOLR-7214 ). We should see how we can share some of the implementation here.
          Hide
          Hoss Man added a comment -

          Xu, your laast patch looked pretty good, but while review & tightening up the tests i noticed a few more improvements.

          I think this version is ready to commit.

          Changes since last patch...

          • alphabetize ive-versions.properties to make precommit happy
          • added SHA1, LICENSE, and NOTICE for tdigest
          • reverted unneccessary whitespace changes in CommonParams
          • added some javadocs to FieldStatsInfo
          • StatsField...
            • switched from ArrayDigest to AVLTreeDigest per recomendations in the t-digest docs
            • renamed "compression" localparam to "tdigestCompression"
              • as an expert level op, we're assuming they know about the tdigets impl details
            • simplified tdigestCompression parsing to leverage SolrParam methods
            • renamed "getGetTdigestCompression()" to "getTdigestCompression()"
            • corrected getTdigestCompression() return type and local var to be double (not int)
          • StatsValuesFactory...
            • use calculateStats & includeInResponse consistently like all other stats
            • remove unused "protected Map<String,String> percentiles = new LinkedHashMap<>()"
            • fixed usage of "getGetTdigestCompression()" (new name, see above)
            • eliminated smallByteSize() from the serialization logic in addTypeSpecificStats (see comments)
          • StatsComponentTest...
            • testIndividualStatLocalParams
              • tightened assertions on individual percentile values
            • testPercentiles
              • switched to using queryAndResponse so we don't need substring for janky xml parsing
              • hardened asserts to ensure exact order of percentiles in response
              • simplified asserts that non-numerics don't support percentiles
          • TestDistributedSearch...
            • beefed up assertions
            • included sanity testing of FieldStatsInfo.getPercentiles()

          We should see how we can share some of the implementation here.

          That sounds like something that would be a much bigger scope then just percentiles, and should be tracked in it's own Jira.

          Show
          Hoss Man added a comment - Xu, your laast patch looked pretty good, but while review & tightening up the tests i noticed a few more improvements. I think this version is ready to commit. Changes since last patch... alphabetize ive-versions.properties to make precommit happy added SHA1, LICENSE, and NOTICE for tdigest reverted unneccessary whitespace changes in CommonParams added some javadocs to FieldStatsInfo StatsField... switched from ArrayDigest to AVLTreeDigest per recomendations in the t-digest docs renamed "compression" localparam to "tdigestCompression" as an expert level op, we're assuming they know about the tdigets impl details simplified tdigestCompression parsing to leverage SolrParam methods renamed "getGetTdigestCompression()" to "getTdigestCompression()" corrected getTdigestCompression() return type and local var to be double (not int) StatsValuesFactory... use calculateStats & includeInResponse consistently like all other stats remove unused "protected Map<String,String> percentiles = new LinkedHashMap<>()" fixed usage of "getGetTdigestCompression()" (new name, see above) eliminated smallByteSize() from the serialization logic in addTypeSpecificStats (see comments) StatsComponentTest... testIndividualStatLocalParams tightened assertions on individual percentile values testPercentiles switched to using queryAndResponse so we don't need substring for janky xml parsing hardened asserts to ensure exact order of percentiles in response simplified asserts that non-numerics don't support percentiles TestDistributedSearch... beefed up assertions included sanity testing of FieldStatsInfo.getPercentiles() We should see how we can share some of the implementation here. That sounds like something that would be a much bigger scope then just percentiles, and should be tracked in it's own Jira.
          Hide
          Yonik Seeley added a comment -

          That sounds like something that would be a much bigger scope then just percentiles, and should be tracked in it's own Jira.

          It's not about implementing it here... it's about keeping it in mind - it can change implementations.

          Show
          Yonik Seeley added a comment - That sounds like something that would be a much bigger scope then just percentiles, and should be tracked in it's own Jira. It's not about implementing it here... it's about keeping it in mind - it can change implementations.
          Hide
          Steve Molloy added a comment -

          Agree with the separate issue to track this, created SOLR-7296 for it.

          Show
          Steve Molloy added a comment - Agree with the separate issue to track this, created SOLR-7296 for it.
          Hide
          ASF subversion and git services added a comment -

          Commit 1668922 from hossman@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1668922 ]

          SOLR-6350: StatsComponent now supports Percentiles

          Show
          ASF subversion and git services added a comment - Commit 1668922 from hossman@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1668922 ] SOLR-6350 : StatsComponent now supports Percentiles
          Hide
          ASF subversion and git services added a comment -

          Commit 1668926 from hossman@apache.org in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1668926 ]

          SOLR-6350: StatsComponent now supports Percentiles (merge r1668922)

          Show
          ASF subversion and git services added a comment - Commit 1668926 from hossman@apache.org in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1668926 ] SOLR-6350 : StatsComponent now supports Percentiles (merge r1668922)
          Hide
          Timothy Potter added a comment -

          Bulk close after 5.1 release

          Show
          Timothy Potter added a comment - Bulk close after 5.1 release

            People

            • Assignee:
              Hoss Man
              Reporter:
              Hoss Man
            • Votes:
              1 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development