Solr
  1. Solr
  2. SOLR-3793

duplicate (deleted) documents included in result set when using field faceting with fq

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 4.0-BETA
    • Fix Version/s: 4.0
    • Component/s: None
    • Labels:
      None

      Description

      Günter Hipler reported on the solr-user mailing list that he was seeing inconsistencies in facet counts compared to the numFound when drilling down onto those facets (using "fq") - in particular: when adding an "fq" such as `fq=

      {!term+f%3DnavNetwork}

      nebis`, the resulting numFound was higher then the number of docs reported by the facet constraint for nebis in the base request.

      I've been able to trivially reproduce this using the example data from Solr 4.0-BETA, trunk@r1381400, and branch_4x@r1381400 (details in comment to follow)

      Important things to note from Günter's email thread with his assessment of the problem...

      https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201208.mbox/%3CCAM_U7jfDpNrGfmWmNtNACHCDCJw4YB-rLBBvRW_WP_jdOb_cgw@mail.gmail.com%3E

      The behaviour is not consistent. Some of the facets provide the correct result, some not. What I can't say for sure: The behaviour was correct (if I'm not wrong) once the whole index was newly created. After running some updates I got these results.

      I'm going to setup a new index with the Lucene 4.0 version from March (to be more exactly: it's version 4.0-2012-03-09_11-29-20) to see what are the results even in case of frequent updates ... the version deployed in march doesn't contain the error I now come across in Beta4.0

      1. SOLR-3793.patch
        3 kB
        Yonik Seeley

        Activity

        Hide
        Hoss Man added a comment -

        Steps to reproduce...

        1) Start with a clean install of 4.0-BETA, containing a completley empty example index, and run solr...

        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example$ ls -a solr/collection1/data/
        .  ..
        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example$ java -jar start.jar 
        2012-09-05 12:59:56.596:INFO:oejs.Server:jetty-8.1.2.v20120308
        ...
        

        2) In another window, index all sample documents...

        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ java -jar post.jar *.xml
        ...
        

        3) Observe the results of a simple query faceting on "cat", as well as the results of filtering on one of those cat values...

        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&wt=json&indent=true'
        {
          "responseHeader":{
            "status":0,
            "QTime":2},
          "response":{"numFound":3,"start":0,"docs":[
              {
                "id":"IW-02"},
              {
                "id":"F8V7067-APL-KIT"},
              {
                "id":"MA147LL/A"}]
          },
          "facet_counts":{
            "facet_queries":{},
            "facet_fields":{
              "cat":[
                "electronics",3,
                "connector",2,
                "music",1]},
            "facet_dates":{},
            "facet_ranges":{}}}
        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&wt=json&indent=true&fq=cat:electronics'
        {
          "responseHeader":{
            "status":0,
            "QTime":8},
          "response":{"numFound":3,"start":0,"docs":[
              {
                "id":"IW-02"},
              {
                "id":"F8V7067-APL-KIT"},
              {
                "id":"MA147LL/A"}]
          },
          "facet_counts":{
            "facet_queries":{},
            "facet_fields":{
              "cat":[
                "electronics",3,
                "connector",2,
                "music",1]},
            "facet_dates":{},
            "facet_ranges":{}}}
        

        4) Re-index some of the sample documents, forcing a new segment to be created, as well as some deletions...

        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ java -jar post.jar ipod_*S
        ...
        

        5) observe that while the "simple" results are unchanged, the filtered request now includes duplicate (deleted?) documents in the result set...

        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&wt=json&indent=true'
        {
          "responseHeader":{
            "status":0,
            "QTime":2},
          "response":{"numFound":3,"start":0,"docs":[
              {
                "id":"IW-02"},
              {
                "id":"F8V7067-APL-KIT"},
              {
                "id":"MA147LL/A"}]
          },
          "facet_counts":{
            "facet_queries":{},
            "facet_fields":{
              "cat":[
                "electronics",3,
                "connector",2,
                "music",1]},
            "facet_dates":{},
            "facet_ranges":{}}}
        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&wt=json&indent=true&fq=cat:electronics'
        {
          "responseHeader":{
            "status":0,
            "QTime":2},
          "response":{"numFound":6,"start":0,"docs":[
              {
                "id":"IW-02"},
              {
                "id":"IW-02"},
              {
                "id":"F8V7067-APL-KIT"},
              {
                "id":"F8V7067-APL-KIT"},
              {
                "id":"MA147LL/A"}]
          },
          "facet_counts":{
            "facet_queries":{},
            "facet_fields":{
              "cat":[
                "electronics",3,
                "connector",2,
                "music",1]},
            "facet_dates":{},
            "facet_ranges":{}}}
        

        Interesting things to note...

        1) stoping & restarting jetty does not make the problem go away, which initially suggested to me that the problem is not related to any sort of stale-caching of filters/docsets – however if you stop & restart jetty, or even just issue a commit, and then re-issue the same two requests in reverse order, then no duplicates are included. do another commit, send the requests in the (original) problematic order and the problem re-appears...

        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&wt=json&indent=true&fq=cat:electronics'
        {
          "responseHeader":{
            "status":0,
            "QTime":6},
          "response":{"numFound":3,"start":0,"docs":[
              {
                "id":"IW-02"},
              {
                "id":"F8V7067-APL-KIT"},
              {
                "id":"MA147LL/A"}]
          },
          "facet_counts":{
            "facet_queries":{},
            "facet_fields":{
              "cat":[
                "electronics",3,
                "connector",2,
                "music",1]},
            "facet_dates":{},
            "facet_ranges":{}}}
        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&wt=json&indent=true'
        {
          "responseHeader":{
            "status":0,
            "QTime":2},
          "response":{"numFound":3,"start":0,"docs":[
              {
                "id":"IW-02"},
              {
                "id":"F8V7067-APL-KIT"},
              {
                "id":"MA147LL/A"}]
          },
          "facet_counts":{
            "facet_queries":{},
            "facet_fields":{
              "cat":[
                "electronics",3,
                "connector",2,
                "music",1]},
            "facet_dates":{},
            "facet_ranges":{}}}
        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ java -Ddata=args -jar post.jar '<commit/>'SimplePostTool version 1.5
        POSTing args to http://localhost:8983/solr/update..
        COMMITting Solr index changes to http://localhost:8983/solr/update..
        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&wt=json&indent=true'
        {
          "responseHeader":{
            "status":0,
            "QTime":2},
          "response":{"numFound":3,"start":0,"docs":[
              {
                "id":"IW-02"},
              {
                "id":"F8V7067-APL-KIT"},
              {
                "id":"MA147LL/A"}]
          },
          "facet_counts":{
            "facet_queries":{},
            "facet_fields":{
              "cat":[
                "electronics",3,
                "connector",2,
                "music",1]},
            "facet_dates":{},
            "facet_ranges":{}}}
        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&wt=json&indent=true&fq=cat:electronics'
        {
          "responseHeader":{
            "status":0,
            "QTime":3},
          "response":{"numFound":6,"start":0,"docs":[
              {
                "id":"IW-02"},
              {
                "id":"IW-02"},
              {
                "id":"F8V7067-APL-KIT"},
              {
                "id":"F8V7067-APL-KIT"},
              {
                "id":"MA147LL/A"}]
          },
          "facet_counts":{
            "facet_queries":{},
            "facet_fields":{
              "cat":[
                "electronics",6,
                "connector",4,
                "music",1]},
            "facet_dates":{},
            "facet_ranges":{}}}
        

        2) Optimizing seems to eliminate the problem completley, suggesting that the root cause is definitely related to multiple segments containing deletions.

        3) Bizarely, the problem seems to be specific to faceting: using the same steps, with the same simple queries & fq, but leaving out the facet params, the duplicate documents are not returned...

        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ java -jar post.jar *.xml
        ...
        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&wt=json&indent=true'
        {
          "responseHeader":{
            "status":0,
            "QTime":13},
          "response":{"numFound":3,"start":0,"docs":[
              {
                "id":"IW-02"},
              {
                "id":"F8V7067-APL-KIT"},
              {
                "id":"MA147LL/A"}]
          }}
        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&wt=json&indent=true&fq=cat:electronics'
        {
          "responseHeader":{
            "status":0,
            "QTime":10},
          "response":{"numFound":3,"start":0,"docs":[
              {
                "id":"IW-02"},
              {
                "id":"F8V7067-APL-KIT"},
              {
                "id":"MA147LL/A"}]
          }}
        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ java -jar post.jar ipod_*
        ...
        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&wt=json&indent=true'
        {
          "responseHeader":{
            "status":0,
            "QTime":2},
          "response":{"numFound":3,"start":0,"docs":[
              {
                "id":"IW-02"},
              {
                "id":"F8V7067-APL-KIT"},
              {
                "id":"MA147LL/A"}]
          }}
        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&wt=json&indent=true&fq=cat:electronics'
        {
          "responseHeader":{
            "status":0,
            "QTime":3},
          "response":{"numFound":3,"start":0,"docs":[
              {
                "id":"IW-02"},
              {
                "id":"F8V7067-APL-KIT"},
              {
                "id":"MA147LL/A"}]
          }}
        
        Show
        Hoss Man added a comment - Steps to reproduce... 1) Start with a clean install of 4.0-BETA, containing a completley empty example index, and run solr... hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example$ ls -a solr/collection1/data/ . .. hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example$ java -jar start.jar 2012-09-05 12:59:56.596:INFO:oejs.Server:jetty-8.1.2.v20120308 ... 2) In another window, index all sample documents... hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ java -jar post.jar *.xml ... 3) Observe the results of a simple query faceting on "cat", as well as the results of filtering on one of those cat values... hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&wt=json&indent=true' { "responseHeader":{ "status":0, "QTime":2}, "response":{"numFound":3,"start":0,"docs":[ { "id":"IW-02"}, { "id":"F8V7067-APL-KIT"}, { "id":"MA147LL/A"}] }, "facet_counts":{ "facet_queries":{}, "facet_fields":{ "cat":[ "electronics",3, "connector",2, "music",1]}, "facet_dates":{}, "facet_ranges":{}}} hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&wt=json&indent=true&fq=cat:electronics' { "responseHeader":{ "status":0, "QTime":8}, "response":{"numFound":3,"start":0,"docs":[ { "id":"IW-02"}, { "id":"F8V7067-APL-KIT"}, { "id":"MA147LL/A"}] }, "facet_counts":{ "facet_queries":{}, "facet_fields":{ "cat":[ "electronics",3, "connector",2, "music",1]}, "facet_dates":{}, "facet_ranges":{}}} 4) Re-index some of the sample documents, forcing a new segment to be created, as well as some deletions... hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ java -jar post.jar ipod_*S ... 5) observe that while the "simple" results are unchanged, the filtered request now includes duplicate (deleted?) documents in the result set... hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&wt=json&indent=true' { "responseHeader":{ "status":0, "QTime":2}, "response":{"numFound":3,"start":0,"docs":[ { "id":"IW-02"}, { "id":"F8V7067-APL-KIT"}, { "id":"MA147LL/A"}] }, "facet_counts":{ "facet_queries":{}, "facet_fields":{ "cat":[ "electronics",3, "connector",2, "music",1]}, "facet_dates":{}, "facet_ranges":{}}} hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&wt=json&indent=true&fq=cat:electronics' { "responseHeader":{ "status":0, "QTime":2}, "response":{"numFound":6,"start":0,"docs":[ { "id":"IW-02"}, { "id":"IW-02"}, { "id":"F8V7067-APL-KIT"}, { "id":"F8V7067-APL-KIT"}, { "id":"MA147LL/A"}] }, "facet_counts":{ "facet_queries":{}, "facet_fields":{ "cat":[ "electronics",3, "connector",2, "music",1]}, "facet_dates":{}, "facet_ranges":{}}} Interesting things to note... 1) stoping & restarting jetty does not make the problem go away, which initially suggested to me that the problem is not related to any sort of stale-caching of filters/docsets – however if you stop & restart jetty, or even just issue a commit, and then re-issue the same two requests in reverse order, then no duplicates are included. do another commit, send the requests in the (original) problematic order and the problem re-appears... hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&wt=json&indent=true&fq=cat:electronics' { "responseHeader":{ "status":0, "QTime":6}, "response":{"numFound":3,"start":0,"docs":[ { "id":"IW-02"}, { "id":"F8V7067-APL-KIT"}, { "id":"MA147LL/A"}] }, "facet_counts":{ "facet_queries":{}, "facet_fields":{ "cat":[ "electronics",3, "connector",2, "music",1]}, "facet_dates":{}, "facet_ranges":{}}} hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&wt=json&indent=true' { "responseHeader":{ "status":0, "QTime":2}, "response":{"numFound":3,"start":0,"docs":[ { "id":"IW-02"}, { "id":"F8V7067-APL-KIT"}, { "id":"MA147LL/A"}] }, "facet_counts":{ "facet_queries":{}, "facet_fields":{ "cat":[ "electronics",3, "connector",2, "music",1]}, "facet_dates":{}, "facet_ranges":{}}} hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ java -Ddata=args -jar post.jar '<commit/>'SimplePostTool version 1.5 POSTing args to http://localhost:8983/solr/update.. COMMITting Solr index changes to http://localhost:8983/solr/update.. hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&wt=json&indent=true' { "responseHeader":{ "status":0, "QTime":2}, "response":{"numFound":3,"start":0,"docs":[ { "id":"IW-02"}, { "id":"F8V7067-APL-KIT"}, { "id":"MA147LL/A"}] }, "facet_counts":{ "facet_queries":{}, "facet_fields":{ "cat":[ "electronics",3, "connector",2, "music",1]}, "facet_dates":{}, "facet_ranges":{}}} hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&wt=json&indent=true&fq=cat:electronics' { "responseHeader":{ "status":0, "QTime":3}, "response":{"numFound":6,"start":0,"docs":[ { "id":"IW-02"}, { "id":"IW-02"}, { "id":"F8V7067-APL-KIT"}, { "id":"F8V7067-APL-KIT"}, { "id":"MA147LL/A"}] }, "facet_counts":{ "facet_queries":{}, "facet_fields":{ "cat":[ "electronics",6, "connector",4, "music",1]}, "facet_dates":{}, "facet_ranges":{}}} 2) Optimizing seems to eliminate the problem completley, suggesting that the root cause is definitely related to multiple segments containing deletions. 3) Bizarely, the problem seems to be specific to faceting: using the same steps, with the same simple queries & fq, but leaving out the facet params, the duplicate documents are not returned... hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ java -jar post.jar *.xml ... hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&wt=json&indent=true' { "responseHeader":{ "status":0, "QTime":13}, "response":{"numFound":3,"start":0,"docs":[ { "id":"IW-02"}, { "id":"F8V7067-APL-KIT"}, { "id":"MA147LL/A"}] }} hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&wt=json&indent=true&fq=cat:electronics' { "responseHeader":{ "status":0, "QTime":10}, "response":{"numFound":3,"start":0,"docs":[ { "id":"IW-02"}, { "id":"F8V7067-APL-KIT"}, { "id":"MA147LL/A"}] }} hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ java -jar post.jar ipod_* ... hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&wt=json&indent=true' { "responseHeader":{ "status":0, "QTime":2}, "response":{"numFound":3,"start":0,"docs":[ { "id":"IW-02"}, { "id":"F8V7067-APL-KIT"}, { "id":"MA147LL/A"}] }} hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&wt=json&indent=true&fq=cat:electronics' { "responseHeader":{ "status":0, "QTime":3}, "response":{"numFound":3,"start":0,"docs":[ { "id":"IW-02"}, { "id":"F8V7067-APL-KIT"}, { "id":"MA147LL/A"}] }}
        Hide
        Hoss Man added a comment -

        The problem seems to be specific to using UnInvertedField via facet.method=fc for faceting, but only if you have not already used facet.method=enum (no idea how using UnInvertedField during the faceting could affect the total result set)...

        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ java -Ddata=args -jar post.jar '<commit/>'
        SimplePostTool version 1.5
        POSTing args to http://localhost:8983/solr/update..
        COMMITting Solr index changes to http://localhost:8983/solr/update..
        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&facet.method=enum&wt=json&indent=true'
        {
          "responseHeader":{
            "status":0,
            "QTime":4},
          "response":{"numFound":3,"start":0,"docs":[
              {
                "id":"IW-02"},
              {
                "id":"F8V7067-APL-KIT"},
              {
                "id":"MA147LL/A"}]
          },
          "facet_counts":{
            "facet_queries":{},
            "facet_fields":{
              "cat":[
                "electronics",3,
                "connector",2,
                "music",1]},
            "facet_dates":{},
            "facet_ranges":{}}}
        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&facet.method=enum&wt=json&indent=true&fq=cat:electronics'
        {
          "responseHeader":{
            "status":0,
            "QTime":3},
          "response":{"numFound":3,"start":0,"docs":[
              {
                "id":"IW-02"},
              {
                "id":"F8V7067-APL-KIT"},
              {
                "id":"MA147LL/A"}]
          },
          "facet_counts":{
            "facet_queries":{},
            "facet_fields":{
              "cat":[
                "electronics",3,
                "connector",2,
                "music",1]},
            "facet_dates":{},
            "facet_ranges":{}}}
        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&facet.method=fc&wt=json&indent=true'
        {
          "responseHeader":{
            "status":0,
            "QTime":3},
          "response":{"numFound":3,"start":0,"docs":[
              {
                "id":"IW-02"},
              {
                "id":"F8V7067-APL-KIT"},
              {
                "id":"MA147LL/A"}]
          },
          "facet_counts":{
            "facet_queries":{},
            "facet_fields":{
              "cat":[
                "electronics",3,
                "connector",2,
                "music",1]},
            "facet_dates":{},
            "facet_ranges":{}}}
        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&facet.method=fc&wt=json&indent=true&fq=cat:electronics'
        {
          "responseHeader":{
            "status":0,
            "QTime":1},
          "response":{"numFound":3,"start":0,"docs":[
              {
                "id":"IW-02"},
              {
                "id":"F8V7067-APL-KIT"},
              {
                "id":"MA147LL/A"}]
          },
          "facet_counts":{
            "facet_queries":{},
            "facet_fields":{
              "cat":[
                "electronics",3,
                "connector",2,
                "music",1]},
            "facet_dates":{},
            "facet_ranges":{}}}
        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ java -Ddata=args -jar post.jar '<commit/>'SimplePostTool version 1.5
        POSTing args to http://localhost:8983/solr/update..
        COMMITting Solr index changes to http://localhost:8983/solr/update..
        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&facet.method=fc&wt=json&indent=true'
        {
          "responseHeader":{
            "status":0,
            "QTime":5},
          "response":{"numFound":3,"start":0,"docs":[
              {
                "id":"IW-02"},
              {
                "id":"F8V7067-APL-KIT"},
              {
                "id":"MA147LL/A"}]
          },
          "facet_counts":{
            "facet_queries":{},
            "facet_fields":{
              "cat":[
                "electronics",3,
                "connector",2,
                "music",1]},
            "facet_dates":{},
            "facet_ranges":{}}}
        hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&facet.method=fc&wt=json&indent=true&fq=cat:electronics'
        {
          "responseHeader":{
            "status":0,
            "QTime":2},
          "response":{"numFound":6,"start":0,"docs":[
              {
                "id":"IW-02"},
              {
                "id":"IW-02"},
              {
                "id":"F8V7067-APL-KIT"},
              {
                "id":"F8V7067-APL-KIT"},
              {
                "id":"MA147LL/A"}]
          },
          "facet_counts":{
            "facet_queries":{},
            "facet_fields":{
              "cat":[
                "electronics",6,
                "connector",4,
                "music",1]},
            "facet_dates":{},
            "facet_ranges":{}}}
        
        Show
        Hoss Man added a comment - The problem seems to be specific to using UnInvertedField via facet.method=fc for faceting, but only if you have not already used facet.method=enum (no idea how using UnInvertedField during the faceting could affect the total result set)... hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ java -Ddata=args -jar post.jar '<commit/>' SimplePostTool version 1.5 POSTing args to http://localhost:8983/solr/update.. COMMITting Solr index changes to http://localhost:8983/solr/update.. hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&facet.method=enum&wt=json&indent=true' { "responseHeader":{ "status":0, "QTime":4}, "response":{"numFound":3,"start":0,"docs":[ { "id":"IW-02"}, { "id":"F8V7067-APL-KIT"}, { "id":"MA147LL/A"}] }, "facet_counts":{ "facet_queries":{}, "facet_fields":{ "cat":[ "electronics",3, "connector",2, "music",1]}, "facet_dates":{}, "facet_ranges":{}}} hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&facet.method=enum&wt=json&indent=true&fq=cat:electronics' { "responseHeader":{ "status":0, "QTime":3}, "response":{"numFound":3,"start":0,"docs":[ { "id":"IW-02"}, { "id":"F8V7067-APL-KIT"}, { "id":"MA147LL/A"}] }, "facet_counts":{ "facet_queries":{}, "facet_fields":{ "cat":[ "electronics",3, "connector",2, "music",1]}, "facet_dates":{}, "facet_ranges":{}}} hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&facet.method=fc&wt=json&indent=true' { "responseHeader":{ "status":0, "QTime":3}, "response":{"numFound":3,"start":0,"docs":[ { "id":"IW-02"}, { "id":"F8V7067-APL-KIT"}, { "id":"MA147LL/A"}] }, "facet_counts":{ "facet_queries":{}, "facet_fields":{ "cat":[ "electronics",3, "connector",2, "music",1]}, "facet_dates":{}, "facet_ranges":{}}} hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&facet.method=fc&wt=json&indent=true&fq=cat:electronics' { "responseHeader":{ "status":0, "QTime":1}, "response":{"numFound":3,"start":0,"docs":[ { "id":"IW-02"}, { "id":"F8V7067-APL-KIT"}, { "id":"MA147LL/A"}] }, "facet_counts":{ "facet_queries":{}, "facet_fields":{ "cat":[ "electronics",3, "connector",2, "music",1]}, "facet_dates":{}, "facet_ranges":{}}} hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ java -Ddata=args -jar post.jar '<commit/>'SimplePostTool version 1.5 POSTing args to http://localhost:8983/solr/update.. COMMITting Solr index changes to http://localhost:8983/solr/update.. hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&facet.method=fc&wt=json&indent=true' { "responseHeader":{ "status":0, "QTime":5}, "response":{"numFound":3,"start":0,"docs":[ { "id":"IW-02"}, { "id":"F8V7067-APL-KIT"}, { "id":"MA147LL/A"}] }, "facet_counts":{ "facet_queries":{}, "facet_fields":{ "cat":[ "electronics",3, "connector",2, "music",1]}, "facet_dates":{}, "facet_ranges":{}}} hossman@frisbee:~/tmp/apache-solr-4.0.0-BETA/solr/example/exampledocs$ curl 'http://localhost:8983/solr/select?echoParams=none&q=ipod&rows=5&fl=id&facet=true&facet.field=cat&facet.mincount=1&facet.method=fc&wt=json&indent=true&fq=cat:electronics' { "responseHeader":{ "status":0, "QTime":2}, "response":{"numFound":6,"start":0,"docs":[ { "id":"IW-02"}, { "id":"IW-02"}, { "id":"F8V7067-APL-KIT"}, { "id":"F8V7067-APL-KIT"}, { "id":"MA147LL/A"}] }, "facet_counts":{ "facet_queries":{}, "facet_fields":{ "cat":[ "electronics",6, "connector",4, "music",1]}, "facet_dates":{}, "facet_ranges":{}}}
        Hide
        Hoss Man added a comment -

        confirmed this affects trunk & branch_4x

        Show
        Hoss Man added a comment - confirmed this affects trunk & branch_4x
        Hide
        Yonik Seeley added a comment -

        Here's a patch to fix the problem.

        The issue was when UnInvertedField faceting cached big terms as filters, it failed to set/use liveDocs. Later, an "fq" was used that retrieved the filter from the cache and used that filter as liveDocs, bringing deleted docs back from the dead.

        Show
        Yonik Seeley added a comment - Here's a patch to fix the problem. The issue was when UnInvertedField faceting cached big terms as filters, it failed to set/use liveDocs. Later, an "fq" was used that retrieved the filter from the cache and used that filter as liveDocs, bringing deleted docs back from the dead.
        Show
        Yonik Seeley added a comment - Committed to trunk and 4x http://svn.apache.org/viewvc?rev=1381568&view=rev http://svn.apache.org/viewvc?rev=1381569&view=rev
        Hide
        Guenter Hipler added a comment -

        Thanks Yonik,
        I'm going to test it with the upcoming nightly build
        Günter

        Show
        Guenter Hipler added a comment - Thanks Yonik, I'm going to test it with the upcoming nightly build Günter
        Hide
        Commit Tag Bot added a comment -

        [branch_4x commit] Yonik Seeley
        http://svn.apache.org/viewvc?view=revision&revision=1381569

        SOLR-3793: use livedocs when caching big terms

        Show
        Commit Tag Bot added a comment - [branch_4x commit] Yonik Seeley http://svn.apache.org/viewvc?view=revision&revision=1381569 SOLR-3793 : use livedocs when caching big terms
        Hide
        Uwe Schindler added a comment -

        Closed after release.

        Show
        Uwe Schindler added a comment - Closed after release.

          People

          • Assignee:
            Yonik Seeley
            Reporter:
            Hoss Man
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development