Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Not a Problem
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Converting a Solr SearchComponent from using filters to using equivalent queries causes its test cases to fail. The reasons are unclear, and I've tried several things to isolate the problem, but with no luck so far.

      The failure manifests itself by the constructed query returning no documents whatsoever, at least within the Solr context it's being used in. The equivalent filter version of the component works properly (as you can see by the passed test).

      To run the test:

      cd solr/contrib/mcf
      ant test

      I will attach both filter and query versions of the code, as patches.

      1. query.diff
        31 kB
        Karl Wright
      2. filter.diff
        31 kB
        Karl Wright

        Issue Links

          Activity

          Karl Wright created issue -
          Hide
          Karl Wright added a comment -

          The filter version of the search component

          Show
          Karl Wright added a comment - The filter version of the search component
          Karl Wright made changes -
          Field Original Value New Value
          Attachment filter.diff [ 12496135 ]
          Hide
          Karl Wright added a comment -

          The query version of the search component, which fails the test

          Show
          Karl Wright added a comment - The query version of the search component, which fails the test
          Karl Wright made changes -
          Attachment query.diff [ 12496136 ]
          Hide
          Karl Wright added a comment -

          Apologies for not finishing this ticket promptly, but minutes after creating it I lost internet and phone service for seven hours.

          Some notes:
          (1) The filter-based plugin passes the tests, the query-based one fails two out of the four tests. I've been concentrating initially on the failure that comes from ManifoldCFSecurityFilterTest.java line 100. Here's the error:

          [junit] ------------- Standard Error -----------------
              [junit] +((-allow_token_share:* -deny_token_share:*) allow_token_share:token1 -deny_token_share:token1) +((-allow_token_document:* -deny_token_document:*) allow_token_document:token1 -deny_token_document:token1)
              [junit] 22/09/2011 08:26:50 ? org.apache.solr.SolrTestCaseJ4 assertQ
              [junit] SEVERE: REQUEST FAILED: xpath=//*[@numFound='3']
              [junit]     xml response was: <?xml version="1.0" encoding="UTF-8"?>
              [junit] <response>
              [junit] <lst name="responseHeader"><int name="status">0</int><int name="QTime">116</int><lst name="params"><str name="echoParams">all</str><str name="fl">id</str><str name="q">*:*</str><str name="qt">/mcf</str><str name="UserTokens">token1</str><str name="mcf">true</str></lst></lst><result name="response" numFound="0" start="0"></result>
              [junit] </response>
          

          This output includes the actual query I am generating for this test, which looks correct to me.

          (2) The test indexes 5 documents into four access token fields. These documents are summarized below.

              //             |     share    |   document
              //             |--------------|--------------
              //             | allow | deny | allow | deny
              // ------------+-------+------+-------+------
              // da12        |       |      | 1, 2  |
              // ------------+-------+------+-------+------
              // da13-dd3    |       |      | 1,3   | 3
              // ------------+-------+------+-------+------
              // sa123-sd13  | 1,2,3 | 1, 3 |       |
              // ------------+-------+------+-------+------
              // sa3-sd1-da23| 3     | 1    | 2,3   |
              // ------------+-------+------+-------+------
              // notoken     |       |      |       |
              // ------------+-------+------+-------+------
          
          

          (2) Initially the query-based plugin directly corresponded to the filter based one. Still failed in exactly the same way. Here's the diff between the filter-based one and the initial query-based one:

          Index: src/java/org/apache/solr/mcf/ManifoldCFSecurityFilter.java
          ===================================================================
          --- src/java/org/apache/solr/mcf/ManifoldCFSecurityFilter.java	(revision 1173895)
          +++ src/java/org/apache/solr/mcf/ManifoldCFSecurityFilter.java	(working copy)
          @@ -150,7 +150,8 @@
                 userAccessTokens = getAccessTokens(authenticatedUserName);
               }
           
          -    BooleanFilter bf = new BooleanFilter();
          +    BooleanQuery bq = new BooleanQuery();
          +    //bf.setMaxClauseCount(100000);
               
               if (userAccessTokens.size() == 0)
               {
          @@ -159,28 +160,26 @@
                 // (fieldAllowShare is empty AND fieldDenyShare is empty AND fieldAllowDocument is empty AND fieldDenyDocument is empty)
                 // We're trying to map to:  -(fieldAllowShare:*) , which should be pretty efficient in Solr because it is negated.  If this turns out not to be so, then we should
                 // have the SolrConnector inject a special token into these fields when they otherwise would be empty, and we can trivially match on that token.
          -      bf.add(new FilterClause(new QueryWrapperFilter(new WildcardQuery(new Term(fieldAllowShare,"*"))),BooleanClause.Occur.MUST_NOT));
          -      bf.add(new FilterClause(new QueryWrapperFilter(new WildcardQuery(new Term(fieldDenyShare,"*"))),BooleanClause.Occur.MUST_NOT));
          -      bf.add(new FilterClause(new QueryWrapperFilter(new WildcardQuery(new Term(fieldAllowDocument,"*"))),BooleanClause.Occur.MUST_NOT));
          -      bf.add(new FilterClause(new QueryWrapperFilter(new WildcardQuery(new Term(fieldDenyDocument,"*"))),BooleanClause.Occur.MUST_NOT));
          +      bq.add(new WildcardQuery(new Term(fieldAllowShare,"*")),BooleanClause.Occur.MUST_NOT);
          +      bq.add(new WildcardQuery(new Term(fieldDenyShare,"*")),BooleanClause.Occur.MUST_NOT);
          +      bq.add(new WildcardQuery(new Term(fieldAllowDocument,"*")),BooleanClause.Occur.MUST_NOT);
          +      bq.add(new WildcardQuery(new Term(fieldDenyDocument,"*")),BooleanClause.Occur.MUST_NOT);
               }
               else
               {
                 // Extend the query appropriately for each user access token.
          -      bf.add(new FilterClause(calculateCompleteSubfilter(fieldAllowShare,fieldDenyShare,userAccessTokens),BooleanClause.Occur.MUST));
          -      bf.add(new FilterClause(calculateCompleteSubfilter(fieldAllowDocument,fieldDenyDocument,userAccessTokens),BooleanClause.Occur.MUST));
          +      bq.add(calculateCompleteSubquery(fieldAllowShare,fieldDenyShare,userAccessTokens),BooleanClause.Occur.MUST);
          +      bq.add(calculateCompleteSubquery(fieldAllowDocument,fieldDenyDocument,userAccessTokens),BooleanClause.Occur.MUST);
               }
           
               // Concatenate with the user's original query.
          -    //FilteredQuery query = new FilteredQuery(rb.getQuery(),bf);
          -    //rb.setQuery(query);
               List<Query> list = rb.getFilters();
               if (list == null)
               {
                 list = new ArrayList<Query>();
                 rb.setFilters(list);
               }
          -    list.add(new ConstantScoreQuery(bf));
          +    list.add(new ConstantScoreQuery(bq));
             }
           
             @Override
          @@ -193,28 +192,27 @@
             * ((fieldAllowShare is empty AND fieldDenyShare is empty) OR fieldAllowShare HAS token1 OR fieldAllowShare HAS token2 ...)
             *     AND fieldDenyShare DOESN'T_HAVE token1 AND fieldDenyShare DOESN'T_HAVE token2 ...
             */
          -  protected Filter calculateCompleteSubfilter(String allowField, String denyField, List<String> userAccessTokens)
          +  protected Query calculateCompleteSubquery(String allowField, String denyField, List<String> userAccessTokens)
             {
          -    BooleanFilter bf = new BooleanFilter();
          +    BooleanQuery bq = new BooleanQuery();
          +    bq.setMaxClauseCount(1000000);
               
               // Add a clause for each token.  This will be added directly to the main filter (as a deny test), as well as to an OR's subclause (as an allow test).
          -    BooleanFilter orFilter = new BooleanFilter();
          +    BooleanQuery orQuery = new BooleanQuery();
          +    orQuery.setMaxClauseCount(1000000);
          +
               // Add the empty-acl case
          -    BooleanFilter subUnprotectedClause = new BooleanFilter();
          -    subUnprotectedClause.add(new FilterClause(new QueryWrapperFilter(new WildcardQuery(new Term(allowField,"*"))),BooleanClause.Occur.MUST_NOT));
          -    subUnprotectedClause.add(new FilterClause(new QueryWrapperFilter(new WildcardQuery(new Term(denyField,"*"))),BooleanClause.Occur.MUST_NOT));
          -    orFilter.add(new FilterClause(subUnprotectedClause,BooleanClause.Occur.SHOULD));
          +    BooleanQuery subUnprotectedClause = new BooleanQuery();
          +    subUnprotectedClause.add(new WildcardQuery(new Term(allowField,"*")),BooleanClause.Occur.MUST_NOT);
          +    subUnprotectedClause.add(new WildcardQuery(new Term(denyField,"*")),BooleanClause.Occur.MUST_NOT);
          +    orQuery.add(subUnprotectedClause,BooleanClause.Occur.SHOULD);
               for (String accessToken : userAccessTokens)
               {
          -      TermsFilter tf = new TermsFilter();
          -      tf.addTerm(new Term(allowField,accessToken));
          -      orFilter.add(new FilterClause(tf,BooleanClause.Occur.SHOULD));
          -      tf = new TermsFilter();
          -      tf.addTerm(new Term(denyField,accessToken));
          -      bf.add(new FilterClause(tf,BooleanClause.Occur.MUST_NOT));
          +      orQuery.add(new TermQuery(new Term(allowField,accessToken)),BooleanClause.Occur.SHOULD);
          +      bq.add(new TermQuery(new Term(denyField,accessToken)),BooleanClause.Occur.MUST_NOT);
               }
          -    bf.add(new FilterClause(orFilter,BooleanClause.Occur.MUST));
          -    return bf;
          +    bq.add(orQuery,BooleanClause.Occur.MUST);
          +    return bq;
             }
          

          After this I experimented for a bit, flattening out one level of BooleanQuery, commenting out the ConstantScoreQuery wrapper, and adding output to stderr of the generated query. None of this fixed things, although the test behavior did change somewhat, which was another worrisome sign. The failure at ManifoldCFSecurityFilterTest.java line 100 still occurred just the same, however.

          This is as far as I've gotten so far. If you need a more condensed test case, I'm happy to try and produce one, but I am unfamiliar with the Lucene-side testing infrastructure so I'd need to be pointed at an example to emulate.

          Show
          Karl Wright added a comment - Apologies for not finishing this ticket promptly, but minutes after creating it I lost internet and phone service for seven hours. Some notes: (1) The filter-based plugin passes the tests, the query-based one fails two out of the four tests. I've been concentrating initially on the failure that comes from ManifoldCFSecurityFilterTest.java line 100. Here's the error: [junit] ------------- Standard Error ----------------- [junit] +((-allow_token_share:* -deny_token_share:*) allow_token_share:token1 -deny_token_share:token1) +((-allow_token_document:* -deny_token_document:*) allow_token_document:token1 -deny_token_document:token1) [junit] 22/09/2011 08:26:50 ? org.apache.solr.SolrTestCaseJ4 assertQ [junit] SEVERE: REQUEST FAILED: xpath= //*[@numFound='3'] [junit] xml response was: <?xml version= "1.0" encoding= "UTF-8" ?> [junit] <response> [junit] <lst name= "responseHeader" >< int name= "status" >0</ int >< int name= "QTime" >116</ int ><lst name= "params" ><str name= "echoParams" >all</str><str name= "fl" >id</str><str name= "q" >*:*</str><str name= "qt" >/mcf</str><str name= "UserTokens" >token1</str><str name= "mcf" > true </str></lst></lst><result name= "response" numFound= "0" start= "0" ></result> [junit] </response> This output includes the actual query I am generating for this test, which looks correct to me. (2) The test indexes 5 documents into four access token fields. These documents are summarized below. // | share | document // |--------------|-------------- // | allow | deny | allow | deny // ------------+-------+------+-------+------ // da12 | | | 1, 2 | // ------------+-------+------+-------+------ // da13-dd3 | | | 1,3 | 3 // ------------+-------+------+-------+------ // sa123-sd13 | 1,2,3 | 1, 3 | | // ------------+-------+------+-------+------ // sa3-sd1-da23| 3 | 1 | 2,3 | // ------------+-------+------+-------+------ // notoken | | | | // ------------+-------+------+-------+------ (2) Initially the query-based plugin directly corresponded to the filter based one. Still failed in exactly the same way. Here's the diff between the filter-based one and the initial query-based one: Index: src/java/org/apache/solr/mcf/ManifoldCFSecurityFilter.java =================================================================== --- src/java/org/apache/solr/mcf/ManifoldCFSecurityFilter.java (revision 1173895) +++ src/java/org/apache/solr/mcf/ManifoldCFSecurityFilter.java (working copy) @@ -150,7 +150,8 @@ userAccessTokens = getAccessTokens(authenticatedUserName); } - BooleanFilter bf = new BooleanFilter(); + BooleanQuery bq = new BooleanQuery(); + //bf.setMaxClauseCount(100000); if (userAccessTokens.size() == 0) { @@ -159,28 +160,26 @@ // (fieldAllowShare is empty AND fieldDenyShare is empty AND fieldAllowDocument is empty AND fieldDenyDocument is empty) // We're trying to map to: -(fieldAllowShare:*) , which should be pretty efficient in Solr because it is negated. If this turns out not to be so, then we should // have the SolrConnector inject a special token into these fields when they otherwise would be empty, and we can trivially match on that token. - bf.add( new FilterClause( new QueryWrapperFilter( new WildcardQuery( new Term(fieldAllowShare, "*" ))),BooleanClause.Occur.MUST_NOT)); - bf.add( new FilterClause( new QueryWrapperFilter( new WildcardQuery( new Term(fieldDenyShare, "*" ))),BooleanClause.Occur.MUST_NOT)); - bf.add( new FilterClause( new QueryWrapperFilter( new WildcardQuery( new Term(fieldAllowDocument, "*" ))),BooleanClause.Occur.MUST_NOT)); - bf.add( new FilterClause( new QueryWrapperFilter( new WildcardQuery( new Term(fieldDenyDocument, "*" ))),BooleanClause.Occur.MUST_NOT)); + bq.add( new WildcardQuery( new Term(fieldAllowShare, "*" )),BooleanClause.Occur.MUST_NOT); + bq.add( new WildcardQuery( new Term(fieldDenyShare, "*" )),BooleanClause.Occur.MUST_NOT); + bq.add( new WildcardQuery( new Term(fieldAllowDocument, "*" )),BooleanClause.Occur.MUST_NOT); + bq.add( new WildcardQuery( new Term(fieldDenyDocument, "*" )),BooleanClause.Occur.MUST_NOT); } else { // Extend the query appropriately for each user access token. - bf.add( new FilterClause(calculateCompleteSubfilter(fieldAllowShare,fieldDenyShare,userAccessTokens),BooleanClause.Occur.MUST)); - bf.add( new FilterClause(calculateCompleteSubfilter(fieldAllowDocument,fieldDenyDocument,userAccessTokens),BooleanClause.Occur.MUST)); + bq.add(calculateCompleteSubquery(fieldAllowShare,fieldDenyShare,userAccessTokens),BooleanClause.Occur.MUST); + bq.add(calculateCompleteSubquery(fieldAllowDocument,fieldDenyDocument,userAccessTokens),BooleanClause.Occur.MUST); } // Concatenate with the user's original query. - //FilteredQuery query = new FilteredQuery(rb.getQuery(),bf); - //rb.setQuery(query); List<Query> list = rb.getFilters(); if (list == null ) { list = new ArrayList<Query>(); rb.setFilters(list); } - list.add( new ConstantScoreQuery(bf)); + list.add( new ConstantScoreQuery(bq)); } @Override @@ -193,28 +192,27 @@ * ((fieldAllowShare is empty AND fieldDenyShare is empty) OR fieldAllowShare HAS token1 OR fieldAllowShare HAS token2 ...) * AND fieldDenyShare DOESN'T_HAVE token1 AND fieldDenyShare DOESN'T_HAVE token2 ... */ - protected Filter calculateCompleteSubfilter( String allowField, String denyField, List< String > userAccessTokens) + protected Query calculateCompleteSubquery( String allowField, String denyField, List< String > userAccessTokens) { - BooleanFilter bf = new BooleanFilter(); + BooleanQuery bq = new BooleanQuery(); + bq.setMaxClauseCount(1000000); // Add a clause for each token. This will be added directly to the main filter (as a deny test), as well as to an OR's subclause (as an allow test). - BooleanFilter orFilter = new BooleanFilter(); + BooleanQuery orQuery = new BooleanQuery(); + orQuery.setMaxClauseCount(1000000); + // Add the empty-acl case - BooleanFilter subUnprotectedClause = new BooleanFilter(); - subUnprotectedClause.add( new FilterClause( new QueryWrapperFilter( new WildcardQuery( new Term(allowField, "*" ))),BooleanClause.Occur.MUST_NOT)); - subUnprotectedClause.add( new FilterClause( new QueryWrapperFilter( new WildcardQuery( new Term(denyField, "*" ))),BooleanClause.Occur.MUST_NOT)); - orFilter.add( new FilterClause(subUnprotectedClause,BooleanClause.Occur.SHOULD)); + BooleanQuery subUnprotectedClause = new BooleanQuery(); + subUnprotectedClause.add( new WildcardQuery( new Term(allowField, "*" )),BooleanClause.Occur.MUST_NOT); + subUnprotectedClause.add( new WildcardQuery( new Term(denyField, "*" )),BooleanClause.Occur.MUST_NOT); + orQuery.add(subUnprotectedClause,BooleanClause.Occur.SHOULD); for ( String accessToken : userAccessTokens) { - TermsFilter tf = new TermsFilter(); - tf.addTerm( new Term(allowField,accessToken)); - orFilter.add( new FilterClause(tf,BooleanClause.Occur.SHOULD)); - tf = new TermsFilter(); - tf.addTerm( new Term(denyField,accessToken)); - bf.add( new FilterClause(tf,BooleanClause.Occur.MUST_NOT)); + orQuery.add( new TermQuery( new Term(allowField,accessToken)),BooleanClause.Occur.SHOULD); + bq.add( new TermQuery( new Term(denyField,accessToken)),BooleanClause.Occur.MUST_NOT); } - bf.add( new FilterClause(orFilter,BooleanClause.Occur.MUST)); - return bf; + bq.add(orQuery,BooleanClause.Occur.MUST); + return bq; } After this I experimented for a bit, flattening out one level of BooleanQuery, commenting out the ConstantScoreQuery wrapper, and adding output to stderr of the generated query. None of this fixed things, although the test behavior did change somewhat, which was another worrisome sign. The failure at ManifoldCFSecurityFilterTest.java line 100 still occurred just the same, however. This is as far as I've gotten so far. If you need a more condensed test case, I'm happy to try and produce one, but I am unfamiliar with the Lucene-side testing infrastructure so I'd need to be pointed at an example to emulate.
          Hide
          Michael McCandless added a comment -

          This doesn't sound good!

          Maybe you could boil it down to Lucene only test? Maybe use src/test/org/apache/lucene/search/TestBooleanQuery.java as an example?

          Show
          Michael McCandless added a comment - This doesn't sound good! Maybe you could boil it down to Lucene only test? Maybe use src/test/org/apache/lucene/search/TestBooleanQuery.java as an example?
          Hide
          Uwe Schindler added a comment -

          From your example patch, i see only MUST_NOT clauses. This cannot work.

          As I was working on BooleanFilter yesterday, I know that BooleanFilter has a hack to support pure-negative queries. BooleanQuery with only negative clauses always returns no results (I was thinking about removing the stupid hack in BF, because it cannot correctly handle deleted docs.... - its still there because backwards compatibility)

          Show
          Uwe Schindler added a comment - From your example patch, i see only MUST_NOT clauses. This cannot work. As I was working on BooleanFilter yesterday, I know that BooleanFilter has a hack to support pure-negative queries. BooleanQuery with only negative clauses always returns no results (I was thinking about removing the stupid hack in BF, because it cannot correctly handle deleted docs.... - its still there because backwards compatibility)
          Hide
          Karl Wright added a comment - - edited

          From your example patch, i see only MUST_NOT clauses. This cannot work.

          I agree that the first clause must have a MatchAllQuery added, but that's not the failure I'm looking at. The case in question involves the SECOND clause:

                 // Extend the query appropriately for each user access token.
          -      bf.add(new FilterClause(calculateCompleteSubfilter(fieldAllowShare,fieldDenyShare,userAccessTokens),BooleanClause.Occur.MUST));
          -      bf.add(new FilterClause(calculateCompleteSubfilter(fieldAllowDocument,fieldDenyDocument,userAccessTokens),BooleanClause.Occur.MUST));
          +      bq.add(calculateCompleteSubquery(fieldAllowShare,fieldDenyShare,userAccessTokens),BooleanClause.Occur.MUST);
          +      bq.add(calculateCompleteSubquery(fieldAllowDocument,fieldDenyDocument,userAccessTokens),BooleanClause.Occur.MUST);
          

          The first clause is used only when a user has no tokens, which has a test but is not the one I've singled out.

          The actual query being generated in this case is:

              [junit] +((-allow_token_share:* -deny_token_share:*) allow_token_share:token1 -deny_token_share:token1) +((-allow_token_document:* -deny_token_document:*) allow_token_document:token1 -deny_token_document:token1)
          
          

          Does this look like it is constructed incorrectly to you? Or are you saying that the subclause "(-allow_token_share:* -deny_token_share:*)" has the same problem? I'll experiment with this and see what I get...

          Show
          Karl Wright added a comment - - edited From your example patch, i see only MUST_NOT clauses. This cannot work. I agree that the first clause must have a MatchAllQuery added, but that's not the failure I'm looking at. The case in question involves the SECOND clause: // Extend the query appropriately for each user access token. - bf.add( new FilterClause(calculateCompleteSubfilter(fieldAllowShare,fieldDenyShare,userAccessTokens),BooleanClause.Occur.MUST)); - bf.add( new FilterClause(calculateCompleteSubfilter(fieldAllowDocument,fieldDenyDocument,userAccessTokens),BooleanClause.Occur.MUST)); + bq.add(calculateCompleteSubquery(fieldAllowShare,fieldDenyShare,userAccessTokens),BooleanClause.Occur.MUST); + bq.add(calculateCompleteSubquery(fieldAllowDocument,fieldDenyDocument,userAccessTokens),BooleanClause.Occur.MUST); The first clause is used only when a user has no tokens, which has a test but is not the one I've singled out. The actual query being generated in this case is: [junit] +((-allow_token_share:* -deny_token_share:*) allow_token_share:token1 -deny_token_share:token1) +((-allow_token_document:* -deny_token_document:*) allow_token_document:token1 -deny_token_document:token1) Does this look like it is constructed incorrectly to you? Or are you saying that the subclause "(-allow_token_share:* -deny_token_share:*)" has the same problem? I'll experiment with this and see what I get...
          Hide
          Uwe Schindler added a comment -

          I opened LUCENE-3451 about the stupid BooleanFilter hack that is inconsistent and confuses people.

          Show
          Uwe Schindler added a comment - I opened LUCENE-3451 about the stupid BooleanFilter hack that is inconsistent and confuses people.
          Uwe Schindler made changes -
          Link This issue is superceded by LUCENE-3451 [ LUCENE-3451 ]
          Hide
          Karl Wright added a comment -

          Yup, adding MatchAllDocsQuery() in the appropriate places makes filters and queries consistent again, and the tests pass.

          Show
          Karl Wright added a comment - Yup, adding MatchAllDocsQuery() in the appropriate places makes filters and queries consistent again, and the tests pass.
          Karl Wright made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Not A Problem [ 8 ]

            People

            • Assignee:
              Unassigned
              Reporter:
              Karl Wright
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development