Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      I think it would be useful to add a way to exclude field from the Solr response. If I have for example 100 stored fields and I want to return all of them but one, it would be handy to list just the field I want to exclude instead of the 99 fields for inclusion through fl.

      1. SOLR-3191.patch
        67 kB
        Andrea Gazzarini
      2. SOLR-3191.patch
        67 kB
        Andrea Gazzarini
      3. SOLR-3191.patch
        55 kB
        Kuntal Ganguly

        Activity

        Hide
        Jan Høydahl added a comment -

        Kuntal Ganguly, there is a problem with your last patch from august, it does not include the Expectation interface.

        Show
        Jan Høydahl added a comment - Kuntal Ganguly , there is a problem with your last patch from august, it does not include the Expectation interface.
        Hide
        Andrea Gazzarini added a comment -

        Hi Jan,
        Yes but with an old version of Solr (4.4.0 if I remember well) and I don't
        use all of the stuff offered by the patch (e.g. exclusion globs, exclusion
        with aliasing) just a simple fl=-huge_field which was exactly the
        reason why at that time I started working on this issue

        Show
        Andrea Gazzarini added a comment - Hi Jan, Yes but with an old version of Solr (4.4.0 if I remember well) and I don't use all of the stuff offered by the patch (e.g. exclusion globs, exclusion with aliasing) just a simple fl=-huge_field which was exactly the reason why at that time I started working on this issue
        Hide
        Jan Høydahl added a comment -

        Anyone have been using the patch in production?

        Show
        Jan Høydahl added a comment - Anyone have been using the patch in production?
        Hide
        Roman Kliewer added a comment -

        I think this feature should be higher prioritized since the atomic updates do require all fields stored.

        In my case there are millions of documents and i do need to frequently update the ACL field, so storing of the default search field is required. This causes of course a much slower search because the default search field is returned every time since it can not be excluded and almost all other fields are dynamic.

        IMHO the absence of this feature renders atomic update feature completely unusable.

        Show
        Roman Kliewer added a comment - I think this feature should be higher prioritized since the atomic updates do require all fields stored . In my case there are millions of documents and i do need to frequently update the ACL field, so storing of the default search field is required. This causes of course a much slower search because the default search field is returned every time since it can not be excluded and almost all other fields are dynamic. IMHO the absence of this feature renders atomic update feature completely unusable.
        Hide
        Andrea Gazzarini added a comment -

        Hi Jan,
        if you refer to my long comment above, the two last points are questions
        about the granularity of unit tests, so nothing important, from a
        functional point of view.

        Show
        Andrea Gazzarini added a comment - Hi Jan, if you refer to my long comment above, the two last points are questions about the granularity of unit tests, so nothing important, from a functional point of view.
        Hide
        Jan Høydahl added a comment -

        Are there known issues with this patch? I see a few TODO's

        Show
        Jan Høydahl added a comment - Are there known issues with this patch? I see a few TODO's
        Kuntal Ganguly made changes -
        Attachment SOLR-3191.patch [ 12664348 ]
        Hide
        Kuntal Ganguly added a comment - - edited

        Updated the existing patch for working with custom transformer like - [mytrans5 jf=pinprojectid if=type:Projects mf=project_accessibilitytype] , that was working properly without this patch. Updated the onTransformer method in-order to make transformer( inbuilt, alias or custom) work properly. Use-case for the above transformer is mentioned above in my previous comment section.
        Also added two test cases for it.

        Show
        Kuntal Ganguly added a comment - - edited Updated the existing patch for working with custom transformer like - [mytrans5 jf=pinprojectid if=type:Projects mf=project_accessibilitytype] , that was working properly without this patch. Updated the onTransformer method in-order to make transformer( inbuilt, alias or custom) work properly. Use-case for the above transformer is mentioned above in my previous comment section. Also added two test cases for it.
        Hide
        Kuntal Ganguly added a comment -

        Erik Hatcher Yeah i will upload the patch along with the test cases for transformer fix tomorrow.

        Show
        Kuntal Ganguly added a comment - Erik Hatcher Yeah i will upload the patch along with the test cases for transformer fix tomorrow.
        Hide
        Andrea Gazzarini added a comment -

        I meant "additional test methods"

        Show
        Andrea Gazzarini added a comment - I meant "additional test methods"
        Hide
        Andrea Gazzarini added a comment -

        Hi guys, I created that patch some time ago, if you need something I will
        be happy to give a help.

        In my comments above you can find all tests involved with the fl
        exclusion, I think additional tests should go there.

        Andrea

        Show
        Andrea Gazzarini added a comment - Hi guys, I created that patch some time ago, if you need something I will be happy to give a help. In my comments above you can find all tests involved with the fl exclusion, I think additional tests should go there. Andrea
        Hide
        Erik Hatcher added a comment -

        Kuntal Ganguly - could you attach a full patch against Solr trunk (or branch_4x) so that the latest success working code is available easily? Thanks! Test cases with a custom transformer would be great if you could add those as well. I have interest in this issue, but it looks a bit hairy to review, so best if the latest is consolidated.

        Show
        Erik Hatcher added a comment - Kuntal Ganguly - could you attach a full patch against Solr trunk (or branch_4x) so that the latest success working code is available easily? Thanks! Test cases with a custom transformer would be great if you could add those as well. I have interest in this issue, but it looks a bit hairy to review, so best if the latest is consolidated.
        Hide
        Kuntal Ganguly added a comment -

        I modify the code to support inbuilt alias transformer as well as custom transformer:

        Modify the method:

        /**

        • Logic to handle Custom as well as inbuilt Doc Transformer.
        • */

        void onTransformer(final StringBuilder expressionBuffer, final SolrQueryRequest request, final DocTransformers augmenters)
        throws SyntaxError {
        if(expressionBuffer!=null){
        String fl_Content=expressionBuffer.toString();
        if(fl_Content.contains("[") && fl_Content.contains("]")){

        if(fl_Content.indexOf("[")==0){
        //Logic to Parse Custom Transformers

        final Map<String,String> augmenterCustomArgs = new HashMap<String,String>();
        QueryParsing.parseLocalParams(expressionBuffer.toString(), 0, augmenterCustomArgs, request.getParams(), "[", CLOSE_BRACKET);
        final String augmenterCustomName = augmenterCustomArgs.remove("type");
        final String customDisp='['+augmenterCustomName+']';
        final TransformerFactory customFactory = request.getCore().getTransformerFactory(augmenterCustomName);
        if( customFactory != null )

        { MapSolrParams augmenterCustomParams = new MapSolrParams( augmenterCustomArgs ); augmenters.addTransformer( customFactory.create(customDisp, augmenterCustomParams, request) ); }

        }else if(fl_Content.indexOf("[")>0 && fl_Content.contains(":[")){
        //Logic to Parse In_Built Transformers

        final String alias = getExpressionAlias(expressionBuffer);
        final String transfomerExpression = getExpressionValue(expressionBuffer);
        final Map<String,String> augmenterArgs = new HashMap<String,String>();
        QueryParsing.parseLocalParams(transfomerExpression, 0, augmenterArgs, request.getParams(), "[", CLOSE_BRACKET);
        final String augmenterName = augmenterArgs.remove("type");
        final String aliasThatWillBeUsed = (alias != null) ? alias : OPEN_BRACKET + augmenterName + CLOSE_BRACKET;
        final TransformerFactory factory = request.getCore().getTransformerFactory(augmenterName);
        if (factory != null)

        { augmenters.addTransformer(factory.create(aliasThatWillBeUsed, new MapSolrParams(augmenterArgs), request)); onInclusionLiteralExpression(expressionBuffer, augmenters, false, false); }

        }

        }

        }
        }

        Show
        Kuntal Ganguly added a comment - I modify the code to support inbuilt alias transformer as well as custom transformer: Modify the method: /** Logic to handle Custom as well as inbuilt Doc Transformer. */ void onTransformer(final StringBuilder expressionBuffer, final SolrQueryRequest request, final DocTransformers augmenters) throws SyntaxError { if(expressionBuffer!=null){ String fl_Content=expressionBuffer.toString(); if(fl_Content.contains(" [") && fl_Content.contains("] ")){ if(fl_Content.indexOf("[")==0){ //Logic to Parse Custom Transformers final Map<String,String> augmenterCustomArgs = new HashMap<String,String>(); QueryParsing.parseLocalParams(expressionBuffer.toString(), 0, augmenterCustomArgs, request.getParams(), "[", CLOSE_BRACKET); final String augmenterCustomName = augmenterCustomArgs.remove("type"); final String customDisp=' ['+augmenterCustomName+'] '; final TransformerFactory customFactory = request.getCore().getTransformerFactory(augmenterCustomName); if( customFactory != null ) { MapSolrParams augmenterCustomParams = new MapSolrParams( augmenterCustomArgs ); augmenters.addTransformer( customFactory.create(customDisp, augmenterCustomParams, request) ); } }else if(fl_Content.indexOf("[")>0 && fl_Content.contains(":[")){ //Logic to Parse In_Built Transformers final String alias = getExpressionAlias(expressionBuffer); final String transfomerExpression = getExpressionValue(expressionBuffer); final Map<String,String> augmenterArgs = new HashMap<String,String>(); QueryParsing.parseLocalParams(transfomerExpression, 0, augmenterArgs, request.getParams(), "[", CLOSE_BRACKET); final String augmenterName = augmenterArgs.remove("type"); final String aliasThatWillBeUsed = (alias != null) ? alias : OPEN_BRACKET + augmenterName + CLOSE_BRACKET; final TransformerFactory factory = request.getCore().getTransformerFactory(augmenterName); if (factory != null) { augmenters.addTransformer(factory.create(aliasThatWillBeUsed, new MapSolrParams(augmenterArgs), request)); onInclusionLiteralExpression(expressionBuffer, augmenters, false, false); } } } } }
        Hide
        Kuntal Ganguly added a comment -

        I have tried this patch on Solr 4.2.1 and 4.9, but this patch is not working with custom doc transformer.

        I have a custom transformer that does the following:

        The Custom transformer for joining between two entities and merging fields from second entity into the response of 1st entity is deployed in both local and staging boxes.

        Usecase:

        Say Entity-1 type=ProjectDocument and Entity-2 type=Projects.Now we want to search on certain fields of Entity-1 (document_createbyname),
        then join with Entity-2 based on common field/fields between two entities (pinprojectid)and finally fetch and merge field/fields of Entity-2(projecttype) into the response of Entity-1.

        Sample Query:

        Local::
        http://10.10.87.91:4983/solr/collection1/select?q=type:ProjectDocument AND document_createbyname:"Sudipta Sarkar"
        &fl=*,[mytrans5 jf=pinprojectid if=type:Projects mf=projecttype]

        Staging::
        http://10.1.1.230:6983/solr/collection1/select?q=type:ProjectDocument AND document_documenttitle:testing&&fl=*,[mytrans5 jf=pinprojectid if=type:Projects mf=pwaccountid,projectaddress2,projecttype,projectenddate,project_city]

        Details of transformer

        Transformer name->mytrans5
        It has three parameter:

        1. jf (Joining Field- Mandatory)= can be single or multi-value separated by comma. This should contain common fields between two entities.

        2. mf (Merge Field- Mandatory)= can be single or multi-value separated by comma. This fields will be fetch from 2nd entity and merge with 1st entity response.

        3. if (Identity Field- Optional)= single value. This fields specify the second entity uniquely from 1st entity (say type:Projects). If this fields is not provided during query,this will then read default value from configuration file. If default value is also missing,then it will take the first value from merge field parameter.

        [mytrans5 jf=pinprojectid,pwaccountid if=type:Projects mf=projecttype,projectenddate,project_city]

        Can you suggest me what to do? Because the transformer is working fine without this patch properly as expected.

        Show
        Kuntal Ganguly added a comment - I have tried this patch on Solr 4.2.1 and 4.9, but this patch is not working with custom doc transformer. I have a custom transformer that does the following: The Custom transformer for joining between two entities and merging fields from second entity into the response of 1st entity is deployed in both local and staging boxes. Usecase: Say Entity-1 type=ProjectDocument and Entity-2 type=Projects.Now we want to search on certain fields of Entity-1 (document_createbyname), then join with Entity-2 based on common field/fields between two entities (pinprojectid)and finally fetch and merge field/fields of Entity-2(projecttype) into the response of Entity-1. Sample Query: Local:: http://10.10.87.91:4983/solr/collection1/select?q=type:ProjectDocument AND document_createbyname:"Sudipta Sarkar" &fl=*, [mytrans5 jf=pinprojectid if=type:Projects mf=projecttype] Staging:: http://10.1.1.230:6983/solr/collection1/select?q=type:ProjectDocument AND document_documenttitle:testing&&fl=*, [mytrans5 jf=pinprojectid if=type:Projects mf=pwaccountid,projectaddress2,projecttype,projectenddate,project_city] Details of transformer Transformer name->mytrans5 It has three parameter: 1. jf (Joining Field- Mandatory)= can be single or multi-value separated by comma. This should contain common fields between two entities. 2. mf (Merge Field- Mandatory)= can be single or multi-value separated by comma. This fields will be fetch from 2nd entity and merge with 1st entity response. 3. if (Identity Field- Optional)= single value. This fields specify the second entity uniquely from 1st entity (say type:Projects). If this fields is not provided during query,this will then read default value from configuration file. If default value is also missing,then it will take the first value from merge field parameter. [mytrans5 jf=pinprojectid,pwaccountid if=type:Projects mf=projecttype,projectenddate,project_city] Can you suggest me what to do? Because the transformer is working fine without this patch properly as expected.
        Shalin Shekhar Mangar made changes -
        Assignee Shalin Shekhar Mangar [ shalinmangar ]
        Hide
        Shalin Shekhar Mangar added a comment -

        I don't have time right now to review this. I assigned it to myself because there was a lot of public interest but no assignee. However it looks like a couple of other committers have interest in this issue as well. I can only look at this after a few weeks so if no one takes it up, then I will.

        Show
        Shalin Shekhar Mangar added a comment - I don't have time right now to review this. I assigned it to myself because there was a lot of public interest but no assignee. However it looks like a couple of other committers have interest in this issue as well. I can only look at this after a few weeks so if no one takes it up, then I will.
        Andrea Gazzarini made changes -
        Attachment SOLR-3191.patch [ 12617644 ]
        Hide
        Andrea Gazzarini added a comment -

        New version of the patch – ReturnFields without constants.

        Show
        Andrea Gazzarini added a comment - New version of the patch – ReturnFields without constants.
        Hide
        Andrea Gazzarini added a comment -

        Heh - I see my use of "interface" was ambiguous.

        Heheh...and in addition there's my wonderful english.

        I understand what you mean and I agree. The only problem I see if we want to remove ReturnFields is that it is referenced 63 times (moreless) in source code so the patch would have a more delicate impact (at the moment the patch changed just 3 isolated files).

        Otherwise, from an OO perspective, what Ryan Ernst said is right so we could just remove the constants in the abstract class.

        Show
        Andrea Gazzarini added a comment - Heh - I see my use of "interface" was ambiguous. Heheh...and in addition there's my wonderful english. I understand what you mean and I agree. The only problem I see if we want to remove ReturnFields is that it is referenced 63 times (moreless) in source code so the patch would have a more delicate impact (at the moment the patch changed just 3 isolated files). Otherwise, from an OO perspective, what Ryan Ernst said is right so we could just remove the constants in the abstract class.
        Hide
        Yonik Seeley added a comment -

        I agree with you regarding the abstract class: ReturnFields should be an interface;

        Heh - I see my use of "interface" was ambiguous. I actually meant the higher level text (http, etc) interface to Solr is important, not the java level. It's that interface that really counts... being able to pass fl=foo, _important, -bar, -_big
        So I was questioning why there were even two classes, rather than just a single one. If a super-expert user wants to make their own customized version, they can copy-n-paste the code, but it's so far into the outliers we shouldn't be designing around it.

        Show
        Yonik Seeley added a comment - I agree with you regarding the abstract class: ReturnFields should be an interface; Heh - I see my use of "interface" was ambiguous. I actually meant the higher level text (http, etc) interface to Solr is important, not the java level. It's that interface that really counts... being able to pass fl=foo, _important, -bar, - _big So I was questioning why there were even two classes, rather than just a single one. If a super-expert user wants to make their own customized version, they can copy-n-paste the code, but it's so far into the outliers we shouldn't be designing around it.
        Hide
        Ryan Ernst added a comment -

        a) ReturnFields as interface

        ReturnFields is an abstract class so that changes to the interface can not break concrete implementations. It was originally added in SOLR-4226.

        b) Constants removed and put in SolrReturnFields as Robert suggested?

        Yes Robert Muir is right that there shouldn't be constants in the ReturnFields. Leave that up to the concrete implementations. I also think making it pluggable would be very nice, but should be a separate issue.

        Show
        Ryan Ernst added a comment - a) ReturnFields as interface ReturnFields is an abstract class so that changes to the interface can not break concrete implementations. It was originally added in SOLR-4226 . b) Constants removed and put in SolrReturnFields as Robert suggested? Yes Robert Muir is right that there shouldn't be constants in the ReturnFields. Leave that up to the concrete implementations. I also think making it pluggable would be very nice, but should be a separate issue.
        Hide
        Andrea Gazzarini added a comment -

        Thank you very much Yonik, very happy to hear that.

        I agree with you regarding the abstract class: ReturnFields should be an interface; I left it as a class because I don't know the past behind the code so my concern was to respect and preserve as much as possible what others did (not for SolrReturnFields but here there were a lot if things that required changes)

        So, do you want me to update the patch with

        a) ReturnFields as interface
        b) Constants removed and put in SolrReturnFields as Robert suggested?

        Show
        Andrea Gazzarini added a comment - Thank you very much Yonik, very happy to hear that. I agree with you regarding the abstract class: ReturnFields should be an interface; I left it as a class because I don't know the past behind the code so my concern was to respect and preserve as much as possible what others did (not for SolrReturnFields but here there were a lot if things that required changes) So, do you want me to update the patch with a) ReturnFields as interface b) Constants removed and put in SolrReturnFields as Robert suggested?
        Hide
        Yonik Seeley added a comment -

        I like the direction you're going Adrea!
        Frankly, I don't understand why there is an abstract class at all, or why we need plugabillity here at all. This should really be about interface and making the simple things simple. The edge cases like fields starting with a "-" can be handled as edge cases (via field() function or whatever) and should not dominate the design here.

        Show
        Yonik Seeley added a comment - I like the direction you're going Adrea! Frankly, I don't understand why there is an abstract class at all, or why we need plugabillity here at all. This should really be about interface and making the simple things simple. The edge cases like fields starting with a "-" can be handled as edge cases (via field() function or whatever) and should not dominate the design here.
        Hide
        Andrea Gazzarini added a comment -

        > the abstract ReturnFields should not have constants. its abstract, it has no syntax. that stuff needs to be moved to an implementation class.

        No problem at all, I can reattach the patch

        > Separately, SolrReturnFields.java i think has already reached the limit where its totally un-understandable. The original discussion on this issue talked about having a separate implementation, but it seems instead it was all piled into ReturnFields.

        Sorry I didn't find those assumptions in this issue. It's all into (Solr)ReturnFields class because all the described logic is there. I can separate the parser (which from "some" perspectives I could agree with you, inner classes complicates the reading) and the builder logic if you think is better.

        > I think we should make the ReturnFields implementation something that can be plugged into the solrconfig without writing java code and provide alternatives (for example: one that is simple and where the code can be understood)

        If things need to be done in this way I can try to do that. Again, I didn't because in this issue there's no trace about that "pluggabilty". Also please clarify me about what you consider understandable so I can follow that direction.

        Show
        Andrea Gazzarini added a comment - > the abstract ReturnFields should not have constants. its abstract, it has no syntax. that stuff needs to be moved to an implementation class. No problem at all, I can reattach the patch > Separately, SolrReturnFields.java i think has already reached the limit where its totally un-understandable. The original discussion on this issue talked about having a separate implementation, but it seems instead it was all piled into ReturnFields. Sorry I didn't find those assumptions in this issue. It's all into (Solr)ReturnFields class because all the described logic is there. I can separate the parser (which from "some" perspectives I could agree with you, inner classes complicates the reading) and the builder logic if you think is better. > I think we should make the ReturnFields implementation something that can be plugged into the solrconfig without writing java code and provide alternatives (for example: one that is simple and where the code can be understood) If things need to be done in this way I can try to do that. Again, I didn't because in this issue there's no trace about that "pluggabilty". Also please clarify me about what you consider understandable so I can follow that direction.
        Hide
        Robert Muir added a comment -

        the abstract ReturnFields should not have constants. its abstract, it has no syntax. that stuff needs to be moved to an implementation class.

        Separately, SolrReturnFields.java i think has already reached the limit where its totally un-understandable. The original discussion on this issue talked about having a separate implementation, but it seems instead it was all piled into ReturnFields.

        What to do about fields that actually start with "-" ? Remember, not everyone uses the lucene query parser. The assumptions listed on the beginning of this issue about that are totally wrong.

        I think we should make the ReturnFields implementation something that can be plugged into the solrconfig without writing java code and provide alternatives (for example: one that is simple and where the code can be understood)

        Show
        Robert Muir added a comment - the abstract ReturnFields should not have constants. its abstract, it has no syntax. that stuff needs to be moved to an implementation class. Separately, SolrReturnFields.java i think has already reached the limit where its totally un-understandable. The original discussion on this issue talked about having a separate implementation, but it seems instead it was all piled into ReturnFields. What to do about fields that actually start with "-" ? Remember, not everyone uses the lucene query parser. The assumptions listed on the beginning of this issue about that are totally wrong. I think we should make the ReturnFields implementation something that can be plugged into the solrconfig without writing java code and provide alternatives (for example: one that is simple and where the code can be understood)
        Shalin Shekhar Mangar made changes -
        Assignee Shalin Shekhar Mangar [ shalinmangar ]
        Andrea Gazzarini made changes -
        Field Original Value New Value
        Attachment SOLR-3191.patch [ 12615801 ]
        Hide
        Andrea Gazzarini added a comment -

        Patch for field exclusion in fl

        Show
        Andrea Gazzarini added a comment - Patch for field exclusion in fl
        Hide
        Andrea Gazzarini added a comment -

        Hi all, I attached a patch for the fl exclusion feature.
        Sorry for the very long post and for delay. I will try to be as short as possible, there are a lot of things that need to be considered but in general I would say that the whole stuff is easier to try than to explain.

        Token types

        Lets start with token types. The following tokens are supported (just a short list, each type is subsequently explained):

        • literal inclusion: name, id, title, subject (can be aliased like alias:fieldname);
        • inclusion glob: na*, n?m*, *me (cannot be aliased because the expression could match more than one fields);
        • *: as before, that means "all real fields": internally is managed as a special case of inclusion glob;
        • literal exclusion: -name, -id, -title (cannot be aliased, doesn't make sense)
        • exclusion glob: -na*, -n?m*, -*me (cannot be aliased, doesn't make sense); as special case -* is ignored.
        • transformers: [explain], [docid], [shard] (can be aliased like alias:[docid])
        • functions: sum(1,1), "literal value" (can be aliased like alias:sum(1,1), myalias:"literal value")
        Literal inclusion

        A literal inclusion declares a (real) field that must be returned in response. Supports aliasing (myalias:fieldname)
        Examples:

        • name
        • this.is.a.valid.field (see ReturnFieldsTest.testDotInFieldName)
        • this-is-a-valid-field (see ReturnFieldsTest.testHyphenInFieldName)
        • this$is$a$valid$field (see ReturnFieldsTest.testDollarInFieldName)
        • #foo_s,fl (see ReturnFieldsTest.testFunkyFieldNames)
        Inclusion glob

        An inclusion glob declares an expression using * and / or ?. Fields matching that expression will be returned in response.
        Examples (see ReturnFieldsTest.testWildCards / testReturnAllFields / testReturnOnlyName)

        • na*
        • n?m?
        • n*
        • *me
        • *m?

        Aliases cannot be applied to inclusion globs because an expression could match more than one fields. The expression wont' be discarded, just the alias part. So for example

        fl=alias:na* will collect the na* inclusion glob

        Literal exclusion

        A literal exclusion declares a field that will be excluded in response. Basically it is a literal inclusion prefixed by -
        Examples (see ReturnFieldsTest.testReturnOnlyName / testReturnOnlyNameAndTitle / testReturnAllRealFieldsExceptName):

        • -name
        • -this.is.a.valid.field (see ReturnFieldsTest.testDotInFieldName)
        • -this-is-a-valid-field (see ReturnFieldsTest.testHyphenInFieldName)
        • -this$is$a$valid$field (see ReturnFieldsTest.testDollarInFieldName)
        • -#foo_s,-fl (see ReturnFieldsTest.testFunkyFieldNames)

        Aliasing is not supported so

        fl=-alias:fieldname

        is silently discarded. Note that in this case or in this one:

        fl=-a1:a -a2:b -a3:c

        all tokens are invalid so the expressions will be resolved in

        fl=

        and therefore standard procedure applies (see last testcase on ReturnFieldsTest.testReturnAllRealFields)

        exclusion glob

        An exclusion glob is an expression that uses * and / or ?. Fields matching that expression will be excluded from response.
        Examples (see ReturnFieldsTest.testReturnOnlyName / testReturnOnlyNameAndTitle / testReturnAllRealFieldsExceptName):

        • -na*
        • -n?m?
        • -n*
        • -*me
        • -*m?

        As literal exclusions, you cannot use aliases in these expressions, they will be silently discarded (see last testcase on ReturnFieldsTest.testReturnAllRealFields)

        transformers

        Just one important thing: invalid transformers are ignored.
        Examples (you can see all the assertions below in ReturnsFieldsTest.testTransformers):

        • fl=[explain] (one transformer)
        • fl=[docid] [explain] (two transformers)
        • fl=[shard] id (a transformer and a real field)
        • fl=[xxxxxx] (an invalid transformer. That will have the same effect of having an empty fl)
        • fl=[xxxxxx] [yyyyy] (two invalid transformers. That will have the same effect of having an empty fl)
        • fl=[docid] [shard] [xxxx] (the invalid transformer will be ignored so only [docid] and [shard] will be evaluated
        • fl=myalias:[docid] (aliased transformer)
        • fl=alias:[yyyyyy] (invalid aliased transformer - will be ignored, see last test on ReturnFieldsTest.testReturnAllRealFields)
        Functions (and literals)

        All functions described here http://wiki.apache.org/solr/FunctionQuery#Available_Functions
        can be declared in "fl" parameter. They can be aliased too.
        Examples (see ReturnFieldsTest.testFunctions)

        • fl=sum(1,1)
        • fl="this is a literal", 'this is another literal", 1.2
        • fl=pippo:"this is an aliased literal", pluto:'this is another aliased literal", paperino:1.2
        • fl= {!func}add($v1,$v2)&v1=10&v2=13.0
          - fl=alias:{!func}

          add($v1,$v2)&v1=10&v2=13.0

        WARNING: differently from transformers, in case of invalid function an exception will be thrown.

        Aliases

        Aliases are supported on

        • literal field names (e.g. alias:fieldname)
        • transformers (e.g. alias:[docid])
        • functions (e.g. alias:max(1,2))

        and are not supported on

        • inclusion globs (because expression could match multiple fields and alias in response must be uniquely associated to a field. The alias part of the token expression will be ignored, the glob expression collected)
        • literal exclusion (doesn't make sense, will be silently ignored)
        • exclusion globs (doesn't make sense, will be silently ignored)

        General rules

        • an inclusion (literal or glob) is ignored if in case of *.
          • fl=* name na* (return all fields, name and na* fields are already included)
          • fl=name * (same as before)
          • any exclusion token will be ignored if an inclusion has been defined before
          • fl=name -id (returns all fields except id so "name", with other real fields, is implicitly included)
        • an inclusion token will clear all exclusion
          • fl=-id name (means "returns only name", no matter what exclusions are defined before)
          • fl=-id *
          • the * expression is ignored in case of an exclusion (literal or glob)
          • * -name (the * doesn't make sense, or in other words is implicit: returns all fields except name)
          • -name * (same as before)
        • score, transformers and functions doesn't change the behaviour described above, they are just added to response
          • fl=score, *, name
          • fl=*, [docid], name, sum(1,1), myfunctionalias:"literal value"
          • fl=name -id [shard], exchange:1.34
        • an empty or null fl will execute as "*" (all real fields). Note that we can be in this situation if we pass an empty fl, no fl or if all tokens in fl are discarded because invalid.
        • some other rules like multiplicity on fl parameter (see ReturnFieldsTest.testManyParameters), token separator (comma or whitespaces - see ReturnFieldsTest.testWhitespace and ReturnFieldsTest.testSeparators) remain the same.

        Some (technical) consideration

        1. I'm not sure about the test methods granularity level: some test methods, although grouping the same concept, contain a lot of sub-cases / scenarios and sometimes is hard to capture the functional behaviour (see for example ReturnFieldsTest.testTransformers or ReturnFieldsTest.testAliases). Maybe doing a test method for each sub-case should be better? I modified the ReturnFieldsTest for adding additional test cases but I followed the existing approach...let me know what you think about a much fine level of granularity..in case I can try to make some refactoring in that direction

        2. There are two (at least it seems to me) classes that are testing ReturnFields behaviour : ReturnFieldsTest and TestPseudoReturnFields...maybe all those test methods could be part of one test case? I don't know the reason behind the choice of having two classes so I'm just asking, the TestPseudoReturnFields is still there and all tests are green

        Any feedback is warmly welcome.
        Andrea

        Show
        Andrea Gazzarini added a comment - Hi all, I attached a patch for the fl exclusion feature. Sorry for the very long post and for delay. I will try to be as short as possible, there are a lot of things that need to be considered but in general I would say that the whole stuff is easier to try than to explain. Token types Lets start with token types. The following tokens are supported (just a short list, each type is subsequently explained): literal inclusion: name, id, title, subject (can be aliased like alias:fieldname); inclusion glob: na*, n?m*, *me (cannot be aliased because the expression could match more than one fields); *: as before, that means "all real fields": internally is managed as a special case of inclusion glob; literal exclusion: -name, -id, -title (cannot be aliased, doesn't make sense) exclusion glob: -na*, -n?m*, -*me (cannot be aliased, doesn't make sense); as special case -* is ignored. transformers: [explain], [docid], [shard] (can be aliased like alias:[docid]) functions: sum(1,1), "literal value" (can be aliased like alias:sum(1,1), myalias:"literal value") Literal inclusion A literal inclusion declares a (real) field that must be returned in response. Supports aliasing (myalias:fieldname) Examples: name this.is.a.valid.field (see ReturnFieldsTest.testDotInFieldName) this-is-a-valid-field (see ReturnFieldsTest.testHyphenInFieldName) this$is$a$valid$field (see ReturnFieldsTest.testDollarInFieldName) #foo_s,fl (see ReturnFieldsTest.testFunkyFieldNames) Inclusion glob An inclusion glob declares an expression using * and / or ?. Fields matching that expression will be returned in response. Examples (see ReturnFieldsTest.testWildCards / testReturnAllFields / testReturnOnlyName) na* n?m? n* *me *m? Aliases cannot be applied to inclusion globs because an expression could match more than one fields. The expression wont' be discarded, just the alias part. So for example fl=alias:na* will collect the na* inclusion glob Literal exclusion A literal exclusion declares a field that will be excluded in response. Basically it is a literal inclusion prefixed by - Examples (see ReturnFieldsTest.testReturnOnlyName / testReturnOnlyNameAndTitle / testReturnAllRealFieldsExceptName): -name -this.is.a.valid.field (see ReturnFieldsTest.testDotInFieldName) -this-is-a-valid-field (see ReturnFieldsTest.testHyphenInFieldName) -this$is$a$valid$field (see ReturnFieldsTest.testDollarInFieldName) -#foo_s,-fl (see ReturnFieldsTest.testFunkyFieldNames) Aliasing is not supported so fl=-alias:fieldname is silently discarded. Note that in this case or in this one: fl=-a1:a -a2:b -a3:c all tokens are invalid so the expressions will be resolved in fl= and therefore standard procedure applies (see last testcase on ReturnFieldsTest.testReturnAllRealFields) exclusion glob An exclusion glob is an expression that uses * and / or ?. Fields matching that expression will be excluded from response. Examples (see ReturnFieldsTest.testReturnOnlyName / testReturnOnlyNameAndTitle / testReturnAllRealFieldsExceptName): -na* -n?m? -n* -*me -*m? As literal exclusions, you cannot use aliases in these expressions, they will be silently discarded (see last testcase on ReturnFieldsTest.testReturnAllRealFields) transformers Just one important thing: invalid transformers are ignored. Examples (you can see all the assertions below in ReturnsFieldsTest.testTransformers): fl=[explain] (one transformer) fl=[docid] [explain] (two transformers) fl=[shard] id (a transformer and a real field) fl=[xxxxxx] (an invalid transformer. That will have the same effect of having an empty fl) fl=[xxxxxx] [yyyyy] (two invalid transformers. That will have the same effect of having an empty fl) fl=[docid] [shard] [xxxx] (the invalid transformer will be ignored so only [docid] and [shard] will be evaluated fl=myalias:[docid] (aliased transformer) fl=alias:[yyyyyy] (invalid aliased transformer - will be ignored, see last test on ReturnFieldsTest.testReturnAllRealFields) Functions (and literals) All functions described here http://wiki.apache.org/solr/FunctionQuery#Available_Functions can be declared in "fl" parameter. They can be aliased too. Examples (see ReturnFieldsTest.testFunctions) fl=sum(1,1) fl="this is a literal", 'this is another literal", 1.2 fl=pippo:"this is an aliased literal", pluto:'this is another aliased literal", paperino:1.2 fl= {!func}add($v1,$v2)&v1=10&v2=13.0 - fl=alias:{!func} add($v1,$v2)&v1=10&v2=13.0 WARNING: differently from transformers, in case of invalid function an exception will be thrown. Aliases Aliases are supported on literal field names (e.g. alias:fieldname) transformers (e.g. alias:[docid]) functions (e.g. alias:max(1,2)) and are not supported on inclusion globs (because expression could match multiple fields and alias in response must be uniquely associated to a field. The alias part of the token expression will be ignored, the glob expression collected) literal exclusion (doesn't make sense, will be silently ignored) exclusion globs (doesn't make sense, will be silently ignored) General rules an inclusion (literal or glob) is ignored if in case of *. fl=* name na* (return all fields, name and na* fields are already included) fl=name * (same as before) any exclusion token will be ignored if an inclusion has been defined before fl=name -id (returns all fields except id so "name", with other real fields, is implicitly included) an inclusion token will clear all exclusion fl=-id name (means "returns only name", no matter what exclusions are defined before) fl=-id * the * expression is ignored in case of an exclusion (literal or glob) * -name (the * doesn't make sense, or in other words is implicit: returns all fields except name) -name * (same as before) score, transformers and functions doesn't change the behaviour described above, they are just added to response fl=score, *, name fl=*, [docid], name, sum(1,1), myfunctionalias:"literal value" fl=name -id [shard], exchange:1.34 an empty or null fl will execute as "*" (all real fields). Note that we can be in this situation if we pass an empty fl, no fl or if all tokens in fl are discarded because invalid. some other rules like multiplicity on fl parameter (see ReturnFieldsTest.testManyParameters), token separator (comma or whitespaces - see ReturnFieldsTest.testWhitespace and ReturnFieldsTest.testSeparators) remain the same. Some (technical) consideration 1. I'm not sure about the test methods granularity level: some test methods, although grouping the same concept, contain a lot of sub-cases / scenarios and sometimes is hard to capture the functional behaviour (see for example ReturnFieldsTest.testTransformers or ReturnFieldsTest.testAliases). Maybe doing a test method for each sub-case should be better? I modified the ReturnFieldsTest for adding additional test cases but I followed the existing approach...let me know what you think about a much fine level of granularity..in case I can try to make some refactoring in that direction 2. There are two (at least it seems to me) classes that are testing ReturnFields behaviour : ReturnFieldsTest and TestPseudoReturnFields...maybe all those test methods could be part of one test case? I don't know the reason behind the choice of having two classes so I'm just asking, the TestPseudoReturnFields is still there and all tests are green Any feedback is warmly welcome. Andrea
        Hide
        Andrea Gazzarini added a comment -

        Hi, Some days ago I built a working implementation of ReturnFields some days ago. It's not yet ready because (in my free time) I'm trying to enumerate all scenarios in ReturnFieldsTest.
        Hope to submit a patch soon.

        Show
        Andrea Gazzarini added a comment - Hi, Some days ago I built a working implementation of ReturnFields some days ago. It's not yet ready because (in my free time) I'm trying to enumerate all scenarios in ReturnFieldsTest. Hope to submit a patch soon.
        Hide
        Erick Erickson added a comment -

        No body has put up any kind of patch for this, it's just a suggestion at this point.

        If you'd like to supply one that would be great!

        Show
        Erick Erickson added a comment - No body has put up any kind of patch for this, it's just a suggestion at this point. If you'd like to supply one that would be great!
        Hide
        David Morana added a comment -

        This is exactly one of the features I was looking for.
        If possible can you make a patch of this for Solr v4.0 Final?
        Thanks,

        Show
        David Morana added a comment - This is exactly one of the features I was looking for. If possible can you make a patch of this for Solr v4.0 Final? Thanks,
        Hide
        Andrea Gazzarini added a comment - - edited

        Hi all, I'm going to complete the implementation of the field exclusion (using trunk code) but in order to complete I have some questions (so probably I'll have to change a little bit something on my code):

        • New ReturnFields implementor or SolrReturnFields change?
          At the moment I created another class so the old implementation (SolrReturnFields) is still there. There's no code duplication (or just a little bit that can be remove if that is ok by changing the SolrReturnFields too) because the fl parsing uses a different logic (basically no QueryParsing.StrParser).
        • SolrReturnFields:83 support for fl=' ' => *,score
          There's a comment below this line about an old feature that could be removed. On the wiki there's no mention about that so can I remove that?
        • About glob
          What would be the right behaviour (see below)? A full name expansion support or just *aaa bbb*? Does SOLR need that complexity? I started playing with SOLR in 2009 and honestly I never used globs in fl so I have no concrete experience for taking a decision.
          • The Wiki talks about just one example (with trailing wildcard)
          • org.apache.solr.search.ReturnFieldsTest.testWilcards() considers only beginning and trailing wildcard cases (e.g. *aaa, bbb*)
          • But the code supports a wide range of globs (e.g. a*a, a?a, n*m*,n*m?)
        • fl expressions (probably I will attach some other "ambiguous" case later)
          What would be the expected behaviour in these cases?
          • * -name test
          • -name test *
          • -name * test
          • -* I understand that should be wrong but what would be the correct behaviour? SyntaxError?
          • name name:manu (at the moment it seems the "aliased" field always wins)
          • pippo:name pippo:manu ( at the moment it seems the last alias wins)

        For latest three I believe a SyntaxError would be more appropriate

        Show
        Andrea Gazzarini added a comment - - edited Hi all, I'm going to complete the implementation of the field exclusion (using trunk code) but in order to complete I have some questions (so probably I'll have to change a little bit something on my code): New ReturnFields implementor or SolrReturnFields change? At the moment I created another class so the old implementation (SolrReturnFields) is still there. There's no code duplication (or just a little bit that can be remove if that is ok by changing the SolrReturnFields too) because the fl parsing uses a different logic (basically no QueryParsing.StrParser). SolrReturnFields:83 support for fl=' ' => *,score There's a comment below this line about an old feature that could be removed. On the wiki there's no mention about that so can I remove that? About glob What would be the right behaviour (see below)? A full name expansion support or just *aaa bbb*? Does SOLR need that complexity? I started playing with SOLR in 2009 and honestly I never used globs in fl so I have no concrete experience for taking a decision. The Wiki talks about just one example (with trailing wildcard) org.apache.solr.search.ReturnFieldsTest.testWilcards() considers only beginning and trailing wildcard cases (e.g. *aaa, bbb*) But the code supports a wide range of globs (e.g. a*a, a?a, n*m*,n*m?) fl expressions (probably I will attach some other "ambiguous" case later) What would be the expected behaviour in these cases? * -name test -name test * -name * test -* I understand that should be wrong but what would be the correct behaviour? SyntaxError? name name:manu (at the moment it seems the "aliased" field always wins) pippo:name pippo:manu ( at the moment it seems the last alias wins) For latest three I believe a SyntaxError would be more appropriate
        Hide
        Bill Bell added a comment -

        This is a very useful feature.... I have 80 fields and want to just eclude one large field...

        Show
        Bill Bell added a comment - This is a very useful feature.... I have 80 fields and want to just eclude one large field...
        Hide
        Jan Høydahl added a comment -

        Luca Cavanna, now that SOLR-2719 is fixed, I think it should be green lights for this, if you'd like to attempt a patch. I don't know if the code from UserFields class may be helpful at all..

        Show
        Jan Høydahl added a comment - Luca Cavanna , now that SOLR-2719 is fixed, I think it should be green lights for this, if you'd like to attempt a patch. I don't know if the code from UserFields class may be helpful at all..
        Hide
        Luca Cavanna added a comment -

        Thanks for your feedbacks!
        I had thought about the same solution too...I was just afraid of adding complexity to the new fl syntax, but if you think that's the way to go, great, and I'd like to contribute on this!
        My concern is just about SOLR-2719. Don't you think we should correct it and add some kind of consistency before adding new features? What do you think about adding some kind of validation for field names? Shall we open a new issue for that?

        Show
        Luca Cavanna added a comment - Thanks for your feedbacks! I had thought about the same solution too...I was just afraid of adding complexity to the new fl syntax, but if you think that's the way to go, great, and I'd like to contribute on this! My concern is just about SOLR-2719 . Don't you think we should correct it and add some kind of consistency before adding new features? What do you think about adding some kind of validation for field names? Shall we open a new issue for that?
        Hide
        Yonik Seeley added a comment -

        +1, looks great Jan!

        Show
        Yonik Seeley added a comment - +1, looks great Jan!
        Hide
        Jan Høydahl added a comment -

        I think by definition a field name cannot start with "-" since even the Lucene query syntax must support the alternative NOT syntax -inStock:true. I therefore think we should borrow this syntax over here, just as we are doing for dismax uf in SOLR-3026. It should be easy enough to parse.

        Examples:

        q=foo&fl=-title           # All fields except title (include * is implicit)
        q=foo&fl=-title,-body     # All fields except title and body
        q=foo&fl=-*_s             # All fields except those matching *_s pattern
        q=foo&fl=foo,-bar         # Would be the same as saying only &fl=foo, since an explicit field resets the list
        q=foo&fl=-bar,foo         # same as above. Order not important?
        q=foo&fl=-foo,foo         # same as above. Probably all the exclude fields should be parsed last and override positive?
        q=foo&fl=*,-title         # same as only -title as * is implicit include when only exclude fields listed?
        q=foo&fl=-*               # Hmm, makes no sense, if you're not returning anything you should ask for rows=0 ??
        q=foo&fl=*z,-*xyz         # Include all fields ending in "z" except those ending in "xyz", negative rule applied last
        
        Show
        Jan Høydahl added a comment - I think by definition a field name cannot start with "-" since even the Lucene query syntax must support the alternative NOT syntax -inStock:true . I therefore think we should borrow this syntax over here, just as we are doing for dismax uf in SOLR-3026 . It should be easy enough to parse. Examples: q=foo&fl=-title # All fields except title (include * is implicit) q=foo&fl=-title,-body # All fields except title and body q=foo&fl=-*_s # All fields except those matching *_s pattern q=foo&fl=foo,-bar # Would be the same as saying only &fl=foo, since an explicit field resets the list q=foo&fl=-bar,foo # same as above. Order not important? q=foo&fl=-foo,foo # same as above. Probably all the exclude fields should be parsed last and override positive? q=foo&fl=*,-title # same as only -title as * is implicit include when only exclude fields listed? q=foo&fl=-* # Hmm, makes no sense, if you're not returning anything you should ask for rows=0 ?? q=foo&fl=*z,-*xyz # Include all fields ending in "z" except those ending in "xyz", negative rule applied last
        Hide
        Luca Cavanna added a comment -

        I previously left a comment on the SOLR-2444 asking if we can add an exclusion syntax there, but I'm honestly afraid of that now, given also the SOLR-2719 regression.
        I was thinking to keep it simple, for example using a new parameter like fl.ex or something, may I know what are your thoughts?

        Show
        Luca Cavanna added a comment - I previously left a comment on the SOLR-2444 asking if we can add an exclusion syntax there, but I'm honestly afraid of that now, given also the SOLR-2719 regression. I was thinking to keep it simple, for example using a new parameter like fl.ex or something, may I know what are your thoughts?
        Luca Cavanna created issue -

          People

          • Assignee:
            Unassigned
            Reporter:
            Luca Cavanna
          • Votes:
            23 Vote for this issue
            Watchers:
            26 Start watching this issue

            Dates

            • Created:
              Updated:

              Development