Solr
  1. Solr
  2. SOLR-1298

FunctionQuery results as pseudo-fields

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.0-ALPHA
    • Component/s: None
    • Labels:
      None

      Description

      It would be helpful if the results of FunctionQueries could be added as fields to a document.

      Couple of options here:
      1. Run FunctionQuery as part of relevance score and add that piece to the document
      2. Run the function (not really a query) during Document/Field retrieval

      1. SOLR-1298.patch
        13 kB
        Chris Male
      2. SOLR-1298-FieldValues.patch
        12 kB
        Chris Male

        Issue Links

          Activity

          Hide
          Erik Hatcher added a comment -

          I would only add that we probably don't want to add this to the actual response document returned, but rather attach them to an additional section like highlighting works. Same really for score, actually too, but I digress.

          Show
          Erik Hatcher added a comment - I would only add that we probably don't want to add this to the actual response document returned, but rather attach them to an additional section like highlighting works. Same really for score, actually too, but I digress.
          Hide
          Yonik Seeley added a comment -

          Returning stuff in separate sections is a pain for correlation though.
          See SOLR-705 for a proposal to add a meta section to documents.

          Show
          Yonik Seeley added a comment - Returning stuff in separate sections is a pain for correlation though. See SOLR-705 for a proposal to add a meta section to documents.
          Hide
          Andrzej Bialecki added a comment -

          Not sure about not adding it - what fields are returned is selectable, right? and it's not possible to obtain this information otherwise. Some time ago I implemented this for a client - it was before SOLR-243, but I used the same idea, i.e. to use a subclass of IndexReader that returns documents with added function fields (and score).

          Show
          Andrzej Bialecki added a comment - Not sure about not adding it - what fields are returned is selectable, right? and it's not possible to obtain this information otherwise. Some time ago I implemented this for a client - it was before SOLR-243 , but I used the same idea, i.e. to use a subclass of IndexReader that returns documents with added function fields (and score).
          Hide
          Chris Male added a comment -

          In my patch in SOLR-773, I tackled this issue by creating the idea of a FieldValueSource, which mapped a name of a pseudo-field to an arbitrary source of data which could be computed at runtime. For me it was distances, but it could also be the results of a FunctionQuery. Since there was a mapping of name to data, it was possible to include or exclude the FieldValueSources from adding their information to the search results through the fl parameter.

          Show
          Chris Male added a comment - In my patch in SOLR-773 , I tackled this issue by creating the idea of a FieldValueSource, which mapped a name of a pseudo-field to an arbitrary source of data which could be computed at runtime. For me it was distances, but it could also be the results of a FunctionQuery. Since there was a mapping of name to data, it was possible to include or exclude the FieldValueSources from adding their information to the search results through the fl parameter.
          Hide
          Erik Hatcher added a comment -

          SOLR-705 meta it is! Sorry, I had come across that one a while ago and liked it, but forgotten about it. +1

          Show
          Erik Hatcher added a comment - SOLR-705 meta it is! Sorry, I had come across that one a while ago and liked it, but forgotten about it. +1
          Hide
          Noble Paul added a comment -

          +1

          I guess it should be returned like any other normal field

          Show
          Noble Paul added a comment - +1 I guess it should be returned like any other normal field
          Hide
          Noble Paul added a comment -

          we should also let search components add extra fields to the document.

          Show
          Noble Paul added a comment - we should also let search components add extra fields to the document.
          Hide
          Grant Ingersoll added a comment -

          Dang, you know it's bad when you wake up in the morning and the first thing that comes into your head is what the interface should look like for some new feature in Solr.

          Alas, having just finished SOLR-1297, I think we should simply make the &fl parameter be able to parse functions and, if need be, they can be materialized/executed as they are being retrieved by the Writer (using SOLR-1650 if implemented).

          Thus, the interface for this would be:

          &fl=sum(x, y),id,a,b,c,score
          

          or

          &fl=id,sum(x, y),score
          
          &fl=*,sum(x, y),score
          

          So, the output would be:

          ...
          <str name="id">foo</str>
          <float name="sum(x,y)">40</float>
          <float name="score">0.343</float>
          ...
          
          Show
          Grant Ingersoll added a comment - Dang, you know it's bad when you wake up in the morning and the first thing that comes into your head is what the interface should look like for some new feature in Solr. Alas, having just finished SOLR-1297 , I think we should simply make the &fl parameter be able to parse functions and, if need be, they can be materialized/executed as they are being retrieved by the Writer (using SOLR-1650 if implemented). Thus, the interface for this would be: &fl=sum(x, y),id,a,b,c,score or &fl=id,sum(x, y),score &fl=*,sum(x, y),score So, the output would be: ... <str name= "id" >foo</str> < float name= "sum(x,y)" >40</ float > < float name= "score" >0.343</ float > ...
          Hide
          Grant Ingersoll added a comment -

          we should also let search components add extra fields to the document.

          I think we could handle this via the ResponseBuilder by storing an <id, <name, value>> pairing in a map that the ResponseWriter could then consult when it needs it as it's streaming out the results. Tricky part is what to do when there are no ids, I suppose.

          Show
          Grant Ingersoll added a comment - we should also let search components add extra fields to the document. I think we could handle this via the ResponseBuilder by storing an <id, <name, value>> pairing in a map that the ResponseWriter could then consult when it needs it as it's streaming out the results. Tricky part is what to do when there are no ids, I suppose.
          Hide
          Grant Ingersoll added a comment -

          Chris, could you isolate this particular part of your patch from SOLR-773?

          Show
          Grant Ingersoll added a comment - Chris, could you isolate this particular part of your patch from SOLR-773 ?
          Hide
          Chris Male added a comment -

          Hi Grant,

          I certainly can. I hadn't thought about having a function as an fl parameter value, but that makes alot of sense and I can support that through my work as well. I'll work on extracting the code today and will get a patch here ASAP.

          Show
          Chris Male added a comment - Hi Grant, I certainly can. I hadn't thought about having a function as an fl parameter value, but that makes alot of sense and I can support that through my work as well. I'll work on extracting the code today and will get a patch here ASAP.
          Hide
          Uri Boness added a comment -

          I certainly can. I hadn't thought about having a function as an fl parameter value, but that makes alot of sense and I can support that through my work as well. I'll work on extracting the code today and will get a patch here ASAP.

          As far as I recall the fact the functions are specified in the fl parameter should still work with the FieldValueSource as it is at the moment. The registry enables you to register any value for any string key, in this case the string key is the function.

          Show
          Uri Boness added a comment - I certainly can. I hadn't thought about having a function as an fl parameter value, but that makes alot of sense and I can support that through my work as well. I'll work on extracting the code today and will get a patch here ASAP. As far as I recall the fact the functions are specified in the fl parameter should still work with the FieldValueSource as it is at the moment. The registry enables you to register any value for any string key, in this case the string key is the function.
          Hide
          Uri Boness added a comment -

          Chris, another thing. You might want to update the FieldValueSource solution to work with SOLR-1644 (instead of the request context)

          Show
          Uri Boness added a comment - Chris, another thing. You might want to update the FieldValueSource solution to work with SOLR-1644 (instead of the request context)
          Hide
          Chris Male added a comment -

          Hi Uri,

          Yup the functions as fl parameters works straight away with the FieldValueSource so no changes required there. I will first chuck up a patch without SOLR-1644 so that it can be immediately reviewed, then I'll dive into how to update it to 1644 and will create another patch then.

          Show
          Chris Male added a comment - Hi Uri, Yup the functions as fl parameters works straight away with the FieldValueSource so no changes required there. I will first chuck up a patch without SOLR-1644 so that it can be immediately reviewed, then I'll dive into how to update it to 1644 and will create another patch then.
          Hide
          Chris Male added a comment -

          Attaching the patch taken from my SOLR-773 patch. Adds in a FieldValueSource and FieldValueSourceRegistry, changes the SolrIndexSearcher to use FieldValueSources when building a document, and hooks this process into the ReponseWriters.

          Show
          Chris Male added a comment - Attaching the patch taken from my SOLR-773 patch. Adds in a FieldValueSource and FieldValueSourceRegistry, changes the SolrIndexSearcher to use FieldValueSources when building a document, and hooks this process into the ReponseWriters.
          Hide
          Chris Male added a comment -

          Attached new patch which changes the names from FieldValueSource to FieldValues, and FieldValueSourceRegistry to FieldValuesRegistry, to avoid confusion with ValueSource.

          Show
          Chris Male added a comment - Attached new patch which changes the names from FieldValueSource to FieldValues, and FieldValueSourceRegistry to FieldValuesRegistry, to avoid confusion with ValueSource.
          Hide
          Yonik Seeley added a comment -

          A few comments and random thoughts on this feature in general:

          • Think scalability... there should be a way to keep things streamable. Some people will want to retrieve values for many documents (10K, 100K, or their whole index). But of course there should be a way for a component to simply add values calculated all at once too.
          • For performance, providers of field values should be able to operate on multiple documents at once. For example, providers may want to sort big blocks of docids and access in docid order for better performance (important for anything that accesses the index). A value provider that needs to access another system would want to send multiple IDs in a batch.
          • Field value providers should be given context, including optionally the set of fields for the current document, and probably the request and response objects
          • Perhaps this should be more generalized in that the value provider be a document mutator - it should be able to also change or remove other fields. I believe this would also allow stuff like per-field security. Field value providers should also be able to add multiple fields - it may not know ahead of time what extra fields a document has.
          • should work with highlighting... this way people don't have to store large text fields if they already have them in another system.
          • keep in mind that some people believe that derived fields (or meta fields) don't belong in the same place as other stored fields. I think it probably depends on the exact usecase though.
          • I'm not sure if SolrIndexSearcher is the right place for this or not though - perhaps its document() method should stick to just the stored fields?
          • Think about how to name these fields nicer names... perhaps this could even include the "select as" ability to rename fields.
            One thought: use an optional '=' or use the "AS" syntax
            fl=foo=bar,dist=gdist(10,20,loc) or
            fl=foo AS bar, gdist(10,20,loc) AS dist (more familiar to DB people?)
            Another option for providing names that would only work with queries/function queries would be local params:
            fl= {!key=dist}

            gdist(10,20,loc)
            but that only works for queries so it's not as flexible

          • If we use the fl syntax for including function queries, then we should consider providing the ability to use multiple "fl" params. This will make it easier for clients who want to tack something on w/o modifying other params.
            If we provide multiple fl params, then an alternate way to specify aliases could be:
            fl.dist=gdist(10,20,loc)
          • fl=foo is ambiguous... do we mean a function query or the field?... perhaps if it's a bare field name, then we treat it as a field unless it has localparams?
            fl= {!func}

            foo

          Show
          Yonik Seeley added a comment - A few comments and random thoughts on this feature in general: Think scalability... there should be a way to keep things streamable. Some people will want to retrieve values for many documents (10K, 100K, or their whole index). But of course there should be a way for a component to simply add values calculated all at once too. For performance, providers of field values should be able to operate on multiple documents at once. For example, providers may want to sort big blocks of docids and access in docid order for better performance (important for anything that accesses the index). A value provider that needs to access another system would want to send multiple IDs in a batch. Field value providers should be given context, including optionally the set of fields for the current document, and probably the request and response objects Perhaps this should be more generalized in that the value provider be a document mutator - it should be able to also change or remove other fields. I believe this would also allow stuff like per-field security. Field value providers should also be able to add multiple fields - it may not know ahead of time what extra fields a document has. should work with highlighting... this way people don't have to store large text fields if they already have them in another system. keep in mind that some people believe that derived fields (or meta fields) don't belong in the same place as other stored fields. I think it probably depends on the exact usecase though. I'm not sure if SolrIndexSearcher is the right place for this or not though - perhaps its document() method should stick to just the stored fields? Think about how to name these fields nicer names... perhaps this could even include the "select as" ability to rename fields. One thought: use an optional '=' or use the "AS" syntax fl=foo=bar,dist=gdist(10,20,loc) or fl=foo AS bar, gdist(10,20,loc) AS dist (more familiar to DB people?) Another option for providing names that would only work with queries/function queries would be local params: fl= {!key=dist} gdist(10,20,loc) but that only works for queries so it's not as flexible If we use the fl syntax for including function queries, then we should consider providing the ability to use multiple "fl" params. This will make it easier for clients who want to tack something on w/o modifying other params. If we provide multiple fl params, then an alternate way to specify aliases could be: fl.dist=gdist(10,20,loc) fl=foo is ambiguous... do we mean a function query or the field?... perhaps if it's a bare field name, then we treat it as a field unless it has localparams? fl= {!func} foo
          Hide
          Noble Paul added a comment -

          hi Chris, Yonik,
          did you check out SOLR-1566 ? it is trying to achieve the same thing. It provides both streaming as well as pre-computed fields

          Show
          Noble Paul added a comment - hi Chris, Yonik, did you check out SOLR-1566 ? it is trying to achieve the same thing. It provides both streaming as well as pre-computed fields
          Hide
          Uri Boness added a comment -

          I like the idea of giving the providing a broader context (document, request, response). This will also allow them to operate on multiple documents in the response (whether it's the docset or the doclist).

          One thing to take into consideration here is that one you introduce dependency between the fields, there must be a way to determine the ordering of the providers (as one provider might depend on fields generated by another provider).

          as for the "<field> AS <alias>" syntax. I think this should be consistent with the work in SOLR-1351 which is currently based on localparams. Perhaps there should be a common approach to handle aliases in requests.

          I think that the proper approach is to separate the stored fields from other "fields.. perhaps even put it in a separate "meta-data" section under the document. But once you do that, again, for the sake of consistency, it would also be wise not to include these fields/functions in the "fl" parameter. So the "fl" parameter will refer to fields, and another parameter "meta" will refer to meta-data values.

          fl={!func}foo

          +1 or even func:foo. Then you can have things like "url:<url>" or "file:<file path>" or even "db:<db alias + field>"

          Show
          Uri Boness added a comment - I like the idea of giving the providing a broader context (document, request, response). This will also allow them to operate on multiple documents in the response (whether it's the docset or the doclist). One thing to take into consideration here is that one you introduce dependency between the fields, there must be a way to determine the ordering of the providers (as one provider might depend on fields generated by another provider). as for the "<field> AS <alias>" syntax. I think this should be consistent with the work in SOLR-1351 which is currently based on localparams. Perhaps there should be a common approach to handle aliases in requests. I think that the proper approach is to separate the stored fields from other "fields.. perhaps even put it in a separate "meta-data" section under the document. But once you do that, again, for the sake of consistency, it would also be wise not to include these fields/functions in the "fl" parameter. So the "fl" parameter will refer to fields, and another parameter "meta" will refer to meta-data values. fl={!func}foo +1 or even func:foo. Then you can have things like "url:<url>" or "file:<file path>" or even "db:<db alias + field>"
          Hide
          Noble Paul added a comment -

          perhaps even put it in a separate "meta-data" section under the document

          This has been discussed earlier. The meta section is not a clean idea. We should put them as normal fields.

          Instead of inventing a new syntax , let us use the local params syntax. we do not have to try to have any similarity with SQL .

          Show
          Noble Paul added a comment - perhaps even put it in a separate "meta-data" section under the document This has been discussed earlier. The meta section is not a clean idea. We should put them as normal fields. Instead of inventing a new syntax , let us use the local params syntax. we do not have to try to have any similarity with SQL .
          Hide
          Grant Ingersoll added a comment -

          fl=foo is ambiguous... do we mean a function query or the field?... perhaps if it's a bare field name, then we treat it as a field unless it has localparams?
          fl=

          Unknown macro: {!func}

          foo

          It is and isn't ambiguous, right? The result should be the same in that the value for that field is loaded (although I suppose it has implications that it would now be possible to load non-stored, single valued fields if we treat it as a function). FWIW, for the sort by function case, I checked to see if it is a field first, then a function, then puke.

          Show
          Grant Ingersoll added a comment - fl=foo is ambiguous... do we mean a function query or the field?... perhaps if it's a bare field name, then we treat it as a field unless it has localparams? fl= Unknown macro: {!func} foo It is and isn't ambiguous, right? The result should be the same in that the value for that field is loaded (although I suppose it has implications that it would now be possible to load non-stored, single valued fields if we treat it as a function). FWIW, for the sort by function case, I checked to see if it is a field first, then a function, then puke.
          Hide
          Grant Ingersoll added a comment -

          I think that the proper approach is to separate the stored fields from other "fields.. perhaps even put it in a separate "meta-data" section under the document. But once you do that, again, for the sake of consistency, it would also be wise not to include these fields/functions in the "fl" parameter. So the "fl" parameter will refer to fields, and another parameter "meta" will refer to meta-data values.

          I think they should be inline, as they are just values associated with a document. I think putting it in some other list is sticking too literally to what Lucene calls a field, which I don't think Solr has to do that. One could easily imagine a Solr component that brought in a database or other storage repository for supplementary fields and it should all be seamless to the client.

          If we step back and think about the use case for this functionality it is that one wants the output of the function closely associated with the document. I don't want to have to go look it up in some other list while I am iterating over my results when all the other values I'm displaying/using are right there associated with the document. That being said, it could be useful to add an attribute that indicates it is a generated name, but in reality, that is inferred by the field name anyway, as in:

          <doc>
          <field name="pow(foo,2)">64</field>
          <field name="foo">8</field>
          </doc>
          

          I'd even argue that highlighter results should be inline, too, but that is a different issue and a bigger can of worms since it has a well used API already.

          Show
          Grant Ingersoll added a comment - I think that the proper approach is to separate the stored fields from other "fields.. perhaps even put it in a separate "meta-data" section under the document. But once you do that, again, for the sake of consistency, it would also be wise not to include these fields/functions in the "fl" parameter. So the "fl" parameter will refer to fields, and another parameter "meta" will refer to meta-data values. I think they should be inline, as they are just values associated with a document. I think putting it in some other list is sticking too literally to what Lucene calls a field, which I don't think Solr has to do that. One could easily imagine a Solr component that brought in a database or other storage repository for supplementary fields and it should all be seamless to the client. If we step back and think about the use case for this functionality it is that one wants the output of the function closely associated with the document. I don't want to have to go look it up in some other list while I am iterating over my results when all the other values I'm displaying/using are right there associated with the document. That being said, it could be useful to add an attribute that indicates it is a generated name, but in reality, that is inferred by the field name anyway, as in: <doc> <field name= "pow(foo,2)" >64</field> <field name= "foo" >8</field> </doc> I'd even argue that highlighter results should be inline, too, but that is a different issue and a bigger can of worms since it has a well used API already.
          Hide
          Chris Male added a comment -

          Just a couple of thoughts about the implementation of this:

          should work with highlighting... this way people don't have to store large text fields if they already have them in another system. I'm not sure if SolrIndexSearcher is the right place for this or not though - perhaps its document() method should stick to just the stored fields?

          From what I can see the Highlighter pulls documents from the SolrIndexSearcher as well through document() so the patch should already support highlighting. If we move the process away from the SolrIndexSearcher, which I understand the case for, then we need to move all components away from using document(), otherwise the same document could be represented in different ways depending on whether its retrieved via the #document() one time or via whatever way we build. Equally, we need custom components to do the same.

          I do like the idea of changing to a DocumentMutator which is given a context and is able to add/remove fields. This will then work seamlessly with having the values inline with the documents.

          Should I go ahead and mockup a patch for something like this?

          Show
          Chris Male added a comment - Just a couple of thoughts about the implementation of this: should work with highlighting... this way people don't have to store large text fields if they already have them in another system. I'm not sure if SolrIndexSearcher is the right place for this or not though - perhaps its document() method should stick to just the stored fields? From what I can see the Highlighter pulls documents from the SolrIndexSearcher as well through document() so the patch should already support highlighting. If we move the process away from the SolrIndexSearcher, which I understand the case for, then we need to move all components away from using document(), otherwise the same document could be represented in different ways depending on whether its retrieved via the #document() one time or via whatever way we build. Equally, we need custom components to do the same. I do like the idea of changing to a DocumentMutator which is given a context and is able to add/remove fields. This will then work seamlessly with having the values inline with the documents. Should I go ahead and mockup a patch for something like this?
          Hide
          Grant Ingersoll added a comment -

          I'm not sure if SolrIndexSearcher is the right place for this or not though - perhaps its document() method should stick to just the stored fields?

          Both Chris' patch here and Noble's on SOLR-1566 take the approach of modifying SolrIndexSearcher.doc() for part of the solution. Not saying this is right or wrong, but I think it would be useful to document here the rationale about why not to do it. Is it just b/c that method is expected to do, more or less, what the Lucene IndexSearcher does?

          Show
          Grant Ingersoll added a comment - I'm not sure if SolrIndexSearcher is the right place for this or not though - perhaps its document() method should stick to just the stored fields? Both Chris' patch here and Noble's on SOLR-1566 take the approach of modifying SolrIndexSearcher.doc() for part of the solution. Not saying this is right or wrong, but I think it would be useful to document here the rationale about why not to do it. Is it just b/c that method is expected to do, more or less, what the Lucene IndexSearcher does?
          Hide
          Uri Boness added a comment -

          I think they should be inline, as they are just values associated with a document. I think putting it in some other list is sticking too literally to what Lucene calls a field, which I don't think Solr has to do that. One could easily imagine a Solr component that brought in a database or other storage repository for supplementary fields and it should all be seamless to the client.

          I definitely agree that one shouldn't see a field in Solr as a field in Lucene. That said, I think do have a tendency to see a field in Solr as somehow bound to the Solr schema.

          One thing to notice is that eventually we end up with the same discussion regarding this feature in the context of different issues, let it be highlighting or field collapsing. In some cases it feel just "right" to return the data as a field in a document, in other places it feels "right" to have as something else. It is true that when you interact with solr directly (specially if you do it manually) you certainly know what queries you send, what functions you request and what you should expect in the result. But from experience, a lot of times you try to automate things a bit and creating a well structured and descriptive protocols is the safe way to enable that.

          I don't want to have to go look it up in some other list while I am iterating over my results when all the other values I'm displaying/using are right there associated with the document.

          Having a sub-section under each documents still associates it with the document. The way I see it, It's like OOP... you can have a Person class that holds all the information of the person it it as primitive fields, or you can group related data, like address info, int a separate Address class.

          That being said, it could be useful to add an attribute that indicates it is a generated name

          That's one way to group fields together, but if you're already doing that, then why not go all the way? If you need to distinguish between generated and non-generated names, why not make it simpler and just separate the two in a different list? (To continue the analogies line I started above ) it's like XML, you can have a single level hierarchy were each element defines attributes to relate it to other elements, but a more suitable solution would just be to group all related elements under one parent element.

          I'd even argue that highlighter results should be inline, too, but that is a different issue and a bigger can of worms since it has a well used API already.

          In some cases it might be (well it just is) more appropriate to have the highlighting inlined. In other cases it might not be possible, specially with some of the latest requests to have highlighting functionality available for arbitrary text loaded from anywhere (which I believe will lead for a highlighting component/requestHandler that will be independent of the query component).

          Not saying this is right or wrong, but I think it would be useful to document here the rationale about why not to do it. Is it just b/c that method is expected to do, more or less, what the Lucene IndexSearcher does?

          I guess so... I guess SolrIndexSearcher is in fact a Lucene IndexSearcher which is the source for this association. In some ways I think it's also relates a bit to the response structure (not directly though, but conceptually)... if the IndexSearcher represents Lucene and the document contains fields coming from other sources as well, perhaps this functionality of gathering all these fields (/metadata ) should be done in a higher level where SolrIndexSearcher just serves as on "field source". The main reason why Chris's patch puts this functionality in the doc() method of the SolrIndexSearcher is simply because it's the easiest and the simplest solution right now... and I don't thing there's nothing wrong with that... simple is good! Even with this solution as it is, the "field sources" are still abstracted away in the form of a "FieldValues" or "DocumentMutator", so architecture-wise I don't see leaving it as is will compromise anything.

          Show
          Uri Boness added a comment - I think they should be inline, as they are just values associated with a document. I think putting it in some other list is sticking too literally to what Lucene calls a field, which I don't think Solr has to do that. One could easily imagine a Solr component that brought in a database or other storage repository for supplementary fields and it should all be seamless to the client. I definitely agree that one shouldn't see a field in Solr as a field in Lucene. That said, I think do have a tendency to see a field in Solr as somehow bound to the Solr schema. One thing to notice is that eventually we end up with the same discussion regarding this feature in the context of different issues, let it be highlighting or field collapsing. In some cases it feel just "right" to return the data as a field in a document, in other places it feels "right" to have as something else. It is true that when you interact with solr directly (specially if you do it manually) you certainly know what queries you send, what functions you request and what you should expect in the result. But from experience, a lot of times you try to automate things a bit and creating a well structured and descriptive protocols is the safe way to enable that. I don't want to have to go look it up in some other list while I am iterating over my results when all the other values I'm displaying/using are right there associated with the document. Having a sub-section under each documents still associates it with the document. The way I see it, It's like OOP... you can have a Person class that holds all the information of the person it it as primitive fields, or you can group related data, like address info, int a separate Address class. That being said, it could be useful to add an attribute that indicates it is a generated name That's one way to group fields together, but if you're already doing that, then why not go all the way? If you need to distinguish between generated and non-generated names, why not make it simpler and just separate the two in a different list? (To continue the analogies line I started above ) it's like XML, you can have a single level hierarchy were each element defines attributes to relate it to other elements, but a more suitable solution would just be to group all related elements under one parent element. I'd even argue that highlighter results should be inline, too, but that is a different issue and a bigger can of worms since it has a well used API already. In some cases it might be (well it just is) more appropriate to have the highlighting inlined. In other cases it might not be possible, specially with some of the latest requests to have highlighting functionality available for arbitrary text loaded from anywhere (which I believe will lead for a highlighting component/requestHandler that will be independent of the query component). Not saying this is right or wrong, but I think it would be useful to document here the rationale about why not to do it. Is it just b/c that method is expected to do, more or less, what the Lucene IndexSearcher does? I guess so... I guess SolrIndexSearcher is in fact a Lucene IndexSearcher which is the source for this association. In some ways I think it's also relates a bit to the response structure (not directly though, but conceptually)... if the IndexSearcher represents Lucene and the document contains fields coming from other sources as well, perhaps this functionality of gathering all these fields (/metadata ) should be done in a higher level where SolrIndexSearcher just serves as on "field source". The main reason why Chris's patch puts this functionality in the doc() method of the SolrIndexSearcher is simply because it's the easiest and the simplest solution right now... and I don't thing there's nothing wrong with that... simple is good! Even with this solution as it is, the "field sources" are still abstracted away in the form of a "FieldValues" or "DocumentMutator", so architecture-wise I don't see leaving it as is will compromise anything.
          Hide
          Noble Paul added a comment -

          Both Chris and me are trying to achieve more or less the same thing. just that SOLR-1566 is a bit more ambitious in scope.

          @Uri , I would request you to take a look at the patch in SOLR-1566 also .

          I guess, all of us agree upon the fact that we need a generic way to add non-Lucene fields to a SolrDocument . The fields could be single-valued/multivalued . Say a function returns a List<int> should be valid too.(I even say it could even be a NamedList) . This is useful for a lot of usecases.

          As long as we can achieve this functionality in a performant way, it is fine. Let us converge our efforts and bring this to a resolution ASAP.

          Show
          Noble Paul added a comment - Both Chris and me are trying to achieve more or less the same thing. just that SOLR-1566 is a bit more ambitious in scope. @Uri , I would request you to take a look at the patch in SOLR-1566 also . I guess, all of us agree upon the fact that we need a generic way to add non-Lucene fields to a SolrDocument . The fields could be single-valued/multivalued . Say a function returns a List<int> should be valid too.(I even say it could even be a NamedList) . This is useful for a lot of usecases. As long as we can achieve this functionality in a performant way, it is fine. Let us converge our efforts and bring this to a resolution ASAP.
          Hide
          Hoss Man added a comment -

          Bulk updating 240 Solr issues to set the Fix Version to "next" per the process outlined in this email...

          http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3Calpine.DEB.1.10.1005251052040.24672@radix.cryptio.net%3E

          Selection criteria was "Unresolved" with a Fix Version of 1.5, 1.6, 3.1, or 4.0. email notifications were suppressed.

          A unique token for finding these 240 issues in the future: hossversioncleanup20100527

          Show
          Hoss Man added a comment - Bulk updating 240 Solr issues to set the Fix Version to "next" per the process outlined in this email... http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3Calpine.DEB.1.10.1005251052040.24672@radix.cryptio.net%3E Selection criteria was "Unresolved" with a Fix Version of 1.5, 1.6, 3.1, or 4.0. email notifications were suppressed. A unique token for finding these 240 issues in the future: hossversioncleanup20100527
          Hide
          Ryan McKinley added a comment -

          With SOLR-2443 and SOLR-1566 adding functions will be easy

          Show
          Ryan McKinley added a comment - With SOLR-2443 and SOLR-1566 adding functions will be easy
          Hide
          Ryan McKinley added a comment -

          With SOLR-2443, we can now parse the function query, but it still needs to actually write the values:

           &fl=id,score,mul(popularity,popularity)
          

          gives:

          <doc>
              <str name="id">GB18030TEST</str>
              <float name="score">1.0</float>
              <str name="mul(popularity,popularity)">now what...</str></doc>
          

          aliasing also works, so:

           &fl=id,score,pop2=mul(popularity,popularity)
          

          gives:

          <doc>
              <str name="id">GB18030TEST</str>
              <float name="score">1.0</float>
              <str name="pop2">now what...</str></doc>
          

          If someone else wants to look into how to get the values filled up, that would be great!

          Show
          Ryan McKinley added a comment - With SOLR-2443 , we can now parse the function query, but it still needs to actually write the values: &fl=id,score,mul(popularity,popularity) gives: <doc> <str name= "id" > GB18030TEST </str> <float name= "score" > 1.0 </float> <str name= "mul(popularity,popularity)" > now what... </str> </doc> aliasing also works, so: &fl=id,score,pop2=mul(popularity,popularity) gives: <doc> <str name= "id" > GB18030TEST </str> <float name= "score" > 1.0 </float> <str name= "pop2" > now what... </str> </doc> If someone else wants to look into how to get the values filled up, that would be great!
          Hide
          Robert Muir added a comment -

          Bulk move 3.2 -> 3.3

          Show
          Robert Muir added a comment - Bulk move 3.2 -> 3.3
          Hide
          Ryan McKinley added a comment -

          This has been in trunk for a while – any problems should get their own issue.

          Show
          Ryan McKinley added a comment - This has been in trunk for a while – any problems should get their own issue.
          Hide
          Bill Bell added a comment -

          Does this actually do the function results? Or does it always return "now what..." ?

          Show
          Bill Bell added a comment - Does this actually do the function results? Or does it always return "now what..." ?
          Hide
          Koji Sekiguchi added a comment -

          Hi, I'm using solr example data on trunk.

          If I post q=ipod&fl=score,price , Solr returns score and price as expected.
          But if I post q=ipod&fl=score,log(price) , Solr returns score, the value of log(price) and rest of all fields.

          Show
          Koji Sekiguchi added a comment - Hi, I'm using solr example data on trunk. If I post q=ipod&fl=score,price , Solr returns score and price as expected. But if I post q=ipod&fl=score,log(price) , Solr returns score, the value of log(price) and rest of all fields.
          Hide
          James Wilson added a comment -

          I just added https://issues.apache.org/jira/browse/SOLR-5423 but someone brought this story to my attention. Feel free to combine the stories.

          Show
          James Wilson added a comment - I just added https://issues.apache.org/jira/browse/SOLR-5423 but someone brought this story to my attention. Feel free to combine the stories.

            People

            • Assignee:
              Yonik Seeley
              Reporter:
              Grant Ingersoll
            • Votes:
              7 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development