Solr
  1. Solr
  2. SOLR-3825

Log document IDs when they are retrieved

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Trivial Trivial
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.0
    • Labels:
      None

      Description

      During relevancy tuning it's important to know exactly which documents the client has seen. Right now the only way to get that list is to splice into the HTTP traffic. Preferably the IDs could be logged along with the query.

      1. SOLR-3825.1.patch
        4 kB
        Scott Stults
      2. SOLR-3825.patch
        11 kB
        Scott Stults

        Activity

        Hide
        Grant Ingersoll added a comment -

        A few comments on the patch:

        1. SolrMBeanTest fails with this patch due to the description and source being null
        2. I don't think we want/need member variables for ids and idScores, as it won't be thread safe. I'd just loop the DocIterator once, building a StringBuilder and then calling addToLog on that StringBuilder. This will also avoid the need for clone()
        3. For the scores, let's just do an output of id:score, id:score, ... Using a Map won't be reliable, as we will want to maintain order in the log.
        4. For the log key, let's just call it the same thing which should simplify parsing, regardless of whether there are scores present or not, so the format would be: responseLog: id1[:score1],id2[:score2],... where [ ] is used to indicate it is optional.
        5. We should follow the normal SearchComponent pattern of being able to turn on/off the component via a request parameter.
          if (!params.getBool(COMPONENT_NAME, false)) {
                return;
              }

          This component should be OFF by default.

        6. In the ResponseLogComponentTest, do we need the createCore() stuff? See some of the other tests and how they use initCore.
        Show
        Grant Ingersoll added a comment - A few comments on the patch: SolrMBeanTest fails with this patch due to the description and source being null I don't think we want/need member variables for ids and idScores, as it won't be thread safe. I'd just loop the DocIterator once, building a StringBuilder and then calling addToLog on that StringBuilder. This will also avoid the need for clone() For the scores, let's just do an output of id:score, id:score, ... Using a Map won't be reliable, as we will want to maintain order in the log. For the log key, let's just call it the same thing which should simplify parsing, regardless of whether there are scores present or not, so the format would be: responseLog: id1 [:score1] ,id2 [:score2] ,... where [ ] is used to indicate it is optional. We should follow the normal SearchComponent pattern of being able to turn on/off the component via a request parameter. if (!params.getBool(COMPONENT_NAME, false )) { return ; } This component should be OFF by default. In the ResponseLogComponentTest, do we need the createCore() stuff? See some of the other tests and how they use initCore.
        Hide
        Scott Stults added a comment -

        I updated the patch to incorporate these.

        Show
        Scott Stults added a comment - I updated the patch to incorporate these.
        Hide
        Grant Ingersoll added a comment -

        Patch looks good, will likely commit soon.

        Show
        Grant Ingersoll added a comment - Patch looks good, will likely commit soon.
        Hide
        Grant Ingersoll added a comment -

        Shoot, this needs to be unique id, not doc id. will change it.

        Show
        Grant Ingersoll added a comment - Shoot, this needs to be unique id, not doc id. will change it.
        Hide
        Scott Stults added a comment -

        This patch goes on after the previous one to change the raw docIDs to the unique ID defined in the schema

        Show
        Scott Stults added a comment - This patch goes on after the previous one to change the raw docIDs to the unique ID defined in the schema
        Hide
        Grant Ingersoll added a comment -

        Thanks, Scott. I committed the updated patch with one minor change.

        Show
        Grant Ingersoll added a comment - Thanks, Scott. I committed the updated patch with one minor change.
        Hide
        Yonik Seeley added a comment -

        Grant committed to trunk earlier,
        I just backported to 4x: http://svn.apache.org/viewvc?rev=1388136&view=rev

        Show
        Yonik Seeley added a comment - Grant committed to trunk earlier, I just backported to 4x: http://svn.apache.org/viewvc?rev=1388136&view=rev
        Hide
        Commit Tag Bot added a comment -

        [branch_4x commit] Yonik Seeley
        http://svn.apache.org/viewvc?view=revision&revision=1388136

        SOLR-3825: Added optional capability to log what ids are in a response

        Show
        Commit Tag Bot added a comment - [branch_4x commit] Yonik Seeley http://svn.apache.org/viewvc?view=revision&revision=1388136 SOLR-3825 : Added optional capability to log what ids are in a response
        Hide
        Uwe Schindler added a comment -

        Closed after release.

        Show
        Uwe Schindler added a comment - Closed after release.

          People

          • Assignee:
            Grant Ingersoll
            Reporter:
            Scott Stults
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development