Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-3825

Log document IDs when they are retrieved

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Trivial
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.0
    • Labels:
      None

      Description

      During relevancy tuning it's important to know exactly which documents the client has seen. Right now the only way to get that list is to splice into the HTTP traffic. Preferably the IDs could be logged along with the query.

      1. SOLR-3825.patch
        11 kB
        Scott Stults
      2. SOLR-3825.1.patch
        4 kB
        Scott Stults

        Activity

        Hide
        gsingers Grant Ingersoll added a comment -

        A few comments on the patch:

        1. SolrMBeanTest fails with this patch due to the description and source being null
        2. I don't think we want/need member variables for ids and idScores, as it won't be thread safe. I'd just loop the DocIterator once, building a StringBuilder and then calling addToLog on that StringBuilder. This will also avoid the need for clone()
        3. For the scores, let's just do an output of id:score, id:score, ... Using a Map won't be reliable, as we will want to maintain order in the log.
        4. For the log key, let's just call it the same thing which should simplify parsing, regardless of whether there are scores present or not, so the format would be: responseLog: id1[:score1],id2[:score2],... where [ ] is used to indicate it is optional.
        5. We should follow the normal SearchComponent pattern of being able to turn on/off the component via a request parameter.
          if (!params.getBool(COMPONENT_NAME, false)) {
                return;
              }

          This component should be OFF by default.

        6. In the ResponseLogComponentTest, do we need the createCore() stuff? See some of the other tests and how they use initCore.
        Show
        gsingers Grant Ingersoll added a comment - A few comments on the patch: SolrMBeanTest fails with this patch due to the description and source being null I don't think we want/need member variables for ids and idScores, as it won't be thread safe. I'd just loop the DocIterator once, building a StringBuilder and then calling addToLog on that StringBuilder. This will also avoid the need for clone() For the scores, let's just do an output of id:score, id:score, ... Using a Map won't be reliable, as we will want to maintain order in the log. For the log key, let's just call it the same thing which should simplify parsing, regardless of whether there are scores present or not, so the format would be: responseLog: id1 [:score1] ,id2 [:score2] ,... where [ ] is used to indicate it is optional. We should follow the normal SearchComponent pattern of being able to turn on/off the component via a request parameter. if (!params.getBool(COMPONENT_NAME, false )) { return ; } This component should be OFF by default. In the ResponseLogComponentTest, do we need the createCore() stuff? See some of the other tests and how they use initCore.
        Hide
        sstults Scott Stults added a comment -

        I updated the patch to incorporate these.

        Show
        sstults Scott Stults added a comment - I updated the patch to incorporate these.
        Hide
        gsingers Grant Ingersoll added a comment -

        Patch looks good, will likely commit soon.

        Show
        gsingers Grant Ingersoll added a comment - Patch looks good, will likely commit soon.
        Hide
        gsingers Grant Ingersoll added a comment -

        Shoot, this needs to be unique id, not doc id. will change it.

        Show
        gsingers Grant Ingersoll added a comment - Shoot, this needs to be unique id, not doc id. will change it.
        Hide
        sstults Scott Stults added a comment -

        This patch goes on after the previous one to change the raw docIDs to the unique ID defined in the schema

        Show
        sstults Scott Stults added a comment - This patch goes on after the previous one to change the raw docIDs to the unique ID defined in the schema
        Hide
        gsingers Grant Ingersoll added a comment -

        Thanks, Scott. I committed the updated patch with one minor change.

        Show
        gsingers Grant Ingersoll added a comment - Thanks, Scott. I committed the updated patch with one minor change.
        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        Grant committed to trunk earlier,
        I just backported to 4x: http://svn.apache.org/viewvc?rev=1388136&view=rev

        Show
        yseeley@gmail.com Yonik Seeley added a comment - Grant committed to trunk earlier, I just backported to 4x: http://svn.apache.org/viewvc?rev=1388136&view=rev
        Hide
        commit-tag-bot Commit Tag Bot added a comment -

        [branch_4x commit] Yonik Seeley
        http://svn.apache.org/viewvc?view=revision&revision=1388136

        SOLR-3825: Added optional capability to log what ids are in a response

        Show
        commit-tag-bot Commit Tag Bot added a comment - [branch_4x commit] Yonik Seeley http://svn.apache.org/viewvc?view=revision&revision=1388136 SOLR-3825 : Added optional capability to log what ids are in a response
        Hide
        thetaphi Uwe Schindler added a comment -

        Closed after release.

        Show
        thetaphi Uwe Schindler added a comment - Closed after release.

          People

          • Assignee:
            gsingers Grant Ingersoll
            Reporter:
            sstults Scott Stults
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development