Solr
  1. Solr
  2. SOLR-2079

Expose HttpServletRequest object from SolrQueryRequest object

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.4, 5.0
    • Component/s: Response Writers, search
    • Labels:
      None

      Description

      This patch adds the HttpServletRequest object to the SolrQueryRequest object. The HttpServletRequest object is needed to obtain the client's IP address for geotargetting, and is part of the patches from W. Quach and C. Mattmann.

      1. SOLR-2079.Quach.Mattmann.082310.patch.txt
        4 kB
        Chris A. Mattmann
      2. SOLR-2079.patch
        6 kB
        Tomás Fernández Löbbe
      3. SOLR-2079.patch
        6 kB
        Tomás Fernández Löbbe
      4. SOLR-2079.patch
        6 kB
        Tomás Fernández Löbbe

        Issue Links

          Activity

          Hide
          Ryan McKinley added a comment -

          Ah yes – i remember wanting to do this a long time ago.

          Hoss convinced me that it really is not a good idea – SolrQueryRequest is intentionally abstracted from HttpServletRequest – adding it directly explodes the various ways things could be passed around.

          In my own code I subclass SolrDispatchFilter and attach whatever we need there:

          @Override
            protected final void execute( HttpServletRequest req, SolrRequestHandler handler, SolrQueryRequest sreq, SolrQueryResponse rsp)
            {    
              sreq.getContext().put( "IP", req.getRemoteAddr() );
              super.execute(req, handler, sreq, rsp);
            }
          
          Show
          Ryan McKinley added a comment - Ah yes – i remember wanting to do this a long time ago. Hoss convinced me that it really is not a good idea – SolrQueryRequest is intentionally abstracted from HttpServletRequest – adding it directly explodes the various ways things could be passed around. In my own code I subclass SolrDispatchFilter and attach whatever we need there: @Override protected final void execute( HttpServletRequest req, SolrRequestHandler handler, SolrQueryRequest sreq, SolrQueryResponse rsp) { sreq.getContext().put( "IP" , req.getRemoteAddr() ); super .execute(req, handler, sreq, rsp); }
          Hide
          Simon Willnauer added a comment -

          chris, wouldn't it be enough to expose the clients address which is less bound to a servlet than exposing the servlet request. Many users might use solr via their own protocols which are not necessarily exposed via Servlets. If you have code depending on the accessing the HttpServletRequest some people might run into problems. As far as I can see from your description you only need to expose the InetAddress right?

          Show
          Simon Willnauer added a comment - chris, wouldn't it be enough to expose the clients address which is less bound to a servlet than exposing the servlet request. Many users might use solr via their own protocols which are not necessarily exposed via Servlets. If you have code depending on the accessing the HttpServletRequest some people might run into problems. As far as I can see from your description you only need to expose the InetAddress right?
          Hide
          Erik Hatcher added a comment -

          yeah, this has been something I've desired too. SOLR-1354 is related to this, and I commented there that one way is to do this generically at the dispatch filter level like Ryan has done, taking all HTTP headers and metadata and stuff it into SolrQueryRequest params perhaps prefixed with "http" or something like that.

          We need to be sure that we're not tying embedded Solr to container/HTTP classes.

          Show
          Erik Hatcher added a comment - yeah, this has been something I've desired too. SOLR-1354 is related to this, and I commented there that one way is to do this generically at the dispatch filter level like Ryan has done, taking all HTTP headers and metadata and stuff it into SolrQueryRequest params perhaps prefixed with " http " or something like that. We need to be sure that we're not tying embedded Solr to container/HTTP classes.
          Hide
          Chris A. Mattmann added a comment -

          Hey Guys: yep, whatever is the easiest way is fine with me. We just need the IP address so that we can do Geotargetting based on the host IP. This is part of the slew of patches that William and I put together. This is the way we saw it and it works but not the only way to do it, for sure!

          Show
          Chris A. Mattmann added a comment - Hey Guys: yep, whatever is the easiest way is fine with me. We just need the IP address so that we can do Geotargetting based on the host IP. This is part of the slew of patches that William and I put together. This is the way we saw it and it works but not the only way to do it, for sure!
          Hide
          Andrzej Bialecki added a comment -

          The IP may be often that of a load balancer or proxy that sits in front of Solr...

          Show
          Andrzej Bialecki added a comment - The IP may be often that of a load balancer or proxy that sits in front of Solr...
          Hide
          Chris A. Mattmann added a comment -

          The IP may be often that of a load balancer or proxy that sits in front of Solr...

          Yep, that's certainly true, but not always the case. Additionally, this kind of sets the stage for more formalized IP detection (or more flexible), but it at least gets us started in that direction. Furthermore, it's likely that a load balancer would be "close" even geographically to the eventual destination server, so geotargetting would still be useful in that sense.

          But, I agree this isn't the only or best solution. I'm just opening the door here to be able to do interesting things...

          Show
          Chris A. Mattmann added a comment - The IP may be often that of a load balancer or proxy that sits in front of Solr... Yep, that's certainly true, but not always the case. Additionally, this kind of sets the stage for more formalized IP detection (or more flexible), but it at least gets us started in that direction. Furthermore, it's likely that a load balancer would be "close" even geographically to the eventual destination server, so geotargetting would still be useful in that sense. But, I agree this isn't the only or best solution. I'm just opening the door here to be able to do interesting things...
          Hide
          Hoss Man added a comment -

          I'm really torn on stuff like this – as Ryan mentioned, i think it's important to keep Solr abstract from HTTP so we don't wind up with functionality that requires the use of HTTP – but on the other hand things like the API allowing RequestHandlers to specify when/if to allow HTTP Caching are kind of important, so we made special hooks for it

          I'm not 100% opposed to SolrDispatchFilter adding the HttpServletRequest into the SolrQueryRequest.getContext() map so custom plugins can use it – but i am 100% opposed to having any requesthandlers/components ship with Solr that require it to work – because those would be completely non-functional for embedded Solr users.

          In a lot of cases, i think it probably makes more sense to ask that the client include the information directly as SolrParams – but that generally assumes that a client that knows about Solr conventions is constructing the request – which doesn't work as well now that more and more people want to let end user browser/clients talk directly to Solr and use Velocity or custom response writers to format the response.

          My preference would be to write components that might care about these sorts of things to utilize regular SolrParams from the SolrQueryRequest, and then add generic logic to the SolrDispatchFilter to map HTTP Headers to SolrParams. (thus people doing embedded logic can still use the functionality by specifying the appropriate SolrParams)

          the key questions with an approach like this would then be fairly simple:

          1. what type of convention to use for the http header => param naming?
          2. what should take precedence in the event of name collision: the HTTP Request Query Params, or the HTTP Request Headers?

          For people who really want direct access to the HttpServletRequest/Response in their custom plugins (like in SOLR-1944), and don't care about about embedded solr users, then we can also include them in the Context (no reason to make a bunch of people re-invent the same DIspatchFilter over and over) but i really feel that nothing that ships with Solr should assume it's there and utilize it directly.

          Show
          Hoss Man added a comment - I'm really torn on stuff like this – as Ryan mentioned, i think it's important to keep Solr abstract from HTTP so we don't wind up with functionality that requires the use of HTTP – but on the other hand things like the API allowing RequestHandlers to specify when/if to allow HTTP Caching are kind of important, so we made special hooks for it I'm not 100% opposed to SolrDispatchFilter adding the HttpServletRequest into the SolrQueryRequest.getContext() map so custom plugins can use it – but i am 100% opposed to having any requesthandlers/components ship with Solr that require it to work – because those would be completely non-functional for embedded Solr users. In a lot of cases, i think it probably makes more sense to ask that the client include the information directly as SolrParams – but that generally assumes that a client that knows about Solr conventions is constructing the request – which doesn't work as well now that more and more people want to let end user browser/clients talk directly to Solr and use Velocity or custom response writers to format the response. My preference would be to write components that might care about these sorts of things to utilize regular SolrParams from the SolrQueryRequest, and then add generic logic to the SolrDispatchFilter to map HTTP Headers to SolrParams. (thus people doing embedded logic can still use the functionality by specifying the appropriate SolrParams) the key questions with an approach like this would then be fairly simple: what type of convention to use for the http header => param naming? what should take precedence in the event of name collision: the HTTP Request Query Params, or the HTTP Request Headers? For people who really want direct access to the HttpServletRequest/Response in their custom plugins (like in SOLR-1944 ), and don't care about about embedded solr users, then we can also include them in the Context (no reason to make a bunch of people re-invent the same DIspatchFilter over and over) but i really feel that nothing that ships with Solr should assume it's there and utilize it directly.
          Hide
          Jan Høydahl added a comment -

          I have been using SolrParams to convey metadata from frontends to middleware layer, and I think it has worked really well. In addition, you get it included in the query logs!
          As for load balancers, most have an option to convey the client's IP in the "X-Forwarded-For" header.

          What if the dispatchFilter adds all HTTP headers to the SolrQueryRequest context. Then we could map explicitly in requestHandler config how to use them:

          <lst name="invariants">
              <str name="_http_remote-ip">$HTTP_HEADERS(X-Forwarded-For, Remote-Address)</str>
          </lst>
          

          This would mean that if HTTP header X-Forwarded-For exists in the context, it will be mapped to param _http_remote-ip, if not, it will use Remote-Address. In this way each application can choose whether to "pollute" the SolrParams with headers or not, choose naming as well as whether it should be invariant or default.

          Show
          Jan Høydahl added a comment - I have been using SolrParams to convey metadata from frontends to middleware layer, and I think it has worked really well. In addition, you get it included in the query logs! As for load balancers, most have an option to convey the client's IP in the "X-Forwarded-For" header. What if the dispatchFilter adds all HTTP headers to the SolrQueryRequest context. Then we could map explicitly in requestHandler config how to use them: <lst name= "invariants" > <str name= "_http_remote-ip" > $HTTP_HEADERS(X-Forwarded-For, Remote-Address) </str> </lst> This would mean that if HTTP header X-Forwarded-For exists in the context, it will be mapped to param _http_remote-ip, if not, it will use Remote-Address. In this way each application can choose whether to "pollute" the SolrParams with headers or not, choose naming as well as whether it should be invariant or default.
          Hide
          Lance Norskog added a comment -

          +1 on Chris' comments.

          Personally, I would keep it as it is. To keep the camel out of the tent, start with the camel's nose.

          Show
          Lance Norskog added a comment - +1 on Chris' comments. Personally, I would keep it as it is. To keep the camel out of the tent, start with the camel's nose.
          Hide
          Robert Muir added a comment -
          Show
          Robert Muir added a comment - Moving out all unassigned issues set to 3.1 per this email: http://www.lucidimagination.com/search/document/f026cc56081b5a51/unassigned_jira_issues_set_for_3_1
          Hide
          Robert Muir added a comment -

          Bulk move 3.2 -> 3.3

          Show
          Robert Muir added a comment - Bulk move 3.2 -> 3.3
          Hide
          Robert Muir added a comment -

          3.4 -> 3.5

          Show
          Robert Muir added a comment - 3.4 -> 3.5
          Hide
          Hoss Man added a comment -

          Bulk of fixVersion=3.6 -> fixVersion=4.0 for issues that have no assignee and have not been updated recently.

          email notification suppressed to prevent mass-spam
          psuedo-unique token identifying these issues: hoss20120321nofix36

          Show
          Hoss Man added a comment - Bulk of fixVersion=3.6 -> fixVersion=4.0 for issues that have no assignee and have not been updated recently. email notification suppressed to prevent mass-spam psuedo-unique token identifying these issues: hoss20120321nofix36
          Hide
          Tomás Fernández Löbbe added a comment -

          I would find this very useful, specially for custom components that may require the original request information like headers. Right now there are no good options to get this information, out of the box is not available and there is no easy way to extend the SolrRequestParsers or SolrDispatchFilter (without recompiling/redeploying) to customize this parsing.
          I like the proposition of adding the object to the SolrRequest context. I'm attaching a possible solution.

          Show
          Tomás Fernández Löbbe added a comment - I would find this very useful, specially for custom components that may require the original request information like headers. Right now there are no good options to get this information, out of the box is not available and there is no easy way to extend the SolrRequestParsers or SolrDispatchFilter (without recompiling/redeploying) to customize this parsing. I like the proposition of adding the object to the SolrRequest context. I'm attaching a possible solution.
          Hide
          Robert Muir added a comment -

          Patch looks good Tomas: I like this solution better than copying individual headers to the map.

          Show
          Robert Muir added a comment - Patch looks good Tomas: I like this solution better than copying individual headers to the map.
          Hide
          Tomás Fernández Löbbe added a comment -

          Yes, I first thought about adding just the headers but that would mean that the value of each header was going to be always an Enumeration<String>, which would require to iterate, even when only one value is expected for a header key, so I thought it was going to be better to just add the request object, and then the custom components could use getHeader(String) or getHeaders(String) depending on the needs, will make the component code cleaner.

          Show
          Tomás Fernández Löbbe added a comment - Yes, I first thought about adding just the headers but that would mean that the value of each header was going to be always an Enumeration<String>, which would require to iterate, even when only one value is expected for a header key, so I thought it was going to be better to just add the request object, and then the custom components could use getHeader(String) or getHeaders(String) depending on the needs, will make the component code cleaner.
          Hide
          Erik Hatcher added a comment -

          Tomás - maybe a little risky, or at least something that needs to be warned about here... what if a component asked for the session from the request object? Wouldn't that cause sessions to be created within Solr and potentially consume resources?

          But in general, +1 to getting HTTP headers (and the entire request object) into the Solr request. Very useful!

          Show
          Erik Hatcher added a comment - Tomás - maybe a little risky, or at least something that needs to be warned about here... what if a component asked for the session from the request object? Wouldn't that cause sessions to be created within Solr and potentially consume resources? But in general, +1 to getting HTTP headers (and the entire request object) into the Solr request. Very useful!
          Hide
          Robert Muir added a comment -

          maybe a little risky, or at least something that needs to be warned about here... what if a component asked for the session from the request object? Wouldn't that cause sessions to be created within Solr and potentially consume resources?

          But this is just a brokenness with the servlet api that any user should be aware of: in this case the user didn't read the javadocs, they should have used getSession(false). If they don't read those javadocs, why would they read ours?

          Show
          Robert Muir added a comment - maybe a little risky, or at least something that needs to be warned about here... what if a component asked for the session from the request object? Wouldn't that cause sessions to be created within Solr and potentially consume resources? But this is just a brokenness with the servlet api that any user should be aware of: in this case the user didn't read the javadocs, they should have used getSession(false). If they don't read those javadocs, why would they read ours?
          Hide
          Uwe Schindler added a comment -

          I agree with Robert. We should pass the HttpServletRequest down with the Solr params. The issue about calling getSession() is not our problem. For Lucene-internal packages we should simply prevent this by adding the APIs to the forbidden-apis signatures (we already have one for servlet-api.jar: lucene/tools/forbiddenApis/servlet-api.txt). If somebody in a 3rd party plugin calls getSession, its his problem. And there might be a valid use-case for this! We just don't want to create sessions in Solr's code, so let's forbid those APIs.

          Show
          Uwe Schindler added a comment - I agree with Robert. We should pass the HttpServletRequest down with the Solr params. The issue about calling getSession() is not our problem. For Lucene-internal packages we should simply prevent this by adding the APIs to the forbidden-apis signatures (we already have one for servlet-api.jar: lucene/tools/forbiddenApis/servlet-api.txt). If somebody in a 3rd party plugin calls getSession, its his problem. And there might be a valid use-case for this! We just don't want to create sessions in Solr's code, so let's forbid those APIs.
          Hide
          Commit Tag Bot added a comment -

          [trunk commit] rmuir
          http://svn.apache.org/viewvc?view=revision&revision=1470614

          SOLR-2079: Add option to pass HttpServletRequest in the SolrQueryRequest context map

          Show
          Commit Tag Bot added a comment - [trunk commit] rmuir http://svn.apache.org/viewvc?view=revision&revision=1470614 SOLR-2079 : Add option to pass HttpServletRequest in the SolrQueryRequest context map
          Hide
          Commit Tag Bot added a comment -

          [branch_4x commit] rmuir
          http://svn.apache.org/viewvc?view=revision&revision=1470617

          SOLR-2079: Add option to pass HttpServletRequest in the SolrQueryRequest context map

          Show
          Commit Tag Bot added a comment - [branch_4x commit] rmuir http://svn.apache.org/viewvc?view=revision&revision=1470617 SOLR-2079 : Add option to pass HttpServletRequest in the SolrQueryRequest context map
          Hide
          Robert Muir added a comment -

          Thanks Tomas!

          Show
          Robert Muir added a comment - Thanks Tomas!
          Hide
          Steve Rowe added a comment -

          Bulk close resolved 4.4 issues

          Show
          Steve Rowe added a comment - Bulk close resolved 4.4 issues

            People

            • Assignee:
              Unassigned
              Reporter:
              Chris A. Mattmann
            • Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development