Description
When I make a request to an URI like /solr/my_core/query?q=%C0, I get a HTTP 500 status code with a stack trace originating at
org.apache.solr.common.SolrException: URLDecoder: Invalid character encoding detected after position 2 of query string / form data (while parsing as UTF-8)
at org.apache.solr.servlet.SolrRequestParsers.decodeChars(SolrRequestParsers.java:421)
…
The obvious reason is that the q parameter value looks like the first byte in a multibyte utf-8 sequence, but that sequence is incomplete/invalid. I have seen a few more instances of this in our monitoring, also with different places where the problem surfaces. [Other issues unrelated, will file separate issues.]
Instead of the HTTP 500 status code, something like e. g. HTTP 400 (Bad Request) would be more appropriate. It would also make processing in downstream systems (that have to deal with Solr’s response) much easier if this class of errors could be recognized.
Also, if I look at the place where the exception is being thrown (https://github.com/apache/solr/blob/releases/lucene-solr/7.7.3/solr/core/src/java/org/apache/solr/servlet/SolrRequestParsers.java#L419-L422), care was taken to use the `ErrorCode.BAD_REQUEST` status. This information, however, seems to be lost along the way.