Solr
  1. Solr
  2. SOLR-3073

Distributed Grouping fails if the uniqueKey is a UUID

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 3.5, 4.0-ALPHA
    • Fix Version/s: 3.6, 4.0-ALPHA
    • Component/s: None
    • Labels:
      None

      Description

      Attempting to use distributed grouping (using a StrField as the group.fieldname) with a UUID as the uniqueKey results in an error because the classname (java.util.UUID) is prepended to the field value during the second phase of the grouping.

      1. SOLR-3073-3x.patch
        1 kB
        Martijn van Groningen
      2. SOLR-3073-3x.patch
        2 kB
        Devon Krisman

        Activity

        Hide
        Devon Krisman added a comment -

        This is the error that happens when you try to run a distributed grouping search with a UUID as the uniqueKey:

        SEVERE: org.apache.solr.common.SolrException: Invalid UUID String: 'java.util.UUID:317db1e1-b778-ec66-ef68-ddd00b096632'
        at org.apache.solr.schema.UUIDField.toInternal(UUIDField.java:85)
        at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:217)
        at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194)
        at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1372)
        at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
        at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
        at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
        at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:945)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
        at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)

        The request handlers append the field Object's classname to its string value if it is from an unrecognized class, the attached patch should add java.util.UUID to the recognized classtypes for Solr's response handlers.

        Show
        Devon Krisman added a comment - This is the error that happens when you try to run a distributed grouping search with a UUID as the uniqueKey: SEVERE: org.apache.solr.common.SolrException: Invalid UUID String: 'java.util.UUID:317db1e1-b778-ec66-ef68-ddd00b096632' at org.apache.solr.schema.UUIDField.toInternal(UUIDField.java:85) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:217) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1372) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:945) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) The request handlers append the field Object's classname to its string value if it is from an unrecognized class, the attached patch should add java.util.UUID to the recognized classtypes for Solr's response handlers.
        Hide
        Martijn van Groningen added a comment -

        Thanks for noticing this issue!

        I'm not sure if the response writers were ever meant to serialize UUID instances. So maybe this is a general response writing bug?

        I attached an alternative fix for this issue. Instead of invoking FieldType#toObject method it invokes FieldType#toExternal method in the TopGroupsResultTransformer class. I think this is a better approach for distributed grouping since there is no overhead of creating an UUID object (UUID#fromString). For distributed grouping we can just threat all ids as strings.

        Show
        Martijn van Groningen added a comment - Thanks for noticing this issue! I'm not sure if the response writers were ever meant to serialize UUID instances. So maybe this is a general response writing bug? I attached an alternative fix for this issue. Instead of invoking FieldType#toObject method it invokes FieldType#toExternal method in the TopGroupsResultTransformer class. I think this is a better approach for distributed grouping since there is no overhead of creating an UUID object (UUID#fromString). For distributed grouping we can just threat all ids as strings.
        Hide
        Yonik Seeley added a comment -

        I assume the results of TopGroupsResultTransformer is never exposed to the end user (i.e. the end user won't know that we use strings internally)? If so, this fix seems easiest. If not, then UUIDField itself should be fixed to not return a UUID from toObject.

        Show
        Yonik Seeley added a comment - I assume the results of TopGroupsResultTransformer is never exposed to the end user (i.e. the end user won't know that we use strings internally)? If so, this fix seems easiest. If not, then UUIDField itself should be fixed to not return a UUID from toObject.
        Hide
        Martijn van Groningen added a comment -

        The client reading the final response (end user) isn't aware of how ids are exposed between shards.

        A client might however use the internal api (group.distibuted.first / group.distibuted.second). I think since this is an internal api which isn't meant to be used by end users, the last patch can be committed.

        Show
        Martijn van Groningen added a comment - The client reading the final response (end user) isn't aware of how ids are exposed between shards. A client might however use the internal api (group.distibuted.first / group.distibuted.second). I think since this is an internal api which isn't meant to be used by end users, the last patch can be committed.
        Hide
        Hoss Man added a comment -

        If not, then UUIDField itself should be fixed to not return a UUID from toObject.

        Wouldn't that defeat the entire point of toObject? ... I thought the whole purpose was to give more robust response writers like JavaBin access to the best possible representation of hte values, even if they weren't "simple" (ie: strings, ints, etc...) so writers that can handle complex types can use complex types.

        if grouping just needs the string representation of hte unique key field then that's what toExternal is for.

        Show
        Hoss Man added a comment - If not, then UUIDField itself should be fixed to not return a UUID from toObject. Wouldn't that defeat the entire point of toObject? ... I thought the whole purpose was to give more robust response writers like JavaBin access to the best possible representation of hte values, even if they weren't "simple" (ie: strings, ints, etc...) so writers that can handle complex types can use complex types. if grouping just needs the string representation of hte unique key field then that's what toExternal is for.
        Hide
        Yonik Seeley added a comment -

        Wouldn't that defeat the entire point of toObject?

        There's been an implicit assumption that whatever toObject produces, it's generic enough that response writers can handle it.

        Show
        Yonik Seeley added a comment - Wouldn't that defeat the entire point of toObject? There's been an implicit assumption that whatever toObject produces, it's generic enough that response writers can handle it.
        Hide
        Uwe Schindler added a comment -

        if grouping just needs the string representation of hte unique key field then that's what toExternal is for.

        I agree!

        Show
        Uwe Schindler added a comment - if grouping just needs the string representation of hte unique key field then that's what toExternal is for. I agree!
        Hide
        Uwe Schindler added a comment -

        There's been an implicit assumption that whatever toObject produces, it's generic enough that response writers can handle it.

        ResponseWriters can handle j.u.UUID? because they fallback to toString()?

        Show
        Uwe Schindler added a comment - There's been an implicit assumption that whatever toObject produces, it's generic enough that response writers can handle it. ResponseWriters can handle j.u.UUID? because they fallback to toString()?
        Hide
        Hoss Man added a comment -

        ResponseWriters can handle j.u.UUID? because they fallback to toString()?

        right ... i specifically remember asking when toObject was introduced if we should call it "toSimpleObject" and give it javadocs listing the allowed interfaces, and someone (ryan? noble?) said no, FieldTpes should be allowed to return any Object and it's up to the client to decide how to deal with objects they can't handle (ie: toString)

        Incase of UUIDField, the JavaBinCodec can serialize it and you get a java.util.UUID object on the other side.

        Show
        Hoss Man added a comment - ResponseWriters can handle j.u.UUID? because they fallback to toString()? right ... i specifically remember asking when toObject was introduced if we should call it "toSimpleObject" and give it javadocs listing the allowed interfaces, and someone (ryan? noble?) said no, FieldTpes should be allowed to return any Object and it's up to the client to decide how to deal with objects they can't handle (ie: toString) Incase of UUIDField, the JavaBinCodec can serialize it and you get a java.util.UUID object on the other side.
        Hide
        Martijn van Groningen added a comment -

        I committed the patch that I attached to trunk and 3x branch.

        Show
        Martijn van Groningen added a comment - I committed the patch that I attached to trunk and 3x branch.
        Hide
        Martijn van Groningen added a comment -

        Actual error is fixed, so this issue is resolved.

        Show
        Martijn van Groningen added a comment - Actual error is fixed, so this issue is resolved.

          People

          • Assignee:
            Martijn van Groningen
            Reporter:
            Devon Krisman
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development