Solr
  1. Solr
  2. SOLR-5354

Distributed sort is broken with CUSTOM FieldType

    Details

      Description

      We added a custom field type to allow an indexed binary field type that supports search (exact match), prefix search, and sort as unsigned bytes lexicographical compare. For sort, BytesRef's UTF8SortedAsUnicodeComparator accomplishes what we want, and even though the name of the comparator mentions UTF8, it doesn't actually assume so and just does byte-level operation, so it's good. However, when we do this across different nodes, we run into an issue where in QueryComponent.doFieldSortValues:

      // Must do the same conversion when sorting by a
      // String field in Lucene, which returns the terms
      // data as BytesRef:
      if (val instanceof BytesRef)

      { UnicodeUtil.UTF8toUTF16((BytesRef)val, spare); field.setStringValue(spare.toString()); val = ft.toObject(field); }

      UnicodeUtil.UTF8toUTF16 is called on our byte array,which isn't actually UTF8. I did a hack where I specified our own field comparator to be ByteBuffer based to get around that instanceof check, but then the field value gets transformed into BYTEARR in JavaBinCodec, and when it's unmarshalled, it gets turned into byte[]. Then, in QueryComponent.mergeIds, a ShardFieldSortedHitQueue is constructed with ShardDoc.getCachedComparator, which decides to give me comparatorNatural in the else of the TODO for CUSTOM, which barfs because byte[] are not Comparable...

      From Chris Hostetter:

      I'm not very familiar with the distributed sorting code, but based on your
      comments, and a quick skim of the functions you pointed to, it definitely
      seems like there are two problems here for people trying to implement
      custom sorting in custom FieldTypes...

      1) QueryComponent.doFieldSortValues - this definitely seems like it should
      be based on the FieldType, not an "instanceof BytesRef" check (oddly: the
      comment event suggestsion that it should be using the FieldType's
      indexedToReadable() method – but it doesn't do that. If it did, then
      this part of hte logic should work for you as long as your custom
      FieldType implemented indexedToReadable in a sane way.

      2) QueryComponent.mergeIds - that TODO definitely looks like a gap that
      needs filled. I'm guessing the sanest thing to do in the CUSTOM case
      would be to ask the FieldComparatorSource (which should be coming from the
      SortField that the custom FieldType produced) to create a FieldComparator
      (via newComparator - the numHits & sortPos could be anything) and then
      wrap that up in a Comparator facade that delegates to
      FieldComparator.compareValues

      That way a custom FieldType could be in complete control of the sort
      comparisons (even when merging ids).

      ...But as i said: i may be missing something, i'm not super familia with
      that code. Please try it out and let us know if thta works – either way
      please open a Jira pointing out the problems trying to implement
      distributed sorting in a custom FieldType.

      1. SOLR-5354__fix_function_edge_case.patch
        28 kB
        Hoss Man
      2. SOLR-5354.patch
        73 kB
        Steve Rowe
      3. SOLR-5354.patch
        68 kB
        Steve Rowe
      4. SOLR-5354.patch
        45 kB
        Steve Rowe
      5. SOLR-5354.patch
        38 kB
        Steve Rowe

        Issue Links

          Activity

          Hide
          Jessica Cheng Mallet added a comment -

          I think calling indexedToReadable() in QueryComponent.doFieldSortValues won't actually help here because it's trying to get the actual Object of the fields, not the readable String representation. If we do go with sending "sort_values" as readable strings, on the other side when QueryComponent.mergeIds is called, it'll need to take care of translating all readable strings to the actual Objects, which I'm not sure if there's an easy way to do.

          The safest thing/ least change is probably to check the sort field type instead of using instanceof in QueryComponent.doFieldSortValues.

          Show
          Jessica Cheng Mallet added a comment - I think calling indexedToReadable() in QueryComponent.doFieldSortValues won't actually help here because it's trying to get the actual Object of the fields, not the readable String representation. If we do go with sending "sort_values" as readable strings, on the other side when QueryComponent.mergeIds is called, it'll need to take care of translating all readable strings to the actual Objects, which I'm not sure if there's an easy way to do. The safest thing/ least change is probably to check the sort field type instead of using instanceof in QueryComponent.doFieldSortValues.
          Hide
          Steve Rowe added a comment - - edited

          Patch fixing the problem.

          QueryComponent.doFieldSortValues() delegates to new method SortField.toExternal(), which serializes according to the sort field type, in the CUSTOM case via new method FieldComparatorSource.toExternal().

          ShardFieldSortedHitQueue uses ShardComparator.sortVal(), which uses new method SortField.toInternal() to convert the external value to the appropriate object, in the CUSTOM case via new method FieldComparatorSource.toInternal().

          Show
          Steve Rowe added a comment - - edited Patch fixing the problem. QueryComponent.doFieldSortValues() delegates to new method SortField.toExternal() , which serializes according to the sort field type, in the CUSTOM case via new method FieldComparatorSource.toExternal() . ShardFieldSortedHitQueue uses ShardComparator.sortVal() , which uses new method SortField.toInternal() to convert the external value to the appropriate object, in the CUSTOM case via new method FieldComparatorSource.toInternal() .
          Hide
          Steve Rowe added a comment -

          If there are no objections I'll commit this tomorrow.

          Show
          Steve Rowe added a comment - If there are no objections I'll commit this tomorrow.
          Hide
          Robert Muir added a comment -

          I think there are a couple issues to address first at least in the lucene part.

          Can we please not have the Object/Object stuff in FieldComparatorSource? This is wrong: FieldComparator already has a generic type so I don't understand the need to discard type safety.

          The unicode conversion for String/String_VAL is incorrect and should not exist: despite the name, these types can be any bytes.

          Show
          Robert Muir added a comment - I think there are a couple issues to address first at least in the lucene part. Can we please not have the Object/Object stuff in FieldComparatorSource? This is wrong: FieldComparator already has a generic type so I don't understand the need to discard type safety. The unicode conversion for String/String_VAL is incorrect and should not exist: despite the name, these types can be any bytes.
          Hide
          Robert Muir added a comment -

          As a concrete example the CollationField and ICUCollationField sort with String/String_VAL comparators but contain non-unicode bytes.

          These currently do not work distributed today either (which I would love to see fixed on this issue).

          Show
          Robert Muir added a comment - As a concrete example the CollationField and ICUCollationField sort with String/String_VAL comparators but contain non-unicode bytes. These currently do not work distributed today either (which I would love to see fixed on this issue).
          Hide
          Steve Rowe added a comment -

          Thanks for the review Robert.

          Can we please not have the Object/Object stuff in FieldComparatorSource? This is wrong: FieldComparator already has a generic type so I don't understand the need to discard type safety.

          I'm not sure what you have in mind - do you think FieldComparatorSource should be generified? In this case I think each extending class will need to provide an implementation for these methods, since there isn't a sensible way to provide a default implementation of conversion to/from the generic type.

          The unicode conversion for String/String_VAL is incorrect and should not exist: despite the name, these types can be any bytes

          This is the status quo right now - the patch just keeps that in place. But I agree. I think the issue is non-binary (XML) serialization, for which UTF-8 is safe, but arbitrary binary is not. Serializing all STRING/STRING_VAL as Base64 seems wasteful in the general case.

          Relatedly, looks like there's an orphaned SortField.Type.BYTES (orphaned in that it's not handled in lots of places) - I guess this should go away?

          As a concrete example the CollationField and ICUCollationField sort with String/String_VAL comparators but contain non-unicode bytes.

          These currently do not work distributed today either (which I would love to see fixed on this issue).

          I'm working on a distributed version of the Solr (icu) collation tests. Once I get that failing, I'll be able to test potential solutions.

          Show
          Steve Rowe added a comment - Thanks for the review Robert. Can we please not have the Object/Object stuff in FieldComparatorSource? This is wrong: FieldComparator already has a generic type so I don't understand the need to discard type safety. I'm not sure what you have in mind - do you think FieldComparatorSource should be generified? In this case I think each extending class will need to provide an implementation for these methods, since there isn't a sensible way to provide a default implementation of conversion to/from the generic type. The unicode conversion for String/String_VAL is incorrect and should not exist: despite the name, these types can be any bytes This is the status quo right now - the patch just keeps that in place. But I agree. I think the issue is non-binary (XML) serialization, for which UTF-8 is safe, but arbitrary binary is not. Serializing all STRING/STRING_VAL as Base64 seems wasteful in the general case. Relatedly, looks like there's an orphaned SortField.Type.BYTES (orphaned in that it's not handled in lots of places) - I guess this should go away? As a concrete example the CollationField and ICUCollationField sort with String/String_VAL comparators but contain non-unicode bytes. These currently do not work distributed today either (which I would love to see fixed on this issue). I'm working on a distributed version of the Solr (icu) collation tests. Once I get that failing, I'll be able to test potential solutions.
          Hide
          Robert Muir added a comment -

          This is the status quo right now - the patch just keeps that in place.

          No its not: its a bug in solr. This patch moves that bug into Lucene.

          Lucene's APIs here work correctly on any bytes today.

          Show
          Robert Muir added a comment - This is the status quo right now - the patch just keeps that in place. No its not: its a bug in solr. This patch moves that bug into Lucene. Lucene's APIs here work correctly on any bytes today.
          Hide
          Robert Muir added a comment -

          I think the issue is non-binary (XML) serialization, for which UTF-8 is safe, but arbitrary binary is not. Serializing all STRING/STRING_VAL as Base64 seems wasteful in the general case.

          This is all solr stuff. I don't think it makes sense to move that logic into lucene, let the user deal with this. They might not be using XML at all: maybe thrift or avro or something else.

          Why not just add serialize/deserialize methods to solr's FieldType.java? It seems like the obvious place.

          Show
          Robert Muir added a comment - I think the issue is non-binary (XML) serialization, for which UTF-8 is safe, but arbitrary binary is not. Serializing all STRING/STRING_VAL as Base64 seems wasteful in the general case. This is all solr stuff. I don't think it makes sense to move that logic into lucene, let the user deal with this. They might not be using XML at all: maybe thrift or avro or something else. Why not just add serialize/deserialize methods to solr's FieldType.java? It seems like the obvious place.
          Hide
          Jessica Cheng Mallet added a comment -

          Why not just add serialize/deserialize methods to solr's FieldType.java? It seems like the obvious place.

          When SortField's are deserialized on the receiving end, it's no longer clear which FieldType the field came from. If the deserialization method depends on FieldType, the node responsible for the merge must also have the schema loaded, which might not be the case in SolrCloud. Maybe solr needs its own SortField too then?

          Show
          Jessica Cheng Mallet added a comment - Why not just add serialize/deserialize methods to solr's FieldType.java? It seems like the obvious place. When SortField's are deserialized on the receiving end, it's no longer clear which FieldType the field came from. If the deserialization method depends on FieldType, the node responsible for the merge must also have the schema loaded, which might not be the case in SolrCloud. Maybe solr needs its own SortField too then?
          Hide
          Robert Muir added a comment -

          When SortField's are deserialized on the receiving end, it's no longer clear which FieldType the field came from. If the deserialization method depends on FieldType, the node responsible for the merge must also have the schema loaded, which might not be the case in SolrCloud.

          Then where is it getting a comparator from? I don't understand how changing a lucene API solves this problem.

          Show
          Robert Muir added a comment - When SortField's are deserialized on the receiving end, it's no longer clear which FieldType the field came from. If the deserialization method depends on FieldType, the node responsible for the merge must also have the schema loaded, which might not be the case in SolrCloud. Then where is it getting a comparator from? I don't understand how changing a lucene API solves this problem.
          Hide
          Robert Muir added a comment -

          Maybe solr needs its own SortField too then?

          OK I see it, I think solr should fix its own apis here? It could add FieldType[] to SortSpec or something like that.

          Show
          Robert Muir added a comment - Maybe solr needs its own SortField too then? OK I see it, I think solr should fix its own apis here? It could add FieldType[] to SortSpec or something like that.
          Hide
          Hoss Man added a comment -

          I've been looking into Solr's distributed sorting code more and more as part of my investigating into SOLR-5463 and i spoke breifly with sarowe off line about the overlap.

          I think the problems with CUSTOM distributed sorting is really just a subset of the larger weirdness with the assumptions Solr makes in general about how it can do distributed sorting and how it can de/serialize the sort values when merging hte results from the multiple shards.

          I think my earlier suggestion (in email that jessica quoted in the issue summary) about using methods on the FieldType (like indexedToReadable and toObject) to ensure we safely de/serialize the sort values are still the right way to go – we have to ensure that no matter what strange object an arbitrary objects are used by a FieldComparator, we can safely serialize it. But i'm not longer convinced re-using those existing methods makes sense – because the sort values used by a FieldType's FieldComparator may not map directly to the "end user" representation of the value (ie: TriedDateField sorts as "long", but toObject returns "Date"; String fields sort on BytesRefs; Custom classes sort on who-knows-what, etc...)

          I think the best solution would be something like:

          • move the toExternal/toInternal concept in the existing patch out of FieldComparatorSource and into Solr's FieldType as methods clearly ment to be very speciic to sorting (ie: "marshalSortValue" and "unmarshalSortValue")>
          • change the fsv=true logic on shards to use marshalSortValue for any SortField that is on a field (if it's score or a function it will be a sinple numeric and already safe to serialize over the wire)
          • change the mergeIds logic on the coordinator node to explicitly use unmarshalSortValue and then use the actual FieldComparator associated with each SortField instead of the hooky assumptions currently being made in ShardFieldSortedHitQueue.getCachedComparator about using things like "comparatorNatural"

          Other misc comments...

          If the deserialization method depends on FieldType, the node responsible for the merge must also have the schema loaded, which might not be the case in SolrCloud.

          That's already a requirement in SolrCloud - the coordnator node merging results and writing them back to the client already has to have the same schema. (If it didn't a custom FieldType with a custom FieldComparator could never work, because there would be now way at all to know what order things should go in)

          I think solr should fix its own apis here? It could add FieldType[] to SortSpec or something like that.

          I'm not sure why that would help? We can already ask each SortField for it's getField() and then look that up in the Schema. The crux of the problem really seems to be: naive assumptions in the distributed sorting code about how to safely send sort values over the wire; and what comparator to use when sorting those values.

          Show
          Hoss Man added a comment - I've been looking into Solr's distributed sorting code more and more as part of my investigating into SOLR-5463 and i spoke breifly with sarowe off line about the overlap. I think the problems with CUSTOM distributed sorting is really just a subset of the larger weirdness with the assumptions Solr makes in general about how it can do distributed sorting and how it can de/serialize the sort values when merging hte results from the multiple shards. I think my earlier suggestion (in email that jessica quoted in the issue summary) about using methods on the FieldType (like indexedToReadable and toObject) to ensure we safely de/serialize the sort values are still the right way to go – we have to ensure that no matter what strange object an arbitrary objects are used by a FieldComparator, we can safely serialize it. But i'm not longer convinced re-using those existing methods makes sense – because the sort values used by a FieldType's FieldComparator may not map directly to the "end user" representation of the value (ie: TriedDateField sorts as "long", but toObject returns "Date"; String fields sort on BytesRefs; Custom classes sort on who-knows-what, etc...) I think the best solution would be something like: move the toExternal/toInternal concept in the existing patch out of FieldComparatorSource and into Solr's FieldType as methods clearly ment to be very speciic to sorting (ie: "marshalSortValue" and "unmarshalSortValue")> change the fsv=true logic on shards to use marshalSortValue for any SortField that is on a field (if it's score or a function it will be a sinple numeric and already safe to serialize over the wire) change the mergeIds logic on the coordinator node to explicitly use unmarshalSortValue and then use the actual FieldComparator associated with each SortField instead of the hooky assumptions currently being made in ShardFieldSortedHitQueue.getCachedComparator about using things like "comparatorNatural" Other misc comments... If the deserialization method depends on FieldType, the node responsible for the merge must also have the schema loaded, which might not be the case in SolrCloud. That's already a requirement in SolrCloud - the coordnator node merging results and writing them back to the client already has to have the same schema. (If it didn't a custom FieldType with a custom FieldComparator could never work, because there would be now way at all to know what order things should go in) I think solr should fix its own apis here? It could add FieldType[] to SortSpec or something like that. I'm not sure why that would help? We can already ask each SortField for it's getField() and then look that up in the Schema. The crux of the problem really seems to be: naive assumptions in the distributed sorting code about how to safely send sort values over the wire; and what comparator to use when sorting those values.
          Hide
          Robert Muir added a comment -

          I'm not sure why that would help? We can already ask each SortField for it's getField() and then look that up in the Schema. The crux of the problem really seems to be: naive assumptions in the distributed sorting code about how to safely send sort values over the wire; and what comparator to use when sorting those values.

          My point is that the serialization/deserialization doesn't really belong in the lucene comparator API, thats all. I agree with your proposed solution...

          Show
          Robert Muir added a comment - I'm not sure why that would help? We can already ask each SortField for it's getField() and then look that up in the Schema. The crux of the problem really seems to be: naive assumptions in the distributed sorting code about how to safely send sort values over the wire; and what comparator to use when sorting those values. My point is that the serialization/deserialization doesn't really belong in the lucene comparator API, thats all. I agree with your proposed solution...
          Hide
          Steve Rowe added a comment -

          Patch moving de/serialization to Solr's FieldType according to Hoss's outline - no Lucene classes are modified in this version.

          All Solr tests pass.

          I think this should also fix distributed (ICU)CollationField - I'm working on a test.

          Show
          Steve Rowe added a comment - Patch moving de/serialization to Solr's FieldType according to Hoss's outline - no Lucene classes are modified in this version. All Solr tests pass. I think this should also fix distributed (ICU)CollationField - I'm working on a test.
          Hide
          Robert Muir added a comment -

          This looks great Steve, thanks.

          Show
          Robert Muir added a comment - This looks great Steve, thanks.
          Hide
          Hoss Man added a comment -

          Steve: looks great for the most part.

          a few comments / questions...

          • call me paranoid, but i really dislike distrib tests that only use the query() method to ensure that the distrib response is the same as the control response – could we please add some assertions that use queryServer() to prove the docs are coming back in the right order in the distrib test?
          • the test should really sanity check that multi-level sorts (eg: "payload asc, id desc") are working properly
          • we should be really clear & careful in the javadocs for FieldType.marshalSortValue and FieldType.unmarshalSortValue – in your patch they refer to "a value of this FieldType" but that's not actually what they operate on. They operate on the values used by the FieldComparator returned by the SortField for this FieldType (ie: SortableDoubleField's toObject returns a Double, but the marshal method operates on ByteRef)
          • I'm confused why we still need comparatorNatural() and it's use for REWRITEABLE. Why not actually rewrite() the SortField using the local IndexSearcher and then wrap the rewritten SortField's FieldComparator using comparatorFieldComparator() just like any other SortField? Since we're only ever going to compare the raw values on the coordinator it shouldn't matter if we rewrite in terms of the local IndexSearcher - it's the best we can do, and that seems safer then assuming REWRITABLE == function and trusting comparatorNatural. (ie: consider someone who writes a custom FieldType that uses REWRITABLE)
          • don't the marshal methods in StrField, TextField, and CollationField need null checks (for the possibilities of docs w/o a value in the sort field?)
          • do we even have any existing tests of distributed sorting on strings & numerics using sortMisstingLast / sortMissingFirst to be sure we don't break that?
          Show
          Hoss Man added a comment - Steve: looks great for the most part. a few comments / questions... call me paranoid, but i really dislike distrib tests that only use the query() method to ensure that the distrib response is the same as the control response – could we please add some assertions that use queryServer() to prove the docs are coming back in the right order in the distrib test? the test should really sanity check that multi-level sorts (eg: "payload asc, id desc") are working properly we should be really clear & careful in the javadocs for FieldType.marshalSortValue and FieldType.unmarshalSortValue – in your patch they refer to "a value of this FieldType" but that's not actually what they operate on. They operate on the values used by the FieldComparator returned by the SortField for this FieldType (ie: SortableDoubleField's toObject returns a Double, but the marshal method operates on ByteRef) I'm confused why we still need comparatorNatural() and it's use for REWRITEABLE. Why not actually rewrite() the SortField using the local IndexSearcher and then wrap the rewritten SortField's FieldComparator using comparatorFieldComparator() just like any other SortField? Since we're only ever going to compare the raw values on the coordinator it shouldn't matter if we rewrite in terms of the local IndexSearcher - it's the best we can do, and that seems safer then assuming REWRITABLE == function and trusting comparatorNatural. (ie: consider someone who writes a custom FieldType that uses REWRITABLE) don't the marshal methods in StrField, TextField, and CollationField need null checks (for the possibilities of docs w/o a value in the sort field?) do we even have any existing tests of distributed sorting on strings & numerics using sortMisstingLast / sortMissingFirst to be sure we don't break that?
          Hide
          Steve Rowe added a comment -

          Thanks Robert and Hoss for the reviews.

          New patch incorporating Hoss's suggestions, details below:

          call me paranoid, but i really dislike distrib tests that only use the query() method to ensure that the distrib response is the same as the control response – could we please add some assertions that use queryServer() to prove the docs are coming back in the right order in the distrib test?

          As Hoss suggested on #solr IRC, I changed BasicDistributedSearchTestCase#query() to return the QueryResponse (instead of void), so that queryServer() doesn't have to be called separately from the random server vs. control check that query() performs.

          I then added a new method to SolrTestCaseJ4: assertFieldValues(doclist,fieldName,values), and now check that id field values are in the expected order using this new method.

          the test should really sanity check that multi-level sorts (eg: "payload asc, id desc") are working properly

          Added.

          we should be really clear & careful in the javadocs for FieldType.marshalSortValue and FieldType.unmarshalSortValue – in your patch they refer to "a value of this FieldType" but that's not actually what they operate on. They operate on the values used by the FieldComparator returned by the SortField for this FieldType (ie: SortableDoubleField's toObject returns a Double, but the marshal method operates on ByteRef)

          Reworded.

          I'm confused why we still need comparatorNatural() and it's use for REWRITEABLE. Why not actually rewrite() the SortField using the local IndexSearcher and then wrap the rewritten SortField's FieldComparator using comparatorFieldComparator() just like any other SortField? Since we're only ever going to compare the raw values on the coordinator it shouldn't matter if we rewrite in terms of the local IndexSearcher - it's the best we can do, and that seems safer then assuming REWRITABLE == function and trusting comparatorNatural. (ie: consider someone who writes a custom FieldType that uses REWRITABLE)

          Fixed - comparatorNatural() is gone.

          don't the marshal methods in StrField, TextField, and CollationField need null checks (for the possibilities of docs w/o a value in the sort field?)

          Yes, they do, ICUCollatonField too - added.

          do we even have any existing tests of distributed sorting on strings & numerics using sortMisstingLast / sortMissingFirst to be sure we don't break that?

          No, I couldn't find any, so I added tests for sortMissingFirst and sortMissingLast on both SortableIntField and StringField on the existing TestDistributedSearch class. I'll add testing for a Trie field too before I commit, not in the patch yet.

          Show
          Steve Rowe added a comment - Thanks Robert and Hoss for the reviews. New patch incorporating Hoss's suggestions, details below: call me paranoid, but i really dislike distrib tests that only use the query() method to ensure that the distrib response is the same as the control response – could we please add some assertions that use queryServer() to prove the docs are coming back in the right order in the distrib test? As Hoss suggested on #solr IRC, I changed BasicDistributedSearchTestCase#query() to return the QueryResponse (instead of void), so that queryServer() doesn't have to be called separately from the random server vs. control check that query() performs. I then added a new method to SolrTestCaseJ4 : assertFieldValues(doclist,fieldName,values) , and now check that id field values are in the expected order using this new method. the test should really sanity check that multi-level sorts (eg: "payload asc, id desc") are working properly Added. we should be really clear & careful in the javadocs for FieldType.marshalSortValue and FieldType.unmarshalSortValue – in your patch they refer to "a value of this FieldType" but that's not actually what they operate on. They operate on the values used by the FieldComparator returned by the SortField for this FieldType (ie: SortableDoubleField's toObject returns a Double, but the marshal method operates on ByteRef) Reworded. I'm confused why we still need comparatorNatural() and it's use for REWRITEABLE. Why not actually rewrite() the SortField using the local IndexSearcher and then wrap the rewritten SortField's FieldComparator using comparatorFieldComparator() just like any other SortField? Since we're only ever going to compare the raw values on the coordinator it shouldn't matter if we rewrite in terms of the local IndexSearcher - it's the best we can do, and that seems safer then assuming REWRITABLE == function and trusting comparatorNatural. (ie: consider someone who writes a custom FieldType that uses REWRITABLE) Fixed - comparatorNatural() is gone. don't the marshal methods in StrField, TextField, and CollationField need null checks (for the possibilities of docs w/o a value in the sort field?) Yes, they do, ICUCollatonField too - added. do we even have any existing tests of distributed sorting on strings & numerics using sortMisstingLast / sortMissingFirst to be sure we don't break that? No, I couldn't find any, so I added tests for sortMissingFirst and sortMissingLast on both SortableIntField and StringField on the existing TestDistributedSearch class. I'll add testing for a Trie field too before I commit, not in the patch yet.
          Hide
          Steve Rowe added a comment -

          I added a Trie long field to the sort missing first/last tests, and moved them from the existing TestDistributedSearch class to a new class TestDistributedMissingSort with its own dedicated minimal schema file.

          All solr tests pass, and ant precommit passes.

          Committing shortly.

          I'll make a separate issue for testing distributed collation.

          Show
          Steve Rowe added a comment - I added a Trie long field to the sort missing first/last tests, and moved them from the existing TestDistributedSearch class to a new class TestDistributedMissingSort with its own dedicated minimal schema file. All solr tests pass, and ant precommit passes. Committing shortly. I'll make a separate issue for testing distributed collation.
          Hide
          ASF subversion and git services added a comment -

          Commit 1546457 from Steve Rowe in branch 'dev/trunk'
          [ https://svn.apache.org/r1546457 ]

          SOLR-5354: Distributed sort is broken with CUSTOM FieldType

          Show
          ASF subversion and git services added a comment - Commit 1546457 from Steve Rowe in branch 'dev/trunk' [ https://svn.apache.org/r1546457 ] SOLR-5354 : Distributed sort is broken with CUSTOM FieldType
          Hide
          ASF subversion and git services added a comment -

          Commit 1546461 from Steve Rowe in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1546461 ]

          SOLR-5354: Distributed sort is broken with CUSTOM FieldType (merged trunk r1546457)

          Show
          ASF subversion and git services added a comment - Commit 1546461 from Steve Rowe in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1546461 ] SOLR-5354 : Distributed sort is broken with CUSTOM FieldType (merged trunk r1546457)
          Hide
          Steve Rowe added a comment -

          Committed to trunk and branch_4x.

          Show
          Steve Rowe added a comment - Committed to trunk and branch_4x.
          Hide
          ASF subversion and git services added a comment -

          Commit 1546571 from Robert Muir in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1546571 ]

          SOLR-5354: don't try to write docvalues with 3.x codec in these tests

          Show
          ASF subversion and git services added a comment - Commit 1546571 from Robert Muir in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1546571 ] SOLR-5354 : don't try to write docvalues with 3.x codec in these tests
          Hide
          ASF subversion and git services added a comment -

          Commit 1546589 from Steve Rowe in branch 'dev/trunk'
          [ https://svn.apache.org/r1546589 ]

          SOLR-5354: fix attribution

          Show
          ASF subversion and git services added a comment - Commit 1546589 from Steve Rowe in branch 'dev/trunk' [ https://svn.apache.org/r1546589 ] SOLR-5354 : fix attribution
          Hide
          ASF subversion and git services added a comment -

          Commit 1546591 from Steve Rowe in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1546591 ]

          SOLR-5354: fix attribution (merged trunk r1546589)

          Show
          ASF subversion and git services added a comment - Commit 1546591 from Steve Rowe in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1546591 ] SOLR-5354 : fix attribution (merged trunk r1546589)
          Hide
          Hoss Man added a comment -

          Looking further at this as part of SOLR-5463 I realized that this change introduces a new bug in some edge cases of sorting on functions. The easiest way to reproduce this is to add a catchall dynamic field to the example schema.xml...

          <dynamicField name="*" type="ignored" multiValued="true" />

          ..and then attempt a distributed sort on a function...

          http://localhost:8983/solr/select?q=*:*&sort=sum%28popularity,price%29+desc&shards=localhost:8983/solr

          (But the problem can be reproduced in more sublte ways – notably any prefix based dynamicField that is also a prefix of a function – for example "s*" or {{"currency*"} as dynamicFields)

          The problem comes about because of a mistaken impression I had when i made this comment above...

          I think solr should fix its own apis here? It could add FieldType[] to SortSpec or something like that.

          I'm not sure why that would help? We can already ask each SortField for it's getField() and then look that up in the Schema.

          My mistake was in thinking that SortField.getField() would always be null unless the Sort was on a "real" field – but SortField's built arround functions have a getField method that returns a string representation of the function. (This is behavior is currently required for the distributed sorting code when serializing/deserializing the list of sort values from each shard in order to know which values belong to which SortField).

          In the past, using SortField.getField() to ask IndexSchema for a FieldType was a rare occurance driven by the runtime type of the value found – the FieldType found was never used unless the runtime sort value was a String, so it was never a problem if a dynamicField pattern matched a sort function string since they never returned Strings. But now wit hthis new code, where we use the FieldType's marshalling methods for any valid field, it can cause problems.

          I think rmuir's suggestion is spot on: When parsing the SortSpec we need to keep track of the FieldType at a minimum – but it's just as easy to keep track of the SchemaField itself.

          I've got a patch where I tweaked sarowe's new test to demonstrate the bug and then fix it by having SortSpec keep track of a List<SchemaField> that corrisponds one-to-one with the SortFields. It's a bit hairy, but it works.

          Show
          Hoss Man added a comment - Looking further at this as part of SOLR-5463 I realized that this change introduces a new bug in some edge cases of sorting on functions. The easiest way to reproduce this is to add a catchall dynamic field to the example schema.xml... <dynamicField name="*" type="ignored" multiValued="true" /> ..and then attempt a distributed sort on a function... http://localhost:8983/solr/select?q=*:*&sort=sum%28popularity,price%29+desc&shards=localhost:8983/solr (But the problem can be reproduced in more sublte ways – notably any prefix based dynamicField that is also a prefix of a function – for example "s*" or {{"currency*"} as dynamicFields) The problem comes about because of a mistaken impression I had when i made this comment above... I think solr should fix its own apis here? It could add FieldType[] to SortSpec or something like that. I'm not sure why that would help? We can already ask each SortField for it's getField() and then look that up in the Schema. My mistake was in thinking that SortField.getField() would always be null unless the Sort was on a "real" field – but SortField's built arround functions have a getField method that returns a string representation of the function. (This is behavior is currently required for the distributed sorting code when serializing/deserializing the list of sort values from each shard in order to know which values belong to which SortField). In the past, using SortField.getField() to ask IndexSchema for a FieldType was a rare occurance driven by the runtime type of the value found – the FieldType found was never used unless the runtime sort value was a String, so it was never a problem if a dynamicField pattern matched a sort function string since they never returned Strings. But now wit hthis new code, where we use the FieldType's marshalling methods for any valid field, it can cause problems. I think rmuir's suggestion is spot on: When parsing the SortSpec we need to keep track of the FieldType at a minimum – but it's just as easy to keep track of the SchemaField itself. I've got a patch where I tweaked sarowe's new test to demonstrate the bug and then fix it by having SortSpec keep track of a List<SchemaField> that corrisponds one-to-one with the SortFields. It's a bit hairy, but it works.
          Hide
          Hoss Man added a comment -

          Steve: could you take a look at this patch and let me know what you think?

          Feel free to go ahead and commit if you think this looks ok, but if you have concerns and think we should go a different direction for the fix you certainly won't hurt my feelings.

          Show
          Hoss Man added a comment - Steve: could you take a look at this patch and let me know what you think? Feel free to go ahead and commit if you think this looks ok, but if you have concerns and think we should go a different direction for the fix you certainly won't hurt my feelings.
          Hide
          ASF subversion and git services added a comment -

          Commit 1547430 from Steve Rowe in branch 'dev/trunk'
          [ https://svn.apache.org/r1547430 ]

          SOLR-5354: applying hoss's patch to fix function edge case in distributed sort

          Show
          ASF subversion and git services added a comment - Commit 1547430 from Steve Rowe in branch 'dev/trunk' [ https://svn.apache.org/r1547430 ] SOLR-5354 : applying hoss's patch to fix function edge case in distributed sort
          Hide
          ASF subversion and git services added a comment -

          Commit 1547473 from Steve Rowe in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1547473 ]

          SOLR-5354: applying hoss's patch to fix function edge case in distributed sort (merged trunk r1547430)

          Show
          ASF subversion and git services added a comment - Commit 1547473 from Steve Rowe in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1547473 ] SOLR-5354 : applying hoss's patch to fix function edge case in distributed sort (merged trunk r1547430)
          Hide
          Steve Rowe added a comment -

          committed Hoss's patch to trunk and branch_4x

          Show
          Steve Rowe added a comment - committed Hoss's patch to trunk and branch_4x
          Hide
          ASF subversion and git services added a comment -

          Commit 1558618 from Robert Muir in branch 'dev/branches/lucene539399'
          [ https://svn.apache.org/r1558618 ]

          LUCENE-5399, SOLR-5354: fix distributed grouping to marshal/unmarshal sort values properly

          Show
          ASF subversion and git services added a comment - Commit 1558618 from Robert Muir in branch 'dev/branches/lucene539399' [ https://svn.apache.org/r1558618 ] LUCENE-5399 , SOLR-5354 : fix distributed grouping to marshal/unmarshal sort values properly

            People

            • Assignee:
              Steve Rowe
              Reporter:
              Jessica Cheng Mallet
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development