Lucene - Core
  1. Lucene - Core
  2. LUCENE-6057

Clarify the Sort(SortField...) constructor)

    Details

    • Lucene Fields:
      New

      Description

      I don't really know which version this affects, but I clarified the documentation of the Sort(SortField...) constructor to ease the understanding for new users.

      Pull Request:

      https://github.com/apache/lucene-solr/pull/20

      1. LUCENE-6057.patch
        1 kB
        Michael McCandless

        Activity

        Hide
        ASF GitHub Bot added a comment -
        Show
        ASF GitHub Bot added a comment - Github user s4ke commented on the pull request: https://github.com/apache/lucene-solr/pull/20#issuecomment-62426181 corresponding JIRA issue: https://issues.apache.org/jira/browse/LUCENE-6057
        Hide
        Michael McCandless added a comment -

        Thanks Martin.

        I started from your PR and tried to simplify the wording a bit (see attached patch). Is this good? If so I'll commit ...

        Show
        Michael McCandless added a comment - Thanks Martin. I started from your PR and tried to simplify the wording a bit (see attached patch). Is this good? If so I'll commit ...
        Hide
        Martin Braun added a comment - - edited

        Seems fine to me .

        Show
        Martin Braun added a comment - - edited Seems fine to me .
        Hide
        ASF subversion and git services added a comment -

        Commit 1639581 from Michael McCandless in branch 'dev/trunk'
        [ https://svn.apache.org/r1639581 ]

        LUCENE-6057: improve Sort(SortField) docs

        Show
        ASF subversion and git services added a comment - Commit 1639581 from Michael McCandless in branch 'dev/trunk' [ https://svn.apache.org/r1639581 ] LUCENE-6057 : improve Sort(SortField) docs
        Hide
        ASF subversion and git services added a comment -

        Commit 1639582 from Michael McCandless in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1639582 ]

        LUCENE-6057: improve Sort(SortField) docs

        Show
        ASF subversion and git services added a comment - Commit 1639582 from Michael McCandless in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1639582 ] LUCENE-6057 : improve Sort(SortField) docs
        Hide
        ASF subversion and git services added a comment -

        Commit 1639583 from Michael McCandless in branch 'dev/branches/lucene_solr_4_10'
        [ https://svn.apache.org/r1639583 ]

        LUCENE-6057: improve Sort(SortField) docs

        Show
        ASF subversion and git services added a comment - Commit 1639583 from Michael McCandless in branch 'dev/branches/lucene_solr_4_10' [ https://svn.apache.org/r1639583 ] LUCENE-6057 : improve Sort(SortField) docs
        Hide
        Michael McCandless added a comment -

        Thanks Martin!

        Show
        Michael McCandless added a comment - Thanks Martin!
        Hide
        ASF GitHub Bot added a comment -

        Github user s4ke commented on the pull request:

        https://github.com/apache/lucene-solr/pull/20#issuecomment-63032405

        Michael McCandless fixed this in his own commit.

        Show
        ASF GitHub Bot added a comment - Github user s4ke commented on the pull request: https://github.com/apache/lucene-solr/pull/20#issuecomment-63032405 Michael McCandless fixed this in his own commit.
        Hide
        ASF GitHub Bot added a comment -

        Github user s4ke closed the pull request at:

        https://github.com/apache/lucene-solr/pull/20

        Show
        ASF GitHub Bot added a comment - Github user s4ke closed the pull request at: https://github.com/apache/lucene-solr/pull/20
        Hide
        Ahmet Arslan added a comment -

        Thanks Martin, for clarifying this! I has some cases where multiple documents get assigned to same score. Order of the documents was changing from core/index to core/index. So I thought sorting algorithm lucene use, is not stable (like heap sort, selection sort etc.). But it looks like thats not the case? When default sort is used (sort by score/relevancy) internal lucene ids are used to break tie. And those ids cange during segment merge etc. Is it wrong to say that lucene uses a non-stable sort?

        Show
        Ahmet Arslan added a comment - Thanks Martin, for clarifying this! I has some cases where multiple documents get assigned to same score. Order of the documents was changing from core/index to core/index. So I thought sorting algorithm lucene use, is not stable (like heap sort, selection sort etc.). But it looks like thats not the case? When default sort is used (sort by score/relevancy) internal lucene ids are used to break tie. And those ids cange during segment merge etc. Is it wrong to say that lucene uses a non-stable sort?
        Hide
        Michael McCandless added a comment -

        The sort is stable within the context of a single point-in-time reader.

        But across different readers with index changes, including just merges being completed, it's not "stable".

        Show
        Michael McCandless added a comment - The sort is stable within the context of a single point-in-time reader. But across different readers with index changes, including just merges being completed, it's not "stable".
        Hide
        Martin Braun added a comment - - edited

        In order to get a stable behaviour you can always use your own ids (in your own field) and let them break the tie before the document id is used. I haven't noticed that unstable behaviour because I only needed custom sorting in extremely complicated cases yet and I have always sorted outside of Lucene because of some awkward constraints of legacy sorting code.

        Michael McCandless: Maybe that info should be added to the SortField Documentation as well (or at least a hint), because that's the main entry point for users that provide their own sorting.

        Show
        Martin Braun added a comment - - edited In order to get a stable behaviour you can always use your own ids (in your own field) and let them break the tie before the document id is used. I haven't noticed that unstable behaviour because I only needed custom sorting in extremely complicated cases yet and I have always sorted outside of Lucene because of some awkward constraints of legacy sorting code. Michael McCandless : Maybe that info should be added to the SortField Documentation as well (or at least a hint), because that's the main entry point for users that provide their own sorting.
        Hide
        Ahmet Arslan added a comment -

        But across different readers with index changes, including just merges being completed, it's not "stable".

        OK I understand that. Which is the source of this unstability?

        • internal lucene ids are used as last resort to break tie
        • sorting algorithm is not stable (like heap sort, selection sort etc.).
        Show
        Ahmet Arslan added a comment - But across different readers with index changes, including just merges being completed, it's not "stable". OK I understand that. Which is the source of this unstability? internal lucene ids are used as last resort to break tie sorting algorithm is not stable (like heap sort, selection sort etc.).
        Hide
        Martin Braun added a comment - - edited

        The sorting algorithm itself is stable as far as I can tell. (see my comment above)

        You can get it completely stable by doing as I described in the comment.

        Show
        Martin Braun added a comment - - edited The sorting algorithm itself is stable as far as I can tell. (see my comment above) You can get it completely stable by doing as I described in the comment.
        Hide
        Ahmet Arslan added a comment -

        The sorting algorithm itself is stable as far as I can tell.

        Thanks, this is what I was wondering. I know how to make it completely stable.

        Show
        Ahmet Arslan added a comment - The sorting algorithm itself is stable as far as I can tell. Thanks, this is what I was wondering. I know how to make it completely stable.
        Hide
        Anshum Gupta added a comment -

        Bulk close after 5.0 release.

        Show
        Anshum Gupta added a comment - Bulk close after 5.0 release.

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Martin Braun
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development