Lucene - Core
  1. Lucene - Core
  2. LUCENE-1478

Missing possibility to supply custom FieldParser when sorting search results

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.4
    • Fix Version/s: 2.9
    • Component/s: core/search
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      When implementing the new TrieRangeQuery for contrib (LUCENE-1470), I was confronted by the problem that the special trie-encoded values (which are longs in a special encoding) cannot be sorted by Searcher.search() and SortField. The problem is: If you use SortField.LONG, you get NumberFormatExceptions. The trie encoded values may be sorted using SortField.String (as the encoding is in such a way, that they are sortable as Strings), but this is very memory ineffective.

      ExtendedFieldCache gives the possibility to specify a custom LongParser when retrieving the cached values. But you cannot use this during searching, because there is no possibility to supply this custom LongParser to the SortField.

      I propose a change in the sort classes:
      Include a pointer to the parser instance to be used in SortField (if not given use the default). My idea is to create a SortField using a new constructor

      SortField(String field, int type, Object parser, boolean reverse)

      The parser is "object" because all current parsers have no super-interface. The ideal solution would be to have:

      SortField(String field, int type, FieldCache.Parser parser, boolean reverse)

      and FieldCache.Parser is a super-interface (just empty, more like a marker-interface) of all other parsers (like LongParser...). The sort implementation then must be changed to respect the given parser (if not NULL), else use the default FieldCache.getXXXX without parser.

      1. LUCENE-1478.patch
        37 kB
        Michael McCandless
      2. LUCENE-1478.patch
        34 kB
        Michael McCandless
      3. LUCENE-1478.patch
        32 kB
        Uwe Schindler
      4. LUCENE-1478.patch
        31 kB
        Uwe Schindler
      5. LUCENE-1478.patch
        31 kB
        Uwe Schindler
      6. LUCENE-1478-cleanup.patch
        3 kB
        Uwe Schindler
      7. LUCENE-1478-no-superinterface.patch
        13 kB
        Uwe Schindler

        Issue Links

          Activity

          Uwe Schindler created issue -
          Uwe Schindler made changes -
          Field Original Value New Value
          Description When implementing the new TrieRangeQuery for contrib (LUCENE-1470), I was confronted by the problem that the special trie-encoded values (which are longs in a special encoding) cannot be sorted by Searcher.search() and SortField. The problem is: If you use SortField.LONG, you get NumberFormatExceptions. The trie encoded values may be sorted using SortField.String (as the encoding is in such a way, that they are sortable as Strings), but this is very memory ineffective.

          ExtendedFieldCache gives the possibility to specify a custom LongParser when retrieving the cached values. But you cannot use this during searching, because there is no possibility to supply this custom LongParser to the SortField.

          I propose a change in the sort classes:
          Include a pointer to the parser instance to be used in SortField (if not given use the default). My idea is to create a SortField using a new constructor
          {code}SortField(String field, int type, Object parser, boolean reverse){code}

          The parser is "object" bcause all parsers have no super-interface. The ideal solution would be to have:

          {code}SortField(String field, int type, FieldCache.Parser parser, boolean reverse){code}

          and FieldCache.Parser is a super-interface (just empty, more like a marker-interface) of all other parsers (like LongParser...). The sort implementation then must be changed to respect the given parser (if not NULL), else use the default FieldCache.getXXXX without parser.
          When implementing the new TrieRangeQuery for contrib (LUCENE-1470), I was confronted by the problem that the special trie-encoded values (which are longs in a special encoding) cannot be sorted by Searcher.search() and SortField. The problem is: If you use SortField.LONG, you get NumberFormatExceptions. The trie encoded values may be sorted using SortField.String (as the encoding is in such a way, that they are sortable as Strings), but this is very memory ineffective.

          ExtendedFieldCache gives the possibility to specify a custom LongParser when retrieving the cached values. But you cannot use this during searching, because there is no possibility to supply this custom LongParser to the SortField.

          I propose a change in the sort classes:
          Include a pointer to the parser instance to be used in SortField (if not given use the default). My idea is to create a SortField using a new constructor
          {code}SortField(String field, int type, Object parser, boolean reverse){code}

          The parser is "object" because all current parsers have no super-interface. The ideal solution would be to have:

          {code}SortField(String field, int type, FieldCache.Parser parser, boolean reverse){code}

          and FieldCache.Parser is a super-interface (just empty, more like a marker-interface) of all other parsers (like LongParser...). The sort implementation then must be changed to respect the given parser (if not NULL), else use the default FieldCache.getXXXX without parser.
          Uwe Schindler made changes -
          Attachment LUCENE-1478-no-superinterface.patch [ 12395332 ]
          Uwe Schindler made changes -
          Lucene Fields [New] [New, Patch Available]
          Michael McCandless made changes -
          Assignee Michael McCandless [ mikemccand ]
          Uwe Schindler made changes -
          Attachment LUCENE-1478.patch [ 12395484 ]
          Uwe Schindler made changes -
          Link This issue incorporates LUCENE-1481 [ LUCENE-1481 ]
          Uwe Schindler made changes -
          Link This issue is blocked by LUCENE-1481 [ LUCENE-1481 ]
          Uwe Schindler made changes -
          Link This issue incorporates LUCENE-1481 [ LUCENE-1481 ]
          Uwe Schindler made changes -
          Attachment LUCENE-1478.patch [ 12395507 ]
          Uwe Schindler made changes -
          Attachment LUCENE-1478.patch [ 12395564 ]
          Michael McCandless made changes -
          Attachment LUCENE-1478.patch [ 12395581 ]
          Michael McCandless made changes -
          Attachment LUCENE-1478.patch [ 12395589 ]
          Michael McCandless made changes -
          Resolution Fixed [ 1 ]
          Status Open [ 1 ] Resolved [ 5 ]
          Lucene Fields [Patch Available, New] [New, Patch Available]
          Michael McCandless made changes -
          Lucene Fields [Patch Available, New] [New, Patch Available]
          Fix Version/s 2.9 [ 12312682 ]
          Uwe Schindler made changes -
          Attachment LUCENE-1478-cleanup.patch [ 12395599 ]
          Mark Miller made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Mark Thomas made changes -
          Workflow jira [ 12447774 ] Default workflow, editable Closed status [ 12562397 ]
          Mark Thomas made changes -
          Workflow Default workflow, editable Closed status [ 12562397 ] jira [ 12584754 ]

            People

            • Assignee:
              Michael McCandless
              Reporter:
              Uwe Schindler
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development