Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-5398

NormValueSource unable to read long field norm

Details

    • Bug
    • Status: Closed
    • Trivial
    • Resolution: Fixed
    • 4.6
    • 4.7, 6.0
    • core/query/scoring
    • None
    • Ubuntu 12.04

    • New

    Description

      Previous Lucene implementation store field norms of all documents in memory, float values are therefore encoded into byte to minimize memory consumption.
      Recent release no longer have this constraint (see LUCENE-5078, and discussion at http://lucene.markmail.org/message/jtwit3pwu5oiqr2h), users are encouraged to implement their own encodeNormValue() to encode them into/decode from any type including int, byte and long, to fulfil their request for precision.
      But the legacy NormValueSource still typecast any long encoding into byte, as seen in line 74 in the java file, making any TFIDFSimilarity using more accurate encoding useless.
      It should be removed for the greater good.

      Attachments

        1. NormValueSource.java
          3 kB
          Peng Cheng
        2. TestValueSourcesWithNonByteNormEncoding.java
          9 kB
          Peng Cheng
        3. LUCENE-5398.patch
          10 kB
          Michael McCandless

        Activity

          People

            Unassigned Unassigned
            peng Peng Cheng
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 1h
                1h
                Remaining:
                Remaining Estimate - 1h
                1h
                Logged:
                Time Spent - Not Specified
                Not Specified