Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Recent changes in Lucene have altered how the FieldCache is used and as-is could lead to previously working Solr installations blowing up when they upgrade to 1.4. We need to fix, or document the affects of these changes.

      1. SOLR-1111_sort.patch
        46 kB
        Yonik Seeley
      2. SOLR-1111_sort.patch
        41 kB
        Yonik Seeley
      3. SOLR-1111_sort.patch
        33 kB
        Yonik Seeley
      4. SOLR-1111_sort.patch
        21 kB
        Yonik Seeley
      5. SOLR-1111-distrib.patch
        9 kB
        Yonik Seeley

        Issue Links

          Activity

          Hide
          Yonik Seeley added a comment - - edited

          The major issue is that Lucene now creates scorers per-segment, and if you use Lucene's searcher.search(...,sort) then the FieldCache populations will also be per-segment.

          The biggest issue: If FieldCache get's populated at both the top-level reader and per-segment, memory usage doubles (as does un-inversion time).

          • Faceting on single-valued fields uses the FieldCache at the top-level (and would be
          • This is non-trivial to change... if we started counting per-segment, counts would somehow have to be merged across segments.
          • Sorting in Solr currently uses the FieldCache at the top level
          • This can't easily be changed to use Lucene's searcher.search(...,sort) since we are using a hit collector (which can be wrapped in a time limited collector).
          • Distributed search uses the top-level FieldCache to retrieve sort field values.
          • FunctionQuery now derives values at the segment level
          • This also applies to the function range query

          Another issue for function query is the use of ord()... it won't be valid in multi-segment indexes if evaluated at the segment level.

          Evaluate custom sorters (like query elevation, etc) to ensure that they still work at the segment level. Solr doesn't currently do segment-level sorting like Lucene now does, but perhaps we should switch for more near-real-time support.

          Show
          Yonik Seeley added a comment - - edited The major issue is that Lucene now creates scorers per-segment, and if you use Lucene's searcher.search(...,sort) then the FieldCache populations will also be per-segment. The biggest issue: If FieldCache get's populated at both the top-level reader and per-segment, memory usage doubles (as does un-inversion time). Faceting on single-valued fields uses the FieldCache at the top-level (and would be This is non-trivial to change... if we started counting per-segment, counts would somehow have to be merged across segments. Sorting in Solr currently uses the FieldCache at the top level This can't easily be changed to use Lucene's searcher.search(...,sort) since we are using a hit collector (which can be wrapped in a time limited collector). Distributed search uses the top-level FieldCache to retrieve sort field values. FunctionQuery now derives values at the segment level This also applies to the function range query Another issue for function query is the use of ord()... it won't be valid in multi-segment indexes if evaluated at the segment level. Evaluate custom sorters (like query elevation, etc) to ensure that they still work at the segment level. Solr doesn't currently do segment-level sorting like Lucene now does, but perhaps we should switch for more near-real-time support.
          Hide
          Jayson Minard added a comment -

          Is this Lucene version in the current 1.4 trunk, or is it a version not-yet integrated into Solr libs?

          And also, the description makes it sound like an upgrade issue, but really any 1.4 version could blow up due to this problem.

          Lastly, define "blow up"... Uses double memory, or some other side effect?

          Show
          Jayson Minard added a comment - Is this Lucene version in the current 1.4 trunk, or is it a version not-yet integrated into Solr libs? And also, the description makes it sound like an upgrade issue, but really any 1.4 version could blow up due to this problem. Lastly, define "blow up"... Uses double memory, or some other side effect?
          Hide
          Yonik Seeley added a comment -

          Is this Lucene version in the current 1.4 trunk

          Yes.

          define "blow up"... Uses double memory, or some other side effect?

          Yep - which can cause previously working systems OOM errors.

          Show
          Yonik Seeley added a comment - Is this Lucene version in the current 1.4 trunk Yes. define "blow up"... Uses double memory, or some other side effect? Yep - which can cause previously working systems OOM errors.
          Hide
          Yonik Seeley added a comment -

          Here's a patch for distributed search to retrieve sort field values from the lowest level index readers.
          I plan on committing shortly.

          Show
          Yonik Seeley added a comment - Here's a patch for distributed search to retrieve sort field values from the lowest level index readers. I plan on committing shortly.
          Hide
          Yonik Seeley added a comment -

          I just committed the distributed search part of this patch.

          Show
          Yonik Seeley added a comment - I just committed the distributed search part of this patch.
          Hide
          Yonik Seeley added a comment -

          Attaching SOLR-1111_sort.patch to use new Lucene Collector classes, including sorting collectors that will use FieldCache entries at the segment level instead of the top level reader.

          Unfortunately, tests don't currently pass - NPE caused by sort=a_i asc.
          Looks like we'll need to port any custom comparators over to the new FieldComparatorSource (I hadn't thought about this before, but of course it makes sense that the old custom comparators wouldn't work since there isn't a method to compare docs from different segments).

          Show
          Yonik Seeley added a comment - Attaching SOLR-1111 _sort.patch to use new Lucene Collector classes, including sorting collectors that will use FieldCache entries at the segment level instead of the top level reader. Unfortunately, tests don't currently pass - NPE caused by sort=a_i asc. Looks like we'll need to port any custom comparators over to the new FieldComparatorSource (I hadn't thought about this before, but of course it makes sense that the old custom comparators wouldn't work since there isn't a method to compare docs from different segments).
          Hide
          Yonik Seeley added a comment -

          Attaching updated patch - multiple test cases still failing.

          • fixed the sort-last comparator sources
          • fixed SolrIndexReader.sortDocSet()
          Show
          Yonik Seeley added a comment - Attaching updated patch - multiple test cases still failing. fixed the sort-last comparator sources fixed SolrIndexReader.sortDocSet()
          Hide
          Yonik Seeley added a comment -

          Latest patch - some tests still fail.

          • fixed/implemented sort-missing-last as a new FieldComparatorSource
          • fixed distributed search for sorting missing last
          • fixed function query when scores are NaN or -infinity... had to map to -max_val

          This won't apply to trunk because it clashes with the reversion of SolrIndexSearcher to use delegation rather than inheritance. I fixed it by changing "super(r)" to "super(wrap(r))"

          Show
          Yonik Seeley added a comment - Latest patch - some tests still fail. fixed/implemented sort-missing-last as a new FieldComparatorSource fixed distributed search for sorting missing last fixed function query when scores are NaN or -infinity... had to map to -max_val This won't apply to trunk because it clashes with the reversion of SolrIndexSearcher to use delegation rather than inheritance. I fixed it by changing "super(r)" to "super(wrap(r))"
          Hide
          Yonik Seeley added a comment -

          TODO reminder: FieldCache.DEFAULT and ExtendedFieldCache.EXT_DEFAULT are different instances... make sure that we are using the same instance everywhere to avoid more memory being used than necessary.

          Show
          Yonik Seeley added a comment - TODO reminder: FieldCache.DEFAULT and ExtendedFieldCache.EXT_DEFAULT are different instances... make sure that we are using the same instance everywhere to avoid more memory being used than necessary.
          Hide
          Yonik Seeley added a comment -

          Attaching updated patch. All tests new pass.

          Show
          Yonik Seeley added a comment - Attaching updated patch. All tests new pass.
          Hide
          Yonik Seeley added a comment -

          committed. Leaving this issue open for now - need to look at RandomSortField, FieldCache.DEFAULT, and perhaps some tests (something to show that FieldCache entries are being shared).

          Show
          Yonik Seeley added a comment - committed. Leaving this issue open for now - need to look at RandomSortField, FieldCache.DEFAULT, and perhaps some tests (something to show that FieldCache entries are being shared).
          Hide
          Yonik Seeley added a comment -

          For FieldCache issues, I've opened LUCENE-1662

          Show
          Yonik Seeley added a comment - For FieldCache issues, I've opened LUCENE-1662
          Hide
          Yonik Seeley added a comment -

          Moving the rest of this to 1.5

          Show
          Yonik Seeley added a comment - Moving the rest of this to 1.5
          Hide
          Hoss Man added a comment -

          Bulk updating 240 Solr issues to set the Fix Version to "next" per the process outlined in this email...

          http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3Calpine.DEB.1.10.1005251052040.24672@radix.cryptio.net%3E

          Selection criteria was "Unresolved" with a Fix Version of 1.5, 1.6, 3.1, or 4.0. email notifications were suppressed.

          A unique token for finding these 240 issues in the future: hossversioncleanup20100527

          Show
          Hoss Man added a comment - Bulk updating 240 Solr issues to set the Fix Version to "next" per the process outlined in this email... http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3Calpine.DEB.1.10.1005251052040.24672@radix.cryptio.net%3E Selection criteria was "Unresolved" with a Fix Version of 1.5, 1.6, 3.1, or 4.0. email notifications were suppressed. A unique token for finding these 240 issues in the future: hossversioncleanup20100527
          Hide
          Robert Muir added a comment -

          Bulk move 3.2 -> 3.3

          Show
          Robert Muir added a comment - Bulk move 3.2 -> 3.3
          Hide
          Robert Muir added a comment -

          3.4 -> 3.5

          Show
          Robert Muir added a comment - 3.4 -> 3.5
          Hide
          Erick Erickson added a comment -

          Is this relevant any more? I found this by accident while investigating a client question...

          Show
          Erick Erickson added a comment - Is this relevant any more? I found this by accident while investigating a client question...
          Hide
          Hoss Man added a comment -

          Issue is marked 3.6 and actively being discussed but has no assignee - assigning to most active committer contributing patches/discussion so far to triage wether this can/should be pushed to 4.0 or not.

          Show
          Hoss Man added a comment - Issue is marked 3.6 and actively being discussed but has no assignee - assigning to most active committer contributing patches/discussion so far to triage wether this can/should be pushed to 4.0 or not.
          Hide
          Hoss Man added a comment -

          bulk fixing the version info for 4.0-ALPHA and 4.0 all affected issues have "hoss20120711-bulk-40-change" in comment

          Show
          Hoss Man added a comment - bulk fixing the version info for 4.0-ALPHA and 4.0 all affected issues have "hoss20120711-bulk-40-change" in comment
          Hide
          Robert Muir added a comment -

          rmuir20120906-bulk-40-change

          Show
          Robert Muir added a comment - rmuir20120906-bulk-40-change
          Hide
          Robert Muir added a comment -

          moving all 4.0 issues not touched in a month to 4.1

          Show
          Robert Muir added a comment - moving all 4.0 issues not touched in a month to 4.1

            People

            • Assignee:
              Yonik Seeley
              Reporter:
              Yonik Seeley
            • Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:

                Development