Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.0, 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Currently the fieldcache is not pluggable at all. It would be better if everything used the docvalues apis.

      This would allow people to customize the implementation, extend the classes with custom subclasses with additional stuff, etc etc.

      FieldCache can be accessed via the docvalues apis, using the FilterReader api.

      1. LUCENE-5666.patch
        1.29 MB
        Robert Muir

        Issue Links

          Activity

          Hide
          ASF subversion and git services added a comment -

          Commit 1593790 from Robert Muir in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1593790 ]

          LUCENE-5666: current state to branch

          Show
          ASF subversion and git services added a comment - Commit 1593790 from Robert Muir in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1593790 ] LUCENE-5666 : current state to branch
          Hide
          ASF subversion and git services added a comment -

          Commit 1593792 from Robert Muir in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1593792 ]

          LUCENE-5666: fix testBasic to use dv always

          Show
          ASF subversion and git services added a comment - Commit 1593792 from Robert Muir in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1593792 ] LUCENE-5666 : fix testBasic to use dv always
          Hide
          ASF subversion and git services added a comment -

          Commit 1593797 from Michael McCandless in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1593797 ]

          LUCENE-5666: fix test bug

          Show
          ASF subversion and git services added a comment - Commit 1593797 from Michael McCandless in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1593797 ] LUCENE-5666 : fix test bug
          Hide
          ASF subversion and git services added a comment -

          Commit 1593802 from Michael McCandless in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1593802 ]

          LUCENE-5666: fix testBasic

          Show
          ASF subversion and git services added a comment - Commit 1593802 from Michael McCandless in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1593802 ] LUCENE-5666 : fix testBasic
          Hide
          ASF subversion and git services added a comment -

          Commit 1593806 from Robert Muir in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1593806 ]

          LUCENE-5666: add docvalues for JDK collator, too

          Show
          ASF subversion and git services added a comment - Commit 1593806 from Robert Muir in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1593806 ] LUCENE-5666 : add docvalues for JDK collator, too
          Hide
          ASF subversion and git services added a comment -

          Commit 1593807 from Robert Muir in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1593807 ]

          LUCENE-5666: fix facet example

          Show
          ASF subversion and git services added a comment - Commit 1593807 from Robert Muir in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1593807 ] LUCENE-5666 : fix facet example
          Hide
          ASF subversion and git services added a comment -

          Commit 1593808 from Michael McCandless in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1593808 ]

          LUCENE-5666: fix AllGroupHeadsCollectorTest

          Show
          ASF subversion and git services added a comment - Commit 1593808 from Michael McCandless in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1593808 ] LUCENE-5666 : fix AllGroupHeadsCollectorTest
          Hide
          ASF subversion and git services added a comment -

          Commit 1593816 from Michael McCandless in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1593816 ]

          LUCENE-5666: fix grouping tests

          Show
          ASF subversion and git services added a comment - Commit 1593816 from Michael McCandless in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1593816 ] LUCENE-5666 : fix grouping tests
          Hide
          ASF subversion and git services added a comment -

          Commit 1593819 from Robert Muir in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1593819 ]

          LUCENE-5666: clean up test and fix testSimple

          Show
          ASF subversion and git services added a comment - Commit 1593819 from Robert Muir in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1593819 ] LUCENE-5666 : clean up test and fix testSimple
          Hide
          ASF subversion and git services added a comment -

          Commit 1593822 from Michael McCandless in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1593822 ]

          LUCENE-5666: more grouping tests

          Show
          ASF subversion and git services added a comment - Commit 1593822 from Michael McCandless in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1593822 ] LUCENE-5666 : more grouping tests
          Hide
          ASF subversion and git services added a comment -

          Commit 1593825 from Robert Muir in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1593825 ]

          LUCENE-5666: fix testRandom

          Show
          ASF subversion and git services added a comment - Commit 1593825 from Robert Muir in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1593825 ] LUCENE-5666 : fix testRandom
          Hide
          ASF subversion and git services added a comment -

          Commit 1593850 from Robert Muir in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1593850 ]

          LUCENE-5666: move out more uninverting

          Show
          ASF subversion and git services added a comment - Commit 1593850 from Robert Muir in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1593850 ] LUCENE-5666 : move out more uninverting
          Hide
          ASF subversion and git services added a comment -

          Commit 1593856 from Robert Muir in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1593856 ]

          LUCENE-5666: get solr compiling

          Show
          ASF subversion and git services added a comment - Commit 1593856 from Robert Muir in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1593856 ] LUCENE-5666 : get solr compiling
          Hide
          Adrien Grand added a comment -

          +1

          Show
          Adrien Grand added a comment - +1
          Hide
          ASF subversion and git services added a comment -

          Commit 1594069 from Robert Muir in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1594069 ]

          LUCENE-5666: javadocs, support SORTED_SET for multi-valued numerics

          Show
          ASF subversion and git services added a comment - Commit 1594069 from Robert Muir in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1594069 ] LUCENE-5666 : javadocs, support SORTED_SET for multi-valued numerics
          Hide
          ASF subversion and git services added a comment -

          Commit 1594095 from Robert Muir in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1594095 ]

          LUCENE-5666: actually make a single-valued fc if the field is not multi-valued

          Show
          ASF subversion and git services added a comment - Commit 1594095 from Robert Muir in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1594095 ] LUCENE-5666 : actually make a single-valued fc if the field is not multi-valued
          Hide
          ASF subversion and git services added a comment -

          Commit 1594204 from Robert Muir in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1594204 ]

          LUCENE-5666: move sortedset sortfield out of sandbox

          Show
          ASF subversion and git services added a comment - Commit 1594204 from Robert Muir in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1594204 ] LUCENE-5666 : move sortedset sortfield out of sandbox
          Hide
          ASF subversion and git services added a comment -

          Commit 1594211 from Robert Muir in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1594211 ]

          LUCENE-5666: add SortedSetFieldSource

          Show
          ASF subversion and git services added a comment - Commit 1594211 from Robert Muir in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1594211 ] LUCENE-5666 : add SortedSetFieldSource
          Hide
          ASF subversion and git services added a comment -

          Commit 1594254 from Robert Muir in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1594254 ]

          LUCENE-5666: get solr started

          Show
          ASF subversion and git services added a comment - Commit 1594254 from Robert Muir in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1594254 ] LUCENE-5666 : get solr started
          Hide
          ASF subversion and git services added a comment -

          Commit 1594258 from Robert Muir in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1594258 ]

          LUCENE-5666: fix test to assert from raw reader

          Show
          ASF subversion and git services added a comment - Commit 1594258 from Robert Muir in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1594258 ] LUCENE-5666 : fix test to assert from raw reader
          Hide
          ASF subversion and git services added a comment -

          Commit 1594311 from Robert Muir in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1594311 ]

          LUCENE-5666: fix some tests

          Show
          ASF subversion and git services added a comment - Commit 1594311 from Robert Muir in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1594311 ] LUCENE-5666 : fix some tests
          Hide
          ASF subversion and git services added a comment -

          Commit 1594316 from Robert Muir in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1594316 ]

          LUCENE-5666: fix bug (null is no longer allowed)

          Show
          ASF subversion and git services added a comment - Commit 1594316 from Robert Muir in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1594316 ] LUCENE-5666 : fix bug (null is no longer allowed)
          Hide
          ASF subversion and git services added a comment -

          Commit 1594327 from Robert Muir in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1594327 ]

          LUCENE-5666: fix test and bug

          Show
          ASF subversion and git services added a comment - Commit 1594327 from Robert Muir in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1594327 ] LUCENE-5666 : fix test and bug
          Hide
          ASF subversion and git services added a comment -

          Commit 1594417 from Robert Muir in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1594417 ]

          LUCENE-5666: fix test failures

          Show
          ASF subversion and git services added a comment - Commit 1594417 from Robert Muir in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1594417 ] LUCENE-5666 : fix test failures
          Hide
          ASF subversion and git services added a comment -

          Commit 1594418 from Robert Muir in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1594418 ]

          LUCENE-5666: fix rewrite bug

          Show
          ASF subversion and git services added a comment - Commit 1594418 from Robert Muir in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1594418 ] LUCENE-5666 : fix rewrite bug
          Hide
          ASF subversion and git services added a comment -

          Commit 1594441 from Robert Muir in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1594441 ]

          LUCENE-5666: fix StatsComponent insanity

          Show
          ASF subversion and git services added a comment - Commit 1594441 from Robert Muir in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1594441 ] LUCENE-5666 : fix StatsComponent insanity
          Hide
          ASF subversion and git services added a comment -

          Commit 1594445 from Robert Muir in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1594445 ]

          LUCENE-5666: still return missing count etc when there are no terms

          Show
          ASF subversion and git services added a comment - Commit 1594445 from Robert Muir in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1594445 ] LUCENE-5666 : still return missing count etc when there are no terms
          Hide
          ASF subversion and git services added a comment -

          Commit 1594452 from Robert Muir in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1594452 ]

          LUCENE-5666: remove insanity during distributed grouping

          Show
          ASF subversion and git services added a comment - Commit 1594452 from Robert Muir in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1594452 ] LUCENE-5666 : remove insanity during distributed grouping
          Hide
          ASF subversion and git services added a comment -

          Commit 1594492 from Robert Muir in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1594492 ]

          LUCENE-5666: support the 2 crazy instances of insanity that are too hard for me to fix

          Show
          ASF subversion and git services added a comment - Commit 1594492 from Robert Muir in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1594492 ] LUCENE-5666 : support the 2 crazy instances of insanity that are too hard for me to fix
          Hide
          ASF subversion and git services added a comment -

          Commit 1594505 from Robert Muir in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1594505 ]

          LUCENE-5666: clear nocommits and fix precommit

          Show
          ASF subversion and git services added a comment - Commit 1594505 from Robert Muir in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1594505 ] LUCENE-5666 : clear nocommits and fix precommit
          Hide
          ASF subversion and git services added a comment -

          Commit 1594507 from Robert Muir in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1594507 ]

          LUCENE-5666: merge trunk

          Show
          ASF subversion and git services added a comment - Commit 1594507 from Robert Muir in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1594507 ] LUCENE-5666 : merge trunk
          Hide
          Robert Muir added a comment -

          Patch (from diff-sources.py) showing the differences between trunk and branch.

          Unfortunately I could not remove all fieldcache insanity in solr (i really tried), so there are two narrow cases where its explicitly "enabled":

          • ord/rord on single-valued numeric fields
          • grouping with faceting (group.facet) on single-valued numeric fields.

          Otherwise no more insanity and things are a lot more flexible.

          Show
          Robert Muir added a comment - Patch (from diff-sources.py) showing the differences between trunk and branch. Unfortunately I could not remove all fieldcache insanity in solr (i really tried), so there are two narrow cases where its explicitly "enabled": ord/rord on single-valued numeric fields grouping with faceting (group.facet) on single-valued numeric fields. Otherwise no more insanity and things are a lot more flexible.
          Hide
          ASF subversion and git services added a comment -

          Commit 1594612 from Michael McCandless in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1594612 ]

          LUCENE-5666: fix test

          Show
          ASF subversion and git services added a comment - Commit 1594612 from Michael McCandless in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1594612 ] LUCENE-5666 : fix test
          Hide
          ASF subversion and git services added a comment -

          Commit 1594615 from Michael McCandless in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1594615 ]

          LUCENE-5666: fix javadocs

          Show
          ASF subversion and git services added a comment - Commit 1594615 from Michael McCandless in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1594615 ] LUCENE-5666 : fix javadocs
          Hide
          ASF subversion and git services added a comment -

          Commit 1595228 from Robert Muir in branch 'dev/branches/lucene5666'
          [ https://svn.apache.org/r1595228 ]

          LUCENE-5666: merge trunk

          Show
          ASF subversion and git services added a comment - Commit 1595228 from Robert Muir in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1595228 ] LUCENE-5666 : merge trunk
          Hide
          ASF subversion and git services added a comment -

          Commit 1595259 from Robert Muir in branch 'dev/trunk'
          [ https://svn.apache.org/r1595259 ]

          LUCENE-5666: Add UninvertingReader

          Show
          ASF subversion and git services added a comment - Commit 1595259 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1595259 ] LUCENE-5666 : Add UninvertingReader
          Hide
          David Smiley added a comment -

          Wow, this was clearly a lot of work and welcome enhancement for sure! Can you please add some more comments on how this impacted the APIs, at least the more public/visible parts we're likely to encounter? There is a lot to review manually. I tried to use Fisheye it's unusable at the moment, which possibly can't handle a codebase this size.

          Show
          David Smiley added a comment - Wow, this was clearly a lot of work and welcome enhancement for sure! Can you please add some more comments on how this impacted the APIs, at least the more public/visible parts we're likely to encounter? There is a lot to review manually. I tried to use Fisheye it's unusable at the moment, which possibly can't handle a codebase this size.
          Hide
          Robert Muir added a comment -

          Is the CHANGES.txt entry not good here? The docvalues apis did not change...

          Show
          Robert Muir added a comment - Is the CHANGES.txt entry not good here? The docvalues apis did not change...
          Hide
          David Smiley added a comment -

          Oh, right. I'll repost it here for everyone's benefit:

          * LUCENE-5666: Change uninverted access (sorting, faceting, grouping, etc)
            to use the DocValues API instead of FieldCache. For FieldCache functionality,
            use UninvertingReader in lucene/misc (or implement your own FilterReader).
            UninvertingReader is more efficient: supports multi-valued numeric fields,
            detects when a multi-valued field is single-valued, reuses caches
            of compatible types (e.g. SORTED also supports BINARY and SORTED_SET access
            without insanity).  "Insanity" is no longer possible unless you explicitly want it. 
            Rename FieldCache* and DocTermOrds* classes in the search package to DocValues*. 
            Move SortedSetSortField to core and add SortedSetFieldSource to queries/, which
            takes the same selectors. Add helper methods to DocValues.java that are better 
            suited for search code (never return null, etc).  (Mike McCandless, Robert Muir)
          

          I looked up DocValues which is new to me but the commit message references LUCENE-5573 which seems mis-attributed. I'm kinda surprised FieldCache isn't deprecated. It could be marked @lucene.internal. At least... it's name doesn't seem appropriate anymore. Maybe UninvertedCache. But perhaps a rename like that would introduce too much change for now, even though it's trunk. It could use some javadocs stating that DocValues.java should generally be used instead.

          Show
          David Smiley added a comment - Oh, right. I'll repost it here for everyone's benefit: * LUCENE-5666: Change uninverted access (sorting, faceting, grouping, etc) to use the DocValues API instead of FieldCache. For FieldCache functionality, use UninvertingReader in lucene/misc (or implement your own FilterReader). UninvertingReader is more efficient: supports multi-valued numeric fields, detects when a multi-valued field is single-valued, reuses caches of compatible types (e.g. SORTED also supports BINARY and SORTED_SET access without insanity). "Insanity" is no longer possible unless you explicitly want it. Rename FieldCache* and DocTermOrds* classes in the search package to DocValues*. Move SortedSetSortField to core and add SortedSetFieldSource to queries/, which takes the same selectors. Add helper methods to DocValues.java that are better suited for search code (never return null, etc). (Mike McCandless, Robert Muir) I looked up DocValues which is new to me but the commit message references LUCENE-5573 which seems mis-attributed. I'm kinda surprised FieldCache isn't deprecated. It could be marked @lucene.internal. At least... it's name doesn't seem appropriate anymore. Maybe UninvertedCache. But perhaps a rename like that would introduce too much change for now, even though it's trunk. It could use some javadocs stating that DocValues.java should generally be used instead.
          Hide
          Robert Muir added a comment -

          I think you missed the point. it does not have any javadocs: its package private.

          Show
          Robert Muir added a comment - I think you missed the point. it does not have any javadocs: its package private.
          Hide
          David Smiley added a comment -

          Ok, I see that now; it's good.

          Show
          David Smiley added a comment - Ok, I see that now; it's good.
          Hide
          Adrien Grand added a comment -

          +1 I like the change a lot!

          Show
          Adrien Grand added a comment - +1 I like the change a lot!
          Hide
          ASF subversion and git services added a comment -

          Commit 1596429 from Anshum Gupta in branch 'dev/trunk'
          [ https://svn.apache.org/r1596429 ]

          LUCENE-5666: Fix idea project files

          Show
          ASF subversion and git services added a comment - Commit 1596429 from Anshum Gupta in branch 'dev/trunk' [ https://svn.apache.org/r1596429 ] LUCENE-5666 : Fix idea project files
          Hide
          Mikhail Khludnev added a comment -

          Robert Muir I'm poring SOLR-6234 to trunk, and observe the annoying issue. Before, FieldCache was always available, but now if doc-vals are not written (indexed), DocValues.get... yields emptyXxx, that break tests silently. I'd rather prefer to get NPE or other Illegal..Exception explicitly.
          What do I see wrong?

          Show
          Mikhail Khludnev added a comment - Robert Muir I'm poring SOLR-6234 to trunk, and observe the annoying issue. Before, FieldCache was always available, but now if doc-vals are not written (indexed), DocValues.get... yields emptyXxx, that break tests silently. I'd rather prefer to get NPE or other Illegal..Exception explicitly. What do I see wrong?
          Hide
          Anshum Gupta added a comment -

          Bulk close after 5.0 release.

          Show
          Anshum Gupta added a comment - Bulk close after 5.0 release.
          Hide
          Mikhail Khludnev added a comment -

          Hello,
          I want to clarify
          DeleteByQueryWrapper refers to some caching
          Even though we wrap for each query, UninvertingReader's core cache key is the inner one, so it still reuses fieldcaches and so on.
          We evidence that indexing stuck on repeatedly building UninversionMap for every deleteByQuery

            
           AtomicReader wrap(AtomicReader reader) {
              return new UninvertingReader(reader, schema.getUninversionMap(reader));  
          }
          

          Is there any chance to make it faster?

          Show
          Mikhail Khludnev added a comment - Hello, I want to clarify DeleteByQueryWrapper refers to some caching Even though we wrap for each query, UninvertingReader's core cache key is the inner one, so it still reuses fieldcaches and so on. We evidence that indexing stuck on repeatedly building UninversionMap for every deleteByQuery AtomicReader wrap(AtomicReader reader) { return new UninvertingReader(reader, schema.getUninversionMap(reader)); } Is there any chance to make it faster?
          Hide
          Mikhail Khludnev added a comment -

          followup LUCENE-7049

          Show
          Mikhail Khludnev added a comment - followup LUCENE-7049

            People

            • Assignee:
              Unassigned
              Reporter:
              Robert Muir
            • Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development