Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-2814

stop writing shared doc stores across segments

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.1, 4.0-ALPHA
    • Fix Version/s: 3.1, 4.0-ALPHA
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Shared doc stores enables the files for stored fields and term vectors to be shared across multiple segments. We've had this optimization since 2.1 I think.

      It works best against a new index, where you open an IW, add lots of docs, and then close it. In that case all of the written segments will reference slices a single shared doc store segment.

      This was a good optimization because it means we never need to merge these files. But, when you open another IW on that index, it writes a new set of doc stores, and then whenever merges take place across doc stores, they must now be merged.

      However, since we switched to shared doc stores, there have been two optimizations for merging the stores. First, we now bulk-copy the bytes in these files if the field name/number assignment is "congruent". Second, we now force congruent field name/number mapping in IndexWriter. This means this optimization is much less potent than it used to be.

      Furthermore, the optimization adds a lot of hair to IndexWriter/DocumentsWriter; this has been the source of sneaky bugs over time, and causes odd behavior like a merge possibly forcing a flush when it starts. Finally, with DWPT (LUCENE-2324), which gets us truly concurrent flushing, we can no longer share doc stores.

      So, I think we should turn off the write-side of shared doc stores to pave the path for DWPT to land on trunk and simplify IW/DW. We still must support reading them (until 5.0), but the read side is far less hairy.

        Attachments

        1. LUCENE-2814.patch
          76 kB
          Earwin Burrfoot
        2. LUCENE-2814.patch
          75 kB
          Michael McCandless
        3. LUCENE-2814.patch
          99 kB
          Earwin Burrfoot
        4. LUCENE-2814.patch
          119 kB
          Earwin Burrfoot
        5. LUCENE-2814.patch
          119 kB
          Earwin Burrfoot

          Issue Links

            Activity

              People

              • Assignee:
                mikemccand Michael McCandless
                Reporter:
                mikemccand Michael McCandless
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: