Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-704

Lucene should have a "write once" mode

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Duplicate
    • 2.1
    • None
    • core/index
    • None
    • New

    Description

      This is a spinoff of LUCENE-701

      If your directory is residing on a "write once" filesystem (eg Hadoop), we need for Lucene to have a mode where it doesn't write to the same file more than once, nor (I think?) do things like rewind a file to overwrite parts of it.

      Lockless commits (LUCENE-701 ) gets us closer to this goal because it always commits to a new segments_N+1 file (and new files for deletes/separate norms), but, it still re-writes to a "segments.gen" file. This file is often "optional" (it's only necessary if directory listing can be stale on the platform/filesystem).

      The only other place I know of is in CompoundFileWriter.close(). That method writes 0's into the header and then rewinds and rewrites those 0s with the actual offsets into the compound file. I think (on quick inspection) that pre-computing the offsets and writing everything in one pass should be simple.

      Does anyone know of other places that re-use filenames or rewind/seek and rewrite bytes?

      We should create a "setWriteOnceMode()" or something like that.

      Attachments

        Activity

          People

            mikemccand Michael McCandless
            mikemccand Michael McCandless
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: