Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-3560

add extra safety to concrete codec implementations

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 4.0-ALPHA
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      In LUCENE-3490, we reorganized the codec model, and a key part of this is that Codecs are "safer"
      and don't rely upon client-side configuration: IndexReader doesn't take Codec or anything of that
      nature, only IndexWriter.

      Instead for "read" all codecs are initialized from the classpath via a no-arg ctor from Java's
      Service Provider Mechanism.

      So, although Codecs can still take parameters in the constructors, be subclassable, etc (for passing
      to IndexWriter), this enforces that they must write any configuration information they need into
      the index, so that we don't have a flimsy API.

      I think we should go even further, for additional safety. Any methods on our concrete codecs that
      are not intended to be subclassed should be final, and we should add assertions to verify this.

      For example, SimpleText's files() implementation should be final. If you want to make an extension
      of simpletext that has additional files, then this is a different index format and should have a
      different name!

      Note: This doesn't stop extensibility, only stupid mistakes.
      For example, this means that Lucene40Codec's postingsFormat() implementation is final, even though
      it offers a configurable "hook" (getPostingsFormatForField) for you to specify per-field postings
      formats (which it writes into a .per file into the index, so that it knows how to read each field).

      private final PostingsFormat postingsFormat = new PerFieldPostingsFormat() {
        @Override
        public PostingsFormat getPostingsFormatForField(String field) {
          return Lucene40Codec.this.getPostingsFormatForField(field);
        }
      };
      
      ...
      
      @Override
      public final PostingsFormat postingsFormat() {
        return postingsFormat;
      }
      
      ...
      
        /** Returns the postings format that should be used for writing 
         *  new segments of <code>field</code>.
         *  
         *  The default implementation always returns "Lucene40"
         */
        public PostingsFormat getPostingsFormatForField(String field) {
          return defaultFormat;
        }
      

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              rcmuir Robert Muir
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated: