Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-1374

Merging of compressed string Fields may hit NPE

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.4
    • 2.4
    • core/index
    • None
    • New

    Description

      This bug was introduced with LUCENE-1219 (only present on 2.4).

      The bug happens when merging compressed string fields, but only if bulk-merging code does not apply because the FieldInfos for the segment being merged are not congruent. This test shows the bug:

        public void testMergeCompressedFields() throws IOException {
          File indexDir = new File(System.getProperty("tempDir"), "mergecompressedfields");
          Directory dir = FSDirectory.getDirectory(indexDir);
          try {
            for(int i=0;i<5;i++) {
              // Must make a new writer & doc each time, w/
              // different fields, so bulk merge of stored fields
              // cannot run:
              IndexWriter w = new IndexWriter(dir, new WhitespaceAnalyzer(), i==0, IndexWriter.MaxFieldLength.UNLIMITED);
              w.setMergeFactor(5);
              w.setMergeScheduler(new SerialMergeScheduler());
              Document doc = new Document();
              doc.add(new Field("test1", "this is some data that will be compressed this this this", Field.Store.COMPRESS, Field.Index.NO));
              doc.add(new Field("test2", new byte[20], Field.Store.COMPRESS));
              doc.add(new Field("field" + i, "random field", Field.Store.NO, Field.Index.TOKENIZED));
              w.addDocument(doc);
              w.close();
            }
      
            byte[] cmp = new byte[20];
      
            IndexReader r = IndexReader.open(dir);
            for(int i=0;i<5;i++) {
              Document doc = r.document(i);
              assertEquals("this is some data that will be compressed this this this", doc.getField("test1").stringValue());
              byte[] b = doc.getField("test2").binaryValue();
              assertTrue(Arrays.equals(b, cmp));
            }
          } finally {
            dir.close();
            _TestUtil.rmDir(indexDir);
          }
        }
      

      It's because in FieldsReader, when we load a field "for merge" we create a FieldForMerge instance which subsequently does not return the right values for getBinary

      {Value,Length,Offset}

      .

      Attachments

        1. LUCENE-1374.patch
          3 kB
          Michael McCandless

        Activity

          People

            mikemccand Michael McCandless
            mikemccand Michael McCandless
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: