Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-1374

Merging of compressed string Fields may hit NPE

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.4
    • Fix Version/s: 2.4
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      This bug was introduced with LUCENE-1219 (only present on 2.4).

      The bug happens when merging compressed string fields, but only if bulk-merging code does not apply because the FieldInfos for the segment being merged are not congruent. This test shows the bug:

        public void testMergeCompressedFields() throws IOException {
          File indexDir = new File(System.getProperty("tempDir"), "mergecompressedfields");
          Directory dir = FSDirectory.getDirectory(indexDir);
          try {
            for(int i=0;i<5;i++) {
              // Must make a new writer & doc each time, w/
              // different fields, so bulk merge of stored fields
              // cannot run:
              IndexWriter w = new IndexWriter(dir, new WhitespaceAnalyzer(), i==0, IndexWriter.MaxFieldLength.UNLIMITED);
              w.setMergeFactor(5);
              w.setMergeScheduler(new SerialMergeScheduler());
              Document doc = new Document();
              doc.add(new Field("test1", "this is some data that will be compressed this this this", Field.Store.COMPRESS, Field.Index.NO));
              doc.add(new Field("test2", new byte[20], Field.Store.COMPRESS));
              doc.add(new Field("field" + i, "random field", Field.Store.NO, Field.Index.TOKENIZED));
              w.addDocument(doc);
              w.close();
            }
      
            byte[] cmp = new byte[20];
      
            IndexReader r = IndexReader.open(dir);
            for(int i=0;i<5;i++) {
              Document doc = r.document(i);
              assertEquals("this is some data that will be compressed this this this", doc.getField("test1").stringValue());
              byte[] b = doc.getField("test2").binaryValue();
              assertTrue(Arrays.equals(b, cmp));
            }
          } finally {
            dir.close();
            _TestUtil.rmDir(indexDir);
          }
        }
      

      It's because in FieldsReader, when we load a field "for merge" we create a FieldForMerge instance which subsequently does not return the right values for getBinary

      {Value,Length,Offset}

      .

        Attachments

        1. LUCENE-1374.patch
          3 kB
          Michael McCandless

          Activity

            People

            • Assignee:
              mikemccand Michael McCandless
              Reporter:
              mikemccand Michael McCandless
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: