Lucene - Core
  1. Lucene - Core
  2. LUCENE-2863

Updating a documenting looses its fields that only indexed, also NumericField tries are completely lost

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Not a Problem
    • Affects Version/s: 3.0.2, 3.0.3
    • Fix Version/s: None
    • Component/s: core/store
    • Labels:
      None
    • Environment:

      WindowsXP, Java1.6.20 using a RamDirectory

    • Lucene Fields:
      New

      Description

      I have a code snippet (see below) which creates a new document with standard (stored, indexed), not-stored, indexed-only and some NumericFields. Then it updates the document via adding a new string field. The result is that all the fields that are not stored but indexed-only and especially NumericFields the trie tokens are completly lost from index after update or delete/add.

      Directory ramDir = new RamDirectory();
      IndexWriter writer = new IndexWriter(ramDir, new WhitespaceAnalyzer(), MaxFieldLength.UNLIMITED);
      Document doc = new Document();
      doc.add(new Field("ID", "HO1234", Store.YES, Index.NOT_ANALYZED_NO_NORMS));
      doc.add(new Field("PATTERN", "HELLO", Store.NO, Index.NOT_ANALYZED_NO_NORMS));
      doc.add(new NumericField("LAT", Store.YES, true).setDoubleValue(51.488266037777066d));
      doc.add(new NumericField("LNG", Store.YES, true).setDoubleValue(-0.08913399651646614d));
      writer.addDocument(doc);
      doc = new Document();
      doc.add(new Field("ID", "HO2222", Store.YES, Index.NOT_ANALYZED_NO_NORMS));
      doc.add(new Field("PATTERN", "BELLO", Store.NO, Index.NOT_ANALYZED_NO_NORMS));
      doc.add(new NumericField("LAT", Store.YES, true).setDoubleValue(101.488266037777066d));
      doc.add(new NumericField("LNG", Store.YES, true).setDoubleValue(-100.08913399651646614d));
      writer.addDocument(doc);
      
      Term t = new Term("ID", "HO1234");
      Query q = new TermQuery(t);
      IndexSearcher seacher = new IndexSearcher(writer.getReader());
      TopDocs hits = seacher.search(q, 1);
      if (hits.scoreDocs.length > 0) {
            Document ndoc = seacher.doc(hits.scoreDocs[0].doc);
            ndoc.add(new Field("FINAL", "FINAL", Store.YES, Index.NOT_ANALYZED_NO_NORMS));
            writer.updateDocument(t, ndoc);
      //      writer.deleteDocuments(q);
      //      writer.addDocument(ndoc);
      } else {
            LOG.info("Couldn't find the document via the query");
      }
      
      seacher = new IndexSearcher(writer.getReader());
      hits = seacher.search(new TermQuery(new Term("PATTERN", "HELLO")), 1);
      LOG.info("_____hits HELLO:" + hits.totalHits); // should be 1 but it's 0
      
      writer.close();
      

      And I have a boundingbox query based on NumericRangeQuery. After the document update it doesn't return any hit.

        Activity

        Tamas Sandor created issue -
        Tamas Sandor made changes -
        Field Original Value New Value
        Priority Major [ 3 ] Blocker [ 1 ]
        Hide
        Earwin Burrfoot added a comment -

        updateDocument() is an atomic version of deleteDocument() + addDocument(), nothing more

        and there's nothing surprising you lose your fields if you delete the doc and don't add them back later.

        Show
        Earwin Burrfoot added a comment - updateDocument() is an atomic version of deleteDocument() + addDocument(), nothing more and there's nothing surprising you lose your fields if you delete the doc and don't add them back later.
        Hide
        Tamas Sandor added a comment -

        Yeah, but how can I add the indexed fields back (tries of LAT, LNG and the PATTERN field)?
        document.getFields() would give my old fields back in the form on List<Fieldable> but the comment says:

        Note that fields which are not stored are not available in documents retrieved from the index, e.g. Searcher.doc(int) or IndexReader.document(int).

        So this won't work either:

        doc = searcher.doc(hits.scoreDocs[0].doc);
        Document ndoc = new Document();
        for (Fieldable field : doc.getFields()) {
            ndoc.add(field);
        }
        ndoc.add(new Field("FINAL", "FINAL", Store.YES, Index.NOT_ANALYZED_NO_NORMS));
        writer.updateDocument(t, ndoc);
        
        Show
        Tamas Sandor added a comment - Yeah, but how can I add the indexed fields back (tries of LAT , LNG and the PATTERN field)? document.getFields() would give my old fields back in the form on List<Fieldable> but the comment says: Note that fields which are not stored are not available in documents retrieved from the index, e.g. Searcher.doc(int) or IndexReader.document(int). So this won't work either: doc = searcher.doc(hits.scoreDocs[0].doc); Document ndoc = new Document(); for (Fieldable field : doc.getFields()) { ndoc.add(field); } ndoc.add( new Field( "FINAL" , "FINAL" , Store.YES, Index.NOT_ANALYZED_NO_NORMS)); writer.updateDocument(t, ndoc);
        Hide
        Shai Erera added a comment -

        If you want to update documents, you should store them in their entirety somewhere (either in a Lucene index as stored fields, all of them), a DB or someplace else. This is how updateDocument currently works.

        Show
        Shai Erera added a comment - If you want to update documents, you should store them in their entirety somewhere (either in a Lucene index as stored fields, all of them), a DB or someplace else. This is how updateDocument currently works.
        Hide
        Shai Erera added a comment -

        This is not the sort of discussions we should be having in JIRA - that's why we have the user list. Closing as it's not a bug, nor a feature/enhancement proposal.

        Show
        Shai Erera added a comment - This is not the sort of discussions we should be having in JIRA - that's why we have the user list. Closing as it's not a bug, nor a feature/enhancement proposal.
        Shai Erera made changes -
        Status Open [ 1 ] Closed [ 6 ]
        Resolution Not A Problem [ 8 ]
        Mark Thomas made changes -
        Workflow jira [ 12542332 ] Default workflow, editable Closed status [ 12564194 ]
        Mark Thomas made changes -
        Workflow Default workflow, editable Closed status [ 12564194 ] jira [ 12585622 ]

          People

          • Assignee:
            Unassigned
            Reporter:
            Tamas Sandor
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development