Solr
  1. Solr
  2. SOLR-124

use NewIndexModifier, LUCENE-565

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: 1.3
    • Component/s: update
    • Labels:
      None

      Description

      LUCENE-565 adds extension points to the IndexWriter, and adds delete-by-term functionality.
      We should probably take advantage of this (when available) in our UpdateHandler (a new one, or modify DU2?)
      and perhaps implement a more efficient deleteByQuery.

        Activity

        Yonik Seeley created issue -
        Hide
        Yonik Seeley added a comment -

        LUCEUE-565 has been committed, but Solr doesn't have that version yet.
        Lucene 2.1 is right around the corner though.

        Show
        Yonik Seeley added a comment - LUCEUE-565 has been committed, but Solr doesn't have that version yet. Lucene 2.1 is right around the corner though.
        Yonik Seeley made changes -
        Field Original Value New Value
        Summary use LUCENE-565 use NewIndexModifier, LUCENE-565
        Hide
        Tim Patton added a comment -

        I see that Lucene 2.1 is out and the latest DL of Solr that I got had the jars. Will this patch be added in soon, or is it available somewhere now?

        Show
        Tim Patton added a comment - I see that Lucene 2.1 is out and the latest DL of Solr that I got had the jars. Will this patch be added in soon, or is it available somewhere now?
        Hide
        Hoss Man added a comment -

        no patch exists yet ... this issue was opened to track that it should be done at some point.

        i believe it will be a somewhat significant change, but i'm not much of an expert on the update internals.

        Show
        Hoss Man added a comment - no patch exists yet ... this issue was opened to track that it should be done at some point. i believe it will be a somewhat significant change, but i'm not much of an expert on the update internals.
        Hide
        Tim Patton added a comment -

        Oh I read "LUCEUE-565 has been committed" as meaning Yonik had committed the code to some branch or had at least checked the code worked but was waiting to commit it in to the trunk.

        If one were to work on this, would it be easier to work from DU2 or the original DU? It looks like DU2 duplicates a lot of features of the new IndexWriter and DU would be a more straitforward starting point.

        Show
        Tim Patton added a comment - Oh I read "LUCEUE-565 has been committed" as meaning Yonik had committed the code to some branch or had at least checked the code worked but was waiting to commit it in to the trunk. If one were to work on this, would it be easier to work from DU2 or the original DU? It looks like DU2 duplicates a lot of features of the new IndexWriter and DU would be a more straitforward starting point.
        Hide
        Mike Klaas added a comment -

        DUH may be simpler, but DUH2 has also been carefully modified to safely support multithreaded indexing and autocommitting.

        Does anyone have a feeling on whether LUCENE-565 will improve performance? Or is it likely to be mostly a code cleansliness improvement?

        Show
        Mike Klaas added a comment - DUH may be simpler, but DUH2 has also been carefully modified to safely support multithreaded indexing and autocommitting. Does anyone have a feeling on whether LUCENE-565 will improve performance? Or is it likely to be mostly a code cleansliness improvement?
        Hide
        Yonik Seeley added a comment -

        > Does anyone have a feeling on whether LUCENE-565 will improve performance?

        Probably not much, if at all. Deletions happen after a segment flush, which is slightly less efficient (more indexreaders that need to be opened), but does get deletions in the index faster (meaning they are more likely to be "squeezed" out by a subsequent segment merge).

        One small advantage to LUCENE-565 is that overwriting is atomic... you can't crash and see duplicates.

        The patch has been changed around multiple times, and it would now be necessary to include a lucene package in solr to get access to package-protected stuff that would allow efficient delete-by-query.

        Show
        Yonik Seeley added a comment - > Does anyone have a feeling on whether LUCENE-565 will improve performance? Probably not much, if at all. Deletions happen after a segment flush, which is slightly less efficient (more indexreaders that need to be opened), but does get deletions in the index faster (meaning they are more likely to be "squeezed" out by a subsequent segment merge). One small advantage to LUCENE-565 is that overwriting is atomic... you can't crash and see duplicates. The patch has been changed around multiple times, and it would now be necessary to include a lucene package in solr to get access to package-protected stuff that would allow efficient delete-by-query.
        Hide
        Tim Patton added a comment -

        Which package protected parts of lucene would need to be accessed? Perhaps a patch to lucene could be submitted to support dleete by query (or at least all easy support of delete by query).

        Show
        Tim Patton added a comment - Which package protected parts of lucene would need to be accessed? Perhaps a patch to lucene could be submitted to support dleete by query (or at least all easy support of delete by query).
        Hide
        Koji Sekiguchi added a comment -

        I could be wrong but is this duplicate of SOLR-559?

        Show
        Koji Sekiguchi added a comment - I could be wrong but is this duplicate of SOLR-559 ?
        Hide
        Yonik Seeley added a comment -

        Yes it is mostly a duplicate (I did a search before I opened 559, but didn't turn up this one).

        I had been planning on having to implement our own deleteByQuery efficiently w/ certain IndexWriter extension points, but Lucene now has deleteByQuery in trunk, so we should just wait for that.

        Show
        Yonik Seeley added a comment - Yes it is mostly a duplicate (I did a search before I opened 559, but didn't turn up this one). I had been planning on having to implement our own deleteByQuery efficiently w/ certain IndexWriter extension points, but Lucene now has deleteByQuery in trunk, so we should just wait for that.
        Yonik Seeley made changes -
        Status Open [ 1 ] Closed [ 6 ]
        Resolution Duplicate [ 3 ]
        Fix Version/s 1.3 [ 12312486 ]

          People

          • Assignee:
            Unassigned
            Reporter:
            Yonik Seeley
          • Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development