Lucene - Core
  1. Lucene - Core
  2. LUCENE-4060

DirectoryTaxonomyWriter.addTaxonomies does not work well in a multi-threaded environment

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Cannot Reproduce
    • Affects Version/s: None
    • Fix Version/s: 4.0-ALPHA, 3.6.1
    • Component/s: modules/facet
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      DirTaxoWriter.addTaxonomies may lead to a corrupt taxonomy index if addCategory is called in parallel. While reviewing the code, the following changes seemed to be required:

      • Fix the bug
      • Simplify addTaxonomies code – it's very complicated, seems rather inefficient, and seems to take care of more than is needed.
      • The intention to support multiple taxonomies is unclear, especially since FacetsPayloadProcessor can handle only one Directory and one OrdinalMap at a time. So I'd like to change the method to addTaxonomy. I anyway don't see the reason to add multiple taxonomies at once, one can call addTaxonomy several times.

      I will post a patch soon.

        Activity

        Shai Erera created issue -
        Hide
        Shai Erera added a comment -

        Patch does:

        • Fix the bug
        • Renames addTaxonomies to addTaxonomy
        • Greatly simplifies addTaxonomy logic
        • Overhauls TestAddTaxonomies to be simpler and test what it needs to test - for some reason it tested that after addTaxonmy the new categories are sorted in a lexicographic order, which is a side effect of addTaxonomy, but does not need to be like that.

        NOTE: before applying the patch you should run

        svn mv lucene/contrib/facet/src/test/org/apache/lucene/facet/taxonomy/directory/TestAddTaxonomies.java lucene/contrib/facet/src/test/org/apache/lucene/facet/taxonomy/directory/TestAddTaxonomy.java
        

        One nice side effect of this improvement is that testBig, with the really huge taxonomy, takes 12 seconds faster to run after the change. While this is not a pure benchmark, it does show that the modified implementation is faster than the older.

        I will commit this shortly and port to trunk.

        Show
        Shai Erera added a comment - Patch does: Fix the bug Renames addTaxonomies to addTaxonomy Greatly simplifies addTaxonomy logic Overhauls TestAddTaxonomies to be simpler and test what it needs to test - for some reason it tested that after addTaxonmy the new categories are sorted in a lexicographic order, which is a side effect of addTaxonomy, but does not need to be like that. NOTE: before applying the patch you should run svn mv lucene/contrib/facet/src/test/org/apache/lucene/facet/taxonomy/directory/TestAddTaxonomies.java lucene/contrib/facet/src/test/org/apache/lucene/facet/taxonomy/directory/TestAddTaxonomy.java One nice side effect of this improvement is that testBig, with the really huge taxonomy, takes 12 seconds faster to run after the change. While this is not a pure benchmark, it does show that the modified implementation is faster than the older. I will commit this shortly and port to trunk.
        Shai Erera made changes -
        Field Original Value New Value
        Attachment LUCENE-4060.patch [ 12527574 ]
        Hide
        Shai Erera added a comment -

        Committed rev 1339032 (3.6.1) and 1339047 (trunk). Also add CHANGES entry and a concurrency test to TestAddTaxonomy.

        Thanks Gilad !

        Show
        Shai Erera added a comment - Committed rev 1339032 (3.6.1) and 1339047 (trunk). Also add CHANGES entry and a concurrency test to TestAddTaxonomy. Thanks Gilad !
        Shai Erera made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Lucene Fields New [ 10121 ] New,Patch Available [ 10121, 10120 ]
        Resolution Cannot Reproduce [ 5 ]
        Hide
        Uwe Schindler added a comment -

        Bulk close for 3.6.1

        Show
        Uwe Schindler added a comment - Bulk close for 3.6.1
        Uwe Schindler made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Shai Erera
            Reporter:
            Shai Erera
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development