Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.3
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      The lucene/join module expects a certain index structure but nothing validates it. Then at search time it either needs to validate the index structure in a slow way or be lenient and cope with what it is given.

      1. LUCENE-6589.patch
        20 kB
        Adrien Grand
      2. LUCENE-6589.patch
        16 kB
        Adrien Grand

        Activity

        Hide
        Adrien Grand added a comment -

        Here is a patch which adds a CheckJoinIndex utility class and runs it in the join tests.

        I fixed some tests to not index garbage documents, and had to remove a test that assumed it was ok to have a non-deleted parent document with deleted children.

        Show
        Adrien Grand added a comment - Here is a patch which adds a CheckJoinIndex utility class and runs it in the join tests. I fixed some tests to not index garbage documents, and had to remove a test that assumed it was ok to have a non-deleted parent document with deleted children.
        Hide
        Martijn van Groningen added a comment -

        +1! Maybe also add a test were this validation fails?

        Would be great if this validation could optionally run too in block join queries? Before a the Scorer goes do its job, if a validate option is enabled we would check the segment is sane. But this can be done in a followup issue (if it makes sense).

        Show
        Martijn van Groningen added a comment - +1! Maybe also add a test were this validation fails? Would be great if this validation could optionally run too in block join queries? Before a the Scorer goes do its job, if a validate option is enabled we would check the segment is sane. But this can be done in a followup issue (if it makes sense).
        Hide
        Adrien Grand added a comment -

        Good point about testing the tester, I added a new test to make sure that we throw an exception when the index has an illegal structure.

        However I disagree about running this check while running block join queries. I think it should considered a heavy operation given that it needs to read the whole index and should only be run to ensure that the index has a good structure, pretty much like CheckIndex?

        Show
        Adrien Grand added a comment - Good point about testing the tester, I added a new test to make sure that we throw an exception when the index has an illegal structure. However I disagree about running this check while running block join queries. I think it should considered a heavy operation given that it needs to read the whole index and should only be run to ensure that the index has a good structure, pretty much like CheckIndex?
        Hide
        Martijn van Groningen added a comment -

        Good point about testing the tester, I added a new test to make sure that we throw an exception when the index has an illegal structure.

        Thanks!

        However I disagree about running this check while running block join queries. I think it should considered a heavy operation given that it needs to read the whole index and should only be run to ensure that the index has a good structure, pretty much like CheckIndex?

        true, good point. I was looking for a way to have the ability to run this validation easily elsewhere too (via a low level debug option). But having the utility available when the join jar is on the class path is good too, systems relying on block join can just add this to their testing infrastructure.

        +1 to commit!

        Show
        Martijn van Groningen added a comment - Good point about testing the tester, I added a new test to make sure that we throw an exception when the index has an illegal structure. Thanks! However I disagree about running this check while running block join queries. I think it should considered a heavy operation given that it needs to read the whole index and should only be run to ensure that the index has a good structure, pretty much like CheckIndex? true, good point. I was looking for a way to have the ability to run this validation easily elsewhere too (via a low level debug option). But having the utility available when the join jar is on the class path is good too, systems relying on block join can just add this to their testing infrastructure. +1 to commit!
        Hide
        Uwe Schindler added a comment -

        Can we also add this to Solr indexing checks? Because the search problems come merely from broken Solr indexes. I already wrote aout that on another issue, so I would really like to ad this testing to Solr!

        Show
        Uwe Schindler added a comment - Can we also add this to Solr indexing checks? Because the search problems come merely from broken Solr indexes. I already wrote aout that on another issue, so I would really like to ad this testing to Solr!
        Hide
        Adrien Grand added a comment -

        I agree it would be nice to have it integrated into Solr but I'm not sure where it should be: this is too expensive to run on every query and should be treated like CheckIndex. Should I open an issue to integrate CheckIndex and this new CheckJoinIndex into Solr?

        Show
        Adrien Grand added a comment - I agree it would be nice to have it integrated into Solr but I'm not sure where it should be: this is too expensive to run on every query and should be treated like CheckIndex. Should I open an issue to integrate CheckIndex and this new CheckJoinIndex into Solr?
        Hide
        Adrien Grand added a comment -

        I opened SOLR-7744.

        Show
        Adrien Grand added a comment - I opened SOLR-7744 .
        Hide
        ASF subversion and git services added a comment -

        Commit 1689637 from Adrien Grand in branch 'dev/trunk'
        [ https://svn.apache.org/r1689637 ]

        LUCENE-6589: Add CheckJoinIndex to validate index structure for joins.

        Show
        ASF subversion and git services added a comment - Commit 1689637 from Adrien Grand in branch 'dev/trunk' [ https://svn.apache.org/r1689637 ] LUCENE-6589 : Add CheckJoinIndex to validate index structure for joins.
        Hide
        ASF subversion and git services added a comment -

        Commit 1689638 from Adrien Grand in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1689638 ]

        LUCENE-6589: Add CheckJoinIndex to validate index structure for joins.

        Show
        ASF subversion and git services added a comment - Commit 1689638 from Adrien Grand in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1689638 ] LUCENE-6589 : Add CheckJoinIndex to validate index structure for joins.
        Hide
        ASF subversion and git services added a comment -

        Commit 1689648 from Adrien Grand in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1689648 ]

        LUCENE-6589: Fix test bug.

        Show
        ASF subversion and git services added a comment - Commit 1689648 from Adrien Grand in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1689648 ] LUCENE-6589 : Fix test bug.
        Hide
        ASF subversion and git services added a comment -

        Commit 1689649 from Adrien Grand in branch 'dev/trunk'
        [ https://svn.apache.org/r1689649 ]

        LUCENE-6589: Fix test bug.

        Show
        ASF subversion and git services added a comment - Commit 1689649 from Adrien Grand in branch 'dev/trunk' [ https://svn.apache.org/r1689649 ] LUCENE-6589 : Fix test bug.
        Hide
        Shalin Shekhar Mangar added a comment -

        Bulk close for 5.3.0 release

        Show
        Shalin Shekhar Mangar added a comment - Bulk close for 5.3.0 release
        Hide
        Mikhail Khludnev added a comment -

        Adrien Grand why don't allow to children to be deleted?

        Show
        Mikhail Khludnev added a comment - Adrien Grand why don't allow to children to be deleted?
        Hide
        Adrien Grand added a comment -

        Mikhail Khludnev We already require a block of documents to be entirely reindexed when adding or modifying a child document, so I think it's consistent to also have this requirement when deleting a child?

        In addition, this is required after the simplifications from LUCENE-6553: now that live docs are applied on top of the query, we would only check if the parent doc IDs are live if you run a ToParentBlockJoinQuery, never the children.

        Show
        Adrien Grand added a comment - Mikhail Khludnev We already require a block of documents to be entirely reindexed when adding or modifying a child document, so I think it's consistent to also have this requirement when deleting a child? In addition, this is required after the simplifications from LUCENE-6553 : now that live docs are applied on top of the query, we would only check if the parent doc IDs are live if you run a ToParentBlockJoinQuery, never the children.
        Hide
        Mikhail Khludnev added a comment -

        thus, we presumably could support deleting a child doc in block, but I agree - it's unnecessarily complicate the code. Let's keep it as-is until we've got a considerable justification from users.
        Thanks for the confirmation.

        Show
        Mikhail Khludnev added a comment - thus, we presumably could support deleting a child doc in block, but I agree - it's unnecessarily complicate the code. Let's keep it as-is until we've got a considerable justification from users. Thanks for the confirmation.

          People

          • Assignee:
            Unassigned
            Reporter:
            Adrien Grand
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development