Uploaded image for project: 'Jackrabbit Oak'
  1. Jackrabbit Oak
  2. OAK-4804

Synonym analyzer with multiple words in synonym definition can give more results than expected

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • lucene
    • None

    Description

      Setting up synonyms such as "FTW, For the win" would also return documents which contain all of "For", "the", "win".

      Test case:

          @Test
          public void fulltextSearchWithPhraseSynonymAnalyzer() throws Exception {
              Tree idx = createFulltextIndex(root.getTree("/"), "test");
              TestUtil.useV2(idx);
      
              Tree anl = idx.addChild(LuceneIndexConstants.ANALYZERS).addChild(LuceneIndexConstants.ANL_DEFAULT);
              anl.addChild(LuceneIndexConstants.ANL_TOKENIZER).setProperty(LuceneIndexConstants.ANL_NAME, "Standard");
              Tree synFilter = anl.addChild(LuceneIndexConstants.ANL_FILTERS).addChild("Synonym");
              synFilter.setProperty("synonyms", "syn.txt");
              synFilter.addChild("syn.txt").addChild(JCR_CONTENT).setProperty(JCR_DATA, "FTW, For the win");
      
              Tree test = root.getTree("/").addChild("test");
              test.addChild("1").setProperty("foo", "FTW");
              test.addChild("2").setProperty("foo", "For the win");
              test.addChild("3").setProperty("foo", "For gods sake, this is not the way to win it");
              root.commit();
      
              assertQuery("select * from [nt:base] where CONTAINS(*, 'FTW') AND ISDESCENDANTNODE('/test')",
                      asList("/test/1", "/test/2"));//current (failing result is ["/test/1", "/test/2", "/test/3"])
          }
      

      Attachments

        Issue Links

          Activity

            People

              catholicon Vikas Saurabh
              catholicon Vikas Saurabh
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: