Forrest
  1. Forrest
  2. FOR-246

Lucene search complains missing docs which definitely do not exist

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.6
    • Fix Version/s: 0.7
    • Component/s: Core operations
    • Labels:
      None

      Description

      Lucene search fails with "SourceNotFoundException: file:/...src/documentation/content/xdocs/docs/howto-v13.xml doesn't exist". However that document should not exist, so the Lucene indexer is mistakenly being told to read bad URIs.

        Activity

        Hide
        David Crossley added a comment -
        Thanks Florian, i applied your patches to both trunk and forrest_06_branch.

        Regarding your comment about not receiving notification of issue changes ... I was the original reporter of the bug, so i do get notified. Anyone else would need to either monitor the forrest-dev mailing list where all notifications go, or they would need to use Jira's "Watch it" facility in the bottom-left panel.

        Regarding your comment about the "dead-end links in site.xml" ... They are a hack to get us over the first hurdle of splitting the website documentation into "dev" and "release" docs. There must be a better way, and when we find it then this issue for Lucence index should go away.
        Show
        David Crossley added a comment - Thanks Florian, i applied your patches to both trunk and forrest_06_branch. Regarding your comment about not receiving notification of issue changes ... I was the original reporter of the bug, so i do get notified. Anyone else would need to either monitor the forrest-dev mailing list where all notifications go, or they would need to use Jira's "Watch it" facility in the bottom-left panel. Regarding your comment about the "dead-end links in site.xml" ... They are a hack to get us over the first hurdle of splitting the website documentation into "dev" and "release" docs. There must be a better way, and when we find it then this issue for Lucence index should go away.
        Hide
        Florian G. Haas added a comment -
        * Fixes book2cinclude-lucene.xsl to correctly handle filenames (relative URLs) with multiple dots.
        * Fixes sitemap.xmap so requests ending in ".lucene" are handled properly.
        * Fixes a trivial comment typo in search.xmap.

        Unresolved: handling dead links in site.xml, as described in my earlier comment and as applicable to the current "site-author" documentation.
        Show
        Florian G. Haas added a comment - * Fixes book2cinclude-lucene.xsl to correctly handle filenames (relative URLs) with multiple dots. * Fixes sitemap.xmap so requests ending in ".lucene" are handled properly. * Fixes a trivial comment typo in search.xmap. Unresolved: handling dead links in site.xml, as described in my earlier comment and as applicable to the current "site-author" documentation.
        Hide
        Florian G. Haas added a comment -
        This behavior is due to several factors:

        * First and foremost, an error in book2cinclude-lucene.xsl which butchered filenames with more than one dot (.) in them. This is why the relative URL howto-v13.dtdx.html is not interpreted correctly and the Lucene facility expects to find a file named howto-v13.xml, which it of course doesn't.

        * Secondly, still pertaining to the docs in "docs-author", a mixup in priorities in the sitemap.xmap. Any file with "images" in the filename is thus missed by the Lucene indexer.

        * Thirdly, WRT the docs in "site-author", the fact that there are some entries in site.xml which don't have any corresponding filenames, namely, the menu items /docs/ and /docs-dev/. I have no idea how to have this handled properly by the Lucene indexer. Any help would be dearly appreciated.

        Sorry it took so long to tackle this issue. I seem not to have properly received the notification message back in August, and I just now spotted the entry on the issue tracker.
        Show
        Florian G. Haas added a comment - This behavior is due to several factors: * First and foremost, an error in book2cinclude-lucene.xsl which butchered filenames with more than one dot (.) in them. This is why the relative URL howto-v13.dtdx.html is not interpreted correctly and the Lucene facility expects to find a file named howto-v13.xml, which it of course doesn't. * Secondly, still pertaining to the docs in "docs-author", a mixup in priorities in the sitemap.xmap. Any file with "images" in the filename is thus missed by the Lucene indexer. * Thirdly, WRT the docs in "site-author", the fact that there are some entries in site.xml which don't have any corresponding filenames, namely, the menu items /docs/ and /docs-dev/. I have no idea how to have this handled properly by the Lucene indexer. Any help would be dearly appreciated. Sorry it took so long to tackle this issue. I seem not to have properly received the notification message back in August, and I just now spotted the entry on the issue tracker.

          People

          • Assignee:
            David Crossley
            Reporter:
            David Crossley
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development