Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2182

Make reverseUrlDirs file dumper option hash the URL for consistency

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.11
    • 1.12
    • tool
    • None

    Description

      At the moment the "reverseUrlDirs" option for FileDumper is terribly brittle and fails on a fair number of edge cases. A more robust way to handle the reverse URL approach to dumping a file is to reverse the server part and hash the URL to use as the file name. This gives us a nice split of files while avoiding a number of likely classes that causes dumps to fail.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            mjoyce Michael Joyce
            mjoyce Michael Joyce
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment