Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-1971

The crawldb.url.filters property is not present in any configuration file

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

      Description

      In CrawlDbFilter.java there is a line for getting a boolean that sets if the filters are going to be applied or not:

      public static final String URL_FILTERING = "crawldb.url.filters";

      However in nutch-default.xml that property is not present. Currently the only way to set this value is using the -filter parameter from the command line.

      The same applies to:
      public static final String URL_NORMALIZING = "crawldb.url.normalizers";
      public static final String URL_NORMALIZING_SCOPE = "crawldb.url.normalizers.scope";

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              betolink Luis Lopez

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment