Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-1971

The crawldb.url.filters property is not present in any configuration file

    XMLWordPrintableJSON

Details

    Description

      In CrawlDbFilter.java there is a line for getting a boolean that sets if the filters are going to be applied or not:

      public static final String URL_FILTERING = "crawldb.url.filters";

      However in nutch-default.xml that property is not present. Currently the only way to set this value is using the -filter parameter from the command line.

      The same applies to:
      public static final String URL_NORMALIZING = "crawldb.url.normalizers";
      public static final String URL_NORMALIZING_SCOPE = "crawldb.url.normalizers.scope";

      Attachments

        Activity

          People

            Unassigned Unassigned
            betolink Luis Lopez
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: