Nutch
  1. Nutch
  2. NUTCH-854

Define standard attributes with values and explaination to configuration files in conf directory

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Not a Problem
    • Affects Version/s: None
    • Fix Version/s: nutchgora
    • Component/s: None
    • Labels:
      None
    • Environment:

      Window XP SP3, Cygwin, JDK 1.6.20, Ant 1.8.1

      Description

      It would make nutch easier to use if all configuration file in conf directory is defined standard attributes with values and explanation. For example, currently nutch-site.xml.template contains no attributes and no explanation, we should define them.

      -------------
      <?xml version="1.0"?>
      <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

      <!-- site-specific property overrides in this file. -->

      <configuration>

      <!-- Agent name-->
      <property>
      <name>http.agent.name</name>
      <value>nutch-solr-integration</value>
      </property>

      <!---->
      <property>
      <name>generate.max.per.host</name>
      <value>100</value>
      </property>
      <property>

      <!-- plug-in using in this site -->
      <name>plugin.includes</name>
      <value>protocol-http|urlfilter-regex|parse-tika|scoring-opic|urlnormalizer-(pass|regex|basic)</value>
      </property>
      </configuration>
      -------------

      Thanks,

        Activity

        Hide
        Pham Tuan Minh added a comment -

        My idea is we define standard attributes for nutch work. In case user want some customizations in crawling their web site (data source), they will define their attributes in nutch-site.xml to override.

        Show
        Pham Tuan Minh added a comment - My idea is we define standard attributes for nutch work. In case user want some customizations in crawling their web site (data source), they will define their attributes in nutch-site.xml to override.
        Hide
        Julien Nioche added a comment -

        nutch-default.xml already does that : all the parameters are listed and commented along with their default values.

        Show
        Julien Nioche added a comment - nutch-default.xml already does that : all the parameters are listed and commented along with their default values.
        Show
        Markus Jelsma added a comment - Bulk close of resolved issues: http://www.lucidimagination.com/search/document/2738eeb014805854/clean_up_open_legacy_issues_in_jira

          People

          • Assignee:
            Unassigned
            Reporter:
            Pham Tuan Minh
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development