Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-1504

Pluggable url partitioner

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.6
    • None
    • generator
    • None
    • Patch Available

    Description

      At present, the url partition logic is hard wired inside nutch core. It should be pluggable like FetchSchedule customized via nutch-site.xml.

      There might be use cases where a single domain needs to be partioned on some custom logic. The existing UrlPartitioner cannot handle such cases.

      Hence the requirement.

      Attachments

        1. custom.partitioner.patch
          6 kB
          Sourajit Basak

        Activity

          People

            lewismc Lewis John McGibbney
            sourajit Sourajit Basak
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: