Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2479

urlmeta plugin port from 1.x to 2.x

    XMLWordPrintableJSON

Details

    • Patch Available
    • Patch, Important

    Description

      I have ported urlmeta plugin available in 1.x to 2.x

      It is designed to do two things:

      • Meta Tags that are supplied with your Crawl URLs, during injection either through seed.txt or through REST API, will be propagated throughout the out-links of those Crawl URLs
      • When you index your URLs, the meta tags that you specified with your URLs will be indexed alongside those URLs--and can be directly queried, assuming you have done everything else correctly.

      I have also added support through the NutchServer REST-API. Have Attached patch along with this issue.

      Attachments

        1. Ninaad.Joshi.plugin.urlmeta.patch
          118 kB
          Ninaad Joshi

        Activity

          People

            Unassigned Unassigned
            ninaadj@gmail.com Ninaad Joshi
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: