Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2479

urlmeta plugin port from 1.x to 2.x

    XMLWordPrintableJSON

    Details

    • Patch Info:
      Patch Available
    • Flags:
      Patch, Important

      Description

      I have ported urlmeta plugin available in 1.x to 2.x

      It is designed to do two things:

      • Meta Tags that are supplied with your Crawl URLs, during injection either through seed.txt or through REST API, will be propagated throughout the out-links of those Crawl URLs
      • When you index your URLs, the meta tags that you specified with your URLs will be indexed alongside those URLs--and can be directly queried, assuming you have done everything else correctly.

      I have also added support through the NutchServer REST-API. Have Attached patch along with this issue.

        Attachments

        1. Ninaad.Joshi.plugin.urlmeta.patch
          118 kB
          Ninaad Joshi

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              ninaadj@gmail.com Ninaad Joshi
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: