Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2132

Publisher/Subscriber model for Nutch to emit events

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.13
    • Component/s: fetcher, REST_api
    • Labels:

      Description

      It would be nice to have a Pub/Sub model in Nutch to emit certain events (ex- Fetcher events like fetch-start, fetch-end, a fetch report which may contain data like outlinks of the current fetched url, score, etc).

      A consumer of this functionality could use this data to generate real time visualization and generate statics of the crawl without having to wait for the fetch round to finish.

      The REST API could contain an endpoint which would respond with a url to which a client could subscribe to get the fetcher events.

        Attachments

        1. NUTCH-2132.v2.patch
          18 kB
          Sujen Shah
        2. PubSub_routingkey.patch
          17 kB
          Sujen Shah
        3. NUTCH-2132.patch
          16 kB
          Sujen Shah

          Issue Links

            Activity

              People

              • Assignee:
                chrismattmann Chris A. Mattmann
                Reporter:
                sujenshah Sujen Shah
              • Votes:
                0 Vote for this issue
                Watchers:
                9 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: