Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-932

Bulk REST API to retrieve crawl results as JSON

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • nutchgora
    • nutchgora
    • REST_api
    • None

    Description

      It would be useful to be able to retrieve results of a crawl as JSON. There are a few things that need to be discussed:

      • how to return bulk results using Restlet (WritableRepresentation subclass?)
      • what should be the format of results?

      I think it would make sense to provide a single record retrieval (by primary key), all records, and records within a range. This incidentally matches well the capabilities of the Gora Query class

      Attachments

        1. NUTCH-932-4.patch
          88 kB
          Andrzej Bialecki
        2. NUTCH-932-3.patch
          80 kB
          Andrzej Bialecki
        3. NUTCH-932-2.patch
          66 kB
          Andrzej Bialecki
        4. NUTCH-932.patch
          54 kB
          Andrzej Bialecki
        5. NUTCH-932.patch
          40 kB
          Andrzej Bialecki
        6. db.formatted.gz
          155 kB
          Andrzej Bialecki
        7. NUTCH-932.patch
          37 kB
          Andrzej Bialecki

        Activity

          People

            ab Andrzej Bialecki
            ab Andrzej Bialecki
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: