Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-932

Bulk REST API to retrieve crawl results as JSON

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: nutchgora
    • Fix Version/s: nutchgora
    • Component/s: REST_api
    • Labels:
      None

      Description

      It would be useful to be able to retrieve results of a crawl as JSON. There are a few things that need to be discussed:

      • how to return bulk results using Restlet (WritableRepresentation subclass?)
      • what should be the format of results?

      I think it would make sense to provide a single record retrieval (by primary key), all records, and records within a range. This incidentally matches well the capabilities of the Gora Query class

        Attachments

        1. NUTCH-932-4.patch
          88 kB
          Andrzej Bialecki
        2. NUTCH-932-3.patch
          80 kB
          Andrzej Bialecki
        3. NUTCH-932-2.patch
          66 kB
          Andrzej Bialecki
        4. NUTCH-932.patch
          54 kB
          Andrzej Bialecki
        5. NUTCH-932.patch
          40 kB
          Andrzej Bialecki
        6. db.formatted.gz
          155 kB
          Andrzej Bialecki
        7. NUTCH-932.patch
          37 kB
          Andrzej Bialecki

          Activity

            People

            • Assignee:
              ab Andrzej Bialecki
              Reporter:
              ab Andrzej Bialecki
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: