Nutch
  1. Nutch
  2. NUTCH-932

Bulk REST API to retrieve crawl results as JSON

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: nutchgora
    • Fix Version/s: nutchgora
    • Component/s: REST_api
    • Labels:
      None

      Description

      It would be useful to be able to retrieve results of a crawl as JSON. There are a few things that need to be discussed:

      • how to return bulk results using Restlet (WritableRepresentation subclass?)
      • what should be the format of results?

      I think it would make sense to provide a single record retrieval (by primary key), all records, and records within a range. This incidentally matches well the capabilities of the Gora Query class

      1. NUTCH-932-4.patch
        88 kB
        Andrzej Bialecki
      2. NUTCH-932-3.patch
        80 kB
        Andrzej Bialecki
      3. NUTCH-932-2.patch
        66 kB
        Andrzej Bialecki
      4. NUTCH-932.patch
        37 kB
        Andrzej Bialecki
      5. NUTCH-932.patch
        40 kB
        Andrzej Bialecki
      6. NUTCH-932.patch
        54 kB
        Andrzej Bialecki
      7. db.formatted.gz
        155 kB
        Andrzej Bialecki

        Activity

        Andrzej Bialecki created issue -
        Andrzej Bialecki made changes -
        Field Original Value New Value
        Attachment NUTCH-932.patch [ 12458834 ]
        Andrzej Bialecki made changes -
        Attachment db.formatted.gz [ 12458836 ]
        Andrzej Bialecki made changes -
        Attachment NUTCH-932.patch [ 12458844 ]
        Andrzej Bialecki made changes -
        Attachment NUTCH-932.patch [ 12458983 ]
        Andrzej Bialecki made changes -
        Attachment NUTCH-932-2.patch [ 12459450 ]
        Andrzej Bialecki made changes -
        Attachment NUTCH-932-3.patch [ 12459460 ]
        Andrzej Bialecki made changes -
        Attachment NUTCH-932-4.patch [ 12460443 ]
        Andrzej Bialecki made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Fix Version/s 2.0 [ 12314893 ]
        Resolution Fixed [ 1 ]
        Lewis John McGibbney made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Andrzej Bialecki
            Reporter:
            Andrzej Bialecki
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development