CouchDB
  1. CouchDB
  2. COUCHDB-973

Return 410 when GETing a previously deleted document (rather than 404)

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Trivial Trivial
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      When you GET a nonexistent doc you get (as you should) a 404 Not Found error. However, if you GET a document that has previously existed you also get the 404 response. It would be more informative (IMO) for the 410 Gone response code to be used. 410 Gone's intention is for exactly this use case, and it could have some value to CouchDB developers who need to know the document did exist.

      CouchDB is already half way there as in the body of the 404 response it does state that the document did exist (at least prior to compaction), so outputing a 410 (again, prior to compaction) would hopefully be a trivial patch.

        Activity

        Hide
        Jan Lehnardt added a comment -

        Feel free to reopen with more/newer arguments.

        Show
        Jan Lehnardt added a comment - Feel free to reopen with more/newer arguments.
        Hide
        Jan Lehnardt added a comment -

        410 seems oddly/poorly specified. “What would Roy do?”

        I'm siding with Paul on this one.

        Show
        Jan Lehnardt added a comment - 410 seems oddly/poorly specified. “What would Roy do?” I'm siding with Paul on this one.
        Hide
        Benoit Chesneau added a comment -

        I dunno, if we go for 410, would that mean ?

        • Get 410 then eventually get new body
        • then the server send the deleted revision

        And if compacted and no revision a 404 ?

        Show
        Benoit Chesneau added a comment - I dunno, if we go for 410, would that mean ? Get 410 then eventually get new body then the server send the deleted revision And if compacted and no revision a 404 ?
        Hide
        Paul Joseph Davis added a comment -

        The fact that 410's can be cached makes me even more hesitant to switch over to them because unless clients are sending the etags from possibly unknown previous entries it would seem like a bad possibility of masking a re-created document. Which seems like a reason why they would have said if we know that a resource can be recreated to use 404.

        We don't keep anything similar at the db level so I don't think that's really affected by this.

        Show
        Paul Joseph Davis added a comment - The fact that 410's can be cached makes me even more hesitant to switch over to them because unless clients are sending the etags from possibly unknown previous entries it would seem like a bad possibility of masking a re-created document. Which seems like a reason why they would have said if we know that a resource can be recreated to use 404. We don't keep anything similar at the db level so I don't think that's really affected by this.
        Hide
        Benjamin Young added a comment -

        From rfc2616:
        "A response received with a status code of 200, 203, 206, 300, 301 or
        410 MAY be stored by a cache and used in reply to a subsequent
        request, subject to the expiration mechanism, unless a cache-control
        directive prohibits caching."

        Cache-Control is current set to must-revalidate, so...
        "When the must-revalidate
        directive is present in a response received by a cache, that cache
        MUST NOT use the entry after it becomes stale to respond to a
        subsequent request without first revalidating it with the origin
        server. (I.e., the cache MUST do an end-to-end revalidation every
        time, if, based solely on the origin server's Expires or max-age
        value, the cached response is stale.)"

        So, in the spirit of future and unexpected innovation, I'd like to see this included...but I'm happy to discuss it further.

        Also, there are additional uses for 410 if it were available on /db URL's as well--but I'm not sure if CouchDB keeps that information around.

        Show
        Benjamin Young added a comment - From rfc2616: "A response received with a status code of 200, 203, 206, 300, 301 or 410 MAY be stored by a cache and used in reply to a subsequent request, subject to the expiration mechanism, unless a cache-control directive prohibits caching." Cache-Control is current set to must-revalidate, so... "When the must-revalidate directive is present in a response received by a cache, that cache MUST NOT use the entry after it becomes stale to respond to a subsequent request without first revalidating it with the origin server. (I.e., the cache MUST do an end-to-end revalidation every time, if, based solely on the origin server's Expires or max-age value, the cached response is stale.)" So, in the spirit of future and unexpected innovation, I'd like to see this included...but I'm happy to discuss it further. Also, there are additional uses for 410 if it were available on /db URL's as well--but I'm not sure if CouchDB keeps that information around.
        Hide
        Benoit Chesneau added a comment -

        I woould say that we don't have to guess what a developer could do with or not. We are always surprised. The real question is : is 410 the appropriate response. If yes, then we should go for it. Reading the spec, it seems we should indeed send a 410.

        Also a side note, we aren't a web server but a database using HTTP as a media. Focusing on that make think that 410 may be OK too.

        Show
        Benoit Chesneau added a comment - I woould say that we don't have to guess what a developer could do with or not. We are always surprised. The real question is : is 410 the appropriate response. If yes, then we should go for it. Reading the spec, it seems we should indeed send a 410. Also a side note, we aren't a web server but a database using HTTP as a media. Focusing on that make think that 410 may be OK too.
        Hide
        Robert Newson added a comment -

        I accept the argument that the ability to create a new resource at the same location at any time makes it difficult to call the deletion 'permanent' but consider the normal case of a web server. 410 is returned to indicate to the caller to clean up whatever lingering references they might have to the resource (its not coming back). That a new resource can appear at the same location, yielding a 200 response, just means that software makes a new link.

        However, this is all hand-waving. What software, if any, would be enhanced by a 410 response over a 404 response? What would be the advantage? Is it wrong in principle to expose the difference between deleted and non existent? By exposing the information, are we encouraging clients to change their behavior accordingly? And what would that behavior be?

        In summary, I think 410 could be justified as a response code but it's not clear what value there is in exposing the difference and it's not clear if it's 'right' to expose an internal detail this way.

        Show
        Robert Newson added a comment - I accept the argument that the ability to create a new resource at the same location at any time makes it difficult to call the deletion 'permanent' but consider the normal case of a web server. 410 is returned to indicate to the caller to clean up whatever lingering references they might have to the resource (its not coming back). That a new resource can appear at the same location, yielding a 200 response, just means that software makes a new link. However, this is all hand-waving. What software, if any, would be enhanced by a 410 response over a 404 response? What would be the advantage? Is it wrong in principle to expose the difference between deleted and non existent? By exposing the information, are we encouraging clients to change their behavior accordingly? And what would that behavior be? In summary, I think 410 could be justified as a response code but it's not clear what value there is in exposing the difference and it's not clear if it's 'right' to expose an internal detail this way.
        Hide
        Benoit Chesneau added a comment -

        indeed. That's something I've introduced recently in couchbeam, I'm checking expected header and return appropriate error. Using 410 would allows the user to ask for more eventually. Also 410 should only happen when no compaction occured.

        Show
        Benoit Chesneau added a comment - indeed. That's something I've introduced recently in couchbeam, I'm checking expected header and return appropriate error. Using 410 would allows the user to ask for more eventually. Also 410 should only happen when no compaction occured.
        Hide
        Randall Leeds added a comment -

        Since the document may be recreated at any time the server "does not know...whether or not the condition is permanent". Seems pretty clear cut to me.
        Also, since this is available in the response body clients that don't care can just look at the response code while clients that do care can inspect the body.
        This sounds like a good argument to make to a client library author when asking them to expose the response body of a 404 (if they hide the HTTP details from you generally).

        Show
        Randall Leeds added a comment - Since the document may be recreated at any time the server "does not know...whether or not the condition is permanent". Seems pretty clear cut to me. Also, since this is available in the response body clients that don't care can just look at the response code while clients that do care can inspect the body. This sounds like a good argument to make to a client library author when asking them to expose the response body of a 404 (if they hide the HTTP details from you generally).
        Hide
        Paul Joseph Davis added a comment -

        I almost went to say +1 but after reading the description of 410 again, I noticed this blurb:

        > If the server does not know, or has no facility to determine, whether
        > or not the condition is permanent, the status code 404 (Not Found)
        > SHOULD be used instead."

        http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.11

        I don't think its super clear cut for either one. As mentioned the

        {"not_found": "deleted"}

        can be used to check the same condition so its not that we're missing any capability other than which layer we may want to deal with this particular case.

        If I were to make a snap decision I think I would vote in favor of keeping 404 but I'm not adverse to hearing arguments in favor of 410.

        Show
        Paul Joseph Davis added a comment - I almost went to say +1 but after reading the description of 410 again, I noticed this blurb: > If the server does not know, or has no facility to determine, whether > or not the condition is permanent, the status code 404 (Not Found) > SHOULD be used instead." http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.11 I don't think its super clear cut for either one. As mentioned the {"not_found": "deleted"} can be used to check the same condition so its not that we're missing any capability other than which layer we may want to deal with this particular case. If I were to make a snap decision I think I would vote in favor of keeping 404 but I'm not adverse to hearing arguments in favor of 410.
        Hide
        Robert Newson added a comment -

        I'll re-read the relevant parts of RFC 2616 but this sounds good to me so far.

        Show
        Robert Newson added a comment - I'll re-read the relevant parts of RFC 2616 but this sounds good to me so far.

          People

          • Assignee:
            Unassigned
            Reporter:
            Benjamin Young
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development