CouchDB
  1. CouchDB
  2. COUCHDB-1039

"High ASCII" characters on PUT'ed URL causes db to misbehave

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 1.0.1, 1.2
    • Fix Version/s: 1.2
    • Component/s: None
    • Labels:
      None
    • Skill Level:
      Dont Know

      Description

      I've tried to PUT a doc to an id containing "high ascii" characters and couchdb (correctly, I imagine) refuses to save it and responds with an error. But any subsequent queries to the db special _all_docs document returns a double response (200 and 500) (much like COUCHDB-48) with a bad_utf8_character_code error.

      Tested on both 1.0.1 (from the Ubuntu Maverick repos) and svn (1.2.0a106148)

      1. badtext.tar.gz
        1 kB
        Thiago Arrais
      2. validate_utf8_docid.patch
        2 kB
        Paul Joseph Davis

        Activity

        Hide
        Thiago Arrais added a comment -

        I've modified this script from COUCHDB-345 that demonstrates the issue.

        Show
        Thiago Arrais added a comment - I've modified this script from COUCHDB-345 that demonstrates the issue.
        Hide
        Paul Joseph Davis added a comment -

        Bug verified and I think I've got the fix. will update shortly.

        Show
        Paul Joseph Davis added a comment - Bug verified and I think I've got the fix. will update shortly.
        Hide
        Paul Joseph Davis added a comment -

        We weren't validating document id's that get pulled from URL's to be valid UTF-8. This patch adds a check to couch_doc:validate_id to go over the id and check with similar code that adam wrote for mochijson2:tokenize_string_fast/2.

        The only thing that worries me is that this is in the the write path for new docs, but AFAICT, its an unavoidable check. Though someone may want to maybe re-exam putting it in couch_doc:validate_id or in the actual PUT request handler.

        Show
        Paul Joseph Davis added a comment - We weren't validating document id's that get pulled from URL's to be valid UTF-8. This patch adds a check to couch_doc:validate_id to go over the id and check with similar code that adam wrote for mochijson2:tokenize_string_fast/2. The only thing that worries me is that this is in the the write path for new docs, but AFAICT, its an unavoidable check. Though someone may want to maybe re-exam putting it in couch_doc:validate_id or in the actual PUT request handler.
        Hide
        Thiago Arrais added a comment -

        Patch verified and works. Thanks, Paul!

        Show
        Thiago Arrais added a comment - Patch verified and works. Thanks, Paul!
        Hide
        Paul Joseph Davis added a comment -

        Applied in 1064417

        Show
        Paul Joseph Davis added a comment - Applied in 1064417

          People

          • Assignee:
            Unassigned
            Reporter:
            Thiago Arrais
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development