Uploaded image for project: 'Thrift'
  1. Thrift
  2. THRIFT-2948

Python TJSONProtocol doesn't handle structs with binary fields containing invalid unicode.

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 0.9.2
    • Fix Version/s: 0.10.0
    • Component/s: Python - Library
    • Labels:
      None
    • Environment:

      python 2.7.6, mac OSX yosemite

      Description

      Serializing a struct to JSON using TJSONProtocol can fail with a unicode decode error if the struct contains a binary field with invalid unicode bytes (for example '\xff').

      To recreate:
      Assume you have a TestStruct defined as

      {1: optional binary blob}

      .

      def test_json_serialization():
        thrift_obj = TestStruct('\xff\xff\x00\xaa')
        transport = TTransport.TMemoryBuffer()
        protocol = TJSONProtocol.TJSONProtocol(transport)
        thrift_obj.write(protocol)
      

      Running this will give the following exception:

      Traceback (most recent call last):
        File "/Users/shaunlindsay/sona/simplethrift/test_suite.py", line 32, in test_json_serialize_deserialize
          serialized = simplethrift.serialize_json(original)
        File "/Users/shaunlindsay/sona/simplethrift/simplethrift.py", line 71, in serialize_json
          thrift_obj.write(protocol)
        File "testfiles/gen-py/teststruct/ttypes.py", line 84, in write
          oprot.writeString(self.blob)
        File "/Library/Python/2.7/site-packages/thrift/protocol/TJSONProtocol.py", line 473, in writeString
          self.writeJSONString(string)
        File "/Library/Python/2.7/site-packages/thrift/protocol/TJSONProtocol.py", line 177, in writeJSONString
          self.trans.write(json.dumps(string))
        File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 243, in dumps
          return _default_encoder.encode(obj)
        File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py", line 201, in encode
          return encode_basestring_ascii(o)
      UnicodeDecodeError: 'utf8' codec can't decode byte 0xff in position 0: invalid start byte
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                nsuke Nobuaki Sukegawa
                Reporter:
                srlindsay Shaun Lindsay
              • Votes:
                2 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: