According to the specs:
a string is encoded as a long followed by that many bytes of UTF-8 encoded character data.
However, that is currently not being adhered to:
The first thing the code does here is to load an int value, not a long. Because of the variable length nature of the size, this will mostly work. However, there may be edge-cases where the serializer is putting in large length values erroneously or nefariously. Let us gracefully detect such scenarios and more closely adhere to the spec.