Uploaded image for project: 'Thrift'
  1. Thrift
  2. THRIFT-1841

NodeJS Thrift incorrectly parses non-UTF8-string types

    XMLWordPrintableJSON

Details

    Description

      *Edit*: See my comment below.

      When a double/float is used in a map (key or value), list, or set types, the decoding is done as a utf8 string, which then incorrectly parses and adds extra bytes.

      For example:

      The bytes of a map <double, double> (this is coming out of the Thrift call)

      00 01 00 08 3f f4 00 00 00 00 00 00 00 08 40 02 00 00 00 00 00 00
      

      But after it's been parsed out from the field as UTF8:

      00 01 00 08 3f 3f 00 00 00 00 00 00 00 08 40 02 00 00 00 00 00 00
      

      As you can see there's an incorrect byte (the 3f where the f4, and an extra 00). For reference, this value was map<double, double> =

      {1.25: 2.25}

      . This is the same behavior for floats. The f4 translated to ASCII 247, which I believe isn't a valid utf8 code.

      The actual value of the field becomes:

        value: '\u0000\u0002\u0000\b??\u0000\u0000\u0000\u0000\u0000\u0000\u0000\b@\u0002\u0000\u0000\u0000\u0000\u0000\u0000''
      

      Where the \b = 8, ? = f4, ? = unknown char.

      I have seen cases where there are extra bytes added in, which breaks the parsing based on byte size:

      00 01 00 08 40 24 48 72 c2 b0 20 c3 84 c2 9c 00 08 40 34 c3 bc c3 93 5a c2 85 c2 87 c2 94
      

      Where the MAP value was

      {10.1415, 20.9876}

      . On a list or set, using either value also yields extra bytes.

      So this messes up any parsing based on the byte-length for the field, since there are a variable number of extra bytes added, either to the key or value of the map, and any values of a list. I believe this could also happen on high-integer values.

      It seems to me when the "ftype" is parsed (int16) before the actual field, it's returning a TYPE value of "11" (string) - instead of the proper value of a map/set/list.

      For reference, the table, and an insert example:

      CREATE TABLE sample_map (
          id text PRIMARY KEY, 
          map_col_text map < text, text >, 
          map_col_int map < int, text >, 
          map_col_float map < float, float >,
          map_col_double map < double, double >
      );
      
      INSERT INTO sample_map (id, map_col_double) VALUES('DOUBLE_ROW_SINGLE', {10.1415: 20.9876});
      

      Not sure if it matters, but this was using CQL3. Also, we are not seeing this on the C++ generated Thrift interface.

      Versions:

      cqlsh:orion> show version;
      [cqlsh 2.3.0 | Cassandra 1.2.0 | CQL spec 3.0.0 | Thrift protocol 19.35.0]
      
      $ thrift --version
      Thrift version 0.9.0
      
       "name": "node-thrift",
        "description": "node.js bindings for the Apache Thrift RPC system",
        "homepage": "http://thrift.apache.org/",
        "repository": {
          "type": "svn",
          "url": "http://svn.apache.org/repos/asf/thrift/trunk/"
        },
        "version": "1.0.0-dev",
      

      The issue also appears in the 0.9.0 version of the thrift library.

      Attachments

        Issue Links

          Activity

            People

              henrique Henrique Mendonca
              nabeel Mohamed Yoosuf Mohamed Nabeel
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: