HBase
  1. HBase
  2. HBASE-2475

[stargate] Required ordering of JSON name/value pairs when performing Insert/Update

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None

      Description

      From Tyler Coffin up on hbase-user@

      I am using the Stargate REST interface to HBase for inserting data. When using JSON to transmit the query content, I have found that specific ordering of key/value pairs within the JSON string is required in order for the query to succeed (otherwise a response of 'HTTP/1.1 500 Row key is invalid' error is thrown if "key" and "Cell" are reversed).

      Example:
      This string receives the above error:

      {"Row":[{"Cell":[{"column":"bWVzc2FnZTptc2c=","$":"Zm9vYmFy"}],"key":"MTIzNAo="}]}
      

      This is the valid equivalent string:

      {"Row":[{"key":"MTIzNAo=","Cell":[{"column":"bWVzc2FnZTptc2c=","$":"Zm9vYmFy"}]}]}
      

      As you can see the only difference between these two instances is that the "key" and "Cell" name/value pairs have their order reversed.

      In the equivalent XML notation, the ordering is specifically required per the schema. However with JSON Objects (i.e. name/value pairs) order is not required (JSON Arrays are ordered, but not Objects). Some JSON libraries will preserve ordering of Objects but not all which is how I discovered this problem in the first place because I was using the Perl JSON library which does not guarantee order).
      I'm unsure if this is a bug in the REST implementation or an inconvenient ambiguity in the JSON specification. Regardless I thought I'd share this discovery with the community for feedback (or at the very least to document this for users' future reference).

      For reference this is the table schema for the above query:

      {NAME => 'reftrack', FAMILIES => [{NAME => 'message', COMPRESSION =>
      'NONE', VERSIONS => '1', TTL => '2147483647', BLOCKSIZE => '65536',
      IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}
      
      1. HBASE-2475.patch
        7 kB
        Andrew Purtell

        Issue Links

          Activity

          Hide
          Andrew Purtell added a comment -

          Thanks for the report, very helpful.

          In the equivalent XML notation, the ordering is specifically required per the schema.

          ... and Jersey adds a marshaller and unmarshaller to the JAXB framework to produce JSON. This is an artifact of jersey-json or something dumb we did when hooking up JAXB. I'll write up some unit tests and look at this soon.

          Show
          Andrew Purtell added a comment - Thanks for the report, very helpful. In the equivalent XML notation, the ordering is specifically required per the schema. ... and Jersey adds a marshaller and unmarshaller to the JAXB framework to produce JSON. This is an artifact of jersey-json or something dumb we did when hooking up JAXB. I'll write up some unit tests and look at this soon.
          Hide
          stack added a comment -

          Bulk move of 0.20.5 issues into 0.21.0 after vote that we merge branch into TRUNK up on list.

          Show
          stack added a comment - Bulk move of 0.20.5 issues into 0.21.0 after vote that we merge branch into TRUNK up on list.
          Hide
          Andrew Purtell added a comment -

          Removed ordering directives on some bindings as part of HBASE-2542 and HBASE-2567.

          Show
          Andrew Purtell added a comment - Removed ordering directives on some bindings as part of HBASE-2542 and HBASE-2567 .
          Hide
          Andrew Purtell added a comment -

          Some JSON field ordering issues remain according to reports. These may be an artifact of how JSON is hooked up to JAXB. The weird '$' special field for specifying a value is due to how the bindings work, for example. Reopening for more investigation and at least a documentation update.

          Show
          Andrew Purtell added a comment - Some JSON field ordering issues remain according to reports. These may be an artifact of how JSON is hooked up to JAXB. The weird '$' special field for specifying a value is due to how the bindings work, for example. Reopening for more investigation and at least a documentation update.
          Hide
          Andrew Purtell added a comment - - edited

          If the "$" value field does not come last in a CellModel, Jersey won't deserialize the model correctly. We can

          1. Document this clearly and also add a troubleshooting section entry.

          2. As an alternative, elsewhere we use Jackson as a JSON serializer/deserializer. Jackson seems widely considered to be better behaved. JacksonJsonProvider in jackson-jaxrs is a drop in replacement for Jersey's JSON binding for JAX-RS. However, the JSON representation of HBase REST may change as a consequence. This would depend on how configurable JacksonJsonProvider is, and if all backwards compatible behaviors can be specified. Digging around Jackson javadoc for a while was inconclusive.

          Show
          Andrew Purtell added a comment - - edited If the "$" value field does not come last in a CellModel, Jersey won't deserialize the model correctly. We can 1. Document this clearly and also add a troubleshooting section entry. 2. As an alternative, elsewhere we use Jackson as a JSON serializer/deserializer. Jackson seems widely considered to be better behaved. JacksonJsonProvider in jackson-jaxrs is a drop in replacement for Jersey's JSON binding for JAX-RS. However, the JSON representation of HBase REST may change as a consequence. This would depend on how configurable JacksonJsonProvider is, and if all backwards compatible behaviors can be specified. Digging around Jackson javadoc for a while was inconclusive.
          Hide
          Andrew Purtell added a comment -

          For option 2 I tried the quick hack attached and:

          2012-06-03 23:43:16,085 WARN  [1224001988@qtp-1382067420-0] log.Slf4jLog(76): /users/TheRealMT/info:password: org.codehaus.jackson.map.exc.UnrecognizedPropertyException: Unrecognized field "Row" (Class org.apache.hadoop.hbase.rest.model.CellSetModel), not marked as ignorable
           at [Source: org.mortbay.jetty.HttpParser$Input@264d107e; line: 1, column: 9] (through reference chain: org.apache.hadoop.hbase.rest.model.CellSetModel["Row"])
          2012-06-03 23:43:16,097 DEBUG [main] client.Client(148): PUT http://localhost:46871/users/TheRealMT/info:password 500 Unrecognized field "Row" (Class org.apache.hadoop.hbase.rest.model.CellSetModel), not marked as ignorable  at [Source: org.mortbay.jetty.HttpParser$Input@264d107e; line: 1, column: 9] (through reference chain: org.apache.hadoop.hbase.rest.model.CellSetModel["Row"]) in 1520 ms
          

          So we can certainly pursue an alternate implementation with Jackson but the JSON representation and all related documentation will change.

          I will note that the error messages provided by Jackson are much better than Jersey in contrast silently accepting input it thinks is broken.

          Show
          Andrew Purtell added a comment - For option 2 I tried the quick hack attached and: 2012-06-03 23:43:16,085 WARN [1224001988@qtp-1382067420-0] log.Slf4jLog(76): /users/TheRealMT/info:password: org.codehaus.jackson.map.exc.UnrecognizedPropertyException: Unrecognized field "Row" (Class org.apache.hadoop.hbase.rest.model.CellSetModel), not marked as ignorable at [Source: org.mortbay.jetty.HttpParser$Input@264d107e; line: 1, column: 9] (through reference chain: org.apache.hadoop.hbase.rest.model.CellSetModel["Row"]) 2012-06-03 23:43:16,097 DEBUG [main] client.Client(148): PUT http://localhost:46871/users/TheRealMT/info:password 500 Unrecognized field "Row" (Class org.apache.hadoop.hbase.rest.model.CellSetModel), not marked as ignorable at [Source: org.mortbay.jetty.HttpParser$Input@264d107e; line: 1, column: 9] (through reference chain: org.apache.hadoop.hbase.rest.model.CellSetModel["Row"]) in 1520 ms So we can certainly pursue an alternate implementation with Jackson but the JSON representation and all related documentation will change. I will note that the error messages provided by Jackson are much better than Jersey in contrast silently accepting input it thinks is broken.
          Hide
          Andrew Purtell added a comment -

          Resolved by new JSON serializer in newer versions

          Show
          Andrew Purtell added a comment - Resolved by new JSON serializer in newer versions

            People

            • Assignee:
              Unassigned
              Reporter:
              Andrew Purtell
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development