HBase
  1. HBase
  2. HBASE-2475

[stargate] Required ordering of JSON name/value pairs when performing Insert/Update

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None

      Description

      From Tyler Coffin up on hbase-user@

      I am using the Stargate REST interface to HBase for inserting data. When using JSON to transmit the query content, I have found that specific ordering of key/value pairs within the JSON string is required in order for the query to succeed (otherwise a response of 'HTTP/1.1 500 Row key is invalid' error is thrown if "key" and "Cell" are reversed).

      Example:
      This string receives the above error:

      {"Row":[{"Cell":[{"column":"bWVzc2FnZTptc2c=","$":"Zm9vYmFy"}],"key":"MTIzNAo="}]}
      

      This is the valid equivalent string:

      {"Row":[{"key":"MTIzNAo=","Cell":[{"column":"bWVzc2FnZTptc2c=","$":"Zm9vYmFy"}]}]}
      

      As you can see the only difference between these two instances is that the "key" and "Cell" name/value pairs have their order reversed.

      In the equivalent XML notation, the ordering is specifically required per the schema. However with JSON Objects (i.e. name/value pairs) order is not required (JSON Arrays are ordered, but not Objects). Some JSON libraries will preserve ordering of Objects but not all which is how I discovered this problem in the first place because I was using the Perl JSON library which does not guarantee order).
      I'm unsure if this is a bug in the REST implementation or an inconvenient ambiguity in the JSON specification. Regardless I thought I'd share this discovery with the community for feedback (or at the very least to document this for users' future reference).

      For reference this is the table schema for the above query:

      {NAME => 'reftrack', FAMILIES => [{NAME => 'message', COMPRESSION =>
      'NONE', VERSIONS => '1', TTL => '2147483647', BLOCKSIZE => '65536',
      IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}
      
      1. HBASE-2475.patch
        7 kB
        Andrew Purtell

        Issue Links

          Activity

          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Resolved Resolved
          28d 9h 1 Andrew Purtell 19/May/10 08:08
          Resolved Resolved Reopened Reopened
          746d 3h 21m 1 Andrew Purtell 03/Jun/12 11:29
          Reopened Reopened Resolved Resolved
          707d 35m 1 Andrew Purtell 11/May/14 12:05
          Andrew Purtell made changes -
          Status Reopened [ 4 ] Resolved [ 5 ]
          Assignee Andrew Purtell [ apurtell ]
          Fix Version/s 0.90.0 [ 12313607 ]
          Fix Version/s 0.20.5 [ 12314800 ]
          Resolution Fixed [ 1 ]
          Hide
          Andrew Purtell added a comment -

          Resolved by new JSON serializer in newer versions

          Show
          Andrew Purtell added a comment - Resolved by new JSON serializer in newer versions
          Andrew Purtell made changes -
          Attachment HBASE-2475.patch [ 12530709 ]
          Hide
          Andrew Purtell added a comment -

          For option 2 I tried the quick hack attached and:

          2012-06-03 23:43:16,085 WARN  [1224001988@qtp-1382067420-0] log.Slf4jLog(76): /users/TheRealMT/info:password: org.codehaus.jackson.map.exc.UnrecognizedPropertyException: Unrecognized field "Row" (Class org.apache.hadoop.hbase.rest.model.CellSetModel), not marked as ignorable
           at [Source: org.mortbay.jetty.HttpParser$Input@264d107e; line: 1, column: 9] (through reference chain: org.apache.hadoop.hbase.rest.model.CellSetModel["Row"])
          2012-06-03 23:43:16,097 DEBUG [main] client.Client(148): PUT http://localhost:46871/users/TheRealMT/info:password 500 Unrecognized field "Row" (Class org.apache.hadoop.hbase.rest.model.CellSetModel), not marked as ignorable  at [Source: org.mortbay.jetty.HttpParser$Input@264d107e; line: 1, column: 9] (through reference chain: org.apache.hadoop.hbase.rest.model.CellSetModel["Row"]) in 1520 ms
          

          So we can certainly pursue an alternate implementation with Jackson but the JSON representation and all related documentation will change.

          I will note that the error messages provided by Jackson are much better than Jersey in contrast silently accepting input it thinks is broken.

          Show
          Andrew Purtell added a comment - For option 2 I tried the quick hack attached and: 2012-06-03 23:43:16,085 WARN [1224001988@qtp-1382067420-0] log.Slf4jLog(76): /users/TheRealMT/info:password: org.codehaus.jackson.map.exc.UnrecognizedPropertyException: Unrecognized field "Row" (Class org.apache.hadoop.hbase.rest.model.CellSetModel), not marked as ignorable at [Source: org.mortbay.jetty.HttpParser$Input@264d107e; line: 1, column: 9] (through reference chain: org.apache.hadoop.hbase.rest.model.CellSetModel["Row"]) 2012-06-03 23:43:16,097 DEBUG [main] client.Client(148): PUT http://localhost:46871/users/TheRealMT/info:password 500 Unrecognized field "Row" (Class org.apache.hadoop.hbase.rest.model.CellSetModel), not marked as ignorable at [Source: org.mortbay.jetty.HttpParser$Input@264d107e; line: 1, column: 9] (through reference chain: org.apache.hadoop.hbase.rest.model.CellSetModel["Row"]) in 1520 ms So we can certainly pursue an alternate implementation with Jackson but the JSON representation and all related documentation will change. I will note that the error messages provided by Jackson are much better than Jersey in contrast silently accepting input it thinks is broken.
          Hide
          Andrew Purtell added a comment - - edited

          If the "$" value field does not come last in a CellModel, Jersey won't deserialize the model correctly. We can

          1. Document this clearly and also add a troubleshooting section entry.

          2. As an alternative, elsewhere we use Jackson as a JSON serializer/deserializer. Jackson seems widely considered to be better behaved. JacksonJsonProvider in jackson-jaxrs is a drop in replacement for Jersey's JSON binding for JAX-RS. However, the JSON representation of HBase REST may change as a consequence. This would depend on how configurable JacksonJsonProvider is, and if all backwards compatible behaviors can be specified. Digging around Jackson javadoc for a while was inconclusive.

          Show
          Andrew Purtell added a comment - - edited If the "$" value field does not come last in a CellModel, Jersey won't deserialize the model correctly. We can 1. Document this clearly and also add a troubleshooting section entry. 2. As an alternative, elsewhere we use Jackson as a JSON serializer/deserializer. Jackson seems widely considered to be better behaved. JacksonJsonProvider in jackson-jaxrs is a drop in replacement for Jersey's JSON binding for JAX-RS. However, the JSON representation of HBase REST may change as a consequence. This would depend on how configurable JacksonJsonProvider is, and if all backwards compatible behaviors can be specified. Digging around Jackson javadoc for a while was inconclusive.
          Andrew Purtell made changes -
          Resolution Fixed [ 1 ]
          Status Resolved [ 5 ] Reopened [ 4 ]
          Hide
          Andrew Purtell added a comment -

          Some JSON field ordering issues remain according to reports. These may be an artifact of how JSON is hooked up to JAXB. The weird '$' special field for specifying a value is due to how the bindings work, for example. Reopening for more investigation and at least a documentation update.

          Show
          Andrew Purtell added a comment - Some JSON field ordering issues remain according to reports. These may be an artifact of how JSON is hooked up to JAXB. The weird '$' special field for specifying a value is due to how the bindings work, for example. Reopening for more investigation and at least a documentation update.
          Andrew Purtell made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Fix Version/s 0.20.5 [ 12314800 ]
          Resolution Fixed [ 1 ]
          Hide
          Andrew Purtell added a comment -

          Removed ordering directives on some bindings as part of HBASE-2542 and HBASE-2567.

          Show
          Andrew Purtell added a comment - Removed ordering directives on some bindings as part of HBASE-2542 and HBASE-2567 .
          Andrew Purtell made changes -
          Link This issue is part of HBASE-2567 [ HBASE-2567 ]
          stack made changes -
          Field Original Value New Value
          Fix Version/s 0.20.5 [ 12314800 ]
          Labels moved_from_0_20_5
          Hide
          stack added a comment -

          Bulk move of 0.20.5 issues into 0.21.0 after vote that we merge branch into TRUNK up on list.

          Show
          stack added a comment - Bulk move of 0.20.5 issues into 0.21.0 after vote that we merge branch into TRUNK up on list.
          Hide
          Andrew Purtell added a comment -

          Thanks for the report, very helpful.

          In the equivalent XML notation, the ordering is specifically required per the schema.

          ... and Jersey adds a marshaller and unmarshaller to the JAXB framework to produce JSON. This is an artifact of jersey-json or something dumb we did when hooking up JAXB. I'll write up some unit tests and look at this soon.

          Show
          Andrew Purtell added a comment - Thanks for the report, very helpful. In the equivalent XML notation, the ordering is specifically required per the schema. ... and Jersey adds a marshaller and unmarshaller to the JAXB framework to produce JSON. This is an artifact of jersey-json or something dumb we did when hooking up JAXB. I'll write up some unit tests and look at this soon.
          Andrew Purtell created issue -

            People

            • Assignee:
              Unassigned
              Reporter:
              Andrew Purtell
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development