Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-21852

Cannot get rows from hbase-rest when the rowkey contains any bytes above 0x7f

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: REST
    • Labels:
      None

      Description

      I have a table that stores it's records with big-endian long (8 byte integer) rowkeys. I'd like to access this data via the hbase-rest api, but have come across an issue where I can't access every row that exists. For example:

      $ curl -v -H "Accept: application/json" "http://hbase-rest:8080/emps/%00%00%00%00%00%00%04%00/"

      Returns the expected row without issue. However

      $ curl -v -H "Accept: application/json" "http://hbase-rest:8080/emps/%00%00%00%00%00%00%03%FF/"

      Returns a 404 Not Found, though I'm certain the record exists. The broken query also generates a log message on the rest server like this:

      WARN [qtp1473981203-37561] util.URIUtil: /emps/%00%00%00%00%00%00%03%FF/ org.eclipse.jetty.util.Utf8Appendable$NotUtf8Exception: Not valid UTF8! byte Ff in state 0

      Some troubleshooting and testing suggests that the error happens when any query contains an encoded byte above 0x7f.

      I've read that hbase-rest supports hex-escaped representation, like the shell, but that has not worked for me, and when looking through RowSpec.java, I don't see any indication that the parseRowKeys() method is attempting to parse the hex-escaped representation. Am I missing something here? Is the rest server supposed to support hex-escaped representation, and I'm not querying correctly?

      I've looked at version 0.98, and the current master branch, and the RowSpec.java source looks largely the same, so I don't believe this to even be a regression.

      I believe the error to be caused by java.net.urldecoder. I can only speculate, but would it be more appropriate to have a generic function that converts %XX strings directly to bytes, not relying on a specific Charset? Or perhaps some logic should be put into the parser to truly support the hex-escaped representation. Perhaps with a url parameter to indicate parsing as such, much like the shell requires using double quotes to indicate byte parsing.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              travis.hegner Travis Hegner
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: