Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-10145

UnicodeDecodeError in Thrift 0.11.0 generated files

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 4.0.0
    • Impala 4.0.0
    • None
    • None
    • ghx-label-10

    Description

      If there is a string with undecodable characters in the query results, then an error will happen during the fetching while thrift 0.11.0 generated python files were in use which results in an UnicodeDecodeError.

      Depending on which protocol is in use with the impala-shell, the error will happen in different places.
      Examples for hs2-http and hs2 protocolls:

      [localhost:28000] default> select unhex('aa');
      Query: select unhex('aa')
      Query submitted at: 2020-09-04 12:41:14 (Coordinator: http://tadam-OptiPlex-7070:25000)
      Query progress can be monitored at: http://tadam-OptiPlex-7070:25000/query_plan?query_id=d041ab999f597fec:46a8b51800000000
      Caught exception 'utf8' codec can't decode byte 0xaa in position 0: invalid start byte, type=<type 'exceptions.UnicodeDecodeError'> in FetchResults. 
      Unknown Exception : 'utf8' codec can't decode byte 0xaa in position 0: invalid start byte
      Traceback (most recent call last):
        File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/impala_shell.py", line 1183, in _execute_stmt
          for rows in rows_fetched:
        File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/lib/impala_client.py", line 781, in fetch
          resp = self._do_hs2_rpc(FetchResults)
        File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/lib/impala_client.py", line 942, in _do_hs2_rpc
          return rpc()
        File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/lib/impala_client.py", line 778, in FetchResults
          return self.imp_service.FetchResults(req)
        File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/TCLIService.py", line 717, in FetchResults
          return self.recv_FetchResults()
        File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/TCLIService.py", line 736, in recv_FetchResults
          result.read(iprot)
        File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/TCLIService.py", line 3593, in read
          self.success.read(iprot)
        File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/ttypes.py", line 5888, in read
          self.results.read(iprot)
        File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/ttypes.py", line 2670, in read
          _elem115.read(iprot)
        File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/ttypes.py", line 2556, in read
          self.stringVal.read(iprot)
        File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/ttypes.py", line 2352, in read
          _elem95 = iprot.readString().decode('utf-8') if sys.version_info[0] == 2 else iprot.readString()
        File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
          return codecs.utf_8_decode(input, errors, True)
      UnicodeDecodeError: 'utf8' codec can't decode byte 0xaa in position 0: invalid start byte
      [Not connected] > 
      
      [localhost:21050] default> select unhex('aa');
      Query: select unhex('aa')
      Query submitted at: 2020-09-04 12:42:22 (Coordinator: http://tadam-OptiPlex-7070:25000)
      Query progress can be monitored at: http://tadam-OptiPlex-7070:25000/query_plan?query_id=3a481e2a0581ea7c:a6e1901800000000
      Caught exception 'utf8' codec can't decode byte 0xaa in position 0: invalid start byte, type=<type 'exceptions.UnicodeDecodeError'> in FetchResults. 
      Unknown Exception : 'utf8' codec can't decode byte 0xaa in position 0: invalid start byte
      Traceback (most recent call last):
        File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/impala_shell.py", line 1183, in _execute_stmt
          for rows in rows_fetched:
        File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/lib/impala_client.py", line 781, in fetch
          resp = self._do_hs2_rpc(FetchResults)
        File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/lib/impala_client.py", line 942, in _do_hs2_rpc
          return rpc()
        File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/lib/impala_client.py", line 778, in FetchResults
          return self.imp_service.FetchResults(req)
        File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/TCLIService.py", line 717, in FetchResults
          return self.recv_FetchResults()
        File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/TCLIService.py", line 736, in recv_FetchResults
          result.read(iprot)
        File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/TCLIService.py", line 3583, in read
          iprot._fast_decode(self, iprot, [self.__class__, self.thrift_spec])
      UnicodeDecodeError: 'utf8' codec can't decode byte 0xaa in position 0: invalid start byte
      

      Attachments

        Issue Links

          Activity

            People

              stigahuang Quanlong Huang
              tadam Adam Tamas
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: