Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 4.0.0
-
None
-
None
-
ghx-label-10
Description
If there is a string with undecodable characters in the query results, then an error will happen during the fetching while thrift 0.11.0 generated python files were in use which results in an UnicodeDecodeError.
Depending on which protocol is in use with the impala-shell, the error will happen in different places.
Examples for hs2-http and hs2 protocolls:
[localhost:28000] default> select unhex('aa'); Query: select unhex('aa') Query submitted at: 2020-09-04 12:41:14 (Coordinator: http://tadam-OptiPlex-7070:25000) Query progress can be monitored at: http://tadam-OptiPlex-7070:25000/query_plan?query_id=d041ab999f597fec:46a8b51800000000 Caught exception 'utf8' codec can't decode byte 0xaa in position 0: invalid start byte, type=<type 'exceptions.UnicodeDecodeError'> in FetchResults. Unknown Exception : 'utf8' codec can't decode byte 0xaa in position 0: invalid start byte Traceback (most recent call last): File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/impala_shell.py", line 1183, in _execute_stmt for rows in rows_fetched: File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/lib/impala_client.py", line 781, in fetch resp = self._do_hs2_rpc(FetchResults) File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/lib/impala_client.py", line 942, in _do_hs2_rpc return rpc() File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/lib/impala_client.py", line 778, in FetchResults return self.imp_service.FetchResults(req) File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/TCLIService.py", line 717, in FetchResults return self.recv_FetchResults() File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/TCLIService.py", line 736, in recv_FetchResults result.read(iprot) File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/TCLIService.py", line 3593, in read self.success.read(iprot) File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/ttypes.py", line 5888, in read self.results.read(iprot) File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/ttypes.py", line 2670, in read _elem115.read(iprot) File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/ttypes.py", line 2556, in read self.stringVal.read(iprot) File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/ttypes.py", line 2352, in read _elem95 = iprot.readString().decode('utf-8') if sys.version_info[0] == 2 else iprot.readString() File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeDecodeError: 'utf8' codec can't decode byte 0xaa in position 0: invalid start byte [Not connected] >
[localhost:21050] default> select unhex('aa'); Query: select unhex('aa') Query submitted at: 2020-09-04 12:42:22 (Coordinator: http://tadam-OptiPlex-7070:25000) Query progress can be monitored at: http://tadam-OptiPlex-7070:25000/query_plan?query_id=3a481e2a0581ea7c:a6e1901800000000 Caught exception 'utf8' codec can't decode byte 0xaa in position 0: invalid start byte, type=<type 'exceptions.UnicodeDecodeError'> in FetchResults. Unknown Exception : 'utf8' codec can't decode byte 0xaa in position 0: invalid start byte Traceback (most recent call last): File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/impala_shell.py", line 1183, in _execute_stmt for rows in rows_fetched: File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/lib/impala_client.py", line 781, in fetch resp = self._do_hs2_rpc(FetchResults) File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/lib/impala_client.py", line 942, in _do_hs2_rpc return rpc() File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/lib/impala_client.py", line 778, in FetchResults return self.imp_service.FetchResults(req) File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/TCLIService.py", line 717, in FetchResults return self.recv_FetchResults() File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/TCLIService.py", line 736, in recv_FetchResults result.read(iprot) File "/home/tadam/imp/impala/shell/build/impala-shell-4.0.0-SNAPSHOT/gen-py/TCLIService/TCLIService.py", line 3583, in read iprot._fast_decode(self, iprot, [self.__class__, self.thrift_spec]) UnicodeDecodeError: 'utf8' codec can't decode byte 0xaa in position 0: invalid start byte
Attachments
Issue Links
- is broken by
-
IMPALA-7825 Upgrade Thrift version to 0.11.0
- Resolved
- is caused by
-
THRIFT-5303 Unicode decode errors in _fast_decode
- Closed
- is related to
-
IMPALA-10299 Impala-shell hangs in printing partial UTF-8 characters
- Resolved
- relates to
-
IMPALA-11313 impala-shell's PyPi form factor still suffers from IMPALA-10299
- Resolved