Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 4.0.0
-
None
-
ghx-label-3
Description
When running with GCC 7.5.0 on Centos 7, dataload hits the following error:
23:51:41 Caching test tables (logging to /data/jenkins/workspace/impala-private-parameterized/repos/Impala/logs/data_loading/cache-test-tables.log)... 23:51:42 FAILED (Took: 0 min 1 sec) 23:51:42 'cache-test-tables' failed. Tail of log: 23:51:42 Log for command 'cache-test-tables' 23:51:42 CACHING tpch.nation AND functional.alltypestiny 23:51:42 Traceback (most recent call last): 23:51:42 File "/data/jenkins/workspace/impala-private-parameterized/repos/Impala/shell/impala_shell.py", line 42, in <module> 23:51:42 from impala_client import ImpalaHS2Client, ImpalaBeeswaxClient, QueryOptionLevels 23:51:42 File "/data0/jenkins/workspace/impala-private-parameterized/repos/Impala/shell/impala_client.py", line 27, in <module> 23:51:42 import sasl 23:51:42 File "/data/jenkins/workspace/impala-private-parameterized/repos/Impala/infra/python/env/lib/python2.7/site-packages/sasl/__init__.py", line 15, in <module> 23:51:42 from sasl.saslwrapper import * 23:51:42 ImportError: /data/jenkins/workspace/impala-private-parameterized/repos/Impala/infra/python/env/lib/python2.7/site-packages/sasl/saslwrapper.so: undefined symbol: _ZTVNSt7__cxx1118basic_stringstreamIcSt11char_traitsIcESaIcEEE
This command is using bin/impala-shell.sh, which uses impala-python to run the shell. The python packages for the infra/python/env virtualenv are compiled using the toolchain GCC. Apparently, at least one of them (sasl) is compiling c++ code, so this means that the compiled python packages need to be able to find the appropriate libstdc+. Centos 7 has an older libstdc, so the system libstdc+ can't satisfy these symbols.
This seems like it should be limited to impala-python, and so it may be enough to fix only bin/impala-shell.sh. One fix is to add the GCC paths to the paths returned by "infra/python/bootstrap_virtualenv.py --print-ld-library-path"