Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-11113

single_node_perf_run.py throws UnicodeDecodeError for TPCDS dataset

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • Impala 4.1.0
    • None
    • None
    • ghx-label-1

    Description

      Possible fix:

      https://stackoverflow.com/questions/19872773/unicodedecodeerror-while-using-json-dumps

      Exception:

      Traceback (most recent call last):
        File "/home/gfurnstahl/Impala/bin/run-workload.py", line 280, in <module>
          json.dump(result_map, f, cls=CustomJSONEncoder)
        File "/home/gfurnstahl/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/json/_init_.py", line 189, in dump
          for chunk in iterable:
        File "/home/gfurnstahl/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/json/encoder.py", line 434, in _iterencode
          for chunk in _iterencode_dict(o, _current_indent_level):
        File "/home/gfurnstahl/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/json/encoder.py", line 408, in _iterencode_dict
          for chunk in chunks:
        File "/home/gfurnstahl/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/json/encoder.py", line 332, in _iterencode_list
          for chunk in chunks:
        File "/home/gfurnstahl/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/json/encoder.py", line 443, in _iterencode
          for chunk in _iterencode(o, _current_indent_level):
        File "/home/gfurnstahl/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/json/encoder.py", line 434, in _iterencode
          for chunk in _iterencode_dict(o, _current_indent_level):
        File "/home/gfurnstahl/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/json/encoder.py", line 408, in _iterencode_dict
          for chunk in chunks:
        File "/home/gfurnstahl/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/json/encoder.py", line 313, in _iterencode_list
          yield buf + _encoder(value)
      UnicodeDecodeError: 'utf8' codec can't decode byte 0xc9 in position 47: invalid continuation byte
      Traceback (most recent call last):
        File "./bin/single_node_perf_run.py", line 359, in <module>
          main()
        File "./bin/single_node_perf_run.py", line 349, in main
          perf_ab_test(options, args)
        File "./bin/single_node_perf_run.py", line 256, in perf_ab_test
          run_workload(temp_dir, workloads, options)
        File "./bin/single_node_perf_run.py", line 154, in run_workload
          configured_call(run_workload)
        File "./bin/single_node_perf_run.py", line 94, in configured_call
          return subprocess.check_call(["bash", "-c", cmd])
        File "/home/gfurnstahl/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/subprocess.py", line 190, in check_call
          raise CalledProcessError(retcode, cmd)
      subprocess.CalledProcessError: Command '['bash', '-c', 'source /home/gfurnstahl/Impala/bin/impala-config.sh && /home/gfurnstahl/Impala/bin/run-workload.py --workloads=tpcds:10 --impalads=localhost:21000 --results_json_file=/home/gfurnstahl/Impala/perf_results/perf_run_l1WHcn/27a1b4c1203fd1fc7929d23659eed0861703e9e1.json --query_iterations=3 --table_formats=parquet/none --plan_first']' returned non-zero exit status 1

      Attachments

        Activity

          People

            gfurnstahl Gergely Fürnstáhl
            gfurnstahl Gergely Fürnstáhl
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: