Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-12241

[Python] Parallel csv reader cancellation test kills pytest

    XMLWordPrintableJSON

Details

    Description

      CI job is okay, but it failed from my side. Tested on x86 skylake server with 32 cores, and Apple M1 with 8 cores.

      Maybe I missed something?

      Test steps:

      $ cmake -GNinja -DCMAKE_BUILD_TYPE=Release -DARROW_COMPUTE=ON -DARROW_PARQUET=ON -DARROW_BUILD_TESTS=ON -DCMAKE_INSTALL_PREFIX=$(pwd)/_install -DCMAKE_INSTALL_LIBDIR=lib -DARROW_PYTHON=ON -DCMAKE_CXX_COMPILER=/usr/bin/clang++-9 -DCMAKE_C_COMPILER=/usr/bin/clang-9 ..
      $ ninja install
      $ cd ~/arrow/python
      # set LD_LIBRARY_PATH, ARROW_HOME to newly built binaries
      $ python setup.py build_ext --inplace
      $ pytest pyarrow
      

      Error log:

      ======================================================================================= test session starts ========================================================================================
      platform linux -- Python 3.6.9, pytest-6.1.2, py-1.9.0, pluggy-0.13.1
      rootdir: /home/cyb/arrow/python, configfile: setup.cfg
      plugins: lazy-fixture-0.6.3, hypothesis-5.41.2
      collected 3802 items / 3 skipped / 3799 selected                                                                                                                                                   
      
      pyarrow/tests/test_adhoc_memory_leak.py s                                                                                                                                                    [  0%]
      pyarrow/tests/test_array.py ......................s...........................................................................................s............................................. [  4%]
      ............................ss.........                                                                                                                                                      [  5%]
      pyarrow/tests/test_builder.py ....                                                                                                                                                           [  5%]
      pyarrow/tests/test_cffi.py ......                                                                                                                                                            [  5%]
      pyarrow/tests/test_compute.py ....................................................................................................................................................           [  9%]
      pyarrow/tests/test_convert_builtin.py ...................................................................................................................................................... [ 13%]
      ................................................................................................................................x..................................................x........ [ 18%]
      ...............ssssss.....................................................................................................sssssss                                                            [ 21%]
      pyarrow/tests/test_csv.py ....................................F.......................
      
      ============================================================================================= FAILURES =============================================================================================
      ______________________________________________________________________________ TestParallelCSVRead.test_cancellation _______________________________________________________________________________
      
      self = <pyarrow.tests.test_csv.TestParallelCSVRead testMethod=test_cancellation>
      
          def test_cancellation(self):
              if (threading.current_thread().ident !=
                      threading.main_thread().ident):
                  pytest.skip("test only works from main Python thread")
      
              if sys.version_info >= (3, 8):
                  raise_signal = signal.raise_signal
              elif os.name == 'nt':
                  # On Windows, os.kill() doesn't actually send a signal,
                  # it just terminates the process with the given exit code.
                  pytest.skip("test requires Python 3.8+ on Windows")
              else:
                  # On Unix, emulate raise_signal() with os.kill().
                  def raise_signal(signum):
                      os.kill(os.getpid(), signum)
      
              large_csv = b"a,b,c\n" + b"1,2,3\n" * 30000000
      
              def signal_from_thread():
                  time.sleep(0.2)
                  raise_signal(signal.SIGINT)
      
              t1 = time.time()
              with pytest.raises(KeyboardInterrupt) as exc_info:
                  threading.Thread(target=signal_from_thread).start()
      >           self.read_bytes(large_csv)
      E           Failed: DID NOT RAISE <class 'KeyboardInterrupt'>
      
      pyarrow/tests/test_csv.py:927: Failed
      ========================================================================================= warnings summary =========================================================================================
      ../../archery/lib/python3.6/distutils/__init__.py:4
        /home/cyb/archery/lib/python3.6/distutils/__init__.py:4: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
          import imp
      
      -- Docs: https://docs.pytest.org/en/stable/warnings.html
      ===================================================================================== short test summary info ======================================================================================
      FAILED pyarrow/tests/test_csv.py::TestParallelCSVRead::test_cancellation - Failed: DID NOT RAISE <class 'KeyboardInterrupt'>
      !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! KeyboardInterrupt !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
      pyarrow/error.pxi:221: KeyboardInterrupt
      (to show a full traceback on KeyboardInterrupt use --full-trace)
      ================================================================= 1 failed, 864 passed, 21 skipped, 2 xfailed, 1 warning in 4.19s ==================================================================
      

      Attachments

        Issue Links

          Activity

            People

              apitrou Antoine Pitrou
              yibocai#1 yibocai#1
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h
                  1h