Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-16597

[Python][FlightRPC] Active server may segfault if Python interpreter shuts down

    XMLWordPrintableJSON

Details

    Description

      On Linux, this reliably segfaults for me with FATAL: exception not rethrown. Adding a server.shutdown to the end fixes it.

      The reason is that the Python interpreter exits after running the script, and other Python threads call PyThread_exit_thread. But one of the Python threads is currently in the middle of executing the RPC handler. PyThread_exit_thread boils down to pthread_exit which works by throwing an exception that it expects will not be caught. But gRPC places a catch(...) around RPC handlers and catches this exception, and then pthreads aborts when it doesn't catch the exception.

      We should force servers to shutdown at exit to avoid this.

      import traceback
      import pyarrow as pa
      import pyarrow.flight as flight
      
      class Server(flight.FlightServerBase):
          def do_put(self, context, descriptor, reader, writer):
              raise flight.FlightCancelledError("foo", extra_info=b"bar")
      
      
      print("PyArrow version:", pa.__version__)
      server = Server("grpc://localhost:0")
      client = flight.connect(f"grpc://localhost:{server.port}")
      
      schema = pa.schema([])
      writer, reader = client.do_put(flight.FlightDescriptor.for_command(b""), schema)
      try:
          writer.done_writing()
      except flight.FlightError as e:
          traceback.print_exc()
          print(e.extra_info)
      except Exception:
          traceback.print_exc()
      

      Attachments

        Issue Links

          Activity

            People

              lidavidm David Li
              lidavidm David Li
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1.5h
                  1.5h