Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-2029

[Python] Program crash on `HdfsFile.tell` if file is closed

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.9.0
    • Component/s: None

      Description

      Of all the `NativeFile` methods, `tell` is the only one that doesn't check if the file is still open before running. This can lead to crashes when using hdfs:

       

      >>> import pyarrow as pa
      >>> h = pa.hdfs.connect()
      18/01/24 22:31:35 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
      18/01/24 22:31:36 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
      >>> with h.open("/tmp/test.txt", mode='wb') as f:
      ...     pass
      ...
      >>> f.tell()
      #
      # A fatal error has been detected by the Java Runtime Environment:
      #
      #  SIGSEGV (0xb) at pc=0x00007f52ccb6733d, pid=14868, tid=0x00007f52de2b9700
      #
      # JRE version: OpenJDK Runtime Environment (8.0_151-b12) (build 1.8.0_151-8u151-b12-0ubuntu0.16.04.2-b12)
      # Java VM: OpenJDK 64-Bit Server VM (25.151-b12 mixed mode linux-amd64 compressed oops)
      # Problematic frame:
      # V  [libjvm.so+0x67c33d]
      #
      # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
      #
      # An error report file with more information is saved as:
      # /working/python/hs_err_pid14868.log
      #
      # If you would like to submit a bug report, please visit:
      #   http://bugreport.java.com/bugreport/crash.jsp
      #
      Aborted
      

      In python, most file-like objects raise a `ValueError` if the file is closed:

      >>> f = open("test.py", mode='wb')
      >>> f.close()
      >>> f.tell()
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
      ValueError: I/O operation on closed file
      >>> import io
      >>> buf = io.BytesIO()
      >>> buf.close()
      >>> buf.tell()
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
      ValueError: I/O operation on closed file.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                jim.crist Jim Crist
                Reporter:
                jim.crist Jim Crist
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: