Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-10617

[Python] RecordBatchStreamReader's iterator doesn't work with python 3.8

Details

    Description

      The following example code doesn't work with python 3.8:

      import pyarrow as pa
      data = [
          pa.array([1, 2, 3, 4]),
          pa.array(['foo', 'bar', 'baz', None]),
          pa.array([True, None, False, True])
      ]
      batch = pa.record_batch(data, names=['f0', 'f1', 'f2'])
      
      sink = pa.BufferOutputStream()
      writer = pa.ipc.new_stream(sink, batch.schema)
      
      for i in range(5):
          writer.write_batch(batch)
      writer.close()
      
      buf = sink.getvalue()
      
      reader = pa.ipc.open_stream(buf)
      [i for i in reader]
      

      It will raise the following runtime error:

      StopIteration Traceback (most recent call last)
      {{ /usr/local/lib/python3.8/dist-packages/pyarrow/ipc.pxi in pyarrow.lib._CRecordBatchReader.read_next_batch()}}

      StopIteration:

      During handling of the above exception, another exception occurred:

      RuntimeError Traceback (most recent call last)
      {{ <ipython-input-7-225bed213dc7> in <module>}}
      {{ 18 reader = pa.ipc.open_stream(buf)}}
      {{ 19}}
      {{ ---> 20 [i for i in reader]}}

      <ipython-input-7-225bed213dc7> in <listcomp>(.0)
      {{ 18 reader = pa.ipc.open_stream(buf)}}
      {{ 19}}
      {{ ---> 20 [i for i in reader]}}

      /usr/local/lib/python3.8/dist-packages/pyarrow/ipc.pxi in _iter_()

      RuntimeError: generator raised StopIteration

      Attachments

        Issue Links

          Activity

            People

              sighingnow Tao He
              sighingnow Tao He
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 10m
                  2h 10m

                  Slack

                    Issue deployment