Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-13454

Dataframe read_fwf fails reading incrementally.

Details

    • Bug
    • Status: Resolved
    • P2
    • Resolution: Fixed
    • None
    • 2.36.0
    • sdk-py-core
    • None

    Description

      When trying to use beam.dataframe.io.read_fwf one gets the error.

        File "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/runners/common.py", line 1206, in process_with_sized_restriction
          return self.do_fn_invoker.invoke_process(
        File "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/runners/common.py", line 698, in invoke_process
          residual = self._invoke_process_per_window(
        File "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/runners/common.py", line 836, in _invoke_process_per_window
          self.output_processor.process_outputs(
        File "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/runners/common.py", line 1334, in process_outputs
          for result in results:
        File "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/dataframe/io.py", line 545, in process
          frames = reader(handle, *self.args, **self.kwargs)
        File "/Users/robertwb/Work/beam/venv-3.8/lib/python3.8/site-packages/pandas/io/parsers.py", line 848, in read_fwf
          return _read(filepath_or_buffer, kwds)
        File "/Users/robertwb/Work/beam/venv-3.8/lib/python3.8/site-packages/pandas/io/parsers.py", line 454, in _read
          parser = TextFileReader(fp_or_buf, **kwds)
        File "/Users/robertwb/Work/beam/venv-3.8/lib/python3.8/site-packages/pandas/io/parsers.py", line 942, in __init__
          self.engine = self._check_file_or_buffer(f, engine)
        File "/Users/robertwb/Work/beam/venv-3.8/lib/python3.8/site-packages/pandas/io/parsers.py", line 1003, in _check_file_or_buffer
          raise ValueError(msg)
      ValueError: The 'python' engine cannot iterate through this file buffer.
      

      Looks like pandas is expecting the file handle to be (line) iterable as well as supporting read().

      Attachments

        Issue Links

          Activity

            People

              robertwb Robert Bradshaw
              robertwb Robert Bradshaw
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10m
                  10m