Details
-
Bug
-
Status: Resolved
-
P3
-
Resolution: Fixed
-
None
Description
User issue: http://stackoverflow.com/questions/38811152/google-dataflow-python-pipeline-write-failure
Reproduction: use a TextFileSink and set output locations as gs://mybucket and it fails. Change it to gs://mybucket/ and it works.
The final output path is generated here:
https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/fileio.py#L495
And this seemingly works in the Java SDK.
Stack:
File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/iobase.py", line 1058, in finish_bundle
yield window.TimestampedValue(self.writer.close(), window.MAX_TIMESTAMP)
File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/fileio.py", line 601, in close
self.sink.close(self.temp_handle)
File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/fileio.py", line 687, in close
file_handle.close()
File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/gcsio.py", line 617, in close
self._flush_write_buffer()
File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/gcsio.py", line 647, in _flush_write_buffer
raise self.upload_thread.last_error # pylint: disable=raising-bad-type
HttpError: HttpError accessing <https://www.googleapis.com/resumable/upload/storage/v1/b/mybucket-temp-2016-08-08_21-29-39/o?uploadType=resumable&alt=json&name=f1cd7fe2-cf96-4d1d-bb5b-a6252cbcd342>: response: <
>, content <{
"error": {
"errors": [
],
"code": 404,
"message": "Not Found"
}
}
Attachments
Issue Links
- links to