Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
0.12.1
-
None
-
Kernel: 4.4.95.x86_64
Python: 2.7.5
Description
When run https://github.com/uber/petastorm.git pytorch_hello_world.py script, it fails due to TypeError as following.
It seems that the pyarrow.lib.HadoopFileSystem._connect require unicode argument, however, the argument input is aways a string type. So add a unicode() convert to make sure that the argument is a unicode type.
Traceback (most recent call last):
File "pytorch_hello_world.py", line 31, in <module>
pytorch_hello_world()
File "pytorch_hello_world.py", line 25, in pytorch_hello_world
with DataLoader(make_reader(dataset_url)) as train_loader:
File "/usr/lib/python2.7/site-packages/petastorm/reader.py", line 132, in make_reader
resolver = FilesystemResolver(dataset_url, hdfs_driver=hdfs_driver)
File "/usr/lib/python2.7/site-packages/petastorm/fs_utils.py", line 83, in _init_
self._filesystem = connector.connect_to_either_namenode(namenodes)
File "/usr/lib/python2.7/site-packages/petastorm/hdfs/namenode.py", line 266, in connect_to_either_namenode
return HAHdfsClient(cls, list_of_namenodes)
File "/usr/lib/python2.7/site-packages/petastorm/hdfs/namenode.py", line 224, in _init_
self._do_connect()
File "/usr/lib/python2.7/site-packages/petastorm/hdfs/namenode.py", line 233, in _do_connect
self._connector_cls._try_next_namenode(self._index_of_nn, self._list_of_namenodes)
File "/usr/lib/python2.7/site-packages/petastorm/hdfs/namenode.py", line 289, in _try_next_namenode
cls.hdfs_connect_namenode(urlparse('hdfs://' + str(host or 'default')))
File "/usr/lib/python2.7/site-packages/petastorm/hdfs/namenode.py", line 250, in hdfs_connect_namenode
return pyarrow.hdfs.connect(url.hostname or 'default', url.port or 8020, driver=driver)
File "/usr/lib64/python2.7/site-packages/pyarrow/hdfs.py", line 209, in connect
extra_conf=extra_conf)
File "/usr/lib64/python2.7/site-packages/pyarrow/hdfs.py", line 39, in _init_
self._connect(host, port, user, kerb_ticket, driver, extra_conf)
File "pyarrow/io-hdfs.pxi", line 97, in pyarrow.lib.HadoopFileSystem._connect
TypeError: Expected unicode, got str
Attachments
Issue Links
- duplicates
-
ARROW-4413 [Python] pyarrow.hdfs.connect() failing
- Resolved
- links to