Avro
  1. Avro
  2. AVRO-959

python implementation calls seek on input, unable to read avros from a stream

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 1.6.1
    • Fix Version/s: None
    • Component/s: python
    • Labels:
      None

      Description

      The python implementation of Avro calls seek on the input file handle which precludes it from being a stream (stdin, hadoop streaming, etc)

      
      ack -a -i seek
      src/avro/datafile.py
      109:      # seek to the end of the file and prepare for writing
      110:      writer.seek(0, 2)
      190:    DataFileReader.seek(long). Forces the end of the current block,
      261:    self.reader.seek(0, 2)
      263:    self.reader.seek(remember_pos)
      270:    # seek to the beginning of the file to get magic block
      271:    self.reader.seek(0, 0) 
      316:    return True. Otherwise, seek back to where we started and return False.
      320:      self.reader.seek(-SYNC_SIZE, 1)
      
      src/avro/io.py
      148:    reader is a Python object on which we can call read, seek, and tell.
      267:    self.reader.seek(self.reader.tell() + n)
      
      src/avro/txipc.py
      217:    request.content.seek(0, 0)
      
      test/test_io.py
      137:    writer.seek(0)
      

        Issue Links

          Activity

          Hide
          Harsh J added a comment -

          It is possible to seek the standard input, iff a file is used as one:

          ➜  ~  cat foo.py
          import sys
          
          inp = sys.stdin
          inp.seek(0)
          ➜  ~  cat foo.py | python foo.py
          Traceback (most recent call last):
            File "foo.py", line 4, in <module>
              inp.seek(0)
          IOError: [Errno 29] Illegal seek
          ➜  ~  python foo.py < foo.py
          ➜  ~  python foo.py <(cat foo.py) 
          ➜  ~  
          

          We could probably document and leverage this?

          Show
          Harsh J added a comment - It is possible to seek the standard input, iff a file is used as one: ➜ ~ cat foo.py import sys inp = sys.stdin inp.seek(0) ➜ ~ cat foo.py | python foo.py Traceback (most recent call last): File "foo.py" , line 4, in <module> inp.seek(0) IOError: [Errno 29] Illegal seek ➜ ~ python foo.py < foo.py ➜ ~ python foo.py <(cat foo.py) ➜ ~ We could probably document and leverage this?
          Hide
          Scott Nottingham added a comment -

          What you are trying to do can be easily accomplished as follows:
          import cStringIO
          file_like_obj = cStringIO.StringIO()
          file_like_obj.write(sys.stdin.read())
          file_like_obj.seek(0)

          now you can pass this file_like_obj into avro's read method.

          Show
          Scott Nottingham added a comment - What you are trying to do can be easily accomplished as follows: import cStringIO file_like_obj = cStringIO.StringIO() file_like_obj.write(sys.stdin.read()) file_like_obj.seek(0) now you can pass this file_like_obj into avro's read method.

            People

            • Assignee:
              Unassigned
              Reporter:
              Sean Jensen-Grey
            • Votes:
              2 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:

                Development