Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-18423

[Python] Expose reading a schema from an IPC message

    XMLWordPrintableJSON

Details

    Description

      Pyarrow currently does not implement the reading of an Arrow schema from an IPC message.

      https://github.com/apache/arrow/blob/80b389efe902af376a85a8b3740e0dbdc5f80900/python/pyarrow/ipc.pxi#L1094

       

      We'd like to consume an Arrow IPC stream like the following:

       

      schema_msg = pyarrow.ipc.read_message(result_iter.next().data)
      schema = pyarrow.ipc.read_schema(schema_msg)
      for batch_data in result_iter:
          batch_msg = pyarrow.ipc.read_message(batch_data.data)
          batch = pyarrow.ipc.read_record_batch(batch_msg, schema)

       

      The associated (tiny) PR on GitHub implements this reading by binding the existing C++ function.

      Attachments

        Issue Links

          Activity

            People

              akohn Andre Kohn
              akohn Andre Kohn
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h
                  1h