Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-2787

[Python] Memory Issue passing table from python to c++ via cython

    XMLWordPrintableJSON

    Details

      Description

      I wanted to create a simple example of reading a table in Python and pass it to C+, but I'm doing something wrong or there is a memory issue. When the table gets to C+ and I print out column names it also prints out a lot of junk and what looks like pydocs. Let me know if you need any more info. Thanks!

      demo.py

      import numpy
      from psy.automl import cyth
      import pandas as pd
      from absl import app
      
      def main(argv):
        sup = pd.DataFrame({
        'int': [1, 2],
        'str': ['a', 'b']
        })
        table = pa.Table.from_pandas(sup)
        cyth.c_t(table)
      

      cyth.pyx

      import pandas as pd
      import pyarrow as pa
      from pyarrow.lib cimport *
      
      cdef extern from "cyth.h" namespace "psy":
       void t(shared_ptr[CTable])
      
      def c_t(obj):
       # These print work
       # for i in range(obj.num_columns):
       # print(obj.column(i).name
        cdef shared_ptr[CTable] tbl = pyarrow_unwrap_table(obj)
        t(tbl)
      

      cyth.h

      #include <iostream>
      #include <string>
      #include "arrow/api.h"
      #include "arrow/python/api.h"
      #include "Python.h"
      
      namespace psy {
      
      void t(std::shared_ptr<arrow::Table> pytable) {
      
      // This works
        std::cout << "NUM" << pytable->num_columns();
      
      // This prints a lot of garbage
        for(int i = 0; i < pytable->num_columns(); i++) {
        std::cout << pytable->column(i)->name();
        }
       }
      }
      

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                apitrou Antoine Pitrou
                Reporter:
                weazelb0y Joseph Toth
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m