We're having problem with a memory leak in a Ruby script that processes many CSV files. I have written some short scripts do demonstrate the problem: https://gist.github.com/stenlarsson/60b1e4e99416738b41ee30e7ba294214
The first script, arrow_test_csv.rb, creates a 184 MB CSV file for testing.
The second script, arrow_memory_leak.rb, then loads the CSV file 10 times using Arrow. It uses the get_process_mem gem to print the memory usage both before and after each iteration. It also invokes the garbage collector on each iteration to ensure the problem is not that Ruby holds on to any objects. This is what it prints on my MacBook Pro using Arrow 6.0.0:
The third script, arrow_memory_leak.py is a Python implementation of the same script. This shows that the problem is not in the Ruby bindings:
I have also tested Arrow 5.0.0 and it has the same problem.