Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-12506

[Python] Improve modularity of pyarrow codebase to speedup compile time

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 4.0.0
    • Python

    Description

      There are some modules in pyarrow that end up being fairly big to compile because they are mostly based on including other pxi / pxd files.

      That means that when a change to those files is done a big module has to be recompiled slowing down the development process when experimenting (seems it's not uncommon that when a change is done it takes less time to recompile libarrow than pyarrow)

      It would be convenient to divide those into separate modules that can lead to separate object files which would allow the compiler to recompile smaller chunks at the time, so that when a change is done we don't have to recompile the whole `lib.pyx` but can just recompile the module where the change is isolated to.

      The goal is to allow faster iteration over pyarrow by reducing time spent on waiting for cython compilation on each change.

      Attachments

        Activity

          People

            amol- Alessandro Molina
            amol- Alessandro Molina
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 6h 10m
                6h 10m