Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-4637

[Python] Avoid importing Pandas unless necessary

    XMLWordPrintableJSON

Details

    Description

      Importing PyArrow is more than twice slower when Pandas is installed:

      $ time python -c "import pyarrow"
      
      real	0m0,360s
      user	0m0,305s
      sys	0m0,037s
      
      $ time python -c "import sys; sys.modules['pandas'] = None; import pyarrow"
      
      real	0m0,144s
      user	0m0,124s
      sys	0m0,020s
      

      We should only import Pandas when necessary, e.g. when asked to ingest or create Pandas data.

      Attachments

        Activity

          People

            wesm Wes McKinney
            apitrou Antoine Pitrou
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 6h 20m
                6h 20m