Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-4637

[Python] Avoid importing Pandas unless necessary

    XMLWordPrintableJSON

    Details

      Description

      Importing PyArrow is more than twice slower when Pandas is installed:

      $ time python -c "import pyarrow"
      
      real	0m0,360s
      user	0m0,305s
      sys	0m0,037s
      
      $ time python -c "import sys; sys.modules['pandas'] = None; import pyarrow"
      
      real	0m0,144s
      user	0m0,124s
      sys	0m0,020s
      

      We should only import Pandas when necessary, e.g. when asked to ingest or create Pandas data.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                wesm Wes McKinney
                Reporter:
                apitrou Antoine Pitrou
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 4h
                  4h