Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-10255

[JS] Reorganize imports and exports to be more friendly to ESM tree-shaking

    XMLWordPrintableJSON

Details

    Description

      Presently most of our public classes can't be easily tree-shaken by library consumers. This is a problem for libraries that only need to use parts of Arrow.

      For example, the vis.gl projects have an integration test that imports three of our simpler classes and tests the resulting bundle size:

      import {Schema, Field, Float32} from 'apache-arrow';
      
      // | Bundle Size        | Compressed     
      // | 202KB (207112) KB  | 45KB (46618) KB
      

      We can help solve this with the following changes:

      • Add "sideEffects": false to our ESM package.json
      • Reorganize our imports to only include what's needed
      • Eliminate or move some static/member methods to standalone exported functions
      • Wrap the utf8 util's node Buffer detection in eval so Webpack doesn't compile in its own Buffer shim
      • Removing flatbuffers namespaces from generated TS because these defeat Webpack's tree-shaking ability

      Candidate functions for removal/moving to standalone functions:

      • Schema.new, Schema.from, Schema.prototype.compareTo
      • Field.prototype.compareTo
      • Type.prototype.compareTo
      • Table.new, Table.from
      • Column.new
      • Vector.new, Vector.from
      • RecordBatchReader.from

      After applying a few of the above changes to the Schema and flatbuffers files, I was able to reduce the vis.gl's import size 90%:

      // Bundle Size      | Compressed
      // 24KB (24942) KB  | 6KB (6154) KB
      

      Attachments

        Issue Links

          Activity

            People

              paul.e.taylor Paul Taylor
              paul.e.taylor Paul Taylor
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 50m
                  2h 50m