Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-7890

Rework dependency stack to ensure beam stay lightweight + embeddable




      Currently, beam entry cost is > 30M:


      -rw-r--r-- 1 rmannibucau rmannibucau  13M févr. 17 11:45 beam-vendor-grpc-1_13_1-0.2.jar
      -rw-r--r-- 1 rmannibucau rmannibucau 8,7M août   5 10:22 beam-sdks-java-core-2.14.0.jar
      -rw-r--r-- 1 rmannibucau rmannibucau 2,6M août   5 10:25 beam-vendor-sdks-java-extensions-protobuf-2.14.0.jar
      -rw-r--r-- 1 rmannibucau rmannibucau 2,6M févr. 17 11:45 beam-vendor-guava-20_0-0.1.jar
      -rw-r--r-- 1 rmannibucau rmannibucau 1,4M août   5 10:21 beam-model-pipeline-2.14.0.jar
      -rw-r--r-- 1 rmannibucau rmannibucau 825K août   5 10:25 beam-model-fn-execution-2.14.0.jar
      -rw-r--r-- 1 rmannibucau rmannibucau 470K août   5 10:21 beam-model-job-management-2.14.0.jar
      -rw-r--r-- 1 rmannibucau rmannibucau 446K août   5 10:25 beam-runners-core-construction-java-2.14.0.jar
      -rw-r--r-- 1 rmannibucau rmannibucau 378K août   5 10:24 beam-runners-core-java-2.14.0.jar

      Due to its embed nature (generally sent with the job) it should stay as lightweight as possible. I see a few actions which can help to make back beam integrable:


      1. Make all the polyglotism layer optional and excludable, this is never needed for several jobs and this additional weight is a clear regression on the packaging side of beam,
      2. Vendoring and sdk dependencies are generally luxuray (who needs a library to do a new ArrayList<>() in 2019 ) so most of the dependencies can be dropped, vendoring can be made very lightweight - to not say optional for the sdk java core

      At the end a reasonable limit for a runner like spark - not the direct one which reimplements all the logic by design - would be around 5M of deps IMHO.




            • Assignee:
              romain.manni-bucau Romain Manni-Bucau
            • Votes:
              0 Vote for this issue
              1 Start watching this issue


              • Created: