Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-913

Create the skeleton for a Dataset API Spark runner

    XMLWordPrintableJSON

    Details

    • Type: Wish
    • Status: Resolved
    • Priority: P2
    • Resolution: Invalid
    • Affects Version/s: None
    • Fix Version/s: Not applicable
    • Component/s: runner-spark
    • Labels:
      None

      Description

      As discussed in Beam Dev list, we should have a second runner for Spark based on the Dataset API.
      As part of this the Spark runner will have three modules: runner-spark-core, runner-spark-rdd (Spark 1.6.x) and runner-spark-dataset (Spark 2.x).

      This work should go in a feature branch (runner-spark2 already exists).

      This ticket is about creating a skeleton for the structure mentioned, and everything that can be easily ported from the current runner.

      Some of the work is already in the current feature branch, but a lot has changed since it was last updated.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                amitsela Amit Sela
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: