Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-22658

SPIP: TeansorFlowOnSpark as a Scalable Deep Learning Lib of Apache Spark

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.2.0
    • Fix Version/s: None
    • Component/s: ML
    • Labels:
      None
    • Flags:
      Important

      Description

      TensorFlowOnSpark (TFoS) was released at github for distributed TensorFlow training and inference on Apache Spark clusters. TFoS is designed to:

      • Easily migrate all existing TensorFlow programs with minimum code change;
      • Support all TensorFlow functionalities: synchronous/asynchronous training, model/data parallelism, inference and TensorBoard;
      • Easily integrate with your existing data processing pipelines (ex. Spark SQL) and machine learning algorithms (ex. MLlib);
      • Be easily deployed on cloud or on-premise: CPU & GPU, Ethernet and Infiniband.

      We propose to merge TFoS into Apache Spark as a scalable deep learning library to:

      • Make deep learning easy for Apache Spark community: Familiar pipeline API for training and inference; Enable TensorFlow training/inference on existing Spark clusters.
      • Further simplify data scientist experience: Ensure compatibility b/w Apache Spark and TFoS; Reduce steps for installation.
      • Help Apache Spark evolutions on deep learning: Establish a design pattern for additional frameworks (ex. Caffe, CNTK); Structured streaming for DL training/inference.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              afeng Andy Feng
              Shepherd:
              Andy Feng
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:

                Time Tracking

                Estimated:
                Original Estimate - 336h
                336h
                Remaining:
                Remaining Estimate - 336h
                336h
                Logged:
                Time Spent - Not Specified
                Not Specified