Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-22658

SPIP: TeansorFlowOnSpark as a Scalable Deep Learning Lib of Apache Spark

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • 2.2.0
    • None
    • ML
    • Important

    Description

      TensorFlowOnSpark (TFoS) was released at github for distributed TensorFlow training and inference on Apache Spark clusters. TFoS is designed to:

      • Easily migrate all existing TensorFlow programs with minimum code change;
      • Support all TensorFlow functionalities: synchronous/asynchronous training, model/data parallelism, inference and TensorBoard;
      • Easily integrate with your existing data processing pipelines (ex. Spark SQL) and machine learning algorithms (ex. MLlib);
      • Be easily deployed on cloud or on-premise: CPU & GPU, Ethernet and Infiniband.

      We propose to merge TFoS into Apache Spark as a scalable deep learning library to:

      • Make deep learning easy for Apache Spark community: Familiar pipeline API for training and inference; Enable TensorFlow training/inference on existing Spark clusters.
      • Further simplify data scientist experience: Ensure compatibility b/w Apache Spark and TFoS; Reduce steps for installation.
      • Help Apache Spark evolutions on deep learning: Establish a design pattern for additional frameworks (ex. Caffe, CNTK); Structured streaming for DL training/inference.

      Attachments

        1. SPIP_ TensorFlowOnSpark.pdf
          180 kB
          Andy Feng

        Activity

          People

            Unassigned Unassigned
            afeng Andy Feng
            Andy Feng Andy Feng
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 336h
                336h
                Remaining:
                Remaining Estimate - 336h
                336h
                Logged:
                Time Spent - Not Specified
                Not Specified