Uploaded image for project: 'SystemDS'
  1. SystemDS
  2. SYSTEMDS-1160

Enable Prefetching of Mini-Batches

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • SystemML 1.0.0
    • None
    • None
    • None

    Description

      For efficient training of large deep learning models, a mini-batch training approach is preferred. On SystemML with the Spark backend, this currently equates to grabbing a mini-batch from an RDD (via a PartitionPruning RDD – see SYSTEMML-951), and then using entirely single-node instructions for each mini-batch. While the fetching of partitions has been made efficient, we currently have to pause after each training step to grab the next partition. For large models, training time is already an issue even for GPUs with saturated input pipelines. Thus, we need to enable prefetching of mini-batches that runs in parallel to the training loop. One possibility would be to create an input queue that is fed from a prefetch thread, and that then feeds the training loop.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              dusenberrymw Mike Dusenberry
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: