[SYSTEMDS-540] Deep Learning - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Epic
Status: In Progress
Priority: Major
Resolution: Unresolved
Affects Version/s: SystemML 0.10, SystemML 0.11, SystemML 0.12, SystemML 0.13, SystemML 1.0.0
Fix Version/s: SystemML 0.10, SystemML 0.11, SystemML 0.12, SystemML 0.13, SystemML 1.1
Component/s: Algorithms, Compiler, Parser, Runtime
Labels:
None

Epic Name:
Deep Learning

Description

This epic covers the addition of deep learning to SystemML, including:

Core DML layer abstractions for deep (convolutional, recurrent) neural nets, with simple forward/backward API: affine, convolution (start with 2D), max-pooling, non-linearities (relu, sigmoid, softmax), dropout, loss functions.
Modularized DML optimizers: (mini-batch, stochastic) gradient descent (w/ momentum, etc.).
Additional DML language support as necessary (tensors, built-in functions such as convolution, function pointers, list structures, etc.).
Integration with other deep learning frameworks (Caffe, Torch, Theano, TensoFlow, etc.) via automatic DML code generation.
etc.

—
Plan:

[DONE] Phase 1: MVPs

Create mathematically correct DML deep learning library for running basic feed-forward and convolutional neural nets on a singlenode.
Create mathematically correct built-in operators for convolution and max pooling for singlenode operation.

[CURRENT] Phase 2: Singlenode

Improve performance of DML deep learning library in singlenode operation.
Expand DML deep learning library to include additional commonly-used layers, such as RNNs and LSTMs, as well as additional optimizers.
Improve built-in operators for convolution and max pooling to be highly performant in singlenode operation.
Implement performant GPU acceleration for built-in operators (and end-to-end deep learning algorithms) in singlenode operation.
Add general engine improvements to improve bottlenecks, such as left-indexing within DML-bodied functions.
Add end-to-end deep learning algorithm examples, such as a "LeNet" convolutional neural net.

Phase 3: Distributed

Expand deep learning support to include distributed operations with large models. This includes improvements to the DML deep learning library, the built-in operators, the GPU acceleration, and general engine improvements.

Phase 4: APIs/Wrappers

Explore integration with Caffe, creating a SystemML interpreter for Caffe model definitions.
Explore integration with Keras, creating a SystemML backend for Keras.

Attachments

Issue Links

incorporates

SYSTEMDS-618 Deep Learning DML Library

In Progress

SYSTEMDS-762 Fix the bug that causes local MR-Jobs when running in non-singlenode mode on MNIST data for Lenet script

Closed

Is contained by

SYSTEMDS-1783 Application of SystemML in Science and Engineering

Open

is depended upon by

SYSTEMDS-1185 SystemML Breast Cancer Project

Resolved

is related to

SYSTEMDS-1566 Possible regression from 0.13 -> 0.14 for MNIST LeNet script

Closed

SYSTEMDS-1595 Missing Block Sizes For PersistentWrites & TransientWrites

Closed

SYSTEMDS-1621 `max(0, X)` fails with type mismatch

Closed

SYSTEMDS-1686 Transpose Conv2d has incorrect filter shape and incorrect input size argument

Closed

SYSTEMDS-1554 IPA Scalar Transient Read Replacement

Closed

SYSTEMDS-1561 Improve constant folding during compilation

Closed

SYSTEMDS-587 Improvements Triggered By Deep Learning Work

In Progress

relates to

SYSTEMDS-409 Extended update in-place support

Open

SYSTEMDS-445 GPU support / low-level optimizations

Open

SYSTEMDS-716 Consumability of SystemML for Deep Learning

Open

SYSTEMDS-914 MLContext Performance Improvements

Open

SYSTEMDS-1129 Enable parfor to run on remote Spark workers

Open

SYSTEMDS-1142 ParFor execution crashes bc of singlethreaded warning in LibMatrixDNN

Resolved

SYSTEMDS-633 Improve Left-Indexing Performance with (Nested) Parfor Loops in UDFs

Closed

SYSTEMDS-951 Efficient spark right indexing via lookup

In Progress

SYSTEMDS-448 Source code generation for automatic operator fusion

Closed

links to

GitHub Pull Request #856

GitHub Pull Request #859

(6 is related to, 9 relates to, 2 links to)

Activity

People

Assignee:: Mike Dusenberry

Reporter:: Mike Dusenberry

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 24/Feb/16 21:43

Updated:: 03/Jun/21 13:27

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

1h 50m