[MXNET-1210] Gluon Audio - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: To Do
Priority: Minor
Resolution: Unresolved
Component/s: Gluon
Labels:
None

Description

As a user, I would like to have an out of the box feature in Audio Data Loader and Audio transforms in MXNet, that would allow me :

to be able to load audio (only .wav files supported currently) files and make a Gluon AudioDataset (NDArrays),

apply some popular audio transforms on the audio data( example scaling, MEL, MFCC etc.),

load the Dataset using Gluon's DataLoader, train a neural network ( Ex: MLP) with this transformed audio dataset,

perform a simple audio data related task such as sounds classification - 1 audio clip with 1 label( Multiclass sound classification problem).

Provide an end to end example for a task (Urban Sounds Classification) including:

reading audio files from a folder location (can be extended to S3 bucket later) and load it into the AudioDataset

apply audio transforms

train a model - neural network with the AudioDataset or DataLoader

perform the multi class classification - conduct inference

Design here: https://cwiki.apache.org/confluence/display/MXNET/Gluon+-+Audio

Attachments

Issue Links

links to

GitHub Pull Request #13241

GitHub Pull Request #13325

Activity

People

Assignee:: Unassigned

Reporter:: Gaurav Gireesh

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 12/Nov/18 23:59

Updated:: 01/Dec/18 18:00

Time Tracking

Estimated:

Not Specified

Remaining:

0h

Logged:

22h