XMLWordPrintableJSON

Details

    • New Feature
    • Status: To Do
    • Minor
    • Resolution: Unresolved
    • Gluon
    • None

    Description

      As a user, I would like to have an out of the box feature in Audio Data Loader and Audio transforms in MXNet, that would allow me :

      • to be able to load audio (only .wav files supported currently) files and make a Gluon AudioDataset (NDArrays),
      • apply some popular audio transforms on the audio data( example scaling, MEL, MFCC etc.),
      • load the Dataset using Gluon's DataLoader, train a neural network ( Ex: MLP) with this transformed audio dataset,
      • perform a simple audio data related task such as sounds classification - 1 audio clip with 1 label( Multiclass sound classification problem).
      • Provide an end to end example for a task (Urban Sounds Classification) including:
      • reading audio files from a folder location (can be extended to S3 bucket later) and load it into the AudioDataset
      • apply audio transforms
      • train a model - neural network with the AudioDataset or DataLoader
      • perform the multi class classification - conduct inference

      Attachments

        Activity

          People

            Unassigned Unassigned
            gaurav.gireesh Gaurav Gireesh
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 22h
                22h