Uploaded image for project: 'Singa'
  1. Singa
  2. SINGA-146

Implement speech recognition examples

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Component/s: None
    • Labels:

      Description

      Speech recognition is a major application of deep learning models. The other two major applications are about computer vision and NLP.
      SINGA has provided examples for image classification and language modelling. We are going to provide a few examples about speech recognition application, e.g., CTC and DBN-DNN.

        Activity

        Hide
        yvtheja Vishnu Teja added a comment -

        Hi,

        I am interested in this project for GSoC 2016.

        Could I please know where can I talk to you about the project?

        Thank you

        Show
        yvtheja Vishnu Teja added a comment - Hi, I am interested in this project for GSoC 2016. Could I please know where can I talk to you about the project? Thank you
        Hide
        wangsh Sheng Wang added a comment -

        Hi Vishnu Teja,

        You can send mails to our developer list: dev@singa.incubator.apache.org
        or simply leave messages under this jira ticket.

        Please let us know if you have any ideas about the gsoc project.

        Show
        wangsh Sheng Wang added a comment - Hi Vishnu Teja, You can send mails to our developer list: dev@singa.incubator.apache.org or simply leave messages under this jira ticket. Please let us know if you have any ideas about the gsoc project.
        Hide
        wangwei.cs wangwei added a comment -

        Hi Vishnu,

        In case that you didn't receive the comment from Sheng Wang (showing below), please join our dev@ mailing list (dev-subscribe@singa.incubator.apache.org) and discuss there (we want to know your background on DL/speech recognition).
        Thanks.

        Show
        wangwei.cs wangwei added a comment - Hi Vishnu, In case that you didn't receive the comment from Sheng Wang (showing below), please join our dev@ mailing list (dev-subscribe@singa.incubator.apache.org) and discuss there (we want to know your background on DL/speech recognition). Thanks.
        Hide
        yvtheja Vishnu Teja added a comment -

        Hi Sheng,

        I have no experience in speech recognition but I have good grip on Machine Learning. And I can learn things in a appropriate speed. Please let me know if it's considerable.

        Thank you

        Show
        yvtheja Vishnu Teja added a comment - Hi Sheng, I have no experience in speech recognition but I have good grip on Machine Learning. And I can learn things in a appropriate speed. Please let me know if it's considerable. Thank you
        Hide
        wangwei.cs wangwei added a comment -

        Hi Vishnu,

        Please read some papers on speech recognition and learn the programming model of SINGA.
        If you have any idea of implementing these algorithms on SINGA, then just let us know.

        speech recognition papers: thesis of http://www.cs.toronto.edu/~gdahl/ (including CTC), deep speech from baidu http://arxiv.org/abs/1512.02595v1
        SINGA: http://singa.apache.org/docs/programming-guide.html.

        Show
        wangwei.cs wangwei added a comment - Hi Vishnu, Please read some papers on speech recognition and learn the programming model of SINGA. If you have any idea of implementing these algorithms on SINGA, then just let us know. speech recognition papers: thesis of http://www.cs.toronto.edu/~gdahl/ (including CTC), deep speech from baidu http://arxiv.org/abs/1512.02595v1 SINGA: http://singa.apache.org/docs/programming-guide.html .
        Hide
        wangwei.cs wangwei added a comment -

        FYI.
        (Copied from mentors@community.apache.org)

        > https://developers.google.com/open-source/gsoc/timeline
        > 14 March 19:00 UTC Student application period opens.
        > 25 March 19:00 UTC Student application deadline.

        http://write.flossmanuals.net/gsocstudentguide/ has a good description
        of the process.

        http://write.flossmanuals.net/gsocstudentguide/writing-a-proposal/
        has some suggestions for your proposal, and Apache has a template here:

        http://community.staging.apache.org/gsoc#application-template

        I would suggest your students to write their proposal as a
        a Google Docs page or similar, and share those with you
        early so your developer community can help them get a good proposal.

        Show
        wangwei.cs wangwei added a comment - FYI. (Copied from mentors@community.apache.org) > https://developers.google.com/open-source/gsoc/timeline > 14 March 19:00 UTC Student application period opens. > 25 March 19:00 UTC Student application deadline. http://write.flossmanuals.net/gsocstudentguide/ has a good description of the process. http://write.flossmanuals.net/gsocstudentguide/writing-a-proposal/ has some suggestions for your proposal, and Apache has a template here: http://community.staging.apache.org/gsoc#application-template I would suggest your students to write their proposal as a a Google Docs page or similar, and share those with you early so your developer community can help them get a good proposal.
        Hide
        dpatnigere David Patnigere added a comment -

        Hello Sir,
        I am interested in this project. I am interested in Speech Recognition and have an understanding of neural networks and the basics of deep learning.
        I have also been reading the papers you recommended to Vishnu. May i know what you expect from this project? Also is there an IRC channel for this group?

        Thank you.

        Show
        dpatnigere David Patnigere added a comment - Hello Sir, I am interested in this project. I am interested in Speech Recognition and have an understanding of neural networks and the basics of deep learning. I have also been reading the papers you recommended to Vishnu. May i know what you expect from this project? Also is there an IRC channel for this group? Thank you.
        Hide
        wangwei.cs wangwei added a comment -

        Hi David,

        We expect to get one or two speech recognition examples like AlexNet for image classification.
        Particularly, we need the detailed instructions on preparing the (training/test) dataset, conducting the training and the test. We expect to see comparable performance as reported by other systems or in the papers where the models were published.

        We haven't set up the IRC channel. If you have any ideas, pls just reply here or send emails to dev@singa.incubator.apache.org. We would like to give suggestions on the proposal.

        Show
        wangwei.cs wangwei added a comment - Hi David, We expect to get one or two speech recognition examples like AlexNet for image classification. Particularly, we need the detailed instructions on preparing the (training/test) dataset, conducting the training and the test. We expect to see comparable performance as reported by other systems or in the papers where the models were published. We haven't set up the IRC channel. If you have any ideas, pls just reply here or send emails to dev@singa.incubator.apache.org. We would like to give suggestions on the proposal.
        Hide
        dpatnigere David Patnigere added a comment -

        Hello sir,
        I was trying to start Singa using Zookeeper. However, using the command
        "./zk-service.sh start" is giving an error regarding the file singa-env.sh at line 54 saying that "singatool" is not a directory/file. Is this a bug?
        I know that "singatool" is defined in tools.cc.
        Any ideas on how to proceed?
        Thank you

        Show
        dpatnigere David Patnigere added a comment - Hello sir, I was trying to start Singa using Zookeeper. However, using the command "./zk-service.sh start" is giving an error regarding the file singa-env.sh at line 54 saying that "singatool" is not a directory/file. Is this a bug? I know that "singatool" is defined in tools.cc. Any ideas on how to proceed? Thank you
        Hide
        wangsh Sheng Wang added a comment - - edited

        Please first check if singatool has been successfully compiled under singa home folder.
        You can manually execute ./singatool to see if it is runnable now.

        Show
        wangsh Sheng Wang added a comment - - edited Please first check if singatool has been successfully compiled under singa home folder. You can manually execute ./singatool to see if it is runnable now.
        Hide
        raunaq.abhyankar Raunaq Abhyankar added a comment -

        Hi!
        I'm Raunaq Abhyankar and I would love to work on this project idea in the summer. I've sent a mail on the developer list which elaborates my experience and skill set.
        I've started going through the papers recommended to Vishnu Teja.
        I have doubts regarding installation of singa on Fedora. Should I post them here or in the list?

        Looking forward!
        Thanks

        Show
        raunaq.abhyankar Raunaq Abhyankar added a comment - Hi! I'm Raunaq Abhyankar and I would love to work on this project idea in the summer. I've sent a mail on the developer list which elaborates my experience and skill set. I've started going through the papers recommended to Vishnu Teja. I have doubts regarding installation of singa on Fedora. Should I post them here or in the list? Looking forward! Thanks
        Hide
        wangwei.cs wangwei added a comment -

        Hi Ranuaq,

        Pls report the issues by creating another JIRA ticket.
        Once you have a proposal draft, please let us know.
        Thanks.

        Show
        wangwei.cs wangwei added a comment - Hi Ranuaq, Pls report the issues by creating another JIRA ticket. Once you have a proposal draft, please let us know. Thanks.
        Hide
        raunaq.abhyankar Raunaq Abhyankar added a comment -

        Dear Sir,
        Thanks.. I'll do that
        Can u please tell me about the language of implementation for this module?
        Thanks

        Show
        raunaq.abhyankar Raunaq Abhyankar added a comment - Dear Sir, Thanks.. I'll do that Can u please tell me about the language of implementation for this module? Thanks
        Hide
        wangwei.cs wangwei added a comment -

        The core models are implemented in C++.
        We are working on providing python API for major functions (models).
        Hence, C++ and Python are preferred.

        Show
        wangwei.cs wangwei added a comment - The core models are implemented in C++. We are working on providing python API for major functions (models). Hence, C++ and Python are preferred.
        Hide
        dpatnigere David Patnigere added a comment -

        Hello Sir,
        Does Singa currently have the capability of creating acoustic models or is it fine to use software such as HTK toolkit?
        Thank you.

        Show
        dpatnigere David Patnigere added a comment - Hello Sir, Does Singa currently have the capability of creating acoustic models or is it fine to use software such as HTK toolkit? Thank you.
        Hide
        wangwei.cs wangwei added a comment -

        SINGA supports RNN, CNN and RBM models which are used by acoustic models.
        We do not have a full acoustic modelling example, which is expected in this project.
        It is not clear about the compatibility of the licenses between HTK and SINGA (Apache V2).
        You can use other toolkit as long as it is compatible with Apache V2.

        Show
        wangwei.cs wangwei added a comment - SINGA supports RNN, CNN and RBM models which are used by acoustic models. We do not have a full acoustic modelling example, which is expected in this project. It is not clear about the compatibility of the licenses between HTK and SINGA (Apache V2). You can use other toolkit as long as it is compatible with Apache V2.
        Hide
        raunaq.abhyankar Raunaq Abhyankar added a comment -

        Dear Sir,
        Could you please elaborate upon the project goals for the summer? That would be of great help in writing my proposal and for further reading & research.
        Thanks

        Show
        raunaq.abhyankar Raunaq Abhyankar added a comment - Dear Sir, Could you please elaborate upon the project goals for the summer? That would be of great help in writing my proposal and for further reading & research. Thanks
        Hide
        wangwei.cs wangwei added a comment -

        Raunaq,

        The expectation (or goals) for this project is explained below.
        You can implement two speech recognition examples, e.g., one for the first phase (till the mid-term) and the other one for the second phase.
        -------------
        We expect to get one or two speech recognition examples like AlexNet for image classification.
        Particularly, we need the detailed instructions on preparing the (training/test) dataset, conducting the training and the test. We expect to see comparable performance as reported by other systems or in the papers where the models were published.

        Show
        wangwei.cs wangwei added a comment - Raunaq, The expectation (or goals) for this project is explained below. You can implement two speech recognition examples, e.g., one for the first phase (till the mid-term) and the other one for the second phase. ------------- We expect to get one or two speech recognition examples like AlexNet for image classification. Particularly, we need the detailed instructions on preparing the (training/test) dataset, conducting the training and the test. We expect to see comparable performance as reported by other systems or in the papers where the models were published.
        Hide
        dpatnigere David Patnigere added a comment -

        Hello Sir,
        I read the HTK licencse and it is a bit vague as to its compatibility with Apache v2. I assume that it will be better to not use it.
        Will the Kaldi toolkit be fine? It is licensed under Apache v2 (though not a part of ASF) and is designed for parallel computing.
        Thank you

        Show
        dpatnigere David Patnigere added a comment - Hello Sir, I read the HTK licencse and it is a bit vague as to its compatibility with Apache v2. I assume that it will be better to not use it. Will the Kaldi toolkit be fine? It is licensed under Apache v2 (though not a part of ASF) and is designed for parallel computing. Thank you
        Hide
        dpatnigere David Patnigere added a comment -

        Hello Sir,
        Based on your last reply to Raunaq, I have prepared a draft and saved a copy for now on the GSOC site.
        Please tell me how to proceed further.
        Your inputs on how to improve the proposal draft would be greatly appreciated.
        Thank you.

        Show
        dpatnigere David Patnigere added a comment - Hello Sir, Based on your last reply to Raunaq, I have prepared a draft and saved a copy for now on the GSOC site. Please tell me how to proceed further. Your inputs on how to improve the proposal draft would be greatly appreciated. Thank you.
        Hide
        wangwei.cs wangwei added a comment -

        Since Kaldi is released under Apache V2, it is fine to use it for this project for data pre-processing.

        Show
        wangwei.cs wangwei added a comment - Since Kaldi is released under Apache V2, it is fine to use it for this project for data pre-processing.
        Hide
        wangwei.cs wangwei added a comment -

        Great!
        I will add comments soon.

        Show
        wangwei.cs wangwei added a comment - Great! I will add comments soon.
        Hide
        wangwei.cs wangwei added a comment -

        I haven't seen your proposal in gsoc website.
        What is the proposal name?
        You can also share a google docs with us.

        Show
        wangwei.cs wangwei added a comment - I haven't seen your proposal in gsoc website. What is the proposal name? You can also share a google docs with us.
        Hide
        raunaq.abhyankar Raunaq Abhyankar added a comment -

        Hello Sir!
        I've uploaded my draft proposal. It would be great if you could review it and provide your inputs!
        Thanks

        Show
        raunaq.abhyankar Raunaq Abhyankar added a comment - Hello Sir! I've uploaded my draft proposal. It would be great if you could review it and provide your inputs! Thanks
        Hide
        dpatnigere David Patnigere added a comment -

        Hello sir,
        Regarding the speech recognition model:
        Singa currently has a 'loss layer'. Is this the same as the CTC loss function (as mentioned in the research paper http://arxiv.org/abs/1512.02595v1 ) ?
        Thank you.

        Show
        dpatnigere David Patnigere added a comment - Hello sir, Regarding the speech recognition model: Singa currently has a 'loss layer'. Is this the same as the CTC loss function (as mentioned in the research paper http://arxiv.org/abs/1512.02595v1 ) ? Thank you.
        Hide
        wangwei.cs wangwei added a comment -

        SINGA has two loss layers:
        1. softmax+cross-entropy
        2. squared euclidean distance

        pls check the loss function in the paper.

        Show
        wangwei.cs wangwei added a comment - SINGA has two loss layers: 1. softmax+cross-entropy 2. squared euclidean distance pls check the loss function in the paper.
        Hide
        raunaq.abhyankar Raunaq Abhyankar added a comment -

        Hi Mr. Wang!
        Thanks a lot for reviewing my proposal! I've made changes wherever necessary and also replied to your comments to make myself clearer.
        Also I have one question that I have added as a comment. If you could please clear that it would be great!
        Thanks again..
        Looking forward!

        Show
        raunaq.abhyankar Raunaq Abhyankar added a comment - Hi Mr. Wang! Thanks a lot for reviewing my proposal! I've made changes wherever necessary and also replied to your comments to make myself clearer. Also I have one question that I have added as a comment. If you could please clear that it would be great! Thanks again.. Looking forward!
        Hide
        dpatnigere David Patnigere added a comment -

        Hello sir,
        Thank you for the comments.
        I have made the required changes where needed.
        Your inputs on how to improve on this further would be of great help!
        Thank you!

        Show
        dpatnigere David Patnigere added a comment - Hello sir, Thank you for the comments. I have made the required changes where needed. Your inputs on how to improve on this further would be of great help! Thank you!
        Hide
        dpatnigere David Patnigere added a comment -

        Hello Sir,
        I have expanded on the model layers a little and added my github profile as well.
        Do tell me if anything still needs to be improved.
        Thank you!

        Show
        dpatnigere David Patnigere added a comment - Hello Sir, I have expanded on the model layers a little and added my github profile as well. Do tell me if anything still needs to be improved. Thank you!
        Hide
        raunaq.abhyankar Raunaq Abhyankar added a comment -

        Hello Mr Wang!
        I have updated my draft proposal (Finally )
        Would love to improve it based on your feedback!
        Thanks

        Show
        raunaq.abhyankar Raunaq Abhyankar added a comment - Hello Mr Wang! I have updated my draft proposal (Finally ) Would love to improve it based on your feedback! Thanks
        Hide
        wangwei.cs wangwei added a comment -

        It seems you all have submitted the final version, which is not visible to me until the submission deadline. Good luck.

        Show
        wangwei.cs wangwei added a comment - It seems you all have submitted the final version, which is not visible to me until the submission deadline. Good luck.
        Hide
        dpatnigere David Patnigere added a comment -

        Thank you Sir!

        Show
        dpatnigere David Patnigere added a comment - Thank you Sir!
        Hide
        raunaq.abhyankar Raunaq Abhyankar added a comment -

        Cool thanks!
        Looking forward..!

        Show
        raunaq.abhyankar Raunaq Abhyankar added a comment - Cool thanks! Looking forward..!
        Hide
        dpatnigere David Patnigere added a comment -

        Sir,
        just curious, but how many submissions have you recieved in all??

        Show
        dpatnigere David Patnigere added a comment - Sir, just curious, but how many submissions have you recieved in all??

          People

          • Assignee:
            Unassigned
            Reporter:
            wangwei.cs wangwei
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:

              Development