Spoken-Language-Identification-RNN

Objective

Spoken Language Identification (LID) is broadly defined as recognizing the language of a given speech utterance. It has numerous applications in automated language and speech recognition,multilingual machine translations, speech-to-speech translations, and emergency call routing. Inthis homework, we will try to classify three languages (English, Hindi and Mandarin) from the spoken utterances that have been crowd-sourced from the class.

Method

mapping = {'english': 0, 'hindi': 1, ' mandarin': 2}

Extract MFCC features from audio files, build up Recurrent Neural Network (GRU/LSTM) to train the model to output 3-class probability.

Duel with silence

Mark silence audios with label -1, and omit them both in loss and accuracy measurement.

Performance

After training from scratch and dueling with overfitting, my model performed ~90% validation accuracy.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.gitignore		.gitignore
README.md		README.md
config.py		config.py
data.py		data.py
generate_features.py		generate_features.py
learning_curves.png		learning_curves.png
model.py		model.py
streaming.py		streaming.py
test_streaming_model.py		test_streaming_model.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spoken-Language-Identification-RNN

Objective

Method

Duel with silence

Performance

About

Releases

Packages

techping/Spoken-Language-Identification-RNN

Folders and files

Latest commit

History

Repository files navigation

Spoken-Language-Identification-RNN

Objective

Method

Duel with silence

Performance

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages