END-TO-END SPEECH RECOGNITION FOR GURAGIGNA LANGUAGE USING DEEP LEARNING TECHNIQUES

dc.contributor.authorABDO NESRU EBRAHIM
dc.date.accessioned2025-05-27T08:31:58Z
dc.date.issued2025-08
dc.description.abstractSpeech recognition entails converting long sequences of acoustic features into shorter sequences of discrete symbols, such as words or phonemes. This process is complicated by varying sequence lengths and uncertainty in output symbol locations, making traditional classifiers impractical. Current automated systems struggle with speaker-independent continuous speech, particularly inlow-resource languages like Guragigna, where the Cheha dialect poses additional challenges dueto its purely spoken nature and lack of a rigid grammatical structure. To address these issues, this research develops an end-to-end speech recognition model utilizing deep learning techniques, specifically a hybrid CNN-BIGRU architecture combined with CTC and attention mechanisms. This approach aims to enhance alignment and robustness in noisy environments. To train and testthe model, a text and speech corpus was created by compiling dataset from different sources likein Wolkite FM, the Old and New Testaments. Experimental results indicate that the CNN-BIGRU model achieves a Word Error Rate (WER) of 2.5%, showcasing improved generalization capabilities. Additionally, four recurrent neural network models LSTM, Bilstm, GRU, and BIGRUwere evaluated, each configured with 1024 hidden units and optimized using the Adam optimizer over 50 epochs. The BIGRU model outperformed the others, achieving an accuracy of 97.50%,while the LSTM, Bilstm, and GRU models achieved maximum accuracies of 95.99%, 96.92%, and96.25%, respectively. The successful implementation of this end-to-end speech recognition system significantly advances communication technologies for low-resource languages, enhancing accessibility for diverse linguistic communities. The findings underscore the effectiveness of deep learning methods in improving speech recognition performance in challenging linguistic contexts.
dc.description.sponsorshipwolkite universty
dc.identifier.urihttps://rps.wku.edu.et/handle/123456789/46052
dc.language.isoen
dc.publisherwolkite universty
dc.subjectAutomatic Speech Recognition
dc.subjectNLP
dc.subjectDeep learning
dc.subjectLSTM
dc.subjectBILSTM
dc.subjectGRU
dc.subjectBIGRU
dc.subjectRNN
dc.subjectCNN
dc.titleEND-TO-END SPEECH RECOGNITION FOR GURAGIGNA LANGUAGE USING DEEP LEARNING TECHNIQUES
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
ABDO NESRU EBRAHIM .pdf
Size:
2.41 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: