Repository logo
Colleges & Collections
All of WKU-Repo
  • English
  • العربية
  • বাংলা
  • Català
  • Čeština
  • Deutsch
  • Ελληνικά
  • Español
  • Suomi
  • Français
  • Gàidhlig
  • हिंदी
  • Magyar
  • Italiano
  • Қазақ
  • Latviešu
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Srpski (lat)
  • Српски
  • Svenska
  • Türkçe
  • Yкраї́нська
  • Tiếng Việt
Log In
New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Browse by Author

Browsing by Author "ABDO NESRU EBRAHIM"

Filter results by typing the first few letters
Now showing 1 - 1 of 1
  • Results Per Page
  • Sort Options
  • No Thumbnail Available
    Item
    END-TO-END SPEECH RECOGNITION FOR GURAGIGNA LANGUAGE USING DEEP LEARNING TECHNIQUES
    (wolkite universty, 2025-08) ABDO NESRU EBRAHIM
    Speech recognition entails converting long sequences of acoustic features into shorter sequences of discrete symbols, such as words or phonemes. This process is complicated by varying sequence lengths and uncertainty in output symbol locations, making traditional classifiers impractical. Current automated systems struggle with speaker-independent continuous speech, particularly inlow-resource languages like Guragigna, where the Cheha dialect poses additional challenges dueto its purely spoken nature and lack of a rigid grammatical structure. To address these issues, this research develops an end-to-end speech recognition model utilizing deep learning techniques, specifically a hybrid CNN-BIGRU architecture combined with CTC and attention mechanisms. This approach aims to enhance alignment and robustness in noisy environments. To train and testthe model, a text and speech corpus was created by compiling dataset from different sources likein Wolkite FM, the Old and New Testaments. Experimental results indicate that the CNN-BIGRU model achieves a Word Error Rate (WER) of 2.5%, showcasing improved generalization capabilities. Additionally, four recurrent neural network models LSTM, Bilstm, GRU, and BIGRUwere evaluated, each configured with 1024 hidden units and optimized using the Adam optimizer over 50 epochs. The BIGRU model outperformed the others, achieving an accuracy of 97.50%,while the LSTM, Bilstm, and GRU models achieved maximum accuracies of 95.99%, 96.92%, and96.25%, respectively. The successful implementation of this end-to-end speech recognition system significantly advances communication technologies for low-resource languages, enhancing accessibility for diverse linguistic communities. The findings underscore the effectiveness of deep learning methods in improving speech recognition performance in challenging linguistic contexts.

WKU Repository © 2025 Wolkite University

  • Privacy policy
  • End User Agreement
  • Send Feedback
Repository logo COAR Notify