Department of Computer Science
URI for this collectionhttps://rps.wku.edu.et/handle/123456789/45765
Department of Computer Science
Browse
2 results
Search Results
Item SCHOOL OF GRADUATE STUDIES DEVELOPING AUTOMATIC CONSTITUENCY PARSER FOR SILTIGNA LANGUAGE USING DEEP LEARNING APPROACH(WOLKITE UNIVERSITY, 2024-04) TEKA MOHAMMEDIn our study, we focused on developing automatic constituency parsing for the Silting language using deep learning approaches. The Siltigna language is experiencing increased speaker numbers, and our goal was to address the language's issues and enhance its content globally. To achieve this, we employed a deep learning technique known as the transition method and the main architecture we used was a seq-to-seq auto encoder-decoder model. This model has been widely used in natural language processing tasks. We conducted experiments using various deep learning models, including LSTM, BiLSTM, LSTM with attention, GRU, and transformer models. To train and evaluate these models, collected a dataset of approximately 2000 sentences and labeled them with corresponding parse trees. Before parsing the sentences into sequences, we applied preprocessing techniques such as data cleaning and tokenization and split the dataset into training and testing sets using an 80-20 split. Subsequently, we trained and tested theLSTM, Bi-LSTM, LSTM with attention, GRU, and Transformer models on the labeled parse tree data. Among these models, the transformer model achieved the best performance with 84.38% accuracy, 0.137 losses, and LAS of 0.83. This indicates that the transformer model was most effective in accurately parsing the Silting language. Our study highlights the importance of natural language processing with interconnected global community. By developing automatic constituency parsing for the Siltigna language, we aimed to bridge language barriers and enable effective communication across borders.Item DEVELOPING SEMANTIC TEXTUAL SIMILARITY FOR GURAGIGNA LANGUAGE USING DEEP LEARNING APPROACH(WOLKITE UNIVERSITY, 2024-05-28) GETNET DEGEMUNatural language processing (NLP) is one part of how far the world has come in terms of technology. It is the process of teaching human language to machines and includes everything from Morphology Analysis to Pragmatic Analysis. Semantic Similarity is one of the highest levels of NLP. The Previous Semantic textual similarity (STS) studies have been conducted using from string-based similarity methods to deep learning methods. These studies have their limitations, and no research has been done for STS in the local language using deep learning. STS has significant advantages in NLP applications like information retrieval, information extraction, text summarization, data mining, machine translation, and other tasks. This thesis aims to present a deep learning approach for capturing semantic textual similarity (STS) in the Guragigna language. The methodology involves collecting a Guragigna language corpus and preprocessing the text data and text representation is done using the Universal Sentence Encoder (USE), along with word embedding techniques including Word2Vec and GloVe andmean Square Error (MSE) is used to measure the performance. In the experimentation phase, models like LSTM, Bidirectional RNN, GRU, and Stacked RNN are trained and evaluated using different embedding techniques. The results demonstrate the efficacy of the developed models in capturing semantic textual similarity in the Guragigna language. Across different embedding techniques, including Word2Vec, GloVe, and USE, the Bidirectional RNN model with USE embedding achieves the lowest MSE of 0.0950 and the highest accuracy of 0.9244. GloVe and Word2Vec embedding also show competitive performance with slightly higher MSE and lower accuracy. The Universal Sentence Encoder consistently emerges as the top-performing embedding across all RNN architectures. The research results demonstrate the effectiveness of LSTM, GRU, Bi RNN, and Stacked RNN models in measuring semantic textual similarity in the Garaging language