DEVELOPING SEMANTIC TEXTUAL SIMILARITY FOR  GURAGIGNA LANGUAGE USING DEEP LEARNING APPROACH

GETNET DEGEMU

DEVELOPING SEMANTIC TEXTUAL SIMILARITY FOR GURAGIGNA LANGUAGE USING DEEP LEARNING APPROACH

dc.contributor.author	GETNET DEGEMU
dc.date.accessioned	2024-06-19T06:16:53Z
dc.date.available	2024-06-19T06:16:53Z
dc.date.issued	2024-05-28
dc.description.abstract	Natural language processing (NLP) is one part of how far the world has come in terms of technology. It is the process of teaching human language to machines and includes everything from Morphology Analysis to Pragmatic Analysis. Semantic Similarity is one of the highest levels of NLP. The Previous Semantic textual similarity (STS) studies have been conducted using from string-based similarity methods to deep learning methods. These studies have their limitations, and no research has been done for STS in the local language using deep learning. STS has significant advantages in NLP applications like information retrieval, information extraction, text summarization, data mining, machine translation, and other tasks. This thesis aims to present a deep learning approach for capturing semantic textual similarity (STS) in the Guragigna language. The methodology involves collecting a Guragigna language corpus and preprocessing the text data and text representation is done using the Universal Sentence Encoder (USE), along with word embedding techniques including Word2Vec and GloVe andmean Square Error (MSE) is used to measure the performance. In the experimentation phase, models like LSTM, Bidirectional RNN, GRU, and Stacked RNN are trained and evaluated using different embedding techniques. The results demonstrate the efficacy of the developed models in capturing semantic textual similarity in the Guragigna language. Across different embedding techniques, including Word2Vec, GloVe, and USE, the Bidirectional RNN model with USE embedding achieves the lowest MSE of 0.0950 and the highest accuracy of 0.9244. GloVe and Word2Vec embedding also show competitive performance with slightly higher MSE and lower accuracy. The Universal Sentence Encoder consistently emerges as the top-performing embedding across all RNN architectures. The research results demonstrate the effectiveness of LSTM, GRU, Bi RNN, and Stacked RNN models in measuring semantic textual similarity in the Garaging language	en_US
dc.description.sponsorship	wolkite universty	en_US
dc.language.iso	en	en_US
dc.publisher	WOLKITE UNIVERSITY	en_US
dc.subject	Semantic textual similarity, Guragigna language, deep learning,	en_US
dc.subject	corpus-based approaches,	en_US
dc.subject	deep learning	en_US
dc.subject	corpus-based approaches	en_US
dc.subject	LSTM,	en_US
dc.subject	GRU,	en_US
dc.subject	Bidirectional RNN,	en_US
dc.subject	Stacked RNN and Word embedding	en_US
dc.title	DEVELOPING SEMANTIC TEXTUAL SIMILARITY FOR GURAGIGNA LANGUAGE USING DEEP LEARNING APPROACH	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: GETNET DEGEMU.pdf
Size:: 3.33 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Department of Computer Science