College of Computing and Informatics
URI for this communityhttps://rps.wku.edu.et/handle/987654321/2333
College of Computing and Informatics
Browse
10 results
Search Results
Item SENTIMENT ANALYSIS FOR AMHARIC-ENGLISH CODE-MIXED SOCIO-POLITICAL POSTS USING DEEP LEARNIN(WOLKITE UNIVERSITY, 2024-05) YITAYEW EBABUSentiment analysis is crucial in natural language processing for identifying emotional nuances in text. Analyzing sentiment in natural language text is essential for discerning emotional subtleties. However, this task becomes especially intricate when dealing with code-mixed texts, like Amharic-English, which exhibit language diversity and frequent code switching, particularly in social media exchanges. In this investigation, we propose employing CNN, LSTM, BiLSTM, and CNN-BiLSTM models to tackle sentiment classification in such code-mixed texts. Our approach involves leveraging deep learning techniques and various preprocessing methods, including language detection and code switching integration. We conducted four experiments utilizing Count Vectorizer and TF IDF. Our assessment reveals that incorporating language detection and code-switching significantly boosts model accuracy. Specifically, the average accuracy of the CNN model increased from 82.004% to 84.458%, the LSTM model from 79.716% to 81.234%, the BiLSTM model from 81.586% to 83.402%, and the CNN-BiLSTM model from 82.128% to 84.765%. These results underscore the efficacy of tailored preprocessing strategies and language detection in enhancing sentiment classification accuracy for code-mixed texts. Our study emphasizes the imperative of addressing language diversity and code-switching to achieve dependable sentiment analysis in multilingual environments. Furthermore, it provides valuable insights for future research, highlighting the importance of language specific preprocessing techniques to optimize model performance across diverse linguistic contextsItem DEVELOPING CLASSIFICATION MODEL WITH KNOWLEDG BASE SYSTEM FOR DIAGNOSIS AND TREATMENT RECOMMENDATION OF HOSPITAL ACQUIRED PNEMONIA(WOLKITE UNIVERSITY, 2024-04) WONDIMU KIBATU GIRMAPneumonia is an illness, usually caused by infection, in which the lungs become inflamed and congested, reducing oxygen exchange and leading to cough and breathlessness. It affects individuals of all ages but occurs most frequently in children and elderly. Pneumonia has different categories. Hospital Acquired Pneumonia (HAP) is the first in mortality and morbidity. And lack of health facilities, lack of sufficient professional in hospitals and complexity of the diagnosis process are problems exposed. KBS has great role in the health care sector so this study aims to combine data mining results with expert knowledge, and establish a Knowledge Base System for the diagnosis and treatment recommendation of HAP. Design science research methodology, with a hybrid data mining process model was employed. The researcher gathered a dataset of 3244 cases of Hospital Acquired Pneumonia (HAP) from Werabe Referral Hospital. The random forest, J48, JRip, and PART algorithms were used in 4 tests with two distinct scenarios by the researcher in order to create the classifier model. PART classifier algorithm conducted on selected attributes with percentage split test option with an accuracy of 99.3% was achieved. To model the gained knowledge decision tree modeling technique and to represent the gained knowledge the rule-based knowledge representation technique was used. Semi structured interview technique is chosen for acquiring knowledge from expert. Then it is modeled by using the decision tree modeling techniques and represented in the production rule. The two extracted knowledge was combined and checked for rule redundancy to develop the knowledge-based system. Finally, to develop the KBS the researcher used SWI prolog and Net Beans for making user interface. To evaluate performance of the developed system, the researcher has used system performance testing and user acceptance evaluation. And Achieves 90.3% accuracy for system performance. And 91.3% of accuracy for user acceptance testing. The result show that the developed system achieves good performance and meets the objectives of the study and it could give proper treatment. This deduces that the developed system could help in identifying the severity level and in diagnosis and treatment recommendation of Hospital Acquired Pneumonia.Item SCHOOL OF GRADUATE STUDIES DEVELOPING AUTOMATIC CONSTITUENCY PARSER FOR SILTIGNA LANGUAGE USING DEEP LEARNING APPROACH(WOLKITE UNIVERSITY, 2024-04) TEKA MOHAMMEDIn our study, we focused on developing automatic constituency parsing for the Silting language using deep learning approaches. The Siltigna language is experiencing increased speaker numbers, and our goal was to address the language's issues and enhance its content globally. To achieve this, we employed a deep learning technique known as the transition method and the main architecture we used was a seq-to-seq auto encoder-decoder model. This model has been widely used in natural language processing tasks. We conducted experiments using various deep learning models, including LSTM, BiLSTM, LSTM with attention, GRU, and transformer models. To train and evaluate these models, collected a dataset of approximately 2000 sentences and labeled them with corresponding parse trees. Before parsing the sentences into sequences, we applied preprocessing techniques such as data cleaning and tokenization and split the dataset into training and testing sets using an 80-20 split. Subsequently, we trained and tested theLSTM, Bi-LSTM, LSTM with attention, GRU, and Transformer models on the labeled parse tree data. Among these models, the transformer model achieved the best performance with 84.38% accuracy, 0.137 losses, and LAS of 0.83. This indicates that the transformer model was most effective in accurately parsing the Silting language. Our study highlights the importance of natural language processing with interconnected global community. By developing automatic constituency parsing for the Siltigna language, we aimed to bridge language barriers and enable effective communication across borders.Item DESIGN HYBRID BESAD INTRUSION DETECTION SYSTEM USING MACHINE LEARNING ALGORITHM AND SAFE MACHINE LEARNING WOLKITE, ETHIOPIA(2024-04) SOLOMON NEGASA JARAIn the subject of computer network security, network attacks have gained international attention. Hence, this thesis aims to evaluate different machine learning classification algorithms and Safe ml that classify network events in intrusion detection systems using a supervised approach and unsupervised types of machine learning classification methods. We used the following methods to carry out the analysis: KNN, Decision Tree, Random Forest, and Extra tree. for a supervised approach. Here we have presented a hybrid Machine learning approach to detect attacks. In the misuse or signature detection module, we used three different classifiers KNN, Decision Tree, Random Forest, and Extra tree which detect known attacks based on the signature database. The unsupervised detection module handles the unknown attack by employing the k-Means Clustering algorithm. Applying data normalization during the pre-processing phase. label encoder, which, as many ML techniques cannot support them directly, translates string characteristics into numerical features. During the ML model construction process, we employ a comprehensive feature engineering technique using IG and KPCA to eliminate irrelevant, redundant, and noisy features while retaining the essential ones. Also used SPSS to examine associations and evaluate descriptive statistics like mean and standard deviation. To show the performance of the proposed evaluation method, we conducted the experiment on the CICIDS2017 dataset. The results of our experiments showed that the classification model integrated with the transformation and feature selection method results in superior accuracy, error rate, and reduced false alarms. The result also shows that the Extra Tree model and RF hold the highest accuracy and reduce the false alarm rate. By utilizing ECDF-based statistical distance measures, the Second Technique accurately predicted model performance safely. This part comprises experiments performed With Safe Machine learning algorithms using Empirical Cumulative Distribution Function (ECDF) using statistical distance measures including the Kolmogorov-Smirnov, Kuiper, Anderson-Darling, Wasserstein, and mixed Wasserstein-Anderson-Darling measures. Then compare all statistical distance accuracy measures of the accuracy of Safe ml with machine learning algorithms accuracy to find the best high confidence modelItem DEVELOPING SEMANTIC TEXTUAL SIMILARITY FOR GURAGIGNA LANGUAGE USING DEEP LEARNING APPROACH(WOLKITE UNIVERSITY, 2024-05-28) GETNET DEGEMUNatural language processing (NLP) is one part of how far the world has come in terms of technology. It is the process of teaching human language to machines and includes everything from Morphology Analysis to Pragmatic Analysis. Semantic Similarity is one of the highest levels of NLP. The Previous Semantic textual similarity (STS) studies have been conducted using from string-based similarity methods to deep learning methods. These studies have their limitations, and no research has been done for STS in the local language using deep learning. STS has significant advantages in NLP applications like information retrieval, information extraction, text summarization, data mining, machine translation, and other tasks. This thesis aims to present a deep learning approach for capturing semantic textual similarity (STS) in the Guragigna language. The methodology involves collecting a Guragigna language corpus and preprocessing the text data and text representation is done using the Universal Sentence Encoder (USE), along with word embedding techniques including Word2Vec and GloVe andmean Square Error (MSE) is used to measure the performance. In the experimentation phase, models like LSTM, Bidirectional RNN, GRU, and Stacked RNN are trained and evaluated using different embedding techniques. The results demonstrate the efficacy of the developed models in capturing semantic textual similarity in the Guragigna language. Across different embedding techniques, including Word2Vec, GloVe, and USE, the Bidirectional RNN model with USE embedding achieves the lowest MSE of 0.0950 and the highest accuracy of 0.9244. GloVe and Word2Vec embedding also show competitive performance with slightly higher MSE and lower accuracy. The Universal Sentence Encoder consistently emerges as the top-performing embedding across all RNN architectures. The research results demonstrate the effectiveness of LSTM, GRU, Bi RNN, and Stacked RNN models in measuring semantic textual similarity in the Garaging languageItem YEMSA TO AMHARIC MACHINE TRANSLATION USINGDEEP LEARNING TECHNIQUES(WOLKITE UNIVERSITY, 2023-12) TEMESGEN HABTAMU ESHETU, TEMESGENIn today's globalized world, the barriers of distance and language have been greatly diminished, transforming our world into a closely interconnected global community. As a consequence, human languages have taken on an international character, enabling effective communication across borders. Traditionally, human translation is costly and inconvenient; several kinds of research are currently conducted to resolve this problem with machine translation techniques. |So, it is automatic, which means it translates one language to another using a computer software system. In this study, Yemsa to Amharic machine translation and vice versa are used by deep learning techniques. Due to increased speaker numbers, to address the issue of endangered Yemsa language and enhance the language's content on the World Wide Web. A numberof indigenous knowledge medicines called Samo Heta and other traditional and religious names are found in the Yemsa language. We utilized the current STOA method of deep learning. The work was executed using a seq-to-seq encoder-decoder architecture. The proposed study was conducted experiments on LSTM, Bi-LSTM, LSTM with attention, GRU and transformer models. We collected a dataset of about 6,000 parallel sentences with 11690 and 12491 vocabularies. In order to translate textinto sentence sequence, we applied the preprocessing technique and used Morfessor tools. The proposed studies utilize the 80/20 splitting technique for dividing the dataset into training and testing sets. The next step is training and testing models on a corresponding with training and testing dataset. The experiment was conducted on LSTM, Bi-LSTM, LSTM+ attention, GRU and Transformer models. Among those models, the transformer model outperforms other models by 99.4% accuracy, 0.0113 loss. And BLEU scores of 9.7 and 9.8 from Yemsa to Amharic and Amharic to Yemsa respectively. The primary limitation of the investigation is the insufficient availability of a substantial dataset to conduct comprehensive experimentation. As a result, there is a necessity to generate parallel corpora in order to conduct comparable research. Finally, the findings of the study show that utilizing deep learning techniques, particularly the transformer model, can significantly improve Yemsa to Amharic machine translation accuracy and BLEU scores. Department of Information SystemsCommunity Department of Software EngineeringCollection Department of Information TechnologyCollection Department of Computer ScienceCollection