College of Computing and Informatics

URI for this communityhttps://rps.wku.edu.et/handle/987654321/2333

Browse

Search Results

Now showing 1 - 10 of 14

DESIGN SOCIAL EVENT EXTRACTION MODEL FROM AMHARIC TEXTS USING DEEP LEARNING APPROACHES
(Wolkite University, 2025-05-01) MIMI ADIMASU GODINE
Social events play a crucial role in capturing societal trends, public opinions, and cultural activities. Extracting and analyzing social events from Amharic text can provide valuable insights into various domains. The extraction of social events from Amharic text poses significant challenges due to the complexity of the language and the unstructured nature of user-generated content. This study aims to develop an effective social event extraction model for Amharic text using deep learning approaches. This study used Yem zone social event datasets, total dataset size is 4,738 event records. By evaluating various feature extraction techniques, including Fast Text, Bi-grams, and Tri-grams, we identify the most suitable methods for enhancing event extraction accuracy. We implement several deep learning models, including LSTM, Bi-LSTM, GRU, Bi-GRU, and Simple-RNN, and assess their performance in extracting event trigger words. The results indicate that the GRU and Bi-GRU models consistently outperform their LSTM and Bi-LSTM counterparts, particularly when utilizing Tri-gram features. Notably, the Bi-GRU model achieves the highest accuracy of 1.00, underscoring the benefits of a bidirectional approach in capturing contextual information. This research contributes to the advancement of Amharic language processing, offering insights that can support various applications such as cultural studies, disaster management, and crisis response. Additionally, we introduce a social event extraction corpus for the Amharic language, paving the way for future research in this area.
TIME SERIES CRIME PREDICTION ANALYSIS USING RNN: A CASE OF WOLKITE CITY POLICE DEPARTMENT
(Wolkite University, 2024-01-01) SOLOMON KASSAYE ESHETU
Crime is an undesirable phenomenon and a global concern that impacts both society and individuals. Annually, we observe an increase in the number of criminal incidents, posing a threat to both public safety and the well-being of the community. Demanding facilities at unequal times is one problem observed in police workforce assignment. Our study aims to determine and examine the relationship between Crime date-time and the number of Crime incidents, as well as their types and locations. We collected nine thousand eight hundred twenty (9,820) criminal offenses handled by Wolkite City ranging from 2008-to-2014 E.C and we include seven (7) most frequently occur Crime types and fifty-two (52) Crime locations in our study. Different preprocessing techniques are applied such as label encoder and Minmax scaler. And we employed RNN models, including Long Short-Term Memory, Gated Recurrent Unit, Bidirectional Long Short-Term Memory and Bidirectional Gated Recurrent Unit, also train the models using training dataset and predict Crime type and location, finally evaluate the model’s using metrics like MSE, R2 and others by testing dataset. For Crime type prediction LSTM has MSE of 0.0125, 0.0126 and 0.0468, Bi-LSTM has MSE of 0.0126, 0.0125 and 0.0466, GRU has MSE of 0.0127, 0.0128 and 0.0501, Bi-GRU has MSE of 0.0126, 0.026, and 0.0468, for hourly, daily and monthly respectively for each model. For Crime location prediction LSTM has MSE of 0.0108, 0.0109 and 0.0617, Bi- LSTM has MSE of 0.0108, 0.0110 and 0.0506, GRU has MSE of 0.0106, 0.0105 and 0.0582, Bi-GRU has MSE of 0.0105, 0.0106, and 0.0513, for hourly, daily and monthly respectively for each model. For Crime type prediction Bi-GRU, Bi-LSTM, LSTM, GRU perform R2 of 0.9995, 0.9994, 0.9899, and 0.9811 respectively. Fo Crime location prediction Bi-LSTM, LSTM, Bi-Bi-GRU and GRU gained R2 of 0.9938 0.9937, 0.9937 and 0.9934, respectively. For hourly Crime type prediction LSTM is slightly better and for daily and monthly Bi-LSTM is better. For hourly and monthly Crime location Bi-GRU is slightly better and for daily, GRU slightly better. In terms of R2, Bi-GRU slightly higher score than others for Crime type and for Crime location Bi-LSTM is slightly higher R2 values. In general, Bi-LSTM and Bi-GRU gained better score for Crime prediction with low error for our dataset.
END-TO-END SPEECH RECOGNITION FOR GURAGIGNA LANGUAGE USING DEEP LEARNING TECHNIQUES
(Wolkite University, 2025-10-05) ABDO NESRU EBRAHIM
Speech recognition entails converting long sequences of acoustic features into shorter sequences of discrete symbols, such as words or phonemes. This process is complicated by varying sequence lengths and uncertainty in output symbol locations, making traditional classifiers impractical. Current automated systems struggle with speaker-independent continuous speech, particularly in low-resource languages like Guragigna, where the Cheha dialect poses additional challenges due to its purely spoken nature and lack of a rigid grammatical structure. To address these issues, this research develops an end-to-end speech recognition model utilizing deep learning techniques, specifically a hybrid CNN-BIGRU architecture combined with CTC and attention mechanisms. This approach aims to enhance alignment and robustness in noisy environments. To train and test the model, a text and speech corpus was created by compiling dataset from different sources like in Wolkite FM, the Old and New Testaments. Experimental results indicate that the CNN-BIGRU model achieves a Word Error Rate (WER) of 2.5%, showcasing improved generalization capabilities. Additionally, four recurrent neural network models LSTM, Bilstm, GRU, and BIGRU were evaluated, each configured with 1024 hidden units and optimized using the Adam optimizer over 50 epochs. The BIGRU model outperformed the others, achieving an accuracy of 97.50%, while the LSTM, Bilstm, and GRU models achieved maximum accuracies of 95.99%, 96.92%, and 96.25%, respectively. The successful implementation of this end-to-end speech recognition system significantly advances communication technologies for low-resource languages, enhancing accessibility for diverse linguistic communities. The findings underscore the effectiveness of deep learning methods in improving speech recognition performance in challenging linguistic contexts.
YEMSA TO AMHARIC MACHINE TRANSLATION USINGDEEP LEARNING TECHNIQUES
(Wolkite University, 2023-04-01) TEMESGEN HABTAMU ESHETU
In today's globalized world, the barriers of distance and language have been greatly diminished, transforming our world into a closely interconnected global community. As a consequence, human languages have taken on an international character, enabling effective communication across borders. Traditionally, human translation is costly and inconvenient; several kinds of research are currently conducted to resolve this problem with machine translation techniques. |So, it is automatic, which means it translates one language to another using a computer software system. In this study, Yemsa to Amharic machine translation and vice versa are used by deep learning techniques. Due to increased speaker numbers, to address the issue of endangered Yemsa language and enhance the language's content on the World Wide Web. A numberof indigenous knowledge medicines called Samo Heta and other traditional and religious names are found in the Yemsa language. We utilized the current STOA method of deep learning. The work was executed using a seq-to-seq encoder-decoder architecture. The proposed study was conducted experiments on LSTM, Bi-LSTM, LSTM with attention, GRU and transformer models. We collected a dataset of about 6,000 parallel sentences with 11690 and 12491 vocabularies. In order to translate textinto sentence sequence, we applied the preprocessing technique and used Morfessor tools. The proposed studies utilize the 80/20 splitting technique for dividing the dataset into training and testing sets. The next step is training and testing models on a corresponding with training and testing dataset. The experiment was conducted on LSTM, Bi-LSTM, LSTM+ attention, GRU and Transformer models. Among those models, the transformer model outperforms other models by 99.4% accuracy, 0.0113 loss. And BLEU scores of 9.7 and 9.8 from Yemsa to Amharic and Amharic to Yemsa respectively. The primary limitation of the investigation is the insufficient availability of a substantial dataset to conduct comprehensive experimentation. As a result, there is a necessity to generate parallel corpora in order to conduct comparable research. Finally, the findings of the study show that utilizing deep learning techniques, particularly the transformer model, can significantly improve Yemsa to Amharic machine translation accuracy and BLEU scores.
SENTENCE-LEVEL GRAMMAR CHECKER FOR KAMBAATISSA LANGUAGE USING DEEP LEARNING APPROACH
(Wolkite University, 2023) TIHUN SEIFU
In the modern world, the most basic and culturally accepted means of communication is language, and the use of grammar is crucial to language fluency. Finding grammatical errors in natural language processing applications involves checking whether the words in a sentence conform to the predefined grammar rules for number, gender, tense, and the necessary information to convey the information in written language. Incorrect input sentences can have agreement problems, such as subject-verb, adjective-noun, and adverbverb agreement problems. This study developed a sentence-level grammar checker for the Kambaatissa language using a deep learning approach. In particular, we focus on implementations of gating methods such as the LSTM class and the more recently proposed GRU. For the development of the proposed model, Python programming languages and packages were used. Among the packages, the TensorFlow and Keras packages can effectively perform grammar error checking of the proposed model. GRU and LSTM test cases were used for evaluation. Finally, the test results show that the LSTM accuracy is 83% recall, 83% precision, 83%, f1_score is 83% and kappa score is 78%. The GRU performed 83% accuracy, 83% of recall, 83% precision and 83% f1_score and kappa score is 77%. The main challenge of this study was the rich and complex morphology of the Kambaatissa language and to find a sufficient amount of Kambaatissa sentence. The results of this study help to advance the Kambaatissa language's grammar checking technology. For writers, students, and language learners seeking to ensure that written text is grammatically correct and consistent, advanced grammar proofreading is an invaluable resource. Future research directions include expanding the coverage of the grammar checker to handle more complex grammatical constructions and integrating it with textual support models for wider usability.
ENHANCING THE SECURITY OF USER DATA STORED ON CLOUD
(Wolkite University, 2024-01-01) HANA MOTI
Data protection is now more important than ever to protect against hacking due to the Internet's explosive development in text transfer. Many encryption and decryption algorithms are used to offer a high level of security, including DH (Diffie Hellman), RSA ( Rivest Shamir-Adleman), and AES (Advanced Encryption Standard).However, these algorithms frequently call for large key sizes, which can make implementation difficult. In this article, a hybrid technique for data encryption with RSA and AES with LZW (Lempel-Ziv-Welch) compression technique is proposed. This thesis mainly concentrates on evaluating the effectiveness of text data encryption and decryption methods utilizing the AES and RSA algorithms, LZW compression, MEGA cloud storage and Cyber Ghost VPN for safe storage and internet access. The paper highlights how these methods increase algorithm strength, key generation, and decryption speed to guarantee the security and privacy of sensitive user data. The goal of this research is to provide a secure data transport solution that overcomes the shortcomings of existing encryption techniques. The difficulty of key size in encryption methods, which might provide employment issues, is the issue that this work attempts to solve. The suggested method seeks to offer a quick and reliable way to encrypt data using the strength of RSA and AES Cryptography together. The study's findings show that the suggested method is quick and secure. Implementation and performance analysis using Python 3.11 (64-bit) shows the efficacy of the suggested strategy. For implementation and performance analysis we use "Design of Technology Integrated Shredder Machine" project of data set size 4,616,263 bytes before compression. After compression, the file size reduced to 4,475,411 bytes. based on results,the proposed algorithm improve the original RSA key generation time is 3.336312 second or 100.00%,encryption time of AES by RSA 0.015691 second or 0.01%,the decryption time of RSA 0.015624 second or 0.01%,the encryption time of compressed data by AES 0.000011 second or 0.00% and the decryption time of compressed data by AES 0.0000012 second or 0.01% but when we encrypt the by AES alone ,the encryption time is 0.07812s or 47.75% and the decryption time is 0.08546s or 52.25 % as the result of this performance investigation, the proposed or AES and RSA hybrid algorithm is very fast and secure than the individual once.
NETWORK TRAFFIC CLASSIFICATION OF SOFTWARE DEFINED NETWORK USING DEEP LEARNING
(WOLKITE UNIVERSITY, 2024-08) BETEMICHAEL KASSAYE MIHRETU
In order to cope with the monumental growth in network traffic, the field of networking is continuously progressing to accommodate this monumental growth in network traffic. As a matter of fact, a centralized control mechanism is provided by architectures such as Software Defined Networking (SDN) for the measurement, control, and prediction of network traffic, but the amount of information that the SDN controller receives is enormous. Recently, it has been suggested that machine learning (ML) is used to process that data. In fact, it is crucial to fine-grained network management, resource utilization, network security that, network traffic classification is used in a variety of network activities. To classify and analyze network traffic flows, the port-based approach, deep packet inspection, and ML are among the most widely used methods. Nevertheless, over the past several years, there has been an explosion in the number of users of the Internet, which has led to an explosive increase in Internet traffic. The exponential growth of Internet applications, which incur high computational costs, has made port-based, deep packet inspection (DPI), and ML approaches inefficient. It has been found that software-defined networking is redefining the network architecture by separating the control plane from the data plane and resulting in the creation of a centralized network controller that maintains a global view of the entire network. The aim of this paper is to propose a new deep learning model for software-defined networks able to accurately predict a wide range of traffic applications in a short time-frame to improve efficiency. In contrast to traditional ML approaches, theproposed model has been able to achieve better results in terms of accuracy, precision, recall, and F1-Score when compared with the traditional approaches. The performance metrics result from deep learning model indicates accuracy of 90.7%, F1-Score of 91%, Precision consistently of above 92%, Recall 88% and testing accuracy 92% respectively. It has been suggested that some further directions should be pursued to achieve future advances in this field based upon the results obtained.
SENTIMENT ANALYSIS FOR AMHARIC-ENGLISH CODE-MIXED SOCIO-POLITICAL POSTS USING DEEP LEARNIN
(WOLKITE UNIVERSITY, 2024-05) YITAYEW EBABU
Sentiment analysis is crucial in natural language processing for identifying emotional nuances in text. Analyzing sentiment in natural language text is essential for discerning emotional subtleties. However, this task becomes especially intricate when dealing with code-mixed texts, like Amharic-English, which exhibit language diversity and frequent code switching, particularly in social media exchanges. In this investigation, we propose employing CNN, LSTM, BiLSTM, and CNN-BiLSTM models to tackle sentiment classification in such code-mixed texts. Our approach involves leveraging deep learning techniques and various preprocessing methods, including language detection and code switching integration. We conducted four experiments utilizing Count Vectorizer and TF IDF. Our assessment reveals that incorporating language detection and code-switching significantly boosts model accuracy. Specifically, the average accuracy of the CNN model increased from 82.004% to 84.458%, the LSTM model from 79.716% to 81.234%, the BiLSTM model from 81.586% to 83.402%, and the CNN-BiLSTM model from 82.128% to 84.765%. These results underscore the efficacy of tailored preprocessing strategies and language detection in enhancing sentiment classification accuracy for code-mixed texts. Our study emphasizes the imperative of addressing language diversity and code-switching to achieve dependable sentiment analysis in multilingual environments. Furthermore, it provides valuable insights for future research, highlighting the importance of language specific preprocessing techniques to optimize model performance across diverse linguistic contexts
DEVELOPING CLASSIFICATION MODEL WITH KNOWLEDG BASE SYSTEM FOR DIAGNOSIS AND TREATMENT RECOMMENDATION OF HOSPITAL ACQUIRED PNEMONIA
(WOLKITE UNIVERSITY, 2024-04) WONDIMU KIBATU GIRMA
Pneumonia is an illness, usually caused by infection, in which the lungs become inflamed and congested, reducing oxygen exchange and leading to cough and breathlessness. It affects individuals of all ages but occurs most frequently in children and elderly. Pneumonia has different categories. Hospital Acquired Pneumonia (HAP) is the first in mortality and morbidity. And lack of health facilities, lack of sufficient professional in hospitals and complexity of the diagnosis process are problems exposed. KBS has great role in the health care sector so this study aims to combine data mining results with expert knowledge, and establish a Knowledge Base System for the diagnosis and treatment recommendation of HAP. Design science research methodology, with a hybrid data mining process model was employed. The researcher gathered a dataset of 3244 cases of Hospital Acquired Pneumonia (HAP) from Werabe Referral Hospital. The random forest, J48, JRip, and PART algorithms were used in 4 tests with two distinct scenarios by the researcher in order to create the classifier model. PART classifier algorithm conducted on selected attributes with percentage split test option with an accuracy of 99.3% was achieved. To model the gained knowledge decision tree modeling technique and to represent the gained knowledge the rule-based knowledge representation technique was used. Semi structured interview technique is chosen for acquiring knowledge from expert. Then it is modeled by using the decision tree modeling techniques and represented in the production rule. The two extracted knowledge was combined and checked for rule redundancy to develop the knowledge-based system. Finally, to develop the KBS the researcher used SWI prolog and Net Beans for making user interface. To evaluate performance of the developed system, the researcher has used system performance testing and user acceptance evaluation. And Achieves 90.3% accuracy for system performance. And 91.3% of accuracy for user acceptance testing. The result show that the developed system achieves good performance and meets the objectives of the study and it could give proper treatment. This deduces that the developed system could help in identifying the severity level and in diagnosis and treatment recommendation of Hospital Acquired Pneumonia.
AUTOMATIC CONSTITUENCY PARSER FOR SILTIGNA LANGUAGE USING DEEP LEARNING APPROACH
(WOLKITE UNIVERSITY, 2024-04) TEKA MOHAMMED
In our study, we focused on developing automatic constituency parsing for the Silting language using deep learning approaches. The Siltigna language is experiencing increased speaker numbers, and our goal was to address the language's issues and enhance its content globally. To achieve this, we employed a deep learning technique known as the transition method and the main architecture we used was a seq-to-seq auto encoder-decoder model. This model has been widely used in natural language processing tasks. We conducted experiments using various deep learning models, including LSTM, BiLSTM, LSTM with attention, GRU, and transformer models. To train and evaluate these models, collected a dataset of approximately 2000 sentences and labeled them with corresponding parse trees. Before parsing the sentences into sequences, we applied preprocessing techniques such as data cleaning and tokenization and split the dataset into training and testing sets using an 80-20 split. Subsequently, we trained and tested theLSTM, Bi-LSTM, LSTM with attention, GRU, and Transformer models on the labeled parse tree data. Among these models, the transformer model achieved the best performance with 84.38% accuracy, 0.137 losses, and LAS of 0.83. This indicates that the transformer model was most effective in accurately parsing the Silting language. Our study highlights the importance of natural language processing with interconnected global community. By developing automatic constituency parsing for the Siltigna language, we aimed to bridge language barriers and enable effective communication across borders.

College of Computing and Informatics

Browse

Filters

Settings

Sort By

Results per page

Search Results