
Solomon Atnafu- PhD
- Professor (Associate) at Addis Ababa University
Solomon Atnafu
- PhD
- Professor (Associate) at Addis Ababa University
About
68
Publications
74,285
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
706
Citations
Introduction
Dr. Solomon Atnafu is currently Associate Professor at the Department of Computer Science, Addis Ababa University. His research interests are in Information Retrieval, Localization, e-Governance, multimedia systems and Mobile Information Systems. He co-advised four PhD candidates who completed their studies successfully. He has also advised more than sixty five graduate students in their MSc thesis works on different subjects. During his career so far, he has published and co-published more than fifty research papers on reputable journals, peer-reviewed International and national conference proceedings, He has served as head of the Department of Computer Science and , as Associate Director of the Center for IT Research and Innovation (CITRI) at the IT Doctoral Program of AAU.
Current institution
Additional affiliations
September 1995 - present
Education
September 1999 - July 2003
Publications
Publications (68)
The rise of transliterated script usage on social media has presented significant challenges to hate speech detection models, as such scripts often bypass models trained exclusively on formal language datasets. Existing Amharic hate speech detection studies predominantly focus on datasets written in formal Amharic scripts using machine learning app...
In this article, exploratory research is conducted to analyze statistical overlap across Amharic and Tigrigna at different level of abstraction, namely, word level, CV syllable level, and at phoneme level. Amharic and Tigrigna are among the most widely spoken Ethiosemitic languages in Ethiopia, yet under resourced to be fully integrated into TTS ap...
Deepfakes have raised significant concerns due to their potential to spread false information and compromise digital media integrity. In this work, we propose a Generative Convolutional Vision Transformer (GenConViT) for deepfake video detection. Our model combines ConvNeXt and Swin Transformer models for feature extraction, and it utilizes Autoenc...
As social media platforms become increasingly accessible, individuals’ usage of new forms of textual communication (posts, comments, chats, etc.) on social media using local language scripts such as Amharic has increased tremendously. However, many users prefer to post comments in Latin scripts instead of local ones due to the availability of more...
As the number of social media comments available online grows, the spread of hate speech has grown gradually. When someone uses hate speech as a weapon to injure, degrade, and humiliate others, their freedom, dignity, and personhood can be jeopardized. Deep neural network-based hate speech detection models, such as the conventional single channel c...
In this study, an experiment is conducted to explore and exploit shared Amharic and Tigrigna syllables in the development of Amharic Tigrigna bilingual text to speech synthesizer. Both Amharic and Tigrigna are under resourced languages, yet these two languages share the Geez writing system with large portion of phone sets and syllables. This study...
As online social media content continues to grow, so does the spread of hate speech. Hate speech has devastating consequences unless it is detected and monitored early. Recently, deep neural network-based hate speech detection models, particularly conventional single-channel Convolutional Neural Network (CNN), have achieved remarkable performance....
To properly apply supervised machine learning, sufficient labeled corpus is required. In this study, we proposed a semi-supervised approach to Amharic sentiment classification. As Amharic is less-resourced language, there is insufficient labeled corpus to apply supervised machine learning. Nowadays, Amharic texts are widely used in social media, wh...
Topic Modeling is a statistical process, which derives the latent themes from extensive collections of text. Three approaches to topic modeling exist, namely, unsupervised, semi-supervised and supervised. In this work, we develop a supervised topic model for an Amharic corpus. We also investigate the effect of stemming on topic detection on Term Fr...
The emergence of the World Wide Web facilitates the growth of user-generated texts in less-resourced languages. Sentiment analysis of these texts may serve as a key performance indicator of the quality of services delivered by companies and government institutions. The presence of user-generated texts is an opportunity for assisting managers and po...
The rapid advancement of deep learning models that can generate and synthesis hyper-realistic videos known as Deepfakes and their ease of access to the general public have raised concern from all concerned bodies to their possible malicious intent use. Deep learning techniques can now generate faces, swap faces between two subjects in a video, alte...
Transfer learning is getting great attention in advancing researches in the downstream tasks of Natural Language Pro-cessing(NLP) in a cost-effective and rapid way. This is because of the rapid development of context-based pre-trained language models. In this research, we develop the first Bidirectional Encoder Representations from Transformers(BER...
Automatic Speech Recognition (ASR) is one of the most important technologies to support spoken communication in modern life. However, its development benefits from large speech corpus. The development of such a corpus is expensive and most of the human languages, including the Ethiopian languages, do not have such resources. To address this problem...
Introduction: Due to the advancement of World Wide Web technology, users usually express their feelings, emotions and opinions as comments in response to the posted news, photo, audio and video. Currently, opinionated sources are increasing in languages other than English. However, Amharic sentiment analysis researches are very few as it has no suf...
Introduction: For carrying out Amharic sentiment classification, the availability of sentiment lexicons is crucial. To date, there are two generated Amharic sentiment lexicons. These are manually generated lexicon (1000) [2] and dictionary based Amharic SWN and SOCAL lexicons [3]. However, dictionary based generated lexicons has shortcomings in tha...
In this paper, we describe an attempt towards the development of parallel corpora for English and Ethiopian Languages, such as Amharic, Tigrigna, Afan-Oromo, Wolaytta and Ge’ez. The corpora are used for conducting a bi-directional statistical machine translation experiments. The BLEU scores of the bi-directional Statistical Machine Translation (SMT...
Sentiment analysis is a hot research area with several applications including analysis of political opinions, classifying comments, movie reviews, news reviews and product reviews. To employ rule based sentiment analysis, sentiment lexicon is required. However, manual construction of a sentiment lexicon is time consuming and costly for resource-lim...
In this paper, we describe an attempt towards the development of parallel corpora for English and Ethiopian Languages, such as Amharic, Tigrigna, Afan-Oromo, Wolaytta and Ge'ez. The corpora are used for conducting a bi-directional statistical machine translation experiments. The BLEU scores of the bi-directional Statistical Machine Translation (SMT...
In this paper, we describe the development of parallel corpora for Ethiopian Languages: Amharic, Tigrigna, Afan-Oromo, Wolaytta and Ge'ez. To check the usability of all the corpora we conducted baseline bi-directional statistical machine translation (SMT) experiments for seven language pairs. The performance of the bi-directional SMT systems shows...
Drought monitoring, and its impact management planning, has been a challenge for decision makers mainly because of lack of reliable information and decision support tools. The main objective of the study was to develop a remote sensing-based vegetation condition drought-monitoring approach for pastoralist areas using multi-temporal and spatial reso...
The objective of this study was to develop information mining methodology for drought modeling and predictions using historical records of climate, satellite, environmental, and oceanic data. The classification and regression tree (CART) approach was used for extracting drought episodes at different time-lag prediction intervals. Using the CART app...
Claims and patterns are used as structures for knowledge capture, design, and sharing. Both structures have been researched and used independently for different application areas and in different contexts. This paper examines how the two structures can be combined to leverage interaction design for low-literacy in low-resource settings, taking into...
With the proliferation of mobile phone and services in African rural communities, there is an emerging need for appropriate interaction design for low-literacy. This workshop will bring together people interested in low-literacy mobile interaction design to share design philosophies of low-literacy, explore design knowledge creation and sharing tec...
This paper presents a method to leverage mobile interaction design knowledge for low-literacy, moving from falsifiable hypotheses (claims) to actionable solutions (patterns). In prior work, claims and patterns have been used separately for different application areas and in different contexts. This research asserts that the transition from claims t...
Although, the syntactical and structural heterogeneities among inter-language linked open data (LOD) data sources bring many challenges, entity co-reference resolution in a multilingual linked open data (MLOD) setting is not well studied.
In this research, a three phase approach is proposed. First, statistical relational learning (SRL) with factori...
This paper identifies factors important in low-literacy mobile user interaction design and development. It explains the limitations and recurrent design problems from developing countries, focusing on Ethiopia as a primary case study, with special consideration for the designer perspective. This exploratory research effort examines the match and mi...
With the development of ubiquitous technologies that support the digitization of money, research is needed on how individuals’ private life practices are affected by new technological financial systems and how cash-based practices can inform their design. In this paper, we report the cash-based monetary practices of one Ethiopian rural community an...
The main objective of this research was to identify co-referent entities located in several linked open data (LOD) sources that are described in various natural languages. The problem is approached from two perspectives. First, we do a multi-scale analysis of the RDF graph to discover structural similarities of entities. This was implemented as a t...
Abstract:
This study focused on land suitability analysis to identify permissible areas suitable for rice crop production in west central highlands of Amhara Region, Ethiopia. The research applied GIS-based Multi-Criteria Decision Approach. Soil, climatic conditions, and topography were a criteria identified as necessary for the intended applicati...
Internet access has a significant impact on a country's economic development. And what makes Internet access more relevant and useful to a country is, its local content. Local content development is a critical resource for adaption and development of Internet in an Economy. This work identifies the problems and analyzes the status of local Internet...
Due to climate changes and the uncertainties in future weather conditions, research on drought monitoring information received more attention from politicians and scientists. The objective of this paper is to develop a new intelligent system concept for drought information extraction and predictions from satellite images. For the modeling experimen...
Current mobile money systems provide users with hierarchical user interface and represent money as a positive rational numbers of the form 1, 3, 4.87...N. However, research indicates that rural communities that cannot read and write have a challenge entering such numbers in to mobile money system. Navigating through hierarchical text menu is also d...
In developing countries, although money is becoming digital in the form of mobile money, it is not easily used by millions of illiterate users in their everyday transactions. Digitization of material money thus poses a challenge to many users. Existing mobile money systems and platforms represent money in terms of simple numbers, like 13, 50, 0.78,...
Existing mobile money architectures overlooked value storage and everyday money practices of individuals. They mainly deal with payment related issues and procedures based on bank accounts. They also targeted urban people, who have banking and technology know-how. Based on this knowledge gap, this research intends to explore, analyze, and identify...
Previous studies concerned with mobile financial services for the poor have been narrowly conceived, mainly depending on secondary data and focusing on technical design issues without having fully understood individuals' complex relationships and money practices. In order to contribute to this knowledge gap and inform mobile money system design, an...
As non-English languages are growing exponentially on the Web, the number of online non-English speakers who realizes the importance of finding information in different languages is enormously growing. However, the major general purpose search engines such as Google, Yahoo, etc have been lagging behind in providing indexes and search features to ha...
Due to climate change and uncertainties in the future weather conditions, drought information mining research has got the attention of both politicians and scientists. In the past, there were limited tools available for extracting and converting the huge data available from different sources to actionable information. The objective of this article...
The purpose of this paper is to develop a new concept and approach for extracting knowledge from satellite images for near
real-time drought monitoring in areas experiencing food insecurity in order to mitigate climate change. The near real-time
data downloaded from the Atlantic Bird satellite was used to produce the drought spatial distribution in...
The main objective of this research was to develop a new concept and approach to extract knowledge
from satellite imageries for near real-time drought monitoring. The near real-time data downloaded from
the Atlantic Bird satellite were used to produce the drought spatial distribution. Our results showed that
approximately 40% of the observed areas...
As XML document is distributed across the web, it can be considered like a distributed repository of XML documents and is subjected to distribution design. However, there is no adequate works on XML document distribution design. To address the shortcomings in XML document fragmentation design, in this work, we have focused on the vertical fragmenta...
Legal documents play a basic role in discharging the law to the public, besides constituting learning material for students, researchers and legal practitioners. Legal documents contain text rich contents that can be structured and marked with description languages such as XML. This basic feature can be exploited by XML based retrieval models to re...
Attributed to climatic change and uncertainty of weather conditions, drought has become a recurrent phenomenon. It is manifested by erratic and uncertain rainfall distribution in rainfall dependent farming areas. The hitherto methods of monitoring drought employed conventional methods that rely on availability of metrological data. The objectives o...
On the Web, the use of languages other than English (e.g., Amharic language) has been growing exponentially. The number of Web documents in Amharic language as well as Internet users in Ethiopia is growing dramatically. However, the major search engines have been lagging behind in providing indexes, stemming and search features to handle this langu...
The Web is a huge repository of information in the form of text, image, audio, and video. People use search engines, such as Google, Yahoo!, etc, to discover resources from this huge repository. These general purpose search engines are designed and optimized for English language. They fell short when they are used for locating web resources of othe...
The Web is a huge repository of information in the form of text, image, audio, and video. People use search engines, such as Google, Yahoo!, Bing, etc, to discover resources from this huge repository. Originally these general purpose search engines are designed and optimized for English language. They fell short when they are used for locating web...
The law changes for several reasons and this change is manifested in the content of legal documents which in turn makes legal document retrieval systems be less efficient in their retrieval quality. In this work, we propose an approach for measuring a user query tendency that can augment the efficiency of XML based legal document retrieval systems....
The amount of available audio data is increasing rapidly in consequence of advancements in media creation, storage and compression technologies. This rapid increase imposes new demands in audio data management and retrieval. In this work, we proposed an audio data model and repository model to fulfill user requirements in retrieving audio data from...
Nowadays search engines provide the easiest way to reach information resources that are available on the Web. The use of languages other than English has been growing exponentially on the Web. Amharic language belongs to one of these languages. Web documents in Amharic language are increasing very fast. The number of Internet users in Ethiopia is g...
La recherche d\'image par le contenu (en anglais Content-Based Image Retrieval, CBIR) est un domaine de recherche très actif depuis plusieurs années. L\'appariement exact n\'étant ni possible ni souhaitable avec des images, l\'approche la plus utilisée consiste à calculer un score de similarité entre les images via une comparaison de leurs
caractér...
Database fragmentation allows reducing irrelevant data accesses by grouping data frequently accessed together in dedicated segments. In this paper, we address multimedia database fragmentation to take into account the rich characteristics of multimedia objects. We particularly discuss multimedia primary horizontal fragmentation and focus on semanti...
Partitioning techniques are traditionally used in distributed system design to reduce accesses to irrelevant information by grouping data frequently accessed together in specific fragments. Here, we address the primary horizontal fragmentation of textually annotated multimedia data. In this study, we discuss the issue of identifying semantic implic...
Until recently, issues in image retrieval have been handled in DBMSs and in computer vision as separate research works. Nowadays, the trend is towards integrating the two approaches (content- and metadata-based) for multi-criteria image retrieval. However, most existing works and proposals in this domain lack a formal framework to deal with a multi...
As the wireless revolution continues to make fast in-roads in telecommunication in developing countries such as Ethiopia, the availability of wireless local content and application programs in local writing systems becomes a necessity. However, a few fundamental technical barriers have to be removed before wireless content in such writing systems s...
La recherche d'images par le contenu est d'une importance croissante dans de nombreux domaines d'application. Pour cette raison, il existe des numerus recherche dans les deux domaines : DBMS et reconnaissance de formes. Mais, la plupart des systèmes existants ont, en particulier, le défaut de n'offrir aucun cadre formel pour les requêtes dites hybr...
This paper presents a multimedia join operator that is carried out through the method of the nearest neighbor search. In contrast to related approaches that utilizes a similarity function to perform a join between two instances of the input tables, we adopt the more flexible and widely used nearest neighbor method. First, we introduce a simple near...
Multimedia databases; Processing of a multimedia join; Nearest neighbor search.
The need for systems that can store, represent and provide efficient retrieval facilities for images of particular interest is becoming very high in medicine. In this respect, a lot of work has been done to integrate image data in standard data processing environments. The two different approaches that are used for the representation of images are...
Since the last two decades, image database management has been practiced using different image representation methods. In the literature, images are represented using two paradigms: the metadata-based and the content-based representations. Image retrieval using the metadata is done using the traditional database operations. However, image retrieval...
Managing image data in a database system using metadata has been practiced since the last two decades. However, describing an image fully and adequately with metadata is practically not possible. The other alternative is describing image content by its low-level features such as color, texture, shape, etc. and using the same for similarity-based im...
The integration of similarity-based data retrieval techniques into database management systems, in order to efficiently support
multimedia data, is currently an active research issue. In this paper, we first demonstrate the necessity of introducing novel
similarity-based operations in image databases, with example queries. Then, we introduce our im...
In database management systems, the need to integrate content-based image retrieval facilities has become one of the key issues. We first illustrate the importance of such facilities with example queries and give an overview of the work done in similarity-based data retrieval. Then, we propose an image repository model that supports similarity-base...
The many successful research results in the domain of computer vision have made similarity based data retrieval techniques a promising approach. As a result, the integration of similarity based retrieval techniques of multimedia data into DBMSs is currently an active research issue. We first illustrate the importance of similarity based operations....
Content-based image retrieval is one of the current active and promising research topics in image analysis and image management. The work of transferring the results of this research to support images in DBMS has also been initiated by many researchers and commercial systems. We first illustrate the importance of using content-based operations in c...
The role of ICT for increased efficiency in the different sectors such as governance, economy, social affairs, etc. is increasingly becoming a necessity in such a way that in the new millennium many activities will practically be impossible to cope with the development standard without its use. African countries thus, have no choice but to do their...
For a developing country such as Ethiopia with a difficult mountainous terrain and limited transportation infrastructure coupled with one of the lowest patient-doctor ratios anywhere in the world (almost 30,000 to 1), telemedicine offers a cost-effective health-care system. This awareness is slowly gaining traction in the country with a pilot progr...