
Behrouz MinaeiIran University of Science and Technology · School of Computer Engineering
Behrouz Minaei
Associate Prof.
About
403
Publications
125,853
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
5,993
Citations
Introduction
Behrouz Minaei-Bigoli obtained his Ph.D. from Michigan State University, Michigan, USA in Computer Science and Engineering Department. He is an associate professor in the School of Computer Engineering at the Iran University of Science and Technology, Tehran, Iran. He is leading a research group in Data Mining as well as another one in Video Game Technologies. His research interests include Text Mining, Natural Language Processing, and Machine Learning.
Additional affiliations
July 2005 - present
Publications
Publications (403)
Knowledge graphs facilitate the extraction of knowledge from data and provide a comprehensive overview of all knowledge within departments, across departments, and global organizations. To enrich the extracted knowledge, several solutions have been proposed to complete the knowledge graph. This study investigates the effectiveness of using the sele...
In recent years, various studies have been conducted on SVMs and their applications in different area. They have been developed significantly in many areas. SVM is one of the most robust classification and regression algorithms that plays a significant role in pattern recognition. However, SVM has not been developed significantly in some areas like...
A recommender system is a model that automatically recommends some meaningful cases (such as clips/films/goods/items) to the clients/people/consumers/users according to their (previous) interests. These systems are expected to recommend the items according to the users’ interests. There are two traditional general recommender system models, i.e., C...
Today, it is particularly important to recognize the semantic similarity between texts in different languages due to the emergence of new natural language processing models like ChatGPT and Bard. These models can provide more accurate and comprehensive answers to users' questions by identifying semantic similarity between two texts in different lan...
Despite paradigmatic research advancements and movements in applied linguistics, the issue of rhetoric, which serves as one of the fundamental pillars of each paradigm, remains largely unaccounted for. Considering the commensurability of argumentation and meta-analysis, coupled with the increasing rate of meta-analytic studies in the field of appli...
Today, it is particularly important to recognize the semantic similarity between texts in different languages due to the emergence of new natural language processing models like ChatGPT and Bard. These models can provide more accurate and comprehensive answers to users' questions by identifying semantic similarity between two texts in different lan...
Recommender systems are widely used in many applications. They can be viewed as the predictor systems that are to suggest accurate and highly preferred items to consumers or clients. These systems can be considered to be information filtering systems. They counter some important challenges such as cold start (it means the absence of enough data for...
Despite all the developments in recommender systems and utilizations of successful application models in the industry, it can be said that there is still a need to improve various parts of these systems in order to enhance their effectiveness and scope of application. Recommender Systems (RS) are well-known in the field of e-commerce and are expect...
In recent years, researchers from academic and industrial fields have become increasingly interested in social network data to extract meaningful information. This information is used in applications such as link prediction between people groups, community detection, protein module identification, etc. Therefore, the clustering technique has emerge...
There are many complex and rich morphological subtleties in the Arabic language, which are very useful when analyzing traditional Arabic texts, especially in the historical and religious contexts, and help in understanding the meaning of the texts. Vocabulary separation means separating the word into different parts such as root and affix. In the m...
Semantic role labeling (SRL) is the process of detecting the predicate-argument structure of each predicate in a sentence. SRL plays a crucial role as a pre-processing step in many NLP applications such as topic and concept extraction, question answering, summarization, machine translation, sentiment analysis, and text mining. Recently, in many lan...
One of the components of natural language processing that has received a lot of investigation recently is semantic textual similarity. In computational linguistics and natural language processing, assessing the semantic similarity of words, phrases, paragraphs, and texts is crucial. Calculating the degree of semantic resemblance between two textual...
Objective
The segmentation of consumers based on their behavior and needs is the most crucial action of the health insurance organization. This study's objective is to cluster Iranian health insureds according to their demographics and data on outpatient prescriptions.
Setting
The population in this study corresponded to the research sample. The H...
Link prediction with knowledge graph embedding (KGE) is a popular method for knowledge graph completion. Furthermore, training KGEs on non-English knowledge graph promote knowledge extraction and knowledge graph reasoning in the context of these languages. However, many challenges in non-English KGEs pose to learning a low-dimensional representatio...
Objective The segmentation of consumers based on their behavior and needs is the most crucial action of the health insurance organization. This study's objective is to cluster Iranian health insureds according to their demographics and data on outpatient prescriptions.
Setting The population in this study corresponded to the research sample. The He...
Predicting the number of patients helps managers to allocate resources in hospitals efficiently. In this research, the relationship between the number of patients with the temperature, relative humidity, wind speed, air pressure, and air pollution in instantaneous, short-, medium- and long-term indices was investigated. Genetic algorithm and ID3 de...
Predicting the number of patients helps managers to allocate resources in hospitals efficiently. In this research, the relationship between the number of patients with the temperature, relative humidity, wind speed, air pressure, and air pollution in instantaneous, short-, medium- and long-term indices was investigated. Genetic algorithm and ID3 de...
A vast amount of unstructured data is being generated in the age of big data. Relation extraction (RE) is the critical way to improve the utility of the data by extracting structured data, which has seen a great evolution in recent years. This paper first introduces five paradigms of RE, namely the rule-based paradigm, the machine learning paradigm...
Achieving an efficient and reliable method is essential to interpret a user’s brain wave and deliver an accurate response in biomedical signal processing. However, EEG patterns exhibit high variability across time and uncertainty due to noise and it is a significant problem to be addressed in mental task as motor imagery. Therefore, fuzzy component...
At present, jurisprudential inferences are made by qualified people using their knowledge of jurisprudence. Today, big data is available electronically, and advances in various artificial intelligence and machine learning techniques allow us to analyze a large amount of jurisprudential data. In this article, while expressing the challenges of juris...
One of the main concerns of researchers in writing scientific texts such as articles and theses is their correctness in terms of spelling since the presence of spelling errors in these texts is unacceptable. This problem, like many natural language processing problems, is highly dependent on the structure and grammar of the language. Persian langua...
Recommender Systems (RSs) have played an important role in online retailing portals and customers’ decision-making processes. Recommender systems that are based on the conventional Collaborative Filtering (CF) approach rely on single customers’ ratings on retailing websites. Multi-criteria CF (MCCF) approaches that rely on multi-aspects of the prod...
Programmers strive to design programs that are flexible, updateable, and maintainable. However, several factors such as lack of time, high costs, and workload lead to the creation of software with inadequacies known as anti-patterns. To identify and refactor software anti-patterns, many research studies have been conducted using machine learning. E...
Since EEG signals encode an individual’s intent of executing an action, scientists have extensively focused on this topic. Motor Imagery (MI) signals have been used by researchers to assistance disabled persons, for autonomous driving and even control devices such as wheelchairs. Therefore, accurate decoding of these signals is essential to develop...
Video games have significant and diverse effects on stress and cognitive systems based on the game style. The effect of this media on the central nervous system is significant because of its repetition. Nowadays, video games have become an important part of human life at different ages, and therefore, assessing their effects (good and bad) on stres...
Massive Open Online Courses (MOOCs) provide learners with high-quality and flexible online courses with no limitations regarding time and location. Detecting users’ behavior in MOOCs is an important task for course recommendations. Collaborative Filtering (CF) is considered the widely approach in recommender systems to provide a online learner cour...
In wireless sensor networks (WSNs), clustering has proved to be one of the most efficient ways to hierarchically organize the network topology for the purpose of load-balancing and elongating the network lifetime. However, achieving optimal clustering in WSNs is an NP-hard problem, and consequently, heuristics and metaheuristics have been widely ad...
Entity resolution (ER) is a process that identifies duplicate records referring to a real-world entity and links them together in one or more datasets. As a first step toward reducing the number of required record comparisons, blocking methods attempt to group records that are likely to match. A proper evaluation of blocking methods for selecting t...
As one of the significant issues in social networks analysis, the influence maximization problem aims to fetch a minimal set of the most influential individuals in the network to maximize the number of influenced nodes under a diffusion model. Several approaches have been proposed to tackle this NP-hard problem. The traditional approaches failed to...
Access to one of the richest data sources in the world, the web, is not possible without cost. Often, this cost is not taken into account in data acquisition processes. In this paper, we introduce the Learning Agents (LA) method for automatic topical data acquisition from the web with minimum bandwidth usage and the lowest cost. The proposed LA met...
Introduction: Video games affect the stress system and cognitive abilities in different ways. Here, we evaluated electrophysiological and biochemical indicators of stress and assessed their effects on cognition and behavioral indexes after playing a scary video game. Methods: Thirty volunteers were recruited into two groups as control and experimen...
Measuring brain activity through Electroencephalogram (EEG) analysis for eye state prediction has attracted attention from machine learning researchers. There have been many methods for EEG analysis using supervised and unsupervised machine learning techniques. The tradeoff between the accuracy and computation time of these methods in performing th...
Tokenization plays a significant role in the process of lexical analysis. Tokens become the input for other natural language processing tasks, like semantic parsing and language modeling. Natural Language Processing in Persian is challenging due to Persian's exceptional cases, such as half-spaces. Thus, it is crucial to have a precise tokenizer for...
Parkinson’s disease (PD) is a complex neurodegenerative disease. Accurate diagnosis of this disease in the early stages is crucial for its initial treatment. This paper aims to present a comparative study on the methods developed by machine learning techniques in PD diagnosis. We rely on clustering and prediction learning approaches to perform the...
Background and objective
Recent advances in the genetic causes of ALS reveals that about 10% of ALS patients have a genetic origin and that more than 30 genes are likely to contribute to this disease. However, four genes are more frequently associated with ALS: C9ORF72, TARDBP, SOD1, and FUS. The relationship between genetic factors and ALS progres...
Cluster-based routing is the most common routing approach to achieve energy efficiency in wireless sensor networks. However, optimal determination of cluster heads is NP-hard, which calls for heuristics or metaheuristics for obtaining a near-optimal solution. Although metaheuristics achieve better performance, they suffer from high computational ti...
User-Generated-Content (UGC) has gained increasing attention as an important indicator of business success in the tourism and hospitality sectors. Previous literature has analyzed travelers’ satisfaction through quantitative approaches using questionnaire surveys. Another direction of research has explored the dimensions of satisfaction based on on...
Headline generation is a challenging subtask of abstractive text summarization, which its output should be a summary, shorter than one sentence. It would be precious to develop a dataset for the evaluation of abstractive summarization methods on this task in the Persian language. There are several datasets for headline generation in Persian, most o...
With the increasing use of social media with its ability for users to share comments immediately, the extent of a system to identify offensive content has become a necessity in all languages. Due to the lack of publicly available resources on offensive language identification for Farsi, which has more than 110 million speakers, we present Pars-OFF,...
The COVID-19 pandemic has caused major global changes both in the areas of healthcare and economics. This pandemic has led, mainly due to conditions related to confinement, to major changes in consumer habits and behaviors. Although there have been several studies on the analysis of customers’ satisfaction through survey-based and online customers’...
To avoid the spread of the COVID-19 crisis, many countries worldwide have temporarily shut down their academic organizations. National and international closures affect over 91% of the education community of the world. E-learning is the only effective manner for educational institutions to coordinate the learning process during the global lockdown...
Dimensionality reduction is an important preprocessing technique in clustering domain. Feature selection is one of dimensionality reduction methods, in which it selects a subset of the most relevant features. This paper proposes a feature selection method based on Invasive Weed Optimization (IWO) algorithm. The IWO uses clustering strategy on data...
Introduction: Computer games as an interactive media play a significant role in the cognitive and behavioral health of the players. Computer games have either positive or negative effects on cognitive indices among players. They also directly influence the lifestyle and quality of life of children, adolescents, and young adults. The present study a...
Sustainable tourism is an emerging trend around the world. Eco-friendly (green) hotels are environmentally friendly properties that are becoming more popular among green travellers. Electronic Word-of-Mouth (e-WOM) is a method of communicating with customers to share their experiences and is a powerful marketing tool for hotel marketing. This paper...
One of the key challenges for classifying multiple cancer types is the complexity of Tumor Protein p53 mutation patterns and its individual effects on tumors. However, far too little attention has been paid to Deep reinforcement Learning on TP53 mutation patterns because of its extremely difficult result interpretations. We introduce a critic netwo...
Link prediction is the task of predicting missing relations between entities of the knowledge graph by inferring from the facts contained in it. Recent work in link prediction has attempted to provide a model for increasing link prediction accuracy by using more layers in neural network architecture or methods that add to the computational complexi...
Classification is one of the most important and widely used issues in machine learning, the purpose of which is to create a rule for grouping data to sets of pre-existing categories is based on a set of training sets. Employed successfully in many scientific and engineering areas, the Support Vector Machine (SVM) is among the most promising methods...
Clustering ensemble has been progressively popular in the ongoing years by combining several base clustering methods into a most likely better and increasingly robust one. Nonetheless, fuzzy clustering dependability (durability) has been unnoticed within the majority of the proposed clustering ensemble approach. This makes them weak against low-qua...
Most of the data on the web is in the form of natural language, but natural language is highly ambiguous, especially when it comes to the frequent occurrence of entities. The goal of entity linking is to find entity mentions and link them to their corresponding entities in an external knowledge base. Recently, FarsBase was introduced as the first P...
BACKGROUND
Video games have attracted noticeable attention, especially in recent years. In 2018, the game industry achieved a larger market than video and music combined for the first time ever. As a result, a group of experts decided to take advantage of the attraction of video games in other computer software. Their efforts formed a concept calle...
Training and evaluation of automatic fact extraction and verification techniques require large amounts of annotated data which might not be available for low-resource languages. This paper presents ParsFEVER: the first publicly available Farsi dataset for fact extraction and verification. We adopt the construction procedure of the standard English...
Multiple instance learning (MIL) has become the standard learning paradigm for distantly supervised relation extraction (DSRE). However, due to relation extraction being performed at bag level, MIL has significant hardware requirements for training when coupled with large sentence encoders such as deep transformer neural networks. In this paper, we...
WordNet lexical-database groups English words into sets of synonyms called "synsets." Synsets are utilized for several applications in the field of text-mining. However, they were also open to criticism because although, in reality, not all the members of a synset represent the meaning of that synset with the same degree, in practice, they are cons...
Digital social media has played a key role in tourism and hospitality industry. The use of machine and deep learning has been effective in market segmentation and customers' preference prediction through social big data analysis. This paper develops a new method to analyze large set of open data in social networking sites for travellers segmentatio...
Relation extraction is the task of extracting semantic relations between entities in a sentence. It is an essential part of some natural language processing tasks such as information extraction, knowledge extraction, question answering, and knowledge base population. The main motivations of this research stem from a lack of a dataset for relation e...
While most of the knowledge bases already support the English language, there is only one knowledge base for the Persian language, known as FarsBase, which is automatically created via semi-structured web information. Unlike English knowledge bases such as Wikidata, which have tremendous community support, the population of a knowledge base like Fa...
Frequent closed itemsets provide a lossless and concise collection of all frequent itemsets to reduce the runtime and memory requirement of frequent itemsets mining tasks. This study presents an algorithm named NEclatClosed for fast mining of frequent closed itemsets. We introduce concepts and techniques based on the vertical database format and em...
Entity Linking is one of the essential tasks of information extraction and natural language understanding. Entity linking mainly consists of two tasks: recognition and disambiguation of named entities. Most studies address these two tasks separately or focus only on one of them. Moreover, most of the state-of-the -art entity linking algorithms are...
Discriminative patterns are sets of characteristics that differentiate multiple groups from each other, for example, successful and unsuccessful medical treatments. The objective of the discriminative pattern mining task is to discover a set of significant patterns that occur with disproportionate frequencies in different class-labeled datasets, ge...
The number of patients should be predicted to meet the physicians demands in hospitals. In this study, a new multi-objective physician assignment model was designed based on the number of the patients estimated by the climatic factors. The number of patients was predicted through multiple linear regression (MLR) and fuzzy inference system (FIS). In...
Introduction: Nowadays, computer games play an important role on the cognitive and behavioral health of the community. The purpose of this study is to investigate the short-term effects of Flow Free ® on the neurologic characteristics of the players of these games. Materials and Methods: A total of 40 healthy male students aged 20 years and above w...
This paper studies the cluster ensemble selection problem for unsupervised learning. Given a large ensemble of clustering solutions, our goal is to select a subset of solutions to form a smaller yet better performing cluster ensemble than using all available solutions. The common way of aggregating the chosen solutions is accumulating the informati...
Anomaly detection in time-evolving networks has many applications, for instance, traffic analysis in transportation networks and intrusion detection in computer networks. One group of popular methods for anomaly detection from evolving networks are robust online subspace trackers. However, these methods suffer from problem of insensitivity to drast...
Plain Language Summary Expanded use of video games with either positive or negative cognitive effects of this media on the audience, have highlighted the need for research in this area. Particularly, positive cognitive effects of video games, include improved cognitive indices and increased mental flexibility of the players. The present study was c...
The most critical concern in machine learning is how to make an algorithm that performs well both on training data and new data. No free lunch theorem implies that each specific task needs its own tailored machine learning algorithm to be designed. A set of strategies and preferences are built into learning machines to tune them for the problem at...