
Alan Demétrius Baria Valejo- PhD in Computer Science
- Professor (Assistant) at Federal University of São Carlos
Alan Demétrius Baria Valejo
- PhD in Computer Science
- Professor (Assistant) at Federal University of São Carlos
About
41
Publications
8,314
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
342
Citations
Introduction
I'm an Assistant Professor at the Department of Computer Science of the University of São Carlos (UFSCar). My interest lies broadly in the area of machine learning, with a focus on social network analysis and text mining.
Current institution
Publications
Publications (41)
Graph Neural Networks (GNNs) have recently received extensive attention due to their applicability in a wide range of tasks, including drug discovery, text classification, traffic forecasting, hardware design, and recommendation. However, GNNs face significant challenges regarding scalability and the ability to handle large-scale graphs. Several st...
The growing data size poses challenges for storage and computational processing time in semi-supervised models, making their practical application difficult; researchers have explored the use of reduced network versions as a potential solution. Real-world networks contain diverse types of vertices and edges, leading to using k-partite network repre...
Text classification is a fundamental task in Text Mining (TM) with applications ranging from spam detection to sentiment analysis. One of the current approaches to this task is Graph Neural Network (GNN), primarily used to deal with complex and unstructured data. However, the scalability of GNNs is a significant challenge when dealing with large-sc...
Exploring label correlations is one of the main challenges in multi-label classification. The literature shows that prediction performances can be improved when classifiers learn these correlations. On the other hand, some works also argue that the multi-label classification methods cannot explore label correlations. The traditional multi-label loc...
Exploring label correlations is one of the main challenges in multi-label classification. The literature shows that prediction performances can be improved when classifiers learn these correlations. On the other hand, some works also argue that the multi-label classification methods cannot explore label correlations. The traditional multi-label loc...
Complex machine learning tasks for knowledge discovery in networked data, such as community detection, node categorization, network visualization, and dimension reduction, have been successfully addressed by coarsening algorithms. It iteratively reduces the original network into a hierarchy of smaller networks, resulting in informative simplificati...
Bipartite networks are pervasive in modeling real-world phenomena and play a fundamental role in graph theory. Interactive exploratory visualization of such networks is an important problem, and particularly challenging when handling large networks. In this paper we present results from an investigation on using a general multilevel method for this...
Stock markets play an essential role in the economy and offer companies opportunities to grow, and insightful investors to make profits. Many tools and techniques have been proposed and applied to analyze the overall market behavior to seize such opportunities. However, understanding the stock exchange’s intrinsic rules and taking opportunities are...
Several coarsening algorithms have been developed as a powerful strategy to deal with difficult machine learning problems represented by large-scale networks, including, network visualization, trajectory mining, community detection and dimension reduction. It iteratively reduces the original network into a hierarchy of gradually smaller informative...
Coarsening algorithms have been successfully used as a powerful strategy to deal with data-intensive machine learning problems defined in bipartite networks, such as clustering, dimensionality reduction, and visualization. Their main goal is to build informative simplifications of the original network at different levels of details. Despite its wid...
Uma característica inerente aos bancos de dados de acidentes rodoviários refere-se ao desequilíbrio existente entre o número de observações associadas às ocorrências dos acidentes com vítimas fatais e não fatais, em relação aos acidentes sem vítimas. Essa particularidade conduz à necessidade da aplicação de técnicas de balanceamento, que possibilit...
his work analyses the performance of grouping methods based on complex networks and clusters, in order to identify main road accident groups and risk factors. The research included a balancing step of data classes, used in the classification and extraction process of decision rules applied in each grouping. Then, it was possible to assess and visua...
One of the main challenges in urban development faced by large cities is related to traffic jam. Despite increasing efforts to maximize the vehicle flow in large cities, to provide greater accuracy to estimate the traffic jam and to maximize the flow of vehicles in the transport infrastructure, without increasing the overhead of information on the...
Graph-based algorithms have aroused considerable interests in recent years by facilitating pattern recognition and learning via information propagation process through the graph. Here, we propose an unsupervised learning algorithm based on propagation on bipartite graph, referred to as Propagation in Bipartite Graph (PBG) algorithm. The contributio...
Multilevel optimization aims at reducing the cost of executing a target network-based algorithm by exploiting coarsened, i.e., reduced or simplified, versions of the network. There is a growing interest in multilevel algorithms in networked systems, mostly motivated by the urge for solutions capable of handling large-scale networks. Notwithstanding...
Many real-world networks display hidden community structures with important potential implications in their dynamics. Many algorithms highly relevant to network analysis have been introduced to unveil community structures. Accurate assessment and comparison of alternative solutions are typically approached by benchmarking the target algorithm(s) on...
Multilevel methods refer to a general framework for solving optimization problems in large graphs considering a hierarchy of contracted representations of the target graph. A recent extension to bipartite graphs has been introduced and successfully employed in diverse applications, but experience suggests the method is highly susceptible to the cho...
A multilevel method is a scalable strategy to solve optimization problems in large bipartite networks, which operates in three stages. Initially the input network is iteratively coarsened into a hierarchy of gradually smaller networks. Coarsening implies in collapsing vertices into so-called super-vertices which inherit properties of their originat...
O objetivo deste estudo foi discutir as principais limitações encontradas no processo de classificação da severidade dos acidentes de tráfego, com base em modelos de árvore de decisão (CART). Para atingir este objetivo, a CART foi utilizada na mineração de um banco de dados desbalanceado de acidentes rodoviários, considerando a variável dependente...
The rapid urban expansion of the world's major cities has directly impacted people's lives. In the urbanization process, it is common that business shops are open to attend the different needs and demands of the increasing number of citizens. This fact represents a business issue encouraging potential investments that could be harnessed to improve...
The objective of this study is to discuss the main constraints in classifying the severity of road accidents using Artificial Neural Networks (ANN). To achieve this, ANN modelling with Multiple Layers Perceptron (MPL) was used. This method is recommended for treating non-linear problems, whose distributions are not normal, which is the case for roa...
The construction of networks from databases of road accidents is considered a challenging issue in road safety. The accident databases are naturally unbalanced, which requires the use of models that allow the simultaneous analysis of multiple variables, without a priori prerequisites to be established. Bayesian Networks (BNs) have presented promisi...
The investigation of road accident in space and time is essential for the development of
researches in Road Safety since it allows identifying the degree and variation of accidents
on a highway. Based on this motivation, this work analyses the four years of data collected
in a stretch of 20 km of highway, through the application of homogeneous comp...
The popularization of GPS has generated a massive amount of geographic data organized in trajectories. Trajectories are ordered sequences of geographic points that represent a path of any moving object, which provides information on the mobility behavior of this moving objects. To improve the understanding of trajectories, places of greater importa...
Nos últimos anos, o uso do paradigma de fog computing está cada vez mais presente em estudos e serviços oferecidos por uma cidade inteligente, sendo os sistemas residenciais inteligentes um dos serviços a ser destacado. Entretanto, tal paradigma traz dois grandes desafios dentro do contexto de casas inteligentes: como extrair de forma eficiente os da...
Interest in algorithms for community detection in networked systems has increased over the last decade, mostly motivated by a search for scalable solutions capable of handling large-scale networks. Multilevel approaches provide a potential solution to scalability, as they reduce the cost of a community detection algorithm by applying it to a coarse...
Multilevel approaches aim at reducing the cost of a target algorithm over a given network by applying it to a coarsened (or reduced) version of the original network. They have been successfully employed in a variety of problems, most notably community detection. However, current solutions are not directly applicable to bipartite networks and the li...
This article proposes ResiDI, an intelligent decision-making system for a residential distributed automation infrastructure based on wireless sensors and actuators. ResiDI transmits events using wireless technologies embedded in WSANs to reduce the wire load capacity of traditional systems. In addition, the nodes are equipped with batteries, as a b...
This work proposes STORM, a solution for decision-making in a residential environment that combines fog computing and computational intelligence. In this scenario, STORm is able to collect, treat, disseminate, detect and control information generated from the sensor nodes to the decision- making process. With this in mind, STORm is based on the dev...
Computer systems are a part of everyday life, since they influence human behavior and stimulate changes in the emotional states of the users. The assessment of users’ emotions during their interaction with computer systems can help to provide tailorable website interfaces and better recommendations systems. However, emotions are complex and difficu...
Graph-based Semi-Supervised Learning (SSL) provides a powerful framework for the modeling of manifold structures in high-dimensional spaces. Additionally, graph representation is effective for the propagation of the few initial labels existing in training data. Graph-based SSL requires robust graphs as input for an accurate data mining task, such a...
Link prediction in online social networks is useful in numerous applications, mainly for recommendation. Recently, different approaches have considered friendship groups information for increasing the link prediction accuracy. Nevertheless , these approaches do not consider the different roles that common neighbors may play in the different overlap...
Manually annotated data is the basis for a large number of tasks in natural language processing as either: evaluation or training data. The annotation of large amounts of data by dedicated full-time annotators can be an expensive task, which may be beyond the budgets of many research projects. An alternative is crowd-sourcing where annotations are...
Many real world complex networks have an a overlapping community structure, in which a vertex belongs to one or more communities. Numerous approaches for crisp overlapping community detection were proposed in the literature, most of them have a good accuracy but their computational costs are considerably high and infeasible for large-scale networks...
The grouping of related verbs is a mature problem in linguistics and natural language processing. There have been a number of resources which have grouped together English verbs, for example VerbNet. In comparison Portuguese has fewer resources, some of which have been based upon English verb studies. The manual grouping of Portuguese verbs would b...
The multilevel graph partitioning strategy aims to reduce the computational cost of the partitioning algorithm by applying it on a coarsened version of the original graph. This strategy is very useful when large-scale networks are analyzed. To improve the multilevel solution, refinement algorithms have been used in the uncorsening phase. Typical refi...