About
25
Publications
15,384
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
443
Citations
Introduction
Ángel Panizo LLedot is a teaching assistant in the school of computer systems engineering of Universidad Politécnica de Madrid (UPM). Has a B.Sc. in Computer Science from Universidad Complutense de Madrid, a M.Sc. in Artificial Intelligence from Universidad Politécnica de Madrid, and a Phd in computer sciences from Universidad Autonoma de Madrid. Nowadays, he is involved with the AIDA research group at ETSISI-UPM.
Current institution
Additional affiliations
March 2020 - July 2020
Education
September 2015 - July 2016
October 2008 - September 2013
Publications
Publications (25)
Las Jornadas de Innovación Educativa organizadas en la Escuela Técnica Superior de Ingeniería de Sistemas Informáticos (ETSISI) de la Universidad Politécnica de Madrid (UPM) han sido un espacio clave para la reflexión y el intercambio de experiencias entre los docentes de nuestra escuela. A través de estas jornadas, no solo compartimos iniciativas...
The Manosphere movement and its subgroups have garnered the attention of researchers seeking deeper insights into their dynamics. This study specifically focuses on a particular subgroup within the Manosphere called Pick Up Artists (PUAs). PUAs concentrate on teaching heterosexual men sexual seduction techniques to attract women, often promoting a...
Internet and social media have revolutionised the way news is distributed and consumed. However, the constant flow of massive amounts of content has made it difficult to discern between truth and falsehood, especially in online platforms plagued with malicious actors who create and spread harmful stories. Debunking disinformation is costly, which h...
Multi-Objective Genetic Algorithms (MOGAs) have been successfully used to address dynamic problems in a wide variety of domains. In these domains, data changes over time, so a non-static analysis is required to obtain feasible solutions. In this type of environments, MOGAs are often time-consuming and require special adaptation to work properly. A...
Extremist ideologies are proliferating nowadays in both political and social levels. Considering that youngsters are in a development stage where they are still conforming their own social identity, they become especially vulnerable to these ideologies’ influence. Therefore, it becomes critical to provide them with the psychological skills to ratio...
Nowadays, Twitter is used by several political extremist groups to establish close communities on which the opinions are amplified following an echo-chamber effect. However, few literature analyses the effect of the use of an extremist discourse in relation to the relevance of these users on their online network. With the aim of analyzing this effe...
Interest in network analysis has not stopped increasing over the last decade. The Community Detection Problem (CDP) has been a hot topic in network analysis, so many different approaches have been proposed. Among them, optimization methods have proven to be highly effective for this task. Traditionally, the CDP has been tackled as a single-objectiv...
Social network based applications have experienced exponential growth in recent years. One of the reasons for this rise is that this application domain offers a particularly fertile place to test and develop the most advanced computational techniques to extract valuable information from the Web. The main contribution of this work is three-fold: (1)...
Radicalization, as a violent form of extremism, is a growing problem for Europe. Currently, it is possible to find extreme ideologies regarding almost every topic such as religion, politics or sports. This problem, which ranges from personal identity conflicts to complex societal issues, has an impact on several people everyday, especially on young...
Social network based applications have experienced exponential growth in recent years. One of the reasons for this rise is that this application domain offers a particularly fertile place to test and develop the most advanced computational techniques to extract valuable information from the Web. The main contribution of this work is three-fold: (1)...
The alt-right is a far-right movement that has uniquely developed on social media, before becoming prominent in the 2016 United States presidential elections. However, very little research exists about their discourse and organization online. This study aimed to analyze how a sample of alt-right supporters organized themselves in the week before an...
The interest in Community Detection Problems on networks that evolves over time has experienced an increasing attention over the last years. Multi-Objective Genetic Algorithms and other bio-inspired methods have been successfully applied to tackle the community finding problem in static networks. Although, there are a large number of evolutionary a...
The alt-right is a far-right movement that has uniquely developed on social media, before becoming prominent in the 2016 United States presidential elections. However, very little research exists about their discourse and organization online. This study aimed to analyze how a sample of alt-right supporters organized themselves in the week before an...
The alt-right is a far-right movement that has uniquely developed on social media, before becoming prominent in the 2016 United States presidential elections. However, very little research exists about their discourse and organization online. This study aimed to analyze how a sample of alt-right supporters organized themselves in the week before an...
In the past decades the field of Artificial Intelligence, and specially the Machine Learning (ML) research area, has undergone a great expansion. This has been allowed for the greater availability of data, which has not been foreign in the field of medicine. This data can be used to train supervised Machine Learning algorithms. Taking into account...
Finding communities of interrelated nodes is a learning task that often holds in problems that can be modeled as a graph. In any case, detecting an optimal partition in a graph is highly time-consuming and complex. For this reason, the implementation of search-based metaheuristics arises as an alternative for addressing these problems. This manuscr...
Twitter is one of the most commonly used Online Social Networks in the world and it has consequently attracted considerable attention from different political groups attempting to gain influence. Among these groups is the alt-right; a modern far-right extremist movement that gained notoriety in the 2016 US presidential election and the infamous Cha...
The sensor network design problem (SNDP) consists of the selection of the type, number and location of the sensors to measure a set of variables, optimizing a specified criteria, and simultaneously satisfying the information requirements. This problem is multimodal and involves several binary variables, therefore it is a complex combinatorial optim...
RiskTrack is a project supported by the European Union, with the aim of helping security forces, intelligence services and prosecutors to assess the risk of Jihadi radicalization of an individual (or a group of people). To determine the risk of radicalization of an individual, it uses information extracted from its Twitter account. Specifically, th...
The interest in community detection problems on networks that evolves over time have experienced an increasing attention over the last years. Genetic Algorithms, and other bio-inspired methods, have been successfully applied to tackle the community finding problem in static networks. However, few research works have been done related to the improve...
El análisis y la detección de comunidades en redes complejas es actualmente un área de estudio en auge, ya que muchos sistemas se pueden representar como redes de nodos interconectados. Tradicionalmente el esfuerzo se ha aplicado en estudiar métodos para analizar redes estáticas, es decir, redes que no cambian en el tiempo. En el mundo real, a menu...
Due to the temporal nature of real-world networks, the interest in community detection problems on dynamic networks have experienced an increasing attention over the last years. Genetic Algorithms, and other bio-inspired methods, have been successfully applied to tackle the community finding problem in static networks. However, few research works h...
Questions
Questions (2)
I have a dataset composed of several subjects. Each subject has a series of binary indicators where 1 indicates that a the subject presents an indicator and 0 means that the indicator is not present. These indicators are grouped into 5 categories each one composed of a different number of the aforementioned indicators. If a subject has more indicators present at a category it means that the category is stronger in that subject.
I want to aggregate all the binary indicators in each category into a single real value for each subject. The method that came to my mind is using the number of indicators with a value of 1 divided by the total number of indicators. That way a subject with all the indicators equal to 1 in one category will get a maximum value of 1.0 in that category, likewise, a subject with half the indicators with a value of 1 and the other half with a value of 0 has an aggregate value of 0.5 in that category. I am new to data science and I am not sure if this is the best approach. What do you think ? Does this aggregation makes sense to you ? Do you know any other possible aggregations ?
Below I attach a sample toy dataset with 3 subjects, 2 categories and 2 indicators per category to further explain my problem:
| | Indicator1.1 | Indicator1.2 | Indicator2.1 | Indicator2.2 |
|----------|--------------|--------------|--------------|--------------|
| Subject1 | 1 | 0 | 1 | 1 |
| Subject2 | 1 | 1 | 0 | 0 |
| Subject3 | 0 | 1 | 1 | 1 |
In the example above, Indicator1.1 and Indicator1.2 belong to the same category, likewise, Indicator2.1 and Indicator2.2 also belong to the same category. With the aforementioned aggregation method of the ratio the categories real values will be:
| | Category 1 | Category 2 |
|---------- |------------ |------------ |
| Subject1 | 0.5 | 1.0 |
| Subject2 | 1.0 | 0.0 |
| Subject3 | 0.5 | 1.0 |
I have used before measures like the Jaccard Index or the Normalized Mutual Information to check the performance of a clustering algorithm using some benchmarks that have ground truth.
Now I am working with time dependant environments split in several time slices. For any time slice i have a ground truth to evaluate the performance of my clustering algorithm. I am looking for some measurements like the ones mentioned above that compress the performance over all the time slices into one number.
thanks.