Anastasiia Filatova’s research while affiliated with ITMO University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (10)


Scheme of events detection pipeline of the ConvTree algorithm.
Pipeline of the event detection system SemConvTree.
Scheme of the post-ranking module.
Filtering step results for posts by time of day: morning (M), afternoon (A), and evening (E), and for posts by month: February (Feb), June (Jun), and October (Oct).
Categories and posts number.

+2

SemConvTree: Semantic Convolutional Quadtrees for Multi-Scale Event Detection in Smart City
  • Article
  • Full-text available

September 2024

·

32 Reads

Mikhail Andeevich Kovalchuk

·

Anastasiia Filatova

·

Aleksei Korneev

·

[...]

·

Highlights What are the main findings? Enhanced Event Detection Accuracy: The introduction of the SemConvTree model, which integrates improved versions of BERTopic, TSB-ARTM, and SBert-Zero-Shot, enables a significant enhancement in the detection accuracy of urban events. The model’s ability to incorporate semantic analysis along with statistical evaluations allows for discerning and categorizing events from social media data more precisely. This results in approximately a 40% increase in the F1-score for event detection compared to previous methods. Semantic Analysis for Event Identification: The SemConvTree model leverages semi-supervised learning techniques to analyze the semantic content of social media posts. This approach helps in understanding the nuanced contexts of urban events, improving the identification process. The model not only recognizes the occurrence of events but also categorizes them into meaningful groups based on their semantic characteristics, which is crucial for effective urban management and planning. What are the implications of the main findings? The increased accuracy in event detection ensures that urban planners and emergency services can respond more effectively to both planned and unplanned urban events. More accurate data leads to better resource allocation, ensuring that services are deployed where they are most needed. This could lead to enhanced safety, improved traffic management, and better crowd control during events, ultimately enhancing urban living conditions. By effectively categorizing urban events based on their semantic characteristics, city administrators can gain insights into the types of events that are prevalent in different areas of the city. This can inform more targeted community engagement strategies, help in the planning of public services and facilities, and ensure that urban policies are closely aligned with the actual dynamics of the city. Additionally, this can aid in long-term urban development strategies by identifying evolving trends and shifts in urban activity patterns. Abstract The digital world is increasingly permeating our reality, creating a significant reflection of the processes and activities occurring in smart cities. Such activities include well-known urban events, celebrations, and those with a very local character. These widespread events have a significant influence on shaping the spirit and atmosphere of urban environments. This work presents SemConvTree, an enhanced semantic version of the ConvTree algorithm. It incorporates the semantic component of data through semi-supervised learning of a topic modeling ensemble, which consists of improved models: BERTopic, TSB-ARTM, and SBert-Zero-Shot. We also present an improved event search algorithm based on both statistical evaluations and semantic analysis of posts. This algorithm allows for fine-tuning the mechanism of discovering the required entities with the specified particularity (such as a particular topic). Experimental studies were conducted within the area of New York City. They showed an improvement in the detection of posts devoted to events (about 40% higher f1-score) due to the accurate handling of events of different scales. These results suggest the long-term potential for creating a semantic platform for the analysis and monitoring of urban events in the future.

Download

SemConvTree: Semantic Convolutional Quadtrees for Multi-scale Event Detection in Smart City

July 2024

·

26 Reads

The digital world is increasingly invading our reality, which leads to the formation of a significant reflection of the processes and activities taking place in the smart city. Such activities include well-known urban events, celebrations, and those with a very local character. Due to the mass occurrence, events have a comparable influence on the formation of the spirit and the urban atmosphere. This work presents an enhanced semantic version of the ConvTree algorithm - SemConvTree. It allows considering the semantic component of the data obtained by using semi-supervised learning of topic modeling ensemble (consisting of improved models BERTopic, TSB-ARTM, SBert-Zero-Shot). We also present an improved event search algorithm based on both statistical evaluations and semantic analysis of posts. This algorithm allows fine-tuning the mechanism of discovering the required entities with the specified particularity (such as a particular topic). Experimental studies were conducted within the area of New York City. They showed an improvement in the detection of posts devoted to events (about 40% higher f1-score) due to the accurate handling of events of different scales. These results lead in the long term to talk about the potential perspective in creating a semantic platform for the analysis and monitoring of urban events in the future.






MuCAAT: Multilingual Contextualized Authorship Anonymization of Texts from social networks

November 2022

·

24 Reads

·

1 Citation

Procedia Computer Science

Social networks are a source of data that can be useful to researchers in a variety of fields. However, an important limitation to the applicability of this kind of data is the presence of private and sensitive information in it. Therefore, it is important to anonymize it before using it in research to ensure that it is ethical and does not lead to the leak of private information. Most anonymization methods involve removing named entities and identifiers from text. However, in most cases, this level of anonymization is not enough since the authorship of texts can be determined by the style of writing. We propose a new text generation method that takes into account the features of social networks and allows effective anonymization of text style. We evaluate the effectiveness of the proposed model on social media texts, including the Sentiment 140 dataset. The result shows that our model outperforms the state-of-the-art model for authorship obfuscation and semantics preservation.


SemAGR: semantic method for accurate geolocations reconstruction within extensive urban sites

November 2022

·

13 Reads

·

4 Citations

Procedia Computer Science

Accurate geolocation of users’ posts in social networks plays a vital role in a wide area of research devoted to analysing the urban environment based on social media data. It allows us to effectively correlate information about a real urban object with its description in social networks. Information analysis about extensive urban sites is challenging in such studies because publications correlate extremely unevenly with marked geotags relating to such objects. In addition, users of social networks often intentionally indicate incorrect geolocations to increase their publications’ popularity. This article proposes a solution to the problem of accurate geolocation reconstruction for extensive urban sites. It is proposed to use a combined method that enriches the list of initial geolocations of the social network with the help of external sources. The transformer model is then applied to recognise named entities that help to detect mentions of events or locations in user posts. After detection, posts are redefined to the new locations from the extended list according to the semantic similarity calculated between publications and locations.


Towards comparable event detection approaches development in social media

November 2022

·

15 Reads

·

3 Citations

Procedia Computer Science

The rapid growth in the popularity of social media allows researchers in various fields to extract useful knowledge for diverse applications. One of the popular areas of such research is event detection based on social media data analysis. In the scope of this direction, one of the major challenges is to evaluate the quality of the results and to provide a fair comparison of an event detection approach with existing solutions. In this work, we evaluate recently published articles on event detection from the comparability perspective and analyze the ways in which the authors evaluate their algorithms. Next, we review publicly available event detection evaluation datasets and highlight the limitations of their use. As a result, we come to the conclusion that there is a strong need to create a public universal dataset that would allow authors to compare algorithms with each other. Lastly, we describe characteristics of such a dataset and suggest potential ways how it can be created.


SeSAM: semi-automated semantic analysis method of urban areas’ events with extreme levels of popularity based on public open data

November 2021

·

21 Reads

·

1 Citation

Procedia Computer Science

This study describes the semi-automated pipeline created for the comprehensive analysis of the urban areas with the extremely low and extremely high popularity levels. It includes the geo-frequency analysis of the Russian-language Instagram publications for the St. Petersburg area and selection of areas with the extreme values of the popularity level according to the number of publications in them. Semantic analysis of the urban areas with an extremely low number of publications includes comparing of algorithms for descriptions extraction and classification for these areas and results of such descriptions extraction and classification using TF-IDF vectorization technique and most valuable words extraction. Semantic analysis of areas with an extremely high number of publications includes the structure description of such areas, comparing of algorithms for advertisement publications extraction, results of the advertisement extraction using BigARTM model and further development and implementation of the algorithm for extracting events related to the the points of attraction in extremely popular urban areas, which is based on the strong time binding hypothesis and the idea of similarity queries using combination of LDA models for revealing semantic structure and algorithm based on frequency analysis. Developed algorithm was tested to extract events in the urban area of St. Petersburg where Ice Palace is placed and showed interpretable results and allow us to correctly extract 89 events out of 102 which occurred in this area in 2019. Finally, SeSAM pipeline for comprehensive urban analysis was created that combined the described algorithms.

Citations (4)


... Voß and Witt [50] utilize the multi-mode RCPSP with an objective function that integrates makespan, weighted tardiness, and setup costs, facilitating activity batching. Słowinski [45] is credited as the first to outline the framework of the multi-objective resource-constrained project scheduling problem (MORCPSP) and list various objectives. ...

Reference:

A Simulated Annealing Algorithm for the Preemptive Multi-Objective Multi-Mode Resource-Constrained Project Scheduling Problem
Hybrid Algorithm for Multi-Contractor, Multi-Resource Project Scheduling in the Industrial Field
  • Citing Article
  • January 2023

Procedia Computer Science

... Finally, minimizing the carbon footprint and energy consumption in training AA models is vital for aligning AI development with sustainability goals Bannour et al., 2021). Developers can take proactive steps by prioritizing efficient algorithms and training strategies that demand fewer computational resources, optimizing model architectures via knowledge distillation (Beyer et al., 2022;Panov et al., 2022), and exploring data-efficient techniques like transfer learning (Gupta et al., 2020;Barlas & Stamatatos, 2021;Hessenthaler et al., 2022;Silva et al., 2023;Saxena et al., 2023b). Monitoring and quantifying carbon emissions during AA model training with carbon tracking tools (Lacoste et al., 2019;Anthony et al., 2020) can provide insights for optimization and offsetting strategies. ...

MuCAAT: Multilingual Contextualized Authorship Anonymization of Texts from social networks
  • Citing Article
  • November 2022

Procedia Computer Science

... One of the popular areas of research in social media is event detection, where one of the main challenges is evaluating the quality of the results and providing a fair comparison of event detection with existing methods [6]. According to [7], in the context of topic detection and tracking, "An event is something that happens at some specific time and place along with all necessary preconditions with unavoidable consequences", e.g., specific elections, accidents, crimes, or natural disasters. ...

Towards comparable event detection approaches development in social media
  • Citing Article
  • November 2022

Procedia Computer Science

... Several groups have leveraged the County Tweet Lexical Bank and datasets assembled by similar methods for opioid-related [14,15,52,76,77] and non-opioid-related [78,79] work. Others have used named entity recognition to extract unambiguous place names from social media text [80][81][82][83]. ...

SemAGR: semantic method for accurate geolocations reconstruction within extensive urban sites
  • Citing Article
  • November 2022

Procedia Computer Science