João Magalhães

João Magalhães
Universidade NOVA de Lisboa | NOVA · Department of Informatics (DI)

Professor

About

149
Publications
28,770
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
994
Citations

Publications

Publications (149)
Preprint
Conversational systems must be robust to user interactions that naturally exhibit diverse conversational traits. Capturing and simulating these diverse traits coherently and efficiently presents a complex challenge. This paper introduces Multi-Trait Adaptive Decoding (mTAD), a method that generates diverse user profiles at decoding-time by sampling...
Preprint
Full-text available
Guiding users through complex procedural plans is an inherently multimodal task in which having visually illustrated plan steps is crucial to deliver an effective plan guidance. However, existing works on plan-following language models (LMs) often are not capable of multimodal input and output. In this work, we present MM-PlanLLM, the first multimo...
Conference Paper
Full-text available
Training Large Language Models (LLMs) to follow user instructions has been shown to supply the LLM with ample capacity to converse fluently while being aligned with humans. Yet, it is not completely clear how an LLM can lead a plan-grounded conversation in mixed-initiative settings where instructions flow in both directions of the conversation, i.e...
Chapter
Dialogue systems need to deal with the unpredictability of user intents to track dialogue state and the heterogeneity of slots to understand user preferences. In this paper we investigate the hypothesis that solving these challenges as one unified model will allow the transfer of parameter support data across the different tasks. The proposed princ...
Conference Paper
Full-text available
With the growing interest in recreating live and realistic outside experiences within the confines of our homes, the online shopping industry has also been impacted. However, traditional modes of interaction with online storefronts have remained mainly unchanged. This paper studies the factors influencing user experience and interaction in 3D virtu...
Preprint
Full-text available
Dialogue systems need to deal with the unpredictability of user intents to track dialogue state and the heterogeneity of slots to understand user preferences. In this paper we investigate the hypothesis that solving these challenges as one unified model will allow the transfer of parameter support data across the different tasks. The proposed princ...
Conference Paper
Full-text available
For task-oriented dialog agents, the tone of voice mediates user-agent interactions, playing a central role in the flow of a conversation. Distinct from domain-agnostic politeness constructs, in specific domains such as online stores, booking platforms, and others, agents need to be capable of adopting highly specific vocabulary, with significant i...
Technical Report
Full-text available
This paper describes the vision, scientific contributions, and technical details of the Task Wizard (TWIZ) team's participation in the Alexa TaskBot Challenge 2021. Our bot design envisions the support of an engaging experience, where users are guided through multimodal conversations, towards the successful completion of the selected task. This is...
Chapter
Face authentication and biometrics are becoming a commodity in many situations of our society. As its application becomes widespread, vulnerability to attacks becomes a challenge that needs to be tackled. In this paper, we propose a non-intrusive on the fly liveness detection system, based on 1D convolutional neural networks, that given pulse signa...
Article
Full-text available
On the quest of providing a more natural interaction between users and search systems, open-domain conversational search assistants have emerged, by assisting users in answering questions about open topics in a conversational manner. In this work, we show how the Transformer architecture achieves state-of-the-art results in key IR tasks, leveraging...
Preprint
Creating a cohesive, high-quality, relevant, media story is a challenge that news media editors face on a daily basis. This challenge is aggravated by the flood of highly relevant information that is constantly pouring onto the newsroom. To assist news media editors in this daunting task, this paper proposes a framework to organize news content int...
Conference Paper
Queuing at airport border controls is one of the bottlenecks in the flow of passengers, which results in a poor travel experience and in serious health risks, like COVID19, due to the concentration of people and contact surfaces [4]. To address this problem, biometrics-on-the-move removes physical barriers for passengers, while preserving security...
Preprint
Full-text available
The conversational search paradigm introduces a step change over the traditional search paradigm by allowing users to interact with search agents in a multi-turn and natural fashion. The conversation flows naturally and is usually centered around a target field of knowledge. In this work, we propose a knowledge-driven answer generation approach for...
Preprint
Full-text available
The use of conversational assistants to search for information is becoming increasingly more popular among the general public, pushing the research towards more advanced and sophisticated techniques. In the last few years, in particular, the interest in conversational search is increasing, not only because of the generalization of conversational as...
Chapter
Open-domain conversational search assistants aim at answering user questions about open topics in a conversational manner. In this paper we show how the Transformer architecture [30] achieves state-of-the-art results in key IR tasks, leveraging the creation of conversational assistants that engage in open-domain conversational search with single, y...
Preprint
Full-text available
Open-domain conversational search assistants aim at answering user questions about open topics in a conversational manner. In this paper we show how the Transformer architecture achieves state-of-the-art results in key IR tasks, leveraging the creation of conversational assistants that engage in open-domain conversational search with single, yet in...
Article
Full-text available
In order to develop computer tools for speech therapy that reliably classify speech productions, there is a need for speech production corpora that characterize the target population in terms of age, gender, and native language. Apart from including correct speech productions, in order to characterize the target population, the corpora should also...
Article
Full-text available
Many children with speech sound disorders cannot pronounce the sibilant consonants correctly. We have developed a serious game, which is controlled by the children's voices in real time, with the purpose of helping children on practicing the production of European Portuguese (EP) sibilant consonants. For this, the game uses a sibilant consonant cla...
Chapter
Full-text available
The development of reliable speech therapy computer tools that automatically classify speech productions depends on the quality of the speech data set used to train the classification algorithms. The data set should characterize the population in terms of age, gender and native language, but it should also have other important properties that chara...
Book
This two-volume set LNCS 12035 and 12036 constitutes the refereed proceedings of the 42nd European Conference on IR Research, ECIR 2020, held in Lisbon, Portugal, in April 2020. The 55 full papers presented together with 8 reproducibility papers, 46 short papers, 10 demonstration papers, 12 invited CLEF papers, 7 doctoral consortium papers, 4 works...
Book
This two-volume set LNCS 12035 and 12036 constitutes the refereed proceedings of the 42nd European Conference on IR Research, ECIR 2020, held in Lisbon, Portugal, in April 2020. The 55 full papers presented together with 8 reproducibility papers, 46 short papers, 10 demonstration papers, 12 invited CLEF papers, 7 doctoral consortium papers, 4 works...
Conference Paper
Cross-modal embeddings, between textual and visual modalities, aim to organise multimodal instances by their semantic correlations. State-of-the-art approaches use maximum-margin methods, based on the hinge-loss, to enforce a constant margin m, to separate projections of multimodal instances from different categories. In this paper, we propose a no...
Conference Paper
Understanding the semantic shifts of multimodal information is only possible with models that capture cross-modal interactions over time. Under this paradigm, a new embedding is needed that structures visual-textual interactions according to the temporal dimension, thus, preserving data's original temporal organisation. This paper introduces a nove...
Preprint
Full-text available
Understanding the semantic shifts of multimodal information is only possible with models that capture cross-modal interactions over time. Under this paradigm, a new embedding is needed that structures visual-textual interactions according to the temporal dimension, thus, preserving data's original temporal organisation. This paper introduces a nove...
Preprint
Cross-modal embeddings, between textual and visual modalities, aim to organise multimodal instances by their semantic correlations. State-of-the-art approaches use maximum-margin methods, based on the hinge-loss, to enforce a constant margin m, to separate projections of multimodal instances from different categories. In this paper, we propose a no...
Chapter
Many children suffering from speech sound disorders cannot pronounce the sibilant consonants correctly. We have developed a serious game that is controlled by the children’s voices in real time and that allows children to practice the European Portuguese sibilant consonants. For this, the game uses a sibilant consonant classifier. Since the game do...
Preprint
Full-text available
Media editors in the newsroom are constantly pressed to provide a "like-being there" coverage of live events. Social media provides a disorganised collection of images and videos that media professionals need to grasp before publishing their latest news updated. Automated news visual storyline editing with social media content can be very challengi...
Conference Paper
Full-text available
Media editors in the newsroom are constantly pressed to provide a"like-being there" coverage of live events. Social media provides a disorganised collection of images and videos that media professionals need to grasp before publishing their latest news updated. Automated news visual storyline editing with social media content can be very challengin...
Conference Paper
Distributing multimedia indexes to multiple nodes enables search over very large datasets (i.e., over one billion images and videos), but comes with a set of challenges: \textithow to distribute documents and queries effectively across nodes to support concurrent querying? andhow to deal with the increased potential for lack of response from nodes...
Chapter
Traditional keyword extraction methods make the assumption that corpora is static. However, in social media, information is highly dynamic, with individual words showing a dynamic behaviour. In this paper we propose an unsupervised approach that jointly models words’ temporal behaviour and keyword’s semantic affinity, to address the task of dynamic...
Conference Paper
Full-text available
The abundance and ever growing expansion of user-generated content defines a paradigm in multimedia consumption. While user immersion through audio has gained relevance in the later years due to the growing interest in virtual and augmented reality immersion technologies, the existent user-generated content visualization techniques are still not ma...
Preprint
Full-text available
Newsworthy events are broadcast through multiple mediums and prompt the crowds to produce comments on social media. In this paper, we propose to leverage on this behavioral dynamics to estimate the most relevant time periods for an event (i.e., query). Recent advances have shown how to improve the estimation of the temporal relevance of such topics...
Conference Paper
Full-text available
The distortion of sibilant sounds is a common type of speech sound disorder in European Portuguese speaking children. Speech and language pathologists (SLP) use different types of speech production tasks to assess these distortions. One of these tasks consists of the sustained production of isolated sibilants. Using these sound productions, SLPs us...
Conference Paper
Combining multiple retrieval functions can lead to notable gains in retrieval performance. Learning to Rank (LETOR) techniques achieve outstanding retrieval results, by learning models with no bounds on model complexity. Often, minor retrieval gains are attained at a significant cost in model complexity. This paper focuses on the research question:...
Preprint
In this paper we address the task of gender classification on picture sharing social media networks such as Instagram and Flickr. We aim to infer the gender of an user given only a small set of the images shared in its profile. We make the assumption that user's images contain a collection of visual elements that implicitly encode discriminative pa...
Preprint
Full-text available
Multimedia information have strong temporal correlations that shape the way modalities co-occur over time. In this paper we study the dynamic nature of multimedia and social-media information, where the temporal dimension emerges as a strong source of evidence for learning the temporal correlations across visual and textual modalities. So far, cros...
Preprint
News editors need to find the photos that best illustrate a news piece and fulfill news-media quality standards, while being pressed to also find the most recent photos of live events. Recently, it became common to use social-media content in the context of news media for its unique value in terms of immediacy and quality. Consequently, the amount...
Preprint
In microblog retrieval, query expansion can be essential to obtain good search results due to the short size of queries and posts. Since information in microblogs is highly dynamic, an up-to-date index coupled with pseudo-relevance feedback (PRF) with an external corpus has a higher chance of retrieving more relevant documents and improving ranking...
Conference Paper
In microblog retrieval, query expansion can be essential to obtain good search results due to the short size of queries and posts. Since information in microblogs is highly dynamic, an up-to-date index coupled with pseudo-relevance feedback (PRF) with an external corpus has a higher chance of retrieving more relevant documents and improving ranking...
Conference Paper
News editors need to find the photos that best illustrate a news piece and fulfill news-media quality standards, while being pressed to also find the most recent photos of live events. Recently, it became common to use social-media content in the context of news media for its unique value in terms of immediacy and quality. Consequently, the amount...
Conference Paper
The rise of large data streams introduces new challenges regarding the delivery of relevant content towards an information need. This need can be seen as a broad topic of information. By identifying sub-streams within a broader data stream, we can retrieve relevant content that matches the multiple facets of the topic; thus summarizing information,...
Article
Full-text available
Effective partitioning multimedia indexes is key for efficient kNN search. But existing algorithms are based on document similarity, without partition size or redundancy constraints. Our goal is to create an index partitioning algorithm that addresses the specific properties of a distributed system: load balancing across nodes, redundancy in node f...
Article
Full-text available
In this paper we propose a large-scale high-dimensional indexing algorithm based on sparse approximation and inverted indexing. Our goal was to devise a method that smoothly scales to handle databases with over 100 million descriptors on a single machine. To meet this goal, we implemented an inverted indexed based on a sparsifying dictionary with l...
Conference Paper
Full-text available
In recommender systems, the cold-start problem is a common challenge. When a new item has no ratings, it becomes difficult to relate it to other items or users. In this paper, we address the cold-start problem and propose to leverage on social-media trends and reputations to improve the recommendation of new items. The proposed framework models the...
Conference Paper
Using solely the information retrieved by audio fingerprinting techniques, we propose methods to treat a possibly large dataset of user-generated audio content, that (1) enable the grouping of several audio files that contain a common audio excerpt (i.e., are relative to the same event), and (2) give information about how those files are correlated...
Conference Paper
The increase of the quantity of user-generated content experienced in social media has boosted the importance of analysing and organising the content by its quality. Here, we propose a method that uses audio fingerprinting to organise and infer the quality of user-generated audio content. The proposed method detects the overlapping segments between...
Preprint
Using solely the information retrieved by audio fingerprinting techniques, we propose methods to treat a possibly large dataset of user-generated audio content, that (1) enable the grouping of several audio files that contain a common audio excerpt (i.e., are relative to the same event), and (2) give information about how those files are correlated...
Preprint
The increase of the quantity of user-generated content experienced in social media has boosted the importance of analysing and organising the content by its quality. Here, we propose a method that uses audio fingerprinting to organise and infer the quality of user-generated audio content. The proposed method detects the overlapping segments between...
Conference Paper
In this paper, we propose a collaborative system to let users share their own videos and interact among themselves to collaboratively do a video coverage of live events. Our intention is to motivate users to make positive contributions to the comprehensiveness of available videos about that event. To achieve this we propose a collaborative video fr...
Conference Paper
This paper addresses the problem of balanced, redundant indexing of media information. Our goal is to partition and distribute the search index, taking advantage of the distributed systems properties: balanced load across nodes, redundancy on node down and efficient node usage under concurrent querying. We follow an information compression approach...
Conference Paper
Full-text available
3D video is introducing great changes in many health related areas. The realism of such information provides health professionals with strong evidence analysis tools to facilitate clinical decision processes. Speech and language therapy aims to help subjects in correcting several disorders. The assessment of the patient by the speech and language t...
Conference Paper
Full-text available
Speech is the main form of human communication. Thus it is important to detect and treat speech sound disorders as early as possible during childhood. When children need to attend speech therapy it is critical to keep them motivated on doing the therapy exercises. Software systems for speech therapy can be a useful tool to keep the child interested...
Article
Full-text available
Affective-interaction in computer games is a novel area with several new challenges, such as detecting players facial expressions robustly. Many of the existing facial expression datasets are composed of a set of posed face images not captured in a realistic affective-interaction setting. The contribution of this paper is an affective-interaction d...
Conference Paper
Full-text available
Traditional speech therapy approaches for speech sound disorders have a lot of advantages to gain from computer-based therapy systems. With speech recognition techniques the motivation elements of these systems can be automated in order to get an interactive environment that motivates the therapy attendee towards better performances. Here we propos...
Conference Paper
In this demo we show how we can enhance real-time microblog search by monitoring news sources on Twitter. We improve retrieval through query expansion using pseudo-relevance feedback. However, instead of doing feedback on the original corpus we use a separate Twitter news index. This allows the system to find additional terms associated with the or...
Preprint
In Twitter, and other microblogging services, the generation of new content by the crowd is often biased towards immediacy: what is happening now. Prompted by the propagation of commentary and information through multiple mediums, users on the Web interact with and produce new posts about newsworthy topics and give rise to trending topics. This pap...
Conference Paper
Full-text available
In Twitter, and other microblogging services, the generation of new content by the crowd is often biased towards immediacy: what is happening now. Prompted by the propagation of commentary and information through multiple mediums, users on the Web interact with and produce new posts about newsworthy topics and give rise to trending topics. This pap...