
About
117
Publications
25,004
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,008
Citations
Citations since 2017
Introduction
Additional affiliations
October 2018 - September 2019
July 2017 - September 2017
Microsoft Research
Position
- Research Intern
May 2017 - July 2017
Education
January 2014 - September 2018
September 2010 - September 2012
September 2007 - July 2010
Publications
Publications (117)
Prompt tuning (PT), where a small amount of trainable soft (continuous) prompt vectors is affixed to the input of language models (LM), has shown promising results across various tasks and models for parameter-efficient fine-tuning (PEFT). PT stands out from other PEFT approaches because it maintains competitive performance with fewer trainable par...
Despite the high economic relevance of Foundation Industries, certain components like Reheating furnaces within their manufacturing chain are energy-intensive. Notable energy consumption reduction could be obtained by reducing the overall heating time in furnaces. Computer-integrated Machine Learning (ML) and Artificial Intelligence (AI) powered co...
In order to accurately simulate users in conversational systems, it is essential to comprehend the factors that influence their behaviour. This is a critical challenge for the Information Retrieval (IR) field, as conventional methods are not well-suited for the interactive and unique sequential structure of conversational contexts. In this study, w...
In recent years, language models (LMs) have made remarkable progress in advancing the field of natural language processing (NLP). However, the impact of data augmentation (DA) techniques on the fine-tuning (FT) performance of these LMs has been a topic of ongoing debate. In this study, we evaluate the effectiveness of three different FT methods in...
Dialogue systems and large language models (LLMs) have gained considerable attention. However, the direct utilization of LLMs as task-oriented dialogue (TOD) models has been found to underperform compared to smaller task-specific models. Nonetheless, it is crucial to acknowledge the significant potential of LLMs and explore improved approaches for...
In recent years, language models (LMs) have made remarkable progress in advancing the field of natural language processing (NLP). However, the impact of data augmentation (DA) techniques on the fine-tuning (FT) performance of these LMs has been a topic of ongoing debate. In this study, we evaluate the effectiveness of three different FT methods in...
Background: Grain commodities are important to people's daily lives and their fluctuations can cause instability for households. Accurate prediction of grain prices can improve food and social security.
Methods & Materials: This study proposes a hybrid Long Short‐Term Memory (LSTM)‐Convolutional Neural Network (CNN) model to forecast weekly oat, c...
Session-based recommendation, which aims to predict the next item of users' interest as per an existing sequence interaction of items, has attracted growing applications of Contrastive Learning (CL) with improved user and item representations. However, these contrastive objectives: (1) serve a similar role as the cross-entropy loss while ignoring t...
In recent years, groundbreaking transformer-based language models (LMs) have made tremendous advances in natural language processing (NLP) tasks. However, the measurement of their fairness with respect to different social groups still remains unsolved. In this paper, we propose and thoroughly validate an evaluation technique to assess the quality a...
The ability to understand a user's underlying needs is critical for conversational systems, especially with limited input from users in a conversation. Thus, in such a domain, Asking Clarification Questions (ACQs) to reveal users' true intent from their queries or utterances arise as an essential task. However, it is noticeable that a key limitatio...
Task-oriented dialogue systems aim at providing users with task-specific services. Users of such systems often do not know all the information about the task they are trying to accomplish, requiring them to seek information about the task. To provide accurate and personalized task-oriented information seeking results, task-oriented dialogue systems...
In collaborative tasks, effective communication is crucial for achieving joint goals. One such task is collaborative building where builders must communicate with each other to construct desired structures in a simulated environment such as Minecraft. We aim to develop an intelligent builder agent to build structures based on user input through dia...
Language models (LMs) trained on vast quantities of unlabelled data have greatly advanced the field of natural language processing (NLP). In this study, we re-visit the widely accepted notion in NLP that continued pre-training LMs on task-related texts improves the performance of fine-tuning (FT) in downstream tasks. Through experiments on eight si...
Citation screening is an essential and time-consuming step of the systematic literature review process in medicine. Multiple previous studies have proposed various automation techniques to assist manual annotators in this tedious task. The most widely used measure for the evaluation of automated citation screening techniques is Work Saved over Samp...
Early and late blight are two diseases which pose a huge risk to both potato (Solanum tuberosum L.) and tomato (Solanum lycopersicum, L. 1753) crops and make farmers run at a loss. The early and automatic detection of these diseases would save time as well as enable farmers to act quickly on crops which have been affected. Machine learning and deep...
Accurately monitoring the variation of snow cover from remote sensing is vital since it assists in various fields including prediction of floods, control of runoff values, and the ice regime of rivers. Spectral indices methods are traditional ways to realize snow segmentation, including the most common one – the Normalized Difference Snow Index (ND...
Given their strong performance on a variety of graph learning tasks, Graph Neural Networks (GNNs) are increasingly used to model financial networks. Traditional GNNs, however, are not able to capture higher-order topological information, and their performance is known to degrade with the presence of negative edges that may arise in many common fina...
The study presented here builds on previous synthetic aperture radar (SAR) burnt area estimation models and presents the first U-Net (a convolutional network architecture for fast and precise segmentation of images) combined with ResNet50 (Residual Networks used as a backbone for many computer vision tasks) encoder architecture used with SAR, Digit...
As virtual personal assistants have now penetrated the consumer market, with products such as Siri and Alexa, the research community has produced several works on task-oriented dialogue tasks such as hotel booking, restaurant booking, and movie recommendation. Assisting users to cook is one of these tasks that are expected to be solved by intellige...
In the midst of digital transformation, automatically understanding the structure and composition of scanned documents is important in order to allow correct indexing, archiving, and processing. In many organizations, different types of documents are usually scanned together in folders, so it is essential to automate the task of segmenting the fold...
The recording of this presentation is available on YouTube here: https://www.youtube.com/watch?v=_fuejyaCheU
Due to the sequential and interactive nature of conversations, the application of traditional Information Retrieval (IR) methods like the Cranfield paradigm require stronger assumptions. When building a test collection for Ad Hoc search, it is fair to assume that the relevance judgments provided by an annotator correlate well with the relevance jud...
As virtual personal assistants have now penetrated the consumer market, with products such as Siri and Alexa, the research community has produced several works on task-oriented dialogue tasks such as hotel booking, restaurant booking, and movie recommendation. Assisting users to cook is one of these tasks that are expected to be solved by intellige...
Collaborative tasks are ubiquitous activities where a form of communication is required in order to reach a joint goal. Collaborative building is one of such tasks. We wish to develop an intelligent builder agent in a simulated building environment (Minecraft) that can build whatever users wish to build by just talking to the agent. In order to ach...
Dialogue State Tracking (DST) aims to keep track of users’ intentions during the course of a conversation. In DST, modelling the relations among domains and slots is still an under-studied problem. Existing approaches that have considered such relations generally fall short in: (1) fusing prior slot-domain membership relations and dialogue-aware dy...
A human-like user simulator that anticipates users' satisfaction scores, actions, and utterances can help goal-oriented dialogue systems in evaluating the conversation and refining their dialogue strategies. However, little work has experimented with user simulators which can generate users' utterances. In this paper, we propose a deep learning-bas...
Collaborative tasks are ubiquitous activities where a form of communication is required in order to reach a joint goal. Collaborative building is one of such tasks. We wish to develop an intelligent builder agent in a simulated building environment (Minecraft) that can build whatever users wish to build by just talking to the agent. In order to ach...
Collaborative tasks are ubiquitous activities where a form of communication is required in order to reach a joint goal. Collaborative building is one of such tasks. We wish to develop an intelligent builder agent in a simulated building environment (Minecraft) that can build whatever users wish to build by just talking to the agent. In order to ach...
Inferring spatial relations in natural language is a crucial ability an intelligent system should possess. The bAbI dataset tries to capture tasks relevant to this domain (task 17 and 19). However, these tasks have several limitations. Most importantly, they are limited to fixed expressions, they are limited in the number of reasoning steps require...
Dialogue State Tracking (DST) aims to keep track of users' intentions during the course of a conversation. In DST, modelling the relations among domains and slots is still an under-studied problem. Existing approaches that have considered such relations generally fall short in: (1) fusing prior slot-domain membership relations and dialogue-aware dy...
For more than three decades, the part of the geoscientific community studying landslides through data-driven models has focused on estimating where landslides may occur across a given landscape. This concept is widely known as landslide susceptibility. And, it has seen a vast improvement from old bivariate statistical techniques to modern deep lear...
Flood incidents can massively damage and disrupt a city economic or governing core. However, flood risk can be mitigated through event planning and city-wide preparation to reduce damage. For, governments, firms, and civilians to make such preparations, flood susceptibility predictions are required. To predict flood susceptibility nine environmenta...
Deep Learning (DL) semantic image segmentation is a technique used in several fields of research. The present paper analyses semantic crack segmentation as a case study to review the up to date research on semantic segmentation in the presence of fine structures and the effectiveness of established approaches to address the inherent class imbalance...
Following its identification in late 2019, COVID-19 has spread around the globe, and been declared a pandemic. With this in mind, modelling the spread of COVID-19 remains important for responding effectively. To date research has focused primarily on modelling the spread of COVID-19 on national and regional scales with just a few studies doing so o...
Off-grid technologies, such as solar home systems (SHS), offer the opportunity to alleviate global energy poverty, providing a cost-effective alternative to an electricity grid connection. However, there is a paucity of high-quality SHS electricity usage data and thus a limited understanding of consumers’ past and future usage patterns. This study...
For more than three decades, the scientific community that studies landslides through data-driven models has focused on estimating where landslides occur across a given landscape. This concept is widely known as landslide susceptibility. And, it has seen a vast improvement from old bivariate statistical techniques to modern deep learning routines....
Inferring spatial relations in natural language is a crucial ability an intelligent system should possess. The bAbI dataset tries to capture tasks relevant to this domain (tasks 17 and 19). However, these tasks have several limitations. Most importantly, they are limited to fixed expressions, they are limited in the number of reasoning steps requir...
Information availability affects people's behavior and perception of the world. Notably, people rely on search engines to satisfy their need for information. Search engines deliver results relevant to user requests usually without being or making themselves accountable for the information they deliver, which may harm people's lives and, in turn, so...
Deep neural networks have gained momentum based on their accuracy, but their interpretability is often criticised. As a result, they are labelled as black boxes. In response, several methods have been proposed in the literature to explain their predictions. Among the explanatory methods, Shapley values is a feature attribution method favoured for i...
For the estimation of the soil organic carbon stocks, bulk density (BD) is a fundamental parameter but measured data are usually not available especially when dealing with legacy soil data. It is possible to estimate BD by applying pedotransfer function (PTF). We applied different estimation methods with the aim to define a suitable PTF for BD of a...
The on-going process of softwarization of IT networks promises to reduce the operational and management costs of network infrastructures by replacing hardware middleboxes with equivalent pieces of code executed on general-purpose servers. Alongside the benefits from the operator’s perspective, new strategies to provide the network’s resources to us...
Background: Legacy data are unique occasions for estimating soil organic carbon (SOC) concentration changes and spatial variability, but their use showed limitations due to the sampling schemes adopted and improvements may be needed in the analysis methodologies. When SOC changes is estimated with legacy data, the use of soil samples collected in d...
Active Learning (AL) has been successfully applied to Deep Learning in order to drastically reduce the amount of data required to achieve high performance. Previous works have shown that lightweight architectures for Named Entity Recognition (NER) can achieve optimal performance with only 25% of the original training data. However, these methods do...
As conversational agents like Siri and Alexa gain in popularity and use, conversation is becoming a more and more important mode of interaction for search. Conversational search shares some features with traditional search, but differs in some important respects: conversational search systems are less likely to return ranked lists of results (a SER...
Near-real time water segmentation with medium resolution satellite imagery plays a critical role in water management. Automated water segmentation of satellite imagery has traditionally been achieved using spectral indices. Spectral water segmentation is limited by environmental factors and requires human expertise to be applied effectively. In rec...
Legacy data are frequently unique sources of data for the estimation of past soil properties. With the rising concerns about greenhouse gases (GHG) emission and soil degradation due to intensive agriculture and climate change effects, soil organic carbon (SOC) concentration might change heavily over time. When SOC changes is estimated with legacy d...
Search engines decide what we see for a given search query. Since many people are exposed to information through search engines, it is fair to expect that search engines are neutral. However, search engine results do not necessarily cover all the viewpoints of a search query topic, and they can be biased towards a specific view since search engine...
Supervised machine learning models and their evaluation strongly depends on the quality of the underlying dataset. When we search for a relevant piece of information it may appear anywhere in a given passage. However, we observe a bias in the position of the correct answer in the text in two popular Question Answering datasets used for passage re-r...
Supervised machine learning models and their evaluation strongly depends on the quality of the underlying dataset. When we search for a relevant piece of information it may appear anywhere in a given passage. However, we observe a bias in the position of the correct answer in the text in two popular Question Answering datasets used for passage re-r...
For the estimation of the soil organic carbon stocks, bulk density (BD) is a fundamental parameter but measured data are usually not available especially when dealing with legacy soil data. It is possible to estimate BD by applying pedotransfer function (PTF). We applied different estimation methods with the aim to define a suitable PTF for BD of a...
Neural point processes (NPPs) employ neural networks to capture complicated dynamics of asynchronous event sequences. Existing NPPs feed all history events into neural networks, assuming that all event types contribute to the prediction of the target type. However, this assumption can be problematic because in reality some event types do not contri...
Modelling the spread of coronavirus globally while learning trends at global and country levels remains crucial for tackling the pandemic. We introduce a novel variational-LSTM Autoencoder model to predict the spread of coronavirus for each country across the globe. This deep Spatio-temporal model does not only rely on historical data of the virus...
Building fire risk prediction is crucial for allocation of building inspection resources and prevention of fire incidents. Existing research of building fire prediction makes use of data relating to local demography, crime, building use and physical building characteristics, yet few studies have analysed the relative importance of predictive featur...
Background
Legacy data are unique occasions for estimating soil organic carbon (SOC) concentration changes and spatial variability, but their use can pose limitations due to the sampling schemes adopted and improvements may be needed in the analysis methodologies. When SOC changes is estimated with legacy data, the use of soil samples collected in...
Supervised machine learning models and their evaluation strongly depends on the quality of the underlying dataset. When we search for a relevant piece of information it may appear anywhere in a given passage. However, we observe a bias in the position of the correct answer in the text in two popular Question Answering datasets used for passage re-r...
The explosion of Open Educational Resources (OERs) in the recent years creates the demand for scalable, automatic approaches to process and evaluate OERs, with the end goal of identifying and recommending the most suitable educational materials for learners. We focus on building models to find the characteristics and features involved in context-ag...
Capturing the occurrence dynamics is crucial to predicting which type of events will happen next and when. A common method to do this is through Hawkes processes. To enhance their capacity, recurrent neural networks (RNNs) have been incorporated due to RNNs’ successes in processing sequential data such as languages. Recent evidence suggests that se...
Shapley values are great analytical tools in game theory to measure the importance of a player in a game. Due to their axiomatic and desirable properties such as efficiency, they have become popular for feature importance analysis in data science and machine learning. However, the time complexity to compute Shapley values based on the original form...
Shapley values are great analytical tools in game theory to measure the importance of a player in a game. Due to their axiomatic and desirable properties such as efficiency, they have become popular for feature importance analysis in data science and machine learning. However, the time complexity to compute Shapley values based on the original form...
The use of stopwords has been thoroughly studied in traditional Information Retrieval systems, but remains unexplored in the context of neural models. Neural re-ranking models take the full text of both the query and document into account. Naturally, removing tokens that do not carry relevance information provides us with an opportunity to improve...
Being able to explain a prediction as well as having a model that performs well are paramount in many machine learning applications. Deep neural networks have gained momentum recently on the basis of their accuracy, however these are often criticised to be black-boxes. Many authors have focused on proposing methods to explain their predictions. Amo...
The explosion of Open Educational Resources (OERs) in the recent years creates the demand for scalable, automatic approaches to process and evaluate OERs, with the end goal of identifying and recommending the most suitable educational materials for learners. We focus on building models to find the characteristics and features involved in context-ag...
Modelling the spread of coronavirus globally while learning trends at global and country levels remains crucial for tackling the pandemic. We introduce a novel variational LSTM-Autoencoder model to predict the spread of coronavirus for each country across the globe. This deep spatio-temporal model does not only rely on historical data of the virus...
While ranking approaches have made rapid advances in the Web search, systems that cater to the complex information needs in professional search tasks are not widely developed, common issues and solutions typically rely on dedicated search strategies backed by ad-hoc retrieval models. In this paper we present a legal search problem where professiona...
While ranking approaches have made rapid advances in the Web search, systems that cater to the complex information needs in professional search tasks are not widely developed, common issues and solutions typically rely on dedicated search strategies backed by ad-hoc retrieval models. In this paper we present a legal search problem where professiona...
In the last decade the ICT community observed a growing popularity of software networking paradigms. This trend consists in moving network applications from static, expensive, hardware equipment (e.g. router, switches, firewalls) towards flexible, cheap pieces of software that are executed on a commodity server. In this context, a server owner may...
The purpose of the SIGIR 2019 workshop on Fairness, Accountability, Confidentiality, Transparency , and Safety (FACTS-IR) was to explore challenges in responsible information retrieval system development and deployment. To this end, the workshop aimed to crowd-source from the larger SIGIR community and draft an actionable research agenda on five ke...