Ebrahim Bagheri's research while affiliated with University of Toronto and other places

Publications (244)

Preprint
Full-text available
Peer review is an integral component of scientific research. The quality of peer review, and consequently the published research, depends to a large extent on the ability to recruit adequate reviewers for submitted papers. However, finding such reviewers is an increasingly difficult task due to several factors, such as the continuous increase both...
Article
Team formation is concerned with the identification of a group of experts who have a high likelihood of effectively collaborating with each other in order to satisfy a collection of input skills. Solutions to this task have mainly adopted graph operations and at least have the following limitations: (1) they are computationally demanding as they re...
Chapter
The objective of this paper is to show that it is possible to significantly reduce stereotypical gender biases in neural rankers without modifying the ranking loss function, which is the current approach in the literature. We systematically de-bias gold standard relevance judgement datasets with a set of balanced and well-matched query pairs. Such...
Chapter
This paper presents the idea of systematically integrating relation triples derived from Open Information Extraction (OpenIE) with neural rankers in order to improve the performance of the ad-hoc retrieval task. This is motivated by two reasons: (1) to capture longer-range semantic associations between keywords in documents, which would not otherwi...
Chapter
The Information Retrieval community has made strides in developing neural rankers, which have show strong retrieval effectiveness on large-scale gold standard datasets. The focus of existing neural rankers has primarily been on measuring the relevance of a document or passage to the user query. However, other considerations such as the convincingne...
Chapter
Recent studies have shown that significant performance improvements reported by neural rankers do not necessarily extend to a diverse range of queries. There is a large set of queries that cannot be effectively addressed by neural rankers primarily because relevant documents to these queries are not identified by first-stage retrievers. In this pap...
Chapter
Recent studies have shown that information retrieval systems may exhibit stereotypical gender biases in outcomes which may lead to discrimination against minority groups, such as different genders, and impact users’ decision making and judgements. In this tutorial, we inform the audience of studies that have systematically reported the presence of...
Preprint
Full-text available
Background Systematic reviews (SRs) are being published at an accelerated rate. Decision makers are often faced with the challenge of comparing and choosing between multiple SRs on the same topic. We surveyed individuals in the healthcare field to understand what criteria they use to compare and select one or more SRs from multiple on the same topi...
Preprint
Full-text available
Many real-world networks, such as social networks, contain structuralheterogeneity and experience temporal evolution. However, while therehas been growing literature on network representation learning, only afew have addressed the need to learn representations for dynamic hetero-geneous networks. The objective of our work in this paper is to introd...
Preprint
Full-text available
The team discovery task is concerned with finding a group of experts from a collaboration network who would collectively cover a desirable set of skills. Most prior work for team discovery either adopt graph-based or neural mapping approaches. Graph-based approaches are computationally intractable often leading to sub-optimal team selection. Neural...
Article
Rapid technological innovations, especially in the information technology space, demand the workforce to be vigilant by acquiring new skills to remain relevant and employable. The workforce needs to be engaged in a continuous lifelong learning process by educating themselves about skills that will be in demand in the future. To do so, it is essenti...
Article
Finding a qualified individual who can independently answer a question on a community question answering platform is becoming more challenging due to the increasing multidisciplinary nature of posted questions. As such, finding a group of experts to collaboratively answer the questions is of paramount importance. To this end, we propose a novel app...
Data
This is my improved version of NSL-KDD. It no longer contains metadata features or other contaminating features. It contains no duplicate samples, no samples with N/A values and it has been optimized with the most appropriate data types for optimal storage and loading. This upload mirrors my version of this dataset on Kaggle https://www.kaggle.com/...
Article
Community question answering (CQA) platforms are receiving increased attention and are becoming an indispensable source of information in different domains ranging from board games to physics. The success of these platforms dependent on how efficiently new questions are assigned to community experts, known ascalled question routing. In this paper,...
Article
Bitcoin mining is the process of generating new blocks in the Bitcoin blockchain. This process is vulnerable to different types of attacks. One of the most famous attacks in this category is selfish mining. This attack is essentially a strategy that a sufficiently powerful mining pool can follow to obtain more revenue than its fair share. The reaso...
Article
The identification of knowledge graph entity mentions in textual content has already attracted much attention. The major assumption of existing work is that entities are explicitly mentioned in text and would only need to be disambiguated and linked. However, this assumption does not necessarily hold for social content where a significant portion o...
Article
The focus of our work is the ad hoc table retrieval task, which aims to rank a list of structured tabular objects in response to a user query. Given the importance of this task, various methods have already been proposed in the literature that focus on syntactic, semantic and neural representations of tables for determining table relevance. However...
Chapter
In light of recent studies that show neural retrieval methods may intensify gender biases during retrieval, the objective of this paper is to propose a simple yet effective sampling strategy for training neural rankers that would allow the rankers to maintain their retrieval effectiveness while reducing gender biases. Our work proposes to consider...
Article
Researchers have already observed social contagion effects in both in-person and online interactions. However, such studies have primarily focused on users’ beliefs, mental states, and interests. In this article, we expand the state of the art by exploring the impact of social contagion on social alignment, i.e., whether the decision to socially al...
Article
The speed of digital transformation has resulted in new challenges for job seekers to become lifelong learners and to develop new skills faster than before. In this paper, our main objective is to examine how online content can serve as indicators for changes to the Information Technology (IT) industry and its in-demand skills. To study this relati...
Article
Existing work in the literature have shown that the number and quality of product ratings and reviews have a direct correlation with the product purchase rates in online e-commerce portals. However, the majority of the products on e-commerce portals do not have any ratings or reviews and are known as cold products (∼90% of products on Amazon are co...
Article
Full-text available
Background Current text mining tools supporting abstract screening in systematic reviews are not widely used, in part because they lack sensitivity and precision. We set out to develop an accessible, semi-automated “workflow” to conduct abstract screening for systematic reviews and other knowledge synthesis methods. Methods We adopt widely recomme...
Article
Linking textual content to entities from the knowledge graph has received increasing attention in the context of which surface form representations of entities, e.g., terms or phrases, are disambiguated and linked to appropriate entities. This allows textual content, e.g., social user-generated content, to be interpreted and reasoned on at a higher...
Chapter
Recent studies in information retrieval have shown that gender biases have found their way into representational and algorithmic aspects of computational models. In this paper, we focus specifically on gender biases in information retrieval gold standard datasets, often referred to as relevance judgements. While not explored in the past, we submit...
Chapter
Post-retrieval Query Performance Prediction (QPP) methods benefit from the characteristics of the retrieved set of documents to determine query difficulty. While existing works have investigated the relation between query and retrieved document spaces, as well as retrieved document scores, the association between the retrieved documents themselves,...
Chapter
We present an open-source extensible python-based toolkit that provides access to a (1) range of built-in unsupervised query expansion methods, and (2) pipeline for generating gold standard datasets for building and evaluating supervised query refinement methods. While the information literature offers abundant work on query expansion techniques, t...
Article
Predicting the performance of a retrieval method for a given query is a highly important and challenging problem in information retrieval. Accurate Query Performance Prediction (QPP) plays an important role in real time handling of queries with varying levels of difficulty. While there have been several successful query performance predictors, no p...
Article
Social interactions through online social media have become a daily routine of many, and the number of those whose real world (offline) and online lives have become intertwined is continuously growing. As such, the interplay of individuals' online and offline activities has been the subject of numerous research studies, the majority of which explor...
Article
Event relations specify how different event flows expressed within the context of a textual passage relate to each other in terms of temporal and causal sequences. There have already been impactful work in the area of temporal and causal event relation extraction; however, the challenge with these approaches is that (1) they are mostly supervised m...
Conference Paper
Social media users readily share their preferences, life events, sentiment and opinions, and implicitly signal their thoughts, feelings, and psychological behavior. This makes social media a viable source of information to accurately and effectively mine users' interests with the hopes of enabling more effective user engagement, better quality deli...
Chapter
Performing observational studies based on social network content has recently gained attraction where the impact of various types of interruptions has been studied on users’ behavior. There has been recent work that have focused on how online social network behavior and activity can impact users’ offline behavior. In this paper, we study the invers...
Article
In information retrieval, the task of query performance prediction (QPP) is concerned with determining in advance the performance of a given query within the context of a retrieval model. QPP has an important role in ensuring proper handling of queries with varying levels of difficulty. Based on the extant literature, query specificity is an import...
Chapter
Implicit entity linking is the task of identifying an appropriate entity whose surface form is not explicitly mentioned in the text. Unlike explicit entity linking where an entity is linked to an observed phrase within the input text, implicit entity linking is concerned with determining specific yet implied entities. Existing work in the literatur...
Chapter
We propose a temporal latent space model for user community prediction in social networks, whose goal is to predict future emerging user communities based on past history of users’ topics of interest. Our model assumes that each user lies within an unobserved latent space, and similar users in the latent space representation are more likely to be m...
Chapter
Query Performance Prediction (QPP) is concerned with estimating the effectiveness of a query within the context of a retrieval model. It allows for operations such as query routing and segmentation, leading to improved retrieval performance. Pre-retrieval QPP methods are oblivious to the performance of the retrieval model as they predict query diff...
Chapter
The ad hoc table retrieval task is concerned with satisfying a query with a ranked list of tables. While there are strong baselines in the literature that exploit learning to rank and semantic matching techniques, there are still a set of hard queries that are difficult for these baseline methods to address. We find that such hard queries are those...
Preprint
Full-text available
The vision of the Linked Open Data (LOD) initiative is to provide a distributed model for publishing and meaningfully interlinking open data. The realization of this goal depends strongly on the quality of the data that is published as a part of the LOD. This paper focuses on the systematic quality assessment of datasets prior to publication on the...
Article
Full-text available
Recent advances in microblog content summarization has primarily viewed this task in the context of traditional multi-document summarization techniques where a microblog post or their collection form one document. While these techniques already facilitate information aggregation, categorization and visualization of microblog posts, they fall short...
Article
Full-text available
Social network users publicly share a wide variety of information with their followers and the general public ranging from their opinions, sentiments and personal life activities. There has already been significant advance in analyzing the shared information from both micro (individual user) and macro (community level) perspectives, giving access t...
Book
Mining user interests from user behavioral data is critical for many applications. Based on user interests, service providers like advertisers can significantly reduce service delivery costs by offering the most relevant products to their customers. The challenge of accurately and efficiently identifying user interests has been the subject of incre...
Article
Product reviews written by the crowd on e-commerce shopping websites have become a critical information source for making purchasing decisions. An important challenge, however, is that the vast majority of products (e.g., 90% of products on amazon.com) do not receive enough attention and lack sufficient reviews by the users; hence, they constitute...
Conference Paper
Specificity is the level of detail at which a given term is represented. Existing approaches to estimating term specificity are primarily dependent on corpus-level frequency statistics. In this work, we explore how neural embeddings can be used to define corpus-independent specificity metrics. Particularly, we propose to measure term specificity ba...
Article
The ever increasing presence of online social networks in users’ daily lives has led to the interplay between users’ online and offline activities. There have already been several works that have studied the impact of users’ online activities on their offline behavior, e.g., the impact of interaction with friends on an exercise social network on th...
Conference Paper
A lot of research in social network mining is concerned with theories and methodologies for community discovery, pattern detection and network evolution, as well as behavioural analysis and anomaly (misbehaviour) detection. The MAISoN workshop focuses on the use of social network data and methods for building predictive models that can be used to u...
Article
In this paper, we propose an evolutionary computing approach based on Genetic Algorithms for composing an efficient trace given a desirable utility function based on the observations made in the event logs of several peer-organizations. Our proposed approach works with a set of event logs from different peer-organizations and generates an efficient...
Conference Paper
Full-text available
he abundance of user generated content on social networks pro-vides the opportunity to build models that are able to accurately and effectively extract, mine and predict users' interests with the hopes of enabling more effective user engagement, better quality delivery of appropriate services and higher user satisfaction. While traditional methods...
Conference Paper
Full-text available
The abundance of user generated content on social networks provides the opportunity to build models that are able to accurately and effectively extract, mine and predict users' interests with the hopes of enabling more effective user engagement, better quality delivery of appropriate services and higher user satisfaction. While traditional methods...
Article
Identifying and extracting user communities is an important step towards understanding social network dynamics from a macro perspective. For this reason, the work in this paper explores various aspects related to the identification of user communities. To date, user community detection methods employ either explicit links between users (link analys...
Article
Relation extraction aims at finding meaningful relationships between two named entities from within unstructured textual content. In this paper, we define the problem of information extraction as a matrix completion problem where we employ the notion of universal schemas formed as a collection of patterns derived from open information extraction sy...
Article
Traditional information retrieval techniques that primarily rely on keyword-based linking of the query and document spaces face challenges such as the vocabulary mismatch problem where relevant documents to a given query might not be retrieved simply due to the use of different terminology for describing the same concepts. As such, semantic search...
Chapter
Full-text available
Most real-world information networks, such as social networks, are heterogeneous and as such, relationships in these networks can be of different types and hence carry differing semantics. Therefore techniques for link prediction in homogeneous networks cannot be directly applied on heterogeneous ones. On the other hand, works that investigate link...
Chapter
Temporal event relations specify how different events expressed within the context of a textual passage relate to each other in terms of time sequence. There have already been impactful work in the area of temporal event relation extraction; however, they are mostly supervised methods that rely on sentence-level textual, syntactic and grammatical s...
Article
Full-text available
The accurate prediction of users’ future interests on social networks allows one to perform future planning by studying how users will react if certain topics emerge in the future. It can improve areas such as targeted advertising and the efficient delivery of services. Despite the importance of predicting user future interests on social networks,...
Conference Paper
Neural embeddings have been effectively integrated into information retrieval tasks including ad hoc retrieval. One of the benefits of neural embeddings is they allow for the calculation of the similarity between queries and documents through vector similarity calculation methods. While such methods have been effective for document matching, they h...
Conference Paper
Full-text available
The accurate prediction of users' future topics of interests on social networks can facilitate content recommendation and platform engagement. However, researchers have found that future interest prediction, especially on social networks such as Twitter, is quite challenging due to the rapid changes in community topics and evolution of user interac...
Conference Paper
Full-text available
Researchers have shown that it is possible to identify reported instances of personal life events from users' social content, e.g., tweets. This is known as personal life event detection. In this paper, we take a step forward and explore the possibility of predicting users' next personal life event based solely on the their historically reported pe...
Article
The growth in the number of publicly available services on the Web has encouraged developers to rely more heavily on such services to deliver products in a faster, cheaper and more reliable fashion. Many developers are now using a collection of these services in tandem to build their applications. While there has been much attention to the area of...
Article
Learning low dimensional dense representations of the vocabularies of a corpus, known as neural embeddings, has gained much attention in the information retrieval community. While there have been several successful attempts at integrating embeddings within the ad hoc document retrieval task, yet, no systematic study has been reported that explores...
Article
Objectives: To illustrate the use of process mining concepts, techniques, and tools to improve the systematic review process. Study design and setting: We simulated review activities and step-specific methods in the process for systematic reviews conducted by one research team over 1 year to generate an event log of activities, with start/end da...
Article
Full-text available
Entity linking, also known as semantic annotation, of textual content has received increasing attention. Recent works in this area have focused on entity linking on text with special characteristics such as search queries and tweets. The semantic annotation of tweets is specially proven to be challenging given the informal nature of the writing and...
Article
Full-text available
One of the major challenges in Web search pertains to the correct interpretation of users’ intent. Query Expansion is one of the well-known approaches for determining the intent of the user by addressing the vocabulary mismatch problem. A limitation of the current query expansion approaches is that the relations between the query terms and the expa...
Article
The large number of published services has motivated the development of tools for creating customized composite services known as service compositions. While service compositions provide high agility and development flexibility, they can also pose challenges when it comes to delivering guaranteed functional and non-functional requirements. This is...
Article
Objective: The goal of this work is to map Unified Medical Language System (UMLS) concepts to DBpedia resources using widely accepted ontology relations from the Simple Knowledge Organization System (skos:exactMatch, skos:closeMatch) and from the Resource Description Framework Schema (rdfs:seeAlso), as a result of which a complete mapping from UML...
Article
Background About 8% of U.S women are prescribed antidepressant medications around the time of pregnancy. Decisions about medication use in pregnancy can be swayed by the opinion of family, friends and online media, sometimes beyond the advice offered by healthcare providers. Exploration of the online social network response to research on antidepre...