Mohammed KorayemIndiana University Bloomington | IUB · Department of Computer Science
Mohammed Korayem
PhD
About
55
Publications
26,168
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,499
Citations
Introduction
Mohammed’s PhD research studied large scale textual and visual mining in social media under the supervision of Prof. David Crandall. His other research interests include Machine Learning, Recommendation Systems, Computer Vision, Text Mining, Web Mining, and Soft computing. He received his PhD in June 2015 and is now a Data Scientist at CareerBuilder, Inc.
Additional affiliations
January 2010 - June 2015
Education
January 2010 - June 2015
January 2010 - May 2013
May 2003 - May 2006
Publications
Publications (55)
Many people use multiple online and social computing platforms, and choose to share varying amounts of personal information about themselves depending on the context and type of site. For example, people may be willing to share personally-identifiable details (including their real name and date of birth) on a site like Facebook, but may withhold th...
As recruitment and talent acquisition have become more and more competitive, recruitment firms have become more sophisticated in using machine learning (ML) methodologies for optimizing their day to day activities. But, most of published ML based methodologies in this area have been limited to the tasks like candidate matching, job to skill matchin...
Online marketplace is a digital platform that connects buyers (demand) and sellers (supply) and provides exposure opportunities that individual participants would not otherwise have access to. The KDD-22 Workshop on Decision Intelligence and Analytics for Online Marketplaces: Jobs, Ridesharing, Retail, and Beyond brought together academics and prac...
Many people use multiple online and social computing platforms, and choose to share varying amounts of personal information about themselves depending on the context and type of site. For example, people may be willing to share personally-identifiable details (including their real name and date of birth) on a site like Facebook, but may withhold th...
The online recruitment matching system has been the core technology and service platform in CareerBuilder. One of the major challenges in an online recruitment scenario is to provide good matches between job posts and candidates using a recommender system on the scale. In this paper, we discussed the techniques for applying an embedding-based recom...
Nowadays, more and more companies become employee centric enterprises and have a data-driven culture. Employees, as a unique capital value, is linked to the company’s profitability. Hiring the appropriate employee and decreasing hiring liability workplace violence are two main tasks in the employment process. Most companies are outsourcing backgrou...
Job transitions and upskilling are common actions taken by many industry working professionals throughout their career. With the current rapidly changing job landscape where requirements are constantly changing and industry sectors are emerging, it is especially difficult to plan and navigate a predetermined career path. In this work, we implemente...
Job recommendation is a crucial part of the online job recruitment business. To match the right person with the right job, a good representation of job postings is required. Such representations should ideally recommend jobs with fitting titles, aligned skill set, and reasonable commute. To address these aspects, we utilize three information graphs...
Job recommendation is an important task for the modern recruitment industry. An excellent job recommender system not only enables to recommend a higher paying job which is maximally aligned with the skill-set of the current job, but also suggests to acquire few additional skills which are required to assume the new position. In this work, we create...
In this paper we introduce a fully-automated quality-assurance (QA) system for search and recommendation engines that does not require participation of end users in the process of evaluating any changes in the existing relevancy algorithms. Specifically, the proposed system doesn’t require any manual effort to assign relevancy scores to query/docum...
Online job boards are one of the central components of modern recruitment industry. With millions of candidates browsing through job postings everyday, the need for accurate, effective, meaningful, and transparent job recommendations is apparent more than ever. While recommendation systems are successfully advancing in variety of online domains by...
Recommendation systems usually involve exploiting the relations among known features and content that describe items (content-based filtering) or the overlap of similar users who interacted with or rated the target item (collaborative filtering). To combine these two filtering approaches, current model-based hybrid recommendation systems typically...
Most work in building semantic knowledge bases has thus far focused upon either manually building language-specific taxonomies/ontologies or upon automatic techniques such as clustering or dimensionality reduction to discover latent semantic links within the content of a given corpus. The former is very labor intensive and is hard to maintain, whil...
Collaborative Filtering (CF) is widely used in large-scale recommendation engines because of its efficiency, accuracy and scalability. However, in practice, the fact that recommendation engines based on CF require interactions between users and items before making recommendations, make it inappropriate for new items which haven't been exposed to th...
Collaborative Filtering (CF) is widely used in large-scale recommendation engines because of its efficiency, accuracy and scalability. However, in practice, the fact that recommendation engines based on CF require interactions between users and items before making recommendations, make it inappropriate for new items which haven't been exposed to th...
Accurate, efficient, global observation of natural events is important for ecologists, meteorologists, governments, and the public. Satellites are effective but limited by their perspective and by atmospheric conditions. Public images on photo-sharing websites could provide crowd-sourced ground data to complement satellites, since photos contain ev...
Recommendation emails are among the best ways to re-engage with customers after they have left a website. While on-site recommendation systems focus on finding the most relevant items for a user at the moment (right item), email recommendations add two critical additional dimensions: who to send recommendations to (right person) and when to send th...
Subjective and sentiment analysis has gained considerable attention recently.
Most of the resources and systems built so far are done for English. The need
for designing systems for other languages is increasing. This paper surveys
different ways used for building systems for subjective and sentiment analysis
for languages other than English. There...
This paper describes a new kind of knowledge representation and mining system which we are calling the Semantic Knowledge Graph. At its heart, the Semantic Knowledge Graph leverages an inverted index, along with a complementary uninverted index, to represent nodes (terms) and edges (the documents within intersecting postings lists for multiple term...
Recommendation systems usually involve exploiting the relations among known features and content that describe items (content-based filtering) or the overlap of similar users who interacted with or rated the target item (collaborative filtering). To combine these two filtering approaches, current model-based hybrid recommendation systems typically...
Low-cost, lightweight wearable cameras let us record (or 'lifelog') our lives from a 'first-person' perspective for purposes ranging from fun to therapy. But they also capture private information that people may not want to be recorded, especially if images are stored in the cloud or visible to other people. For example, recent studies suggest that...
We present an ensemble approach for categorizing search query entities in the recruitment domain. Understanding the types of entities expressed in a search query (Company, Skill, Job Title, etc.) enables more intelligent information retrieval based upon those entities compared to a traditional keyword-based search. Because search queries are typica...
We present an ensemble approach for categorizing search query entities in the recruitment domain. Understanding the types of entities expressed in a search query (Company, Skill, Job Title, etc.) enables more intelligent information retrieval based upon those entities compared to a traditional keyword-based search. Because search queries are typica...
Several algorithms and tools have been developed to (semi) automate the
process of glycan identification by interpreting Mass Spectrometric data.
However, each has limitations when annotating MSn data with thousands of MS
spectra using uncurated public databases. Moreover, the existing tools are not
designed to manage MSn data where n > 2. We propo...
Probabilistic Graphical Models (PGM) are very useful in the fields of machine
learning and data mining. The crucial limitation of those models,however, is
the scalability. The Bayesian Network, which is one of the most common PGMs
used in machine learning and data mining, demonstrates this limitation when the
training data consists of random variab...
As the ability to store and process massive amounts of user behavioral data increases, new approaches continue to arise for leveraging the wisdom of the crowds to gain insights that were previously very challenging to discover by text mining alone. For example, through collaborative filtering, we can learn previously hidden relationships between it...
Term ambiguity - the challenge of having multiple
potential meanings for a keyword or phrase - can be a major
problem for search engines. Contextual information is essential
for word sense disambiguation, but search queries are often
limited to very few keywords, making the available textual context
needed for disambiguation minimal or non-existent...
Image classification is a fundamental computer vision problem with decades of related work. It is a complex task and is a crucial part of many applications. The vision community has created many standard data sets for object recognition and image classification. While these benchmarks are created with the goal of being a realistic, representative s...
We live and work in environments that are inundated with cameras embedded in
devices such as phones, tablets, laptops, and monitors. Newer wearable devices
like Google Glass, Narrative Clip, and Autographer offer the ability to quietly
log our lives with cameras from a `first person' perspective. While capturing
several meaningful and interesting m...
Most work in semantic search has thus far fo-cused upon either manually building language-specific tax-onomies/ontologies or upon automatic techniques such as clustering or dimensionality reduction to discover latent semantic links within the content that is being searched. The former is very labor intensive and is hard to maintain, while the latte...
Common difficulties like the cold-start problem and a lack of sufficient
information about users due to their limited interactions have been major
challenges for most recommender systems (RS). To overcome these challenges and
many similar ones that result in low accuracy (precision and recall)
recommendations, we propose a novel system that extract...
In the big data era, scalability has become a crucial requirement for any
useful computational model. Probabilistic graphical models are very useful for
mining and discovering data insights, but they are not scalable enough to be
suitable for big data problems. Bayesian Networks particularly demonstrate this
limitation when their data is represente...
The billions of public photos on online social media sites contain a vast amount of latent visual information about the world. In this paper, we study the feasibility of observing the state of the natural world by recognizing specific types of scenes and objects in large-scale social image collections. More specifically, we study whether we can rec...
Captchas are challenge-response tests used in many online systems to prevent attacks by automated bots. Avatar Captchas are a recently-proposed variant in which users are asked to classify between human faces and computer- generated avatar faces, and have been shown to be secure if bots employ random guessing. We test a variety of modern object rec...
Subjectivityandsentimentanalysis(SSA)hasrecentlygainedconsid- erable attention, but most of the resources and systems built so far are tailored to English and other Indo-European languages. The need for designing systems for other languages is increasing, especially as blogging and micro-blogging web- sites become popular throughout the world. This...
Captchas are frequently used on the modern world wide web to differentiate human users from automated bots by giving tests that are easy for humans to answer but difficult or impossible for algorithms. As artificial intelligence algorithms have improved, new types of Captchas have had to be developed. Recent work has proposed a new system called Av...
The popularity of social media websites like Flickr and Twitter has created enormous collections of user-generated content online. Latent in these content collections are observations of the world: each photo is a visual snapshot of what the world looked like at a particular point in time and space, for example, while each tweet is a textual expres...
Studying relationships between keyword tags on social sharing websites has become a popular topic of research, both to improve tag suggestion systems and to discover connections between the concepts that the tags represent. Existing approaches have largely relied on tag co-occurrences. In this paper, we show how to find connections between tags by...
Although Subjectivity and Sentiment Analysis (SSA) has been witnessing a flurry of novel research, there are few attempts to build SSA systems for Morphologically-Rich Languages (MRL). In the current study, we report efforts to partially fill this gap. We present a newly developed manually annotated corpus of Modern Standard Arabic (MSA) together w...
The area of Subjectivity and sentiment analysis (SSA) has been witnessing a flurry of novel research. However, only few attempts have been made to build SSA systems for the health domain. In the current study, we report efforts to partially bridge this gap. We present a new labeled corpus of professional articles collected from major Websites focus...
Advances in DNA microarray technology has motivated the research community to introduce sophisticated techniques for analyzing the resulted large-scale datasets. Biclustering techniques have been widely adapted for analyzing microarray gene expression data due to its ability to extract local patterns with a subset of genes that are similarly expres...
Hidden Markov Models are widely used in speech recognition and bioinformatics systems. Conventional methods are usually used in the parameter estimation process of Hidden Markov Models (HMM). These methods are based on iterative procedure, like Baum-Welch method, or gradient based methods. However, these methods can yield to local optimum parameter...