Pinar KARAGOZ

Pinar KARAGOZ
Middle East Technical University | METU · Department of Computer Engineering

Professor

About

165
Publications
19,491
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,813
Citations

Publications

Publications (165)
Article
Full-text available
Event detection from textual content by using text mining concepts is a well-researched field in the literature. On the other hand, graph modeling and graph embedding techniques in recent years provide an opportunity to represent textual contents as graphs. Text can be enriched with additional attributes in graphs, and the complex relationships can...
Article
According to the psychology literature, there is a strong correlation between the personality traits and the linguistic behavior of people. Due to increase in computer based communication, individuals express their personalities in written forms on social media. Hence, social media became a convenient resource to analyze the relationship between th...
Article
Irony, which is a way of expression through the use of the opposite, commonly occurs in daily social media posts. Hence, automatic detection of irony is essential to understand the semantics of informal texts more accurately. The literature has several sentiment analysis studies on Turkish texts, but those focusing on irony detection are very few....
Article
Full-text available
Popularity of Location-based Social Networks (LBSNs) provide opportunity to collect massive multi modal datasets that contain geographical information, as well as time and social interactions. Such data is a useful resource for generating personalized location recommendations. Such heterogeneous data can be further extended with notions of trust be...
Chapter
Irony detection is a text analysis problem aiming to detect ironic content. The methods in the literature are mostly for English text. In this paper, we focus on irony detection in Turkish and we analyze the explainability of neural models using Shapley Additive Explanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME). The ana...
Article
Trajectory analysis and extraction of trajectory patterns are crucial to enhance marine safety and marine status awareness. The major data source for such analysis is Automatic Identification System (AIS), which publishes data related to movement of the ship while cruising. AIS broadcasts information including type of ship, identity number, state,...
Article
Process mining is an active research area that provides a wide range of automated process discovery, conformance checking and process enhancement solutions by extracting process information from event logs. With the emerge of new shared economy models and system architectures, monolithic perspective of process mining, i.e. a single process within a...
Article
Wind energy, with its high potential, has an important place among renewable energy sources. Therefore, the number of investments on wind energy is increasing with new turbine technologies and solutions. For the investors of these technologies, how to determine the location of wind turbines for such investments is a challenging and critical problem...
Chapter
Intelligent data analysis techniques such as data mining or statistical/machine learning algorithms are applied to diverse domains, including energy informatics. These techniques have been successfully employed in order to solve different problems within the energy domain, particularly forecasting problems such as renewable energy and energy consum...
Article
Learning to rank is a supervised learning problem that aims to construct a ranking model for the given data. The most common application of learning to rank is to rank a set of documents against a query. In this work, we focus on point‐wise learning to rank, where the model learns the ranking values. Multivariate adaptive regression splines (MARS)...
Article
With the automated teller machine (ATM) cash replenishment problem, banks aim to reduce the number of out-of-cash ATMs and duration of out-of-cash status. On the other hand, they want to reduce the cost of cash replenishment, as well. The problem conventionally involves forecasting ATM cash withdrawals, and then cash replenishment optimization base...
Article
Full-text available
Advertisement recommendation on the Web is a popular research problem. For microblog platforms, different requirements arise due to the differences in the context of social media and social network. In this work, we propose an advertisement recommendation technique for microblogs. The proposed solution uses all contents of the messages (texts, capt...
Chapter
In this work, we focus on social interactions in communities in order to detect events. There are several previous efforts for the event detection problem based on analyzing the change in the network structure in terms of the overall network features. However, in this work, event detection is considered as a problem of change detection in community...
Chapter
Graphs are powerful data structures that allow us to represent varying relationships within data. In the past, due to the difficulties related to the time complexities of processing graph models, graphs rarely involved machine learning tasks. In recent years, especially with the new advances in deep learning techniques, increasing number of graph m...
Chapter
The resources of time and memory space are limited in data stream classification process. Hence, one should read the data only once and it is not possible to store the history as a whole. Therefore, when dealing with data streams, classification approaches in traditional data mining fall short and several enhancements are needed. In the literature,...
Article
Full-text available
The web provides a suitable media for users to share opinions on various topics, including consumer products, events or news. In most of such content, authors express different opinions on different features (i.e., aspects) of the topic. It is a common practice to express a positive opinion on one aspect and a negative opinion on another aspect wit...
Article
Full-text available
With the increasing need for the energy, the importance of renewable energy sources has also been increasing. In order to include the power produced by the wind into electricity grid in a controlled manner, power prediction has an important role. To produce a reliable wind power forecast, obtaining Wind Power Plants’ (WPP) power generation data in...
Chapter
Intelligent data analysis techniques such as data mining or statistical/machine learning algorithms are applied to diverse domains, including energy informatics. These techniques have been successfully employed in order to solve different problems within the energy domain, particularly forecasting problems such as renewable energy and energy consum...
Conference Paper
Full-text available
Graph databases are gaining wide use as they provide flexible mechanisms to model real world entities and the relationships among them. In the literature, there exists several studies that evaluate performance of graph databases and graph database query languages. However, there is limited work on comparing performance for graph database querying u...
Conference Paper
In this paper, we study the problem of topic adoption prediction for an author within a social academic network. The previous efforts on the problem use topic similarity and topic adoption of co-authors. We model the problem with an influence detection point of view, and propose that the influence on the author is an important factor. Hence, we def...
Conference Paper
Recommendation diversification, which is the action of suggesting dissimilar products, is an emerging and important issue in recommendation systems. Recent studies show that recommending diverse products increases customer satisfaction. This is valid for advertisement recommendation as well. In this study, we investigate the diversification perform...
Chapter
In this work, we model the problem of online event detection in microblogs as a stateful stream processing problem and offer a novel solution that balances result accuracy and performance. Our new approach builds on two state of the art algorithms. The first algorithm is based on identifying bursty keywords inside blocks of blog messages. The secon...
Preprint
Full-text available
This work investigates segmentation approaches for sentiment analysis on informal short texts in Turkish. The two building blocks of the proposed work are segmentation and deep neural network model. Segmentation focuses on preprocessing of text with different methods. These methods are grouped in four: morphological, sub-word, tokenization, and hyb...
Chapter
As a subproblem of sentiment analysis topic, aspect based sentiment analysis aims to extract distinct opinions for different aspects of a case in a given text. When the case is product review, it is possible to understand reviewer’s opinion on features of the product, rather than the product in general. Then, a product feature can be associated wit...
Chapter
Automated Telling Machine (ATM) replenishment is a well-known problem in banking industry. Banks aim to improve customer satisfaction by reducing the number of out-of-cash ATMs and duration of out-of-cash status. On the other hand, they want to reduce the cost of cash replenishment, also. The problem conventionally has two components: forecasting A...
Chapter
Multi-relational data mining (MRDM) is concerned with discovering hidden patterns from multiple tables in a relational database. One of the most commonly addressed tasks in MRDM is concept discovery in which the problem is inducing logical definitions of a specific relation, called target relation, in terms of other relations, called background kno...
Book
Many approaches have sprouted from artificial intelligence (AI) and produced major breakthroughs in the computer science and engineering industries. Deep learning is a method that is transforming the world of data and analytics. Optimization of this new approach is still unclear, however, and there’s a need for research on the various applications...
Chapter
Cities offer a large variety of Points of Interest (POI) for leisure, tourism, culture, and entertainment. This offering is exciting and challenging, as it requires people to search for POIs that satisfy their preferences and needs. Finding such places gets tricky as people gather in groups to visit the POIs (e.g., friends, family). Moreover, a gro...
Conference Paper
It is widely known that the generation and consumption of electricity should be balanced for secure operation and maintenance of the electricity grid. In order to help achieve this balance in the grid, the renewable energy resources such as wind and stream-flow should be forecast at high accuracies on the generation side, and similarly, electricity...
Article
Full-text available
A recent ‘‘third wave’’ of neural network (NN) approaches now delivers state-ofthe-art performance in many machine learning tasks, spanning speech recognition, computer vision, and natural language processing. Because these modern NNs often comprise multiple interconnected layers, work in this area is often referred to as deep learning. Recent year...
Conference Paper
Full-text available
Movie plot summaries are expected to reflect the genre of movies since many spectators read the plot summaries before deciding to watch a movie. In this study, we perform movie genre classification from plot summaries of movies using bidirec- tional LSTM (Bi-LSTM). We first divide each plot summary of a movie into sentences and assign the genre of...
Conference Paper
Full-text available
Performing first-aid skills correctly and without losing time is life-sustaining and it requires the use of current techniques, practical applications, realistic scenarios and experienced staff. Furthermore, it is important that the training can be repeated frequently so that the learned techniques are not forgotten and they can be applied in a cal...
Conference Paper
Effective use of renewable energy sources, and in particular wind energy, is of paramount importance. Compared to other renewable energy sources, wind is so fluctuating that it must be integrated to the electricity grid in a planned way. Wind power forecast methods have an important role in this integration. These methods can be broadly classified...
Conference Paper
The web provides a suitable media for users to post comments on different topics. In most of such content, authors express different opinions on different features or aspects of the topic. In aspect based sentiment analysis, it is analyzed as to for which aspect which opinion is expressed. Once aspects are available, the next important step is to m...
Article
Full-text available
Identifying similarities in microblog posts for event detection poses challenges due to short texts with idiosyncratic spellings, irregular writing styles, abbreviations and synonyms. In order to overcome these challenges, we present an enhancement to the incremental clustering techniques by detecting similar terms in microblog posts in a temporal...
Article
Full-text available
Detection of events using voluntarily generated content in microblogs has been the objective of numerous recent studies. One essential challenge tackled in these studies is estimating the locations of events. In this paper, we review the state-of-the-art location estimation techniques used in the localization of events detected in microblogs, parti...
Article
Nowadays, many businesses, such as banks, use direct marketing methods to reach customers to minimize the campaigning cost and maximize the return rate. To achieve this, huge customer data should be analyzed to determine the most appropriate product offer for each customer and the most effective channel to reach her/him. However, since only a very...
Conference Paper
Event detection from social media messages is conventionally based on clustering the message contents. The most basic approach is representing messages in terms of term vectors that are constructed through traditional natural language processing (NLP) methods and then assigning weights to terms generally based on frequency. In this study, we use ne...
Conference Paper
Full-text available
Detecting irony in texts attracts computer scientists' attention as a recent research problem. Automatic detection of irony on microblog texts, i.e., microposts, poses additional challenges. Microposts have limited number of characters, and generally include typing errors, therefore traditional methods of text mining cannot be applied easily. This...
Article
Full-text available
The vocabulary mismatch problem is a long-standing problem in information retrieval. Semantic matching holds the promise of solving the problem. Recent advances in language technology have given rise to unsupervised neural models for learning representations of words as well as bigger textual units. Such representations enable powerful semantic mat...
Conference Paper
Full-text available
In this work we focus on improving the time efficiency of Inductive Logic Programming (ILP)-based concept discovery systems. Such systems have scalability issues mainly due to the evaluation of large search spaces. Evaluation of the search space cosists translating candidate concept descriptor into SQL queries, which involve a number of equijoins o...
Conference Paper
In this article, a new unsupervised feature extraction method for aspect-based sentiment analysis is proposed. This method improves the performance of frequency based feature extraction by using an online search engine. Although frequency based feature extraction methods produce good precision and recall values on formal texts, they are not very su...
Article
Full-text available
There are many parameters that may affect the navigation behaviour of web users. Prediction of the potential next page that may be visited by the web user is important, since this information can be used for prefetching or personalization of the page for that user. One of the successful methods for the determination of the next web page is to const...
Article
Detecting real-world events by following posts in microblogs has been the motivation of numerous recent studies. In this work, we focus on the spatio-temporal characteristics of events detected in microblogs, and propose a method to estimate their locations using the Dempster–Shafer theory. We utilize three basic location-related features of the po...
Conference Paper
The location-based social networks (LBSN) facilitate users to check-in their current location and share it with other users. The accumulated check-in data can be employed for the benefit of users by providing personalized recommendations. In this paper, we propose a random walk based context-aware friend recommendation algorithm (RWCFR). RWCFR cons...
Conference Paper
Location-based social network as one of the platforms in this field has been providing services and facilities to enhance user experience to explore their surroundings and new places. Among current services, point of interest (POI) recommendation and activity recommendation draws significant attention of users, which makes it a potential field of t...
Conference Paper
Full-text available
The performance of result diversification for tweet search suffers from the well-known vocabulary mismatch problem, as tweets are too short and usually informal. As a remedy, we propose to adopt a query and tweet expansion strategy that utilizes automatically-generated word embeddings. Our experiments using state-of-the-art diversification methods...
Conference Paper
The pervasiveness of location-acquisition technologies enable location-based social networks (LBSN) to become increasingly popular in recent years. Users are able to check-in their current location and share information with other users through these networks. LBSN check-in data can be used for the benefit of users by providing personalized recomme...
Article
Full-text available
High utility sequential pattern mining has been considered as an important research problem and a number of relevant algorithms have been proposed for this topic. The main challenge of high utility sequential pattern mining is that, the search space is large and the efficiency of the solutions is directly affected by the degree at which they can el...
Article
Full-text available
In recent years, using cell phone log data to model human mobility patterns became an active research area. This problem is a challenging data mining problem due to huge size and non-uniformity of the log data, which introduces several granularity levels for the specification of temporal and spatial dimensions. This paper focuses on the prediction...
Conference Paper
Efficient integration of renewable energy sources into the electricity grid has become one of the challenging problems in recent years. This issue is more critical especially for unstable energy sources such as wind. The focus of this work is the performance analysis of several alternative wind forecast combination models in comparison to the curre...
Conference Paper
Full-text available
Concept discovery is a multi-relational data mining task for inducing definitions of a specific relation in terms of other relations in the data set. Such learning tasks usually have to deal with large search spaces and hence have efficiency and scalability issues. In this paper, we present a hybrid approach that combines association rule mining me...
Article
The location-based social networks (LBSN) enable users to check in their current location and share it with other users. The accumulated check-in data can be employed for the benefit of users by providing personalized recommendations. In this paper, we propose a context-aware location recommendation system for LBSNs using a random walk approach. Ou...
Article
Extracting patterns from web usage data helps to facilitate better web personalization and web structure readjustment. The classical frequency-based sequence mining techniques consider only the binary occurrences of web pages in sessions that result in the extraction of many patterns that are not informative for users. To handle this problem, utili...
Conference Paper
In the multi-relational data mining, concept discovery is the problem of inducing definitions of a relation in terms of other relations provided. In this paper, we present a method that combines graph-based and association rule mining-based methods for concept discovery in graphs. The proposed method is related to graphs as the data, which is initi...
Conference Paper
Even though process-aware information systems are intensively utilized in the organizations, traditional process management paradigms majorly concentrate on the design and configuration phases. Instead of starting with a process design, process mining attempts to discover interesting patterns from process enactment namely event logs and extract bus...
Conference Paper
Due to the increasing use of mobile phones and their increasing capabilities, huge amount of usage and location data can be collected. Location prediction is an important task for mobile phone operators and smart city administrations to provide better services and recommendations. In this work, we propose a sequence mining based approach for locati...
Article
As the result of increasing population and growing technological activities, nonrenewable energy sources, which are the main energy providers, are diminishing day by day. Due to this factor, efforts on efficient utilization of renewable energy sources have increased all over the world. Wind is one of the most significant alternative energy resource...
Conference Paper
Full-text available
Provenance traces captured by scientific workflows can be useful for designing, debugging and maintenance. However, our experience suggests that they are of limited use for reporting results, in part because traces do not comprise domain-specific annotations needed for explaining results, and the black-box nature of some workflow activities. We sho...
Article
Inductive Programming Logic (ILP)-based concept discovery systems aim to find patterns that describe a target relation in terms of other relations provided as background knowledge. Such systems usually work within first order logic framework, build large search spaces, and have long running times. Memoization has widely been incorporated in concept...
Article
Considering wide use of Twitter as the source of information, reaching an interesting tweet for a user among a bunch of tweets is challenging. In this work we propose a Named Entity Recognition (NER) based user profile modeling for Twitter users and employ this model to generate personalized tweet recommendations. Effectiveness of the proposed meth...