
Ali CakmakIstanbul Technical University · Department of Computer Engineering
Ali Cakmak
Ph.D.
Looking for ambitious and hardworking graduate research students in bioinformatics and applied machine learning.
About
48
Publications
13,664
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
348
Citations
Citations since 2017
Introduction
My primary research interests focus on mining, computational analysis, and management of data in different fields with a special focus on biological and health data.
Additional affiliations
October 2008 - October 2009
Publications
Publications (48)
This study explores the machine learning-based assessment of predisposition to colorectal cancer based on single nucleotide polymorphisms (SNP). Such a computational approach may be used as a risk indicator and an auxiliary diagnosis method that complements the traditional methods such as biopsy and CT scan. Moreover, it may be used to develop a lo...
Most of the popular Big Data analytics tools evolved to adapt their working environment to extract valuable information from a vast amount of unstructured data. The ability of data mining techniques to filter this helpful information from Big Data led to the term ‘Big Data Mining’. Shifting the scope of data from small-size, structured, and stable...
Sequence patterns are frequently employed in many expert system applications in a wide range of domains from bioinformatics to smart homes and stock market analysis. Regular sequence patterns fail to express whether two consecutive items in a pattern are occurring right after each other in all pattern occurrences in an item database or not. Such a...
The metabolic wiring of patient cells is altered drastically in many diseases, including cancer. Understanding the nature of such changes may pave the way for new therapeutic opportunities as well as the development of personalized treatment strategies for patients. In this paper, we propose an algorithm called Metabolitics, which allows systems-le...
Accurately identifying organisms based on their partially available genetic material is an important task to explore the phylogenetic diversity in an environment. Specific fragments in the DNA sequence of a living organism have been defined as DNA barcodes and can be used as markers to identify species efficiently and effectively. The existing DNA...
Twitter is an online social networking website where people can post short messages on any subject, and these messages become visible to other users. Users intentionally express their opinions about companies or products via micro-blogging texts. Analyzing such messages might help explore what customers think about company products, or what the bro...
Spatiotemporal soccer data enables in-depth analysis of a soccer game. However, the amount and the nature of the data makes it challenging for analysts to easily uncover insights from the data. In this article, we introduce an interactive visualization tool that uses novel data mining and machine learning methods to enable coaches and analysts to w...
Accurate cost and time estimation of a query is one of the major success indicators for database management systems. SQL allows to express flexible queries on text-formatted data. The LIKE operator is used to search for a specified pattern (e.g., LIKE “luck%”) in a string database. It is vital to estimate the selectivity of such flexible predicates...
The emerging data explosion in sports field has created new opportunities to practice
data science and analytics for deeper and larger scale analysis of games. With
collaborating and competing 22 players on the field, soccer is often considered as a
complex system. More specifically, each game is usually modeled as a network with
players as nodes,...
Based on their skills and interests, students' success in courses may differ greatly. Predicting student success in courses before they take them may be important. For instance, students may choose elective courses that they are likely to pass with good grades. Besides, instructors may have an idea about the expected success of students in a class,...
Analysis and training system accommodating a processor unit which can access a recorded positions database comprising the movement data like movement type of the movements like pass, shot realized by the players in at least one sports competition, dominant factors which are effective in realization of the movements, and the result of the movement,...
Predicting promising academic papers is useful for a variety of parties, including researchers, universities, scientific councils, and policymakers. Researchers may benefit from such data to narrow down their reading list and focus on what will be important, and policymakers may use predictions to infer rising fields for a more strategic distributi...
Metabolic networks have become one of the centers of attention in life sciences research with the advancements in the metabolomics field. A vast array of studies analyzes metabolites and their interrelations to seek explanations for various biological questions, and numerous genome-scale metabolic networks have been assembled to serve for this purp...
Metabolic networks have become one of the centers of attention in life sciences research with the advancements in the metabolomics field. A vast array of studies analyzes metabolites and their interrelations to seek explanations for various biological questions, and numerous genome-scale metabolic networks have been assembled to serve for this purp...
Integration of metabolic pathways resources and metabolic network models, and deploying new tools on the integrated platform is useful for systems biology research on understanding the regulation of metabolic networks. PathCase-SB is such an integrative a web-based application, providing a database-enabled framework and tools towards effective and...
Integration of metabolic pathways resources and metabolic network models, and deploying new tools on the integrated platform can help perform more effective and more efficient systems biology research on understanding the regulation of metabolic networks. Therefore, the tasks of (a) integrating under a single database environment regulatory metabol...
With the recent advances in experimental technologies, such as gas chromatography and mass spectrometry, the number of metabolites that can be measured in biofluids of individuals has markedly increased. Given a set of such measurements, a very common task encountered by biologists is to identify the metabolic mechanisms that lead to changes in the...
Integration of metabolic pathways resources and regulatory metabolic network models, and deploying new tools on the integrated platform can help perform more effective and more efficient systems biology research on understanding the regulation in metabolic networks. Therefore, the tasks of (a) integrating under a single database environment regulat...
With recent advances in experimental technologies, the number of metabolites measured in bio-fluids of organisms has markedly increased. Given a set of measurements, a common metabolomics task is to identify the metabolic mechanisms that lead to changes in the concentrations of given metabolites, and interpret the metabolic consequences of the obse...
Metabolism is a representation of the biochemical principles that govern the production, consumption, degradation, and biosynthesis of metabolites in living cells. Organisms respond to changes in their physiological conditions or environmental perturbations (i.e. constraints) via cooperative implementation of such principles. Querying inner working...
Metabolism is a representation of the biochemical principles that govern the production , consumption, degradation, and biosynthesis of metabolites in living cells. Organisms respond to changes in their physiological conditions or environmental perturbations (i.e. constraints) via cooperative implementation of such principles. Querying inner workin...
Querying biochemical networks in flexible ways over the web is important to facilitate ongoing biological research. In this paper, we present a querying interface for biological networks, more specifically, metabolic networks. The interface allows for the specification of a large class of containment, path, and neighborhood queries with ease from a...
The emerging field of metabolomics enables researchers to measure concentrations of large numbers of metabolites in biofluids, and to interpret them in connection with the underlying metabolic network, which poses a significant challenge for manual analysis. Given a set of observations on metabolite concentration changes, our goal in this study is...
This paper focuses on the use of labeling schemes for evaluating queries on DAG structured data, such as pedigrees and ontologies that are stored in a relational database. We compare using Dewey+ labeling, NodeCodes and its variants for the evaluation of ancestor/descendant queries on ontologies and inbreeding coefficient calculation on pedigrees....
With the development of improved and cost-effective technologies, it is now possible to detect thousands of metabolites in biofluids or specific organs, and reliably quantify their amounts. Metabolomics focuses on studying the concentrations of metabolites in a cell or a tissue. In this paper, we describe a prototype Web-based metabolomics data ana...
Information about individuals on publicly available Web sites stands as a valuable, yet unorganized, data source. Turning such an enormous data source into a ldquodatabaserdquo is highly desirable as it has the potential to lead to novel ways of using the available information to the largest extent. In this paper, we present PopulusLog, a novel Web...
We identify two issues with searching literature digital collections within digital libraries: (a) there are no effective paper-scoring and ranking mechanisms. Without a scoring and ranking system, users are often forced
to scan a large and diverse set of publications listed as search results and potentially miss the important ones. (b) Topic
diffu...
As the blueprints of cellular actions, biological pathways characterize the roles of genomic entities in various cellular mechanisms, and as such, their availability, manipulation and queriability over the web is important to facilitate ongoing biological research.
In this article, we present the new features of PathCase, a system to store, query,...
Genes and gene products are frequently annotated with Gene Ontology concepts based on the evidence provided in genomics articles. Manually locating and curating information about a genomic entity from the biomedical literature requires vast amounts of human effort. Hence, there is clearly a need forautomated computational tools to annotate the gene...
New graph structures where node labels are members of hierarchically organized ontologies or taxonomies have become commonplace in different domains, e.g., life sciences. It is a challenging task to mine for frequent patterns in this new graph model which we call taxonomy-superimposed graphs, as there may be many patterns that are implied by the ge...
Biological pathways provide significant insights on the interaction mechanisms of molecules. Presently, many essential pathways still remain unknown or incomplete for newly sequenced organisms. Moreover, experimental validation of enormous numbers of possible pathway candidates in a wet-lab environment is time- and effort-extensive. Thus, there is...
Functional characterizations of pathways provide new opportunities in defining, understanding, and comparing existing biological pathways, and in helping discover new ones in different organisms. In this paper, we present and evaluate computational techniques for categorizing pathways, based upon the Gene Ontology (GO) annotations of enzymes within...
Context-based literature digital library search is a new search paradigm that creates an effective ranking of query outputs by controlling query output topic diversity. We define contexts as pre-specified ontology-based terms and locate the paper set of a context based on semantic properties of the context (ontology) term. In order to provide a com...
Annotating genes with Gene Ontology (GO) terms is crucial for biologists to characterize the traits of genes in a standardized way. However, manual curation of textual data, the most reliable form of gene annotation by GO terms, requires significant amounts of human effort, is very costly, and cannot catch up with the rate of increase in biomedical...
Biological Web data sources have now become essential information sources for researchers. However, their use is tedious, labor-intensive, repetitive, and possibly involve the integration of data from multiple Web data sources. In this paper, as a first step towards the full integration of Web data sources, we propose a framework that allows an int...
Signaling pathways are chains of interacting proteins, through which the cell converts a (usually) extracellular signal into a biological response. The number of known signaling pathways in the biological literature and on the Web has been increasing at a very high rate, thus demanding a need for efficient ways of storing, visualizing, querying, an...
Digital libraries do not assign importance/relevance scores to their publications, authors, or publication venues, even though
scores are potentially useful for (a) providing comparative assessment, or “importances”, of publications, authors, publication
venues, (b) ranking publications returned in search outputs, and (c) using scores in locating s...
Publication searching based on keywords provided by users is traditional in digital libraries. While useful in many circumstances, the success of locating related publications via keyword-based searching paradigm is influenced by how users choose their keywords. Example-based searching, where user provides an example publication to locate similar p...
A metabolic network describes how different cellular processes (i.e., pathways) are connected and work together as part of a metabolism. The complexity of metabolisms and their working principles increase dramatically in higher order organisms (e.g., mammalians) in comparison to relatively lower level organisms (e.g., prokaryotes). As the complexit...