Kamran KarimiUniversity of Calgary
Kamran Karimi
Ph.D., P.Eng.
About
76
Publications
34,688
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,674
Citations
Introduction
Additional affiliations
November 2012 - present
Publications
Publications (76)
As a model organism database, Xenbase has been providing informatics and genomic data on Xenopus (Silurana) tropicalis and Xenopus laevis frogs for more than a decade. The Xenbase database contains curated, as well as community-contributed and automatically harvested literature, gene and genomic data. A GBrowse genome browser, a BLAST+ server and s...
In this paper we discuss ways to reduce the execution time of a software
Global Navigation Satellite System (GNSS) receiver that is meant for offline
operation in a cloud environment. Client devices record satellite signals they
receive, and send them to the cloud, to be processed by this software. The goal
of this project is for each client reques...
We present a solution to the problem of understanding a system that produces a sequence of temporally ordered observations. Our solution is based on generating and interpreting a set of temporal decision rules. A temporal decision rule is a decision rule that can be used to predict or retrodict the value of a decision attribute, using condition att...
It is desirable to automatically learn the effects of actions in an unknown environment. C4.5 has been used to discover associations, and it can also be used to find causal rules. Its output consists of rules that predict the value of a decision attribute using some condition attributes. Integrating C4.5's results in other applications usually requ...
We describe TimeSleuth, a hybrid tool based on the C4.5 classification software, which is intended for the discovery of temporal/causal rules. Temporally ordered data are gathered from observable attributes of a system, and used to discover relations among the attributes. In general, such rules could be atemporal or temporal. We evaluate TimeSleuth...
The Alliance of Genome Resources (Alliance) is an extensible coalition of knowledgebases focused on the genetics and genomics of intensively-studied model organisms. The Alliance is organized as individual knowledge centers with strong connections to their research communities and a centralized software infrastructure, discussed here. Model organis...
Echinobase (www.echinobase.org) is a model organism knowledgebase serving as a resource for the community that studies echinoderms, a phylum of marine invertebrates that includes sea urchins and sea stars. Echinoderms have been important experimental models for over 100 years and continue to make important contributions to environmental, evolutiona...
Xenbase (https://www.xenbase.org/), the Xenopus model organism knowledgebase, is a web accessible resource that integrates the diverse genomic and biological data from research on the laboratory frogs Xenopus laevis and Xenopus tropicalis. The goal of Xenbase is to accelerate discovery and empower Xenopus research, enhance the impact of Xenopus res...
Background
Ontologies of precisely defined, controlled vocabularies are essential to curate the results of biological experiments such that the data are machine searchable, can be computationally analyzed, and are interoperable across the biomedical research continuum. There is also an increasing need for methods to interrelate phenotypic data easi...
Echinobase (www.echinobase.org) is a third generation web resource supporting genomic research on echinoderms. The new version was built by cloning the mature Xenopus model organism knowledgebase, Xenbase, refactoring data ingestion pipelines and modifying the user interface to adapt to multispecies echinoderm content. This approach leveraged over...
Background
Ontologies of precisely defined, controlled vocabularies are essential to curate the results of biological experiments such that the data are machine searchable, can be computationally analyzed, and are interoperable across the biomedical research continuum. There is also an increasing need for methods to interrelate phenotypic data easi...
A keyword-based search of comprehensive databases such as PubMed may return irrelevant papers, especially if the keywords are used in multiple fields of study. In such cases, domain experts (curators) need to verify the results and remove the irrelevant articles. Automating this filtering process will save time, but it has to be done well enough to...
Echinobase (https://echinobase.org) is a central online platform that generates, manages and hosts genomic data relevant to echinoderm research. While the resource primarily serves the echinoderm research community, the recent release of an excellent quality genome for the frequently studied purple sea urchin (Strongylocentrotus purpuratus genome,...
Xenbase (www.xenbase.org) is a knowledge base for researchers and biomedical scientists that employ the amphibian Xenopus as a model organism in biomedical research to gain a deeper understanding of developmental and disease processes. Through expert curation and automated data provisioning from various sources Xenbase strives to integrate the body...
Rhythms of various periodicities drive cyclical processes in organisms ranging from single cells to the largest mammals on earth, and on scales from cellular physiology to global migrations. Molecular mechanisms that generate circadian behaviours in model organisms are well studied, but longer phase cycles and interactions between cycles with diffe...
We present an unprecedentedly comprehensive characterization of protein dynamics across early development in Xenopus laevis, available immediately via a convenient Web portal. This resource allows interrogation of the protein expression data in conjunction with other data modalities such as genome wide mRNA expression. This study provides detailed...
At a fundamental level most genes, signaling pathways, biological functions and organ systems are highly conserved between man and all vertebrate species. Leveraging this conservation, researchers are increasingly using the experimental advantages of the amphibian Xenopus to model human disease. The online Xenopus resource, Xenbase, enables human d...
With the advent of whole transcriptome and genome analysis methods, classifying samples containing multiple origins has become a significant task. Nucleotide sequences can be allocated to a genome or transcriptome by aligning sequences to multiple target sequence sets, but this approach requires extensive computational resources and also depends on...
Xenbase is the Xenopus model organism database (www.xenbase.org), a web-accessible resource that integrates the diverse genomic and biological data for Xenopus research. It hosts a variety of content including current and archived genomes for both X. laevis and X. tropicalis, bioinformatic tools for comparative genetic analyses including BLAST and...
Xenbase (www.xenbase.org) is an online resource for researchers utilizing Xenopus laevis and Xenopus tropicalis, and for biomedical scientists seeking access to data generated with these model systems. Content is aggregated from a variety of external resources and also generated by in-house curation of scientific literature and bioinformatic analys...
To explore the origins and consequences of tetraploidy in the African clawed frog, we sequenced the Xenopus laevis genome and compared it to the related diploid X. tropicalis genome. We characterize the allotetraploid origin of X. laevis by partitioning its genome into two homoeologous subgenomes, marked by distinct families of 'fossil' transposabl...
Xenbase, the Xenopus model organism database (www.xenbase.org), is a cloud-based, web accessible resource that integrates the diverse genomic and biological data from Xenopus research. Xenopus frogs are one of the major vertebrate animal models used for biomedical research, and Xenbase is the central repository for the enormous amount of data gener...
At the heart of databases is a data model referred to as a schema. Relational databases store information in tables, and the schema defines the tables and provides a map of relationships that show how the different table/data types relate to one another. In Xenbase, we were tasked to represent genomic, molecular, and biological data of both a diplo...
OpenCL, along with CUDA, is one of the main tools used to program GPGPUs.
However, it allows running the same code on multi-core CPUs too, making it a
rival for the long-established OpenMP. In this paper we compare OpenCL and
OpenMP when developing and running compute-heavy code on a CPU. Both ease of
programming and performance aspects are conside...
Xenbase (http://www.xenbase.org), the Xenopus frog model organism database, integrates a wide variety of data from this biomedical model genus. Two closely related species are represented: the allotetraploid Xenopus laevis that is widely used for microinjection and tissue explant-based protocols, and the diploid Xenopus tropicalis which is used for...
Iterative approaches to quantum computation are described. Incongruities in the behavior of the various individual elements in a quantum processor may be managed by establishing a set of equivalent configurations for the elements of the quantum processor. The quantum processor is programmed and operated using each equivalent configuration to determ...
Efforts to develop useful quantum computers have been blocked primarily by environmental noise. Quantum annealing is a scheme of quantum computation that is predicted to be more robust against noise, because despite the thermal environment mixing the system's state in the energy basis, the system partially retains coherence in the computational bas...
A virtual appliance contains a target application, and the running
environment necessary for running that application. Users run an appliance
using a virtualization engine, freeing them from the need to make sure that the
target application has access to all its dependencies. However, creating and
managing a virtual appliance, versus a stand-alone...
Many interesting but practically intractable problems can be reduced to that of finding the ground state of a system of interacting spins; however, finding such a ground state remains computationally difficult. It is believed that the ground state of some naturally occurring spin systems can be effectively attained through a process called quantum...
A superconducting chip containing a regular array of flux qubits, tunable
interqubit inductive couplers, an XY-addressable readout system, on-chip
programmable magnetic memory, and a sparse network of analog control lines has
been studied. The architecture of the chip and the infrastructure used to
control it were designed to facilitate the impleme...
Causality is a non-obvious concept that is often considered to be related to temporality. In this paper we present a number of past and present approaches to the definition of temporality and causality from philosophical, physical, and computational points of view. We note that time is an important ingredient in many relationships and phenomena. Th...
Adiabatic quantum optimization offers a new method for solving hard
optimization problems. In this paper we calculate median adiabatic times (in
seconds) determined by the minimum gap during the adiabatic quantum
optimization for an NP-hard Ising spin glass instance class with up to 128
binary variables. Using parameters obtained from a realistic s...
CUDA and OpenCL are two different frameworks for GPU programming. OpenCL is
an open standard that can be used to program CPUs, GPUs, and other devices from
different vendors, while CUDA is specific to NVIDIA GPUs. Although OpenCL
promises a portable language for GPU programming, its generality may entail a
performance penalty. In this paper, we use...
This paper describes an algorithm for selecting parameter values (e.g.
temperature values) at which to measure equilibrium properties with Parallel
Tempering Monte Carlo simulation. Simple approaches to choosing parameter
values can lead to poor equilibration of the simulation, especially for Ising
spin systems that undergo $1^st$-order phase trans...
Much of the current focus in high-performance computing is on multi-threading, multi-computing, and graphics processing unit (GPU) computing. However, vectorization and non-parallel optimization techniques, which can often be employed additionally, are less frequently discussed. In this paper, we present an analysis of several optimizations done on...
This paper presents two conceptually simple methods for parallelizing a
Parallel Tempering Monte Carlo simulation in a distributed volunteer computing
context, where computers belonging to the general public are used. The first
method uses conventional multi-threading. The second method uses CUDA, a
graphics card computing system. Parallel Temperin...
The computation of data cubes is one of the most expensive operations in on-line analytical processing (OLAP). To improve efficiency, an iceberg cube represents only the cells whose aggregate values are above a given threshold (minimum support). Top-down and bottom-up approaches are used to compute the iceberg cube for a data set, but both have per...
Generating decision rule sets from observational data is an established branch of machine learning. Although such rules may be well-suited to machine execution, a human being may have problems interpreting them. Making inferences about the dependencies of a number of attributes on each other by looking at the rules is hard, hence the need to summar...
Developing parallel and distributed programs is usually considered a hard task. One has to have a good understanding of the problem domain, as well as the target hardware, and map the problem to the available hardware resources. The resulting program is often hard to port to another system. The development and maintenance process may thus be costly...
Absract. Evolutionary algorithms are an effective way of solving search problems. They usually operate in a forward temporal direction, where, as new members of the population are created, information about the previous members is lost. With reversible computing, no information is lost, and one can undo the effects of a computation, thus making it...
In this paper we propose a way to perform robotic self-recognition in static and quasi-static environments, self-recognition is a process during which the robot discovers the effects of its own actions on the environment and itself. For example, how much would the robot move when its wheels turn once? Such information can be hard-coded into the rob...
In this thesis, we present a solution to the problem of discovering rules from sequential data. As part of the solution, the Temporal Investigation Method for Enregistered Record Sequences (TIMERS) and its implementation, the TimeSleuth software, are introduced. TIMERS uses the passage of time between attribute observations as justification for jud...
We present the Temporal Investigation Method for Enregistered Record Sequences II (TIMERS II), which can be used to classify the relationship between a decision attribute and a number of condition attributes as instantaneous, causal, or acausal. In this paper we consider it possible to refer to both previous and next values of attributes in tempora...
In this paper we present TIMERS II (Temporal Investigation Method for Enregistered Record Sequences II). Assuming that the effects take time to manifest, TIMERS II merges the input records and brings the causes and effects together. The output is in the form of a set of decision rules. The condition attributes' values could have been observed in th...
Deriving algorithms to solve problems is a main activity in computer science. Elaborate techniques are often devised to efficiently handle specific problems. This paper proposes the use of a general method of solving problems, called the Iterative Multi-Agent (IMA) Method, which assumes little about the specific problem at hand. In IMA the problem...
In this paper we propose a new algorithm, called 1DIMERS (One Dimensional Investigation Method for Enregistered Record Sequences), to mine rules in any data of sequential nature, temporal or spatial. We assume that each record in the sequence is at the same temporal or spatial distance from others, and we do not constrain the rules to follow any mo...
In this paper we present the idea of using the direction of time to discover causality in temporal data. The Temporal Investigation Method for Enregistered Record Sequences (TIMERS), creates temporal classification rules from the input, and then measures the accuracy of the rules. It does so two times, each time assuming a different direction for t...
In this paper we propose a solution to the problem of distinguishing between causal and acausal temporal sets of rules. The method, called the Temporal Investigation Method for Enregistered Record Sequences (TIMERS), is explained and introduced formally. The input to TIMERS consists of a sequence of records, where each record is observed at regular...
The problem of determining whether or not the value of an attribute is caused by other observed attributes, or they merely happen to occur together, has been attacked from different angles. In this paper we propose a solution to the problem of distinguishing between causal and acausal temporal rules, and the system that generated the rules. The pro...
We introduce a method for finding temporal and atemporal relations in nominal, causal data. This method searches for relations among variables that characterize the behavior of a single system. Data are gathered from variables of the system, and used to discover relations among the variables. In general, such rules could be causal or acausal. We fo...
Discovering causal relations in a system is essential to understanding how it works and to learning how to control the behaviour of the system. RFCT is a causality miner that uses association relations as the basis for the discovery of causal relations. It does so by making explicit the temporal relationships among the data. RFCT uses C4.5 as its a...
Discovering causal and temporal relations in a system is essential to understanding how it works, and to learning to control the behaviour of the system. TimeSleuth is a causality miner that uses association relations as the basis for the discovery of causal and temporal relations. It does so by introducing time into the observed data. TimeSleuth u...
Finding the cause of things has always been a main focus of human curiosity. As part of a project at the
Department of Computer Science at the University of Regina , we are using existing tools to extract causal (temporal) and association (non-temporal) rules from observational data. We have successfully used C4.5 to extract such rules. By a causa...
Decision trees are useful tools for classification and prediction purposes, and have been applied mostly to data that is void of any explicit notion of time. This covers many application areas where the data is about populations of the same entities, but is not suitable for cases where there is a temporal relation among the data. One case is when w...
It is desirable to automatically learn the effects of actions in an unknown environment. Using situation calculus in a causal domain is a very natural way of recording the actions and their effects. These could later be used for Automatic Programming purposes. A brute force approach to representing the situations involves recording the value of all...
The Iterative Multi-Agent (IMA) Method works by breaking down a search problem into many sub-problems, each to be solved separately. Independent problem solving agents work on the sub-problems and use iteration to handle any interaction between the sub-problems. Each agent knows about a subset of the whole problem and cannot solve it all by itself....
The similarity assessment process often involves measuring the similarity of objects X and Y in terms of the similarity of corresponding constituents of X and Y, possibly in a recursive manner. This approach is not useful when the verbatim value of the data is of less interest than what they can potentially "do," or where the objects of interest ha...
Observing the world and finding trends and relations among the variables of interest is an important and common learning activity. In this paper we apply TETRAD, a program that uses Bayesian networks to discover causal rules, and C4.5, which creates decision trees, to the problem of discovering relations among a set of variables in the controlled e...
This paper introduces a problem solving method involving independent agents and a set of partial solutions. In the Iterative Multi-Agent (IMA) method, each agent knows about a subset of the whole problem and can not solve it all by itself. An agent picks a partial solution from the set and then applies its knowledge of the problem to bring that par...
Linux is an easily available and powerful operating system, but it is based on a 70s design, making the need for the addition of more modern concepts apparent. This article lists the main characteristics of Distributed Inter-Process Communication (DIPC), a relatively simple system software that provides users of the Linux operating system with both...
Distributed Inter-Process Communication (DIPC) is a software-only solution to enable people to build and program Multi-Computers. Among other things, DIPC provides the programmers with Transparent Distributed Shared Memory. DIPC is not concerned with the incompatible executable code problem. It also does not change user's data, but DIPC has to unde...
Distributed Inter-Process Communication (DIPC) provides the programmers of the Linux operating system with distributed programming facilities, including Distributed Shared Memory (DSM). It works by making UNIX System V IPC mechanisms (shared memory, message queues and semaphores) network transparent, thus integrating neatly with the rest of the sys...