- Home
- Amirkabir University of Technology
- Department of Mathematics and Computer Science
- Fatemeh Zare-Mirakabad

# Fatemeh Zare-Mirakabad

20.86

About

30

Research items

1,838

Reads

88

Citations

Introduction

**Skills and Expertise**

Research Experience

Sep 2010 - Nov 2015

Sep 2005 - Jun 2009

Network

Cited

Followers

Following

Research

Research Items (30)

Background:
Nowadays, according to valuable resources of high-quality genome sequences, reference-based assembly methods with high accuracy and efficiency are strongly required. Many different algorithms have been designed for mapping reads onto a genome sequence which try to enhance the accuracy of reconstructed genomes. In this problem, one of the challenges occurs when some reads are aligned to multiple locations due to repetitive regions in the genomes.
Results:
In this paper, our goal is to decrease the error rate of rebuilt genomes by resolving multi-mapping reads. To achieve this purpose, we reduce the search space for the reads which can be aligned against the genome with mismatches, insertions or deletions to decrease the probability of incorrect read mapping. We propose a pipeline divided to three steps: ExactMapping, InExactMapping, and MergingContigs, where exact and inexact reads are aligned in two separate phases. We test our pipeline on some simulated and real data sets by applying some read mappers. The results show that the two-step mapping of reads onto the contigs generated by a mapper such as Bowtie2, BWA and Yara is effective in improving the contigs in terms of error rate.
Conclusions:
Assessment results of our pipeline suggest that reducing the error rate of read mapping, not only can improve the genomes reconstructed by reference-based assembly in a reasonable running time, but can also have an impact on improving the genomes generated by de novo assembly. In fact, our pipeline produces genomes comparable to those of a multi-mapping reads resolution tool, namely MMR by decreasing the number of multi-mapping reads. Consequently, we introduce EIM as a post-processing step to genomes reconstructed by mappers.

- Oct 2017

Finding an effective measure to predict a more accurate RNA secondary structure is a challenging problem. In the last decade, an experimental method, known as selective [Formula: see text]-hydroxyl acylation analyzed by primer extension (SHAPE), was proposed to measure the tendency of forming a base pair for almost all nucleotides in an RNA sequence. These SHAPE reactivities are then utilized to improve the accuracy of RNA structure prediction. Due to a significant impact of SHAPE reactivity and in order to reduce the experimental costs, we propose a new model called HL-k-mer. This model simulates the SHAPE reactivity for each nucleotide in an RNA sequence. This is done by fetching the SHAPE reactivities for all sub-sequences of length k (k-mers) appearing in helix and loop regions. For evaluating the quality of simulated SHAPE data, ESD-Fold method is used based on the SHAPE data simulated by the HL-k-mer model ([Formula: see text], 3, 4). Also, for further evaluation of simulated SHAPE data, three different methods are employed. We also extend this model to simulate the SHAPE data for the RNA pseudoknotted structure. The results indicate that the average accuracies of prediction using the SHAPE data simulated by our models (for [Formula: see text], 3) are higher compared to the experimental SHAPE data.

- Oct 2017

DNA double strand breaks (DSBs) are the most lethal lesions of DNA induced by ionizing radiation, industrial chemicals and a wide variety of drugs used in chemotherapy. In the context of DNA damage response system modelling, uncertainty may arise in several ways such as number of induced DSBs, kinetic rates and measurement error in observable quantities. Therefore, using the stochastic approaches is imperative to gain further insight into the dynamic behaviour of DSBs repair process. In this article, a continuous-time Markov chain (CTMC) model of the non-homologous end joining (NHEJ) mechanism is formulated according to the DSB complexity. Additionally, a Metropolis Monte Carlo method is used to perform maximum likelihood estimation of the kinetic rate constants. Here, the effects of fluctuating kinetic rates and DSBs induction rate of the NHEJ mechanism are investigated. The stochastic realizations of the total yield of simple and complex DSBs ligation are simulated to compare their asymptotic dynamics. Furthermore, it has been proved that the total yield of DSBs has a normal distribution for sufficiently large number of DSBs. In order to estimate the expected duration of repairing DSBs, the probability distribution of DSBs lifetime is calculated based on the CTMC NHEJ model. Moreover, the variability of total yield of DSBs during constant low-dose radiation is evaluated in the presented model. The findings indicate that in stochastic NHEJ model, when there is no new DSBs induction through the repair process, all DSBs are eventually repaired. However, when DSBs are induced by constant low-dose radiation, a number of DSBs remains un-repaired.

It has long been established that in addition to being involved in protein translation, RNA plays essential roles in numerous other cellular processes, including gene regulation and DNA replication. Such roles are known to be dictated by higher-order structures of RNA molecules. It is therefore of prime importance to find an RNA sequence that can fold to acquire a particular function that is desirable for use in pharmaceuticals and basic research. The challenge of finding an RNA sequence for a given structure is known as the RNA design problem. Although there are several algorithms to solve this problem, they mainly consider hard constraints, such as minimum free energy, to evaluate the predicted sequences. Recently, SHAPE data has emerged as a new soft constraint for RNA secondary structure prediction. To take advantage of this new experimental constraint, we report here a new method for accurate design of RNA sequences based on their secondary structures using SHAPE data as pseudo-free energy. We then compare our algorithm with the four others: INFO-RNA, ERD, MODENA and RNAifold 2.0. Our algorithm precisely predicts 26 out of 29 new sequences for the structures extracted from the Rfam dataset, while the other four algorithms predict no more than 22 out of 29. The proposed algorithm is comparable to the above algorithms on RNA-SSD datasets, where they can predict up to 33 appropriate sequences for RNA secondary structures out of 34.

- May 2017
- 25 th Iranian Conference on Electrical Engineering (ICEE)

Transcription factor binding sites on human DNA are the target locations of specific proteins called transcription factors. Gene expression process begins when a transcription factor binds to its target location in the genome. Expensive experimental methods are used to identify a limited number of these binding sites, hence there is essential need for computational algorithms. In this paper, we train a back propagation neural network to identify SP1 factor binding sites on human chromosome1. Biological data have been extracted from NCBI database which includes a wide variety of genetic information of human and other species. In order to compare the performance of our trained neural network with other classification algorithms, we use Support Vector Machine, Discriminant Analysis and K-Nearest Neighbor algorithm to classify same data. Results show that our trained neural network outperforms other classification algorithms.

Background
According to structure-dependent function of proteins, two main challenging problems called Protein Structure Prediction (PSP) and Inverse Protein Folding (IPF) are investigated. In spite of IPF essential applications, it has not been investigated as much as PSP problem.In fact, the ultimate goal of IPF problem or protein design is to create proteins with enhanced properties or even novel functions. One of the major computational challenges in protein design is its large sequence space, namely searching through all plausible sequences is impossible. Inasmuch as, protein secondary structure represents an appropriate primary scaffold of the protein conformation, undoubtedly studying the Protein Secondary Structure Inverse Folding (PSSIF) problem is a quantum leap forward in protein design, as it can reduce the search space.In this paper, a novel genetic algorithm which uses native secondary sub-structures is proposed to solve PSSIF problem. In essence, evolutionary information can lead the algorithm to design appropriate amino acid sequences respective to the target secondary structures. Furthermore, they can be folded to tertiary structures almost similar to their reference 3D structures. ResultsThe proposed algorithm called GAPSSIF benefits from evolutionary information obtained by solved proteins in the PDB. Therefore, we construct a repository of protein secondary sub-structures to accelerate convergence of the algorithm.The secondary structure of designed sequences by GAPSSIF is comparable with those obtained by Evolver and EvoDesign. Although we do not explicitly consider tertiary structure features through the algorithm, the structural similarity of native and designed sequences declares acceptable values. Conclusions
Using the evolutionary information of native structures can significantly improve the quality of designed sequences. In fact, the combination of this information and effective features such as solvent accessibility and torsion angles leads IPF problem to an efficient solution. GAPSSIF can be downloaded at http://bioinformatics.aut.ac.ir/GAPSSIF/.

Background
Non-coding RNAs perform a wide range of functions inside the living cells that are related to their structures. Several algorithms have been proposed to predict RNA secondary structure based on minimum free energy. Low prediction accuracy of these algorithms indicates that free energy alone is not sufficient to predict the functional secondary structure. Recently, the obtained information from the SHAPE experiment greatly improves the accuracy of RNA secondary structure prediction by adding this information to the thermodynamic free energy as pseudo-free energy.
Method
In this paper, a new method is proposed to predict RNA secondary structure based on both free energy and SHAPE pseudo-free energy. For each RNA sequence, a population of secondary structures is constructed and their SHAPE data are simulated. Then, an evolutionary algorithm is used to improve each structure based on both free and pseudo-free energies. Finally, a structure with minimum summation of free and pseudo-free energies is considered as the predicted RNA secondary structure.
Results and Conclusions
Computationally simulating the SHAPE data for a given RNA sequence requires its secondary structure. Here, we overcome this limitation by employing a population of secondary structures. This helps us to simulate the SHAPE data for any RNA sequence and consequently improves the accuracy of RNA secondary structure prediction as it is confirmed by our experiments. The source code and web server of our proposed method are freely available at http://mostafa.ut.ac.ir/ESD-Fold/.

ESD-Fold: RNA Folding Based on Simulated SHAPE Data.
(PDF)

Advances in DNA sequencing technology have caused generation of the vast amount of new sequence data. It is essential to understand the functions, features, and structures of every newly sequenced data. Analyzing sequence data by different methods could provide important information about the sequence data. One of the essential tasks for genome annotation is gene prediction that can help to understand the features and determine functions of the genes. One of the key steps towards correct gene structure prediction is accurate splice site detection. There are vast numbers of splice site prediction methods, however, a few of them can be incorporated in gene prediction modules because of their complexity. In this paper, a novel model is presented to recognize unknown splice sites in a new genome without using any prior knowledge. Our model is defined based on integrating Jensen-Shannon divergence and a polynomial equation of order 2. Finally, the proposed model is evaluated on Yeast’s genome to predict splice sites. The experimental results suggest that the proposed method is an effective approach for splice site prediction.

“J-STAGE Advance published date: 15 January 2015” on p. 317 should be changed to “J-STAGE Advance published date: 15 January 2016”.

Protein complexes are aggregates of protein molecules that play important roles in biological processes. Detecting protein complexes from protein-protein interaction (PPI) networks is one of the most challenging problems in computational biology, and many computational methods have been developed to solve this problem. Generally, these methods yield high false positive rates. In this article, a semantic similarity measure between proteins, based on Gene Ontology (GO) structure, is applied to weigh PPI networks. Consequently, one of the well-known methods, COACH, has been improved to be compatible with weighted PPI networks for protein complex detection. The new method, WCOACH, is compared to the COACH, ClusterOne, IPCA, CORE, OH-PIN, HC-PIN and MCODE methods on several PPI networks such as DIP, Krogan, Gavin 2002 and MIPS. WCOACH can be applied as a fast and high-performance algorithm to predict protein complexes in weighted PPI networks. All data and programs are freely available at http://bioinformatics.aut.ac.ir/wcoach.

- Sep 2014

Motivation:
Interaction of two RNA molecules is considered as an important factor that regulates gene expression post-transcriptional process. Most of the ncRNAs prevent the translation of their target mRNA(s) by forming stable bindings with them. Although several computational methods have been proposed to predict the interactions between two RNAs, none of them can produce reliable and accurate results.
Results:
In this paper, a new approach entitled tempRNAs is presented to accurately predict interaction structure between two RNAs based on a gradual temperature decrease. For each specified temperature, our algorithm contains two main steps. First, the secondary structure of each RNA is determined with respect to the previous base pairs as constraints. Second, two RNAs are concatenated and then the interaction between them is calculated according to the previous base pairs. The secondary structures are determined based on minimum free energy model. The proposed algorithm is evaluated for a set of known interacting RNA pairs. The results show the higher accuracy of the proposed method in comparison to the other state-of-the-art algorithms, namely inRNAs and RactIP.

Background
RNA-RNA interaction plays an important role in the regulation of gene expression and cell development. In this process, an RNA molecule prohibits the translation of another RNA molecule by establishing stable interactions with it. In the RNA-RNA interaction prediction problem, two RNA sequences are given as inputs and the goal is to find the optimal secondary structure of two RNAs and between them. Some different algorithms have been proposed to predict RNA-RNA interaction structure. However, most of them suffer from high computational time.
Results
In this paper, we introduce a novel genetic algorithm called GRNAs to predict the RNA-RNA interaction. The proposed algorithm is performed on some standard datasets with appropriate accuracy and lower time complexity in comparison to the other state-of-the-art algorithms. In the proposed algorithm, each individual is a secondary structure of two interacting RNAs. The minimum free energy is considered as a fitness function for each individual. In each generation, the algorithm is converged to find the optimal secondary structure (minimum free energy structure) of two interacting RNAs by using crossover and mutation operations.
Conclusions
This algorithm is properly employed for joint secondary structure prediction. The results achieved on a set of known interacting RNA pairs are compared with the other related algorithms and the effectiveness and validity of the proposed algorithm have been demonstrated. It has been shown that time complexity of the algorithm in each iteration is as efficient as the other approaches.

In computational methods, position weight matrices (PWMs) are commonly applied for transcription factor binding site (TFBS) prediction. Although these matrices are more accurate than simple consensus sequences to predict actual binding sites, they usually produce a large number of false positive (FP) predictions and so are impoverished sources of information. Several studies have employed additional sources of information such as sequence conservation or the vicinity to transcription start sites to distinguish true binding regions from random ones. Recently, the spatial distribution of modified nucleosomes has been shown to be associated with different promoter architectures. These aligned patterns can facilitate DNA accessibility for transcription factors. We hypothesize that using data from these aligned and periodic patterns can improve the performance of binding region prediction. In this study, we propose two effective features, "modified nucleosomes neighboring" and "modified nucleosomes occupancy", to decrease FP in binding site discovery. Based on these features, we designed a logistic regression classifier which estimates the probability of a region as a TFBS. Our model learned each feature based on Sp1 binding sites on Chromosome 1 and was tested on the other chromosomes in human CD4+T cells. In this work, we investigated 21 histone modifications and found that only 8 out of 21 marks are strongly correlated with transcription factor binding regions. To prove that these features are not specific to Sp1, we combined the logistic regression classifier with the PWM, and created a new model to search TFBSs on the genome. We tested the model using transcription factors MAZ, PU.1 and ELF1 and compared the results to those using only the PWM. The results show that our model can predict Transcription factor binding regions more successfully. The relative simplicity of the model and capability of integrating other features make it a superior method for TFBS prediction.

- Jun 2013

In living systems, RNAs play important biological functions. The functional form of an RNA frequently requires a specific tertiary structure. The scaffold for this structure is provided by secondary structural elements that are hydrogen bonds within the molecule. Here, we concentrate on the inverse RNA folding problem. In this problem, an RNA secondary structure is given as a target structure and the goal is to design an RNA sequence that its structure is the same (or very similar) to the given target structure. Different heuristic search methods have been proposed for this problem. One common feature among these methods is to use a folding algorithm to evaluate the accuracy of the designed RNA sequence during the generation process. The well known folding algorithms take O(n 3) times where n is the length of the RNA sequence. In this paper, we introduce a new algorithm called GGI-Fold based on multi-objective genetic algorithm and Gibbs sampling method for the inverse RNA folding problem. Our algorithm generates a sequence where its structure is the same or very similar to the given target structure. The key feature of our method is that it never uses any folding algorithm to improve the quality of the generated sequences. We compare our algorithm with RNA-SSD for some biological test samples. In all test samples, our algorithm outperforms the RNA-SSD method for generating a sequence where its structure is more stable.

In living systems, RNAs play important biological functions. The functional form of an RNA frequently requires a specific tertiary structure. The scaffold for this structure is provided by secondary structural elements that are hydrogen bonds within the molecule. Here, we concentrate on the inverse RNA folding problem. In this problem, an RNA secondary structure is given as a target structure and the goal is to
design an RNA sequence that its structure is the same (or very similar) to the given target structure. Different heuristic search methods have been proposed for this problem. One common feature among these methods is to use a folding algorithm to evaluate the accuracy of the designed RNA sequence during the generation process. The well known folding algorithms take O(n3) times where n is the length of the RNA sequence. In this paper, we introduce a new algorithm called GGI-Fold based on multiobjective genetic algorithm and Gibbs sampling method for the inverse RNA folding problem. Our algorithm generates a sequence where its structure is the same or very similar to the given target structure. The key feature of our method is that it never uses any folding algorithm to improve the quality of the generated sequences. We compare our algorithm with
RNA-SSD for some biological test samples. In all test samples, our algorithm outperforms the RNA-SSD method for generating a sequence where its structure is more stable.

Background: RNA molecules play many important regulatory, catalytic and structural roles in the cell, and RNA secondary structure prediction with pseudoknots is one the most important problems in biology. An RNA pseudoknot is an element of the RNA secondary structure in which bases of a single-stranded loop pair with complementary bases outside the loop. Modeling these nested structures (pseudoknots) causes numerous computational difficulties and so it has been generally neglected in RNA structure prediction algorithms. Objectives: In this study, we present a new heuristic algorithm for the Prediction of RNA Knotted structures using Tree Adjoining Grammars (named PreRKTAG). Materials and Methods: For a given RNA sequence, PreRKTAG uses a genetic algorithm on tree adjoining grammars to propose a structure with minimum thermodynamic energy. The genetic algorithm employs a subclass of tree adjoining grammars as individuals by which the secondary structure of RNAs are modeled. Upon the tree adjoining grammars, new crossover and mutation operations were designed.The fitness function is defined according to the RNA thermodynamic energy function, which causes the algorithm convergence to be a stable structure. Results: The applicability of our algorithm is demonstrated by comparing its iresults with three well-known RNA secondary structure prediction algorithms that support crossed structures. Conclusions: We performed our comparison on a set of RNA sequences from the RNAseP database, where the outcomes show efficiency and practicality of the proposed algorithm. © 2013, National Institute of Genetic Engineering and Biotechnology; Published by Kowsar Corp.

- Jan 2013

In living systems, RNAs play important biological functions. The functional form of an RNA frequently requires a specific tertiary structure. The scaffold for this structure is provided by secondary structural elements that are hydrogen bonds within the molecule. Here, we concentrate on the inverse RNA folding problem. In this problem, an RNA secondary structure is given as a target structure and the goal is to design an RNA sequence that its structure is the same (or very similar) to the given target structure. Different heuristic search methods have been proposed for this problem. One common feature among these methods is to use a folding algorithm to evaluate the accuracy of the designed RNA sequence during the generation process. The well known folding algorithms take O(n 3) times where n is the length of the RNA sequence. In this paper, we introduce a new algorithm called GGI-Fold based on multi-objective genetic algorithm and Gibbs sampling method for the inverse RNA folding problem. Our algorithm generates a sequence where its structure is the same or very similar to the given target structure. The key feature of our method is that it never uses any folding algorithm to improve the quality of the generated sequences. We compare our algorithm with RNA-SSD for some biological test samples. In all test samples, our algorithm outperforms the RNA-SSD method for generating a sequence where its structure is more stable.

- Apr 2012

We present a new molecular algorithm for adding two binary numbers with n bits. Without considering the generation of input, this algorithm can be performed in O(1) in a test tube using O(n) different types of DNA strands, and the output can be detected in O(n). The output strands, prior to read out operation, can serve as the input strands for another round of addition. The algorithm can be easily extended to any other logical operation, and even for adding two decimal numbers.

- Jan 2012

RNA-RNA interaction is used in many biological processes such as gene expression regulation. In this process, an RNA molecule prohibits the translation of another RNA molecule by establishing stable interactions with it. In this regard, some algorithms have been formed to predict the structure of the interaction between two RNA molecules. One common pitfall in the most algorithms is their high computational time. In this paper, we introduce a novel algorithm called TIRNA to accurately predict the secondary structure between two RNA molecules based on minimum free energy (MFE). The algorithm is stand on a heuristic approach which employs some dot matrices for finding the secondary structure of each RNA and between two RNAs. The proposed algorithm has been performed on some standard datasets such as CopA-CopT, R1inv-R2inv, Tar-Tar*, DIS-DIS and IncRNA₅₄-RepZ in the Escherichia coli bacteria. The time and space complexity of the algorithm are 0(k² log k²) and 0(k²), respectively, where k indicates the sum of the length of two RNAs. The experimental results show the high validity and efficiency of the TIRNA.

Motif discovery is one of the fundamental problems in the signal detection and gene regulation. Motif discovery in biology is equivalent motif search and de novo motif finding problems in computer science. The challenging problem for both of these cases is Motif Representation (MR). A Common Position Weight Matrix (CPWM) is a simple MR model in which each position is independent from the other positions. However, the CPWM model is not an appropriate MR model. In fact, biological experiments show that the structural information is extremely important in motif discovery. The structural information is included in an MR model by considering dependent and conserved positions. Recently, some MR models have been introduced based on this assumption. These MR models are used only for the motif search problem. In this paper, we design a new MR model based on information theory. This model can be used for the de novo motif finding and motif search problems. We extract some known motifs from JASPAR and TRANSFAC databases to search for common features among them. Based on these features, a new MR model is constructed called EPWM. A jackknife test is used to show the EPWM model is more successful than the other MR models for the motif search problem. The jackknife test with each MR model (the EPWM model and the other MR models) is implemented and performed on the JASPAR and TRANS-FAC databases. To verify the efficiency of the EPWM model for the de novo motif finding problem, we implement the EPWM model and the other MR models in the Gibbs sampling method. Finally, the Gibbs sampling method is performed on the JASPAR and TRANSFAC databases. The results show that the EPWM model gives more accurate prediction than the other MR models for the motif search and de novo motif finding problems.

Pattern discovery in DNA sequences is one of the most fundamental problems in molecular biology with important applications in finding regulatory signals and transcription factor binding sites. An important task in this problem is to search (or predict) known binding sites in a new DNA sequence. For this reason, all subsequences of the given DNA sequence are scored based on an scoring function and the prediction is done by selecting the best score. By assuming no dependency between binding site base positions, most of the available tools for known binding site prediction are designed. Recently Tomovic and Oakeley investigated the statistical basis for either a claim of dependence or independence, to determine whether such a claim is generally true, and they presented a scoring function for binding site prediction based on the dependency between binding site base positions. Our primary objective is to investigate the scoring functions which can be used in known binding site prediction based on the assumption of dependency or independency in binding site base positions.
We propose a new scoring function based on the dependency between all positions in biding site base positions. This scoring function uses joint information content and mutual information as a measure of dependency between positions in transcription factor binding site. Our method for modeling dependencies is simply an extension of position independency methods. We evaluate our new scoring function on the real data sets extracted from JASPAR and TRANSFAC data bases, and compare the obtained results with two other well known scoring functions.
The results demonstrate that the new approach improves known binding site discovery and show that the joint information content and mutual information provide a better and more general criterion to investigate the relationships between positions in the TFBS. Our scoring function is formulated by simple mathematical calculations. By implementing our method on several biological data sets, it can be induced that this method performs better than methods that do not consider dependencies.

- Mar 2009

In this paper a novel genetic algorithm is presented for the dyad motif finding problem. The genetic algorithm uses a multi-objective fitness function based on the sum of pairs, the number of matches, and the information content. The individuals required for the population pool in the genetic algorithm are optimized by Gibbs sampling method. Also, new crossover and mutation operators are designed. The algorithm is implemented and tested on the different types of real datasets. The results are compared with other well-known algorithms and the effectiveness of our algorithm is shown.

Pattern discovery or motif finding is one of the most challenging problems in both molecular biology and computer science. In this paper we present an exact exhaus-tive method, for finding motifs of length ℓ in a set of t sequences of length n with a limited number of mutations d. The algorithm is based on the Depth First Search on a suffix trie with maximum nodes O(tn) and is performed in O(t 2 n 2 ℓ 2) time complexity. The proposed algorithm is tested on yeast and human transcription factor binding site data sets and the obtained results are compared to the other well-known algorithms. The experimental results demonstrate that the proposed method is working analogous to them algorithms.

- Jan 2009

A new algorithm for alignment of two RNA secondary structures without pseudoknots is presented. The algorithm is based on finding the longest common sub-structures between two RNA structures, and special effort is devoted for aligning the beginning and the end parts of the existing stems (base pairs) in the secondary structures of two RNAs. Results of structure alignments of different types of RNA are obtained by this algorithm, and show more consistency with models of evolution rather than other existing structure alignment algorithms.

- Jan 2007

Several biological features are presented by different types of trees. Two types of such trees are considered in this paper. the first type is trees with n external nodes that each internal node have at least two children, and are used in neuro-science and called neuronal dendritic trees. The second type is trees with n internal nodes and m external nodes. This type of trees represent the secondary structure of RNA sequences, and called RNA trees. In this paper, we present two new parallel algorithms for generation of these two biological trees. Both algorithms are adoptive and cost-optimal and generate the trees in B-order. Computations run in an SM SIMD model.

The task of transcription factor binding sites discovery from the upstream region of gene, without any prior knowledge of what look likes, is very challenging. In this paper we propose an algorithm based on Particle Swarm Optimization (PSO) to identify motif instances in multiple biological sequences. The experimental results on yeast sac-choromyces Cerevisae transcription factor binding sites, demonstrate that the proposed method is working analogous to YMF, MEME and AlignACE algorithms.

Current institution

Co-authors

**Top co-authors**

**All co-authors (32)**