Matti Kankainen

University of Helsinki, Helsinki, Province of Southern Finland, Finland

Are you Matti Kankainen?

Claim your profile

Publications (21)106.37 Total impact

  • Article: Nonpathogenic Lactobacillus rhamnosus activates the inflammasome and antiviral responses in human macrophages.
    [show abstract] [hide abstract]
    ABSTRACT: In this study, we have utilized global gene expression profiling to compare the responses of human primary macrophages to two closely related, well-characterized Lactobacillus rhamnosus strains GG and LC705, since our understanding of the responses elicited by nonpathogenic bacteria in human innate immune system is limited. Macrophages are phagocytic cells of the innate immune system that perform sentinel functions to initiate appropriate responses to surrounding stimuli. Macrophages that reside on gut mucosa encounter ingested and intestinal bacteria. Bacteria of Lactobacillus genus are nonpathogenic and used in food and as supplements with health-promoting probiotic potential. Our results demonstrate that live GG and LC705 induced quantitatively different gene expression profiles in macrophages. A gene ontology analysis revealed functional similarities and differences in responses to GG and LC705 that were reflected in host defense responses. Both GG and LC705 induced interleukin-1β production in macrophages that required caspase-1 activity. LC705, but not GG, induced type I interferon -dependent gene activation that correlated with its ability to prevent influenza A virus replication and production of viral proteins in macrophages. Our results indicate that nonpathogenic bacteria are able to activate the inflammasome. In addition, our results suggest that L. rhamnosus may prime the antiviral potential of human macrophages.
    Gut Microbes 11/2012; 3(6).
  • Article: Systems-level analysis of clinically different phenotypes of juvenile nasopharyngeal angiofibromas.
    [show abstract] [hide abstract]
    ABSTRACT: OBJECTIVES/HYPOTHESIS: To explore the molecular genetic background of juvenile nasopharyngeal angiofibromas and to identify biological processes and putative factors determining the different growth patterns of these tumors. STUDY DESIGN: By comparing copy number and gene expression level changes of two clinically different phenotypes of juvenile nasopharyngeal angiofibromas, we aimed to find processes essential in the growth and development of these tumors. Based on the results and prior knowledge of the proteins significance for growth, we studied the expression of tyrosine kinase SYK in 27 tumor samples. METHODS: Comparative genomic hybridization and gene expression analyses were performed for the two tumor samples, and protein expression of SYK was studied in 27 samples by immunohistochemical staining. RESULTS: Between low- and high-stage juvenile nasopharyngeal angiofibromas, 1,245 genes showed at least a two-fold change in expression. The corresponding proteins of these transcripts were enriched in different biological processes. Protein kinase SYK was expressed in all 27 samples, and its intensity significantly correlated with tumor stage. CONCLUSIONS: Because the molecular genetic background of juvenile nasopharyngeal angiofibroma is unknown, our aim was to investigate genomic alterations that could associate to low- and high-stage tumors. We were able to identify gene expression changes that relate to particular biological processes, but assessing clinically relevant molecular profiles still requires further characterization. Due to the low incidence of juvenile angiofibroma, in the future a combination of molecular profiling data from several studies would be useful in understanding the molecular background of the disease. Laryngoscope, 2012.
    The Laryngoscope 09/2012; · 1.75 Impact Factor
  • Source
    Article: BLANNOTATOR: enhanced homology-based function prediction of bacterial proteins.
    Matti Kankainen, Teija Ojala, Liisa Holm
    [show abstract] [hide abstract]
    ABSTRACT: Automated function prediction has played a central role in determining the biological functions of bacterial proteins. Typically, protein function annotation relies on homology, and function is inferred from other proteins with similar sequences. This approach has become popular in bacterial genomics because it is one of the few methods that is practical for large datasets and because it does not require additional functional genomics experiments. However, the existing solutions produce erroneous predictions in many cases, especially when query sequences have low levels of identity with the annotated source protein. This problem has created a pressing need for improvements in homology-based annotation. We present an automated method for the functional annotation of bacterial protein sequences. Based on sequence similarity searches, BLANNOTATOR accurately annotates query sequences with one-line summary descriptions of protein function. It groups sequences identified by BLAST into subsets according to their annotation and bases its prediction on a set of sequences with consistent functional information. We show the results of BLANNOTATOR's performance in sets of bacterial proteins with known functions. We simulated the annotation process for 3090 SWISS-PROT proteins using a database in its state preceding the functional characterisation of the query protein. For this dataset, our method outperformed the five others that we tested, and the improved performance was maintained even in the absence of highly related sequence hits. We further demonstrate the value of our tool by analysing the putative proteome of Lactobacillus crispatus strain ST1. BLANNOTATOR is an accurate method for bacterial protein function prediction. It is practical for genome-scale data and does not require pre-existing sequence clustering; thus, this method suits the needs of bacterial genome and metagenome researchers. The method and a web-server are available at http://ekhidna.biocenter.helsinki.fi/poxo/blannotator/.
    BMC Bioinformatics 02/2012; 13:33. · 2.75 Impact Factor
  • Article: Effect of acid stress on protein expression and phosphorylation in Lactobacillus rhamnosus GG.
    [show abstract] [hide abstract]
    ABSTRACT: Acidic environments encountered in food products and during gastrointestinal tract passage affect the survival of bacteria that are marketed as probiotics. In this study, the global proteome responses of the probiotic lactic acid bacterium Lactobacillus rhamnosus GG to two physiologically relevant pH conditions (pH 4.8 and pH 5.8) were studied by 2-D DIGE. The proteomics data were complemented with transcriptome analyses by whole-genome DNA microarrays. The cells were cultured in industrial-type whey medium under strictly defined bioreactor conditions. In total, 2-D DIGE revealed the pH-dependent formation of 92 protein spots. In response to lower pH conditions, the strongest up-regulation of all proteins was detected for a predicted surface antigen, LGG_02016. In addition, the acid pH was found to up-regulate the expression of F(0)F(1)-ATP synthase genes whereas the abundance of proteins participating in nucleotide biosynthesis and protein synthesis was significantly diminished. Moreover, the results suggest that L. rhamnosus GG modulates its pyruvate metabolism depending on the growth pH. Furthermore, a growth pH-dependent protein phosphorylation phenomenon was detected in several L. rhamnosus GG proteins with ProQ Diamond 2-DE gel staining. Proteins participating in central cellular pathways were shown to be phosphorylated, and the phosphorylation of glycolytic enzymes was found to be especially extensive.
    Journal of proteomics 11/2011; 75(4):1357-74. · 5.07 Impact Factor
  • Article: Growth phase-associated changes in the proteome and transcriptome of Lactobacillus rhamnosus GG in industrial-type whey medium.
    [show abstract] [hide abstract]
    ABSTRACT: The growth phase during which probiotic bacteria are harvested and consumed can strongly influence their performance as health-promoting agents. In this study, global transcriptomic and proteomic changes were studied in the widely used probiotic Lactobacillus rhamnosus GG during growth in industrial-type whey medium under strictly defined bioreactor conditions. The expression of 636 genes (P ≤ 0.01) and 116 proteins (P < 0.05) changed significantly over time. Of the significantly differentially produced proteins, 61 were associated with alterations at the transcript level. The most remarkable growth phase-dependent changes occurred during the transition from the exponential to the stationary growth phase and were associated with the shift from glucose fermentation to galactose utilization and the transition from homolactic to mixed acid fermentation. Furthermore, several genes encoding proteins proposed to promote the survival and persistence of L. rhamnosus GG in the host and proteins that directly contribute to human health showed temporal changes in expression. Our results suggest that L. rhamnosus GG has a highly flexible and adaptable metabolism and that the growth stage during which bacterial cells are harvested and consumed should be taken into consideration to gain the maximal benefit from probiotic bacteria.
    Microbial Biotechnology 09/2011; 4(6):746-66. · 2.53 Impact Factor
  • Article: MPEA--metabolite pathway enrichment analysis.
    [show abstract] [hide abstract]
    ABSTRACT: We present metabolite pathway enrichment analysis (MPEA) for the visualization and biological interpretation of metabolite data at the system level. Our tool follows the concept of gene set enrichment analysis (GSEA) and tests whether metabolites involved in some predefined pathway occur towards the top (or bottom) of a ranked query compound list. In particular, MPEA is designed to handle many-to-many relationships that may occur between the query compounds and metabolite annotations. For a demonstration, we analysed metabolite profiles of 14 twin pairs with differing body weights. MPEA found significant pathways from data that had no significant individual query compounds, its results were congruent with those discovered from transcriptomics data and it detected more pathways than the competing metabolic pathway method did. AVAILABILITY: The web server and source code of MPEA are available at http://ekhidna.biocenter.helsinki.fi/poxo/mpea/.
    Bioinformatics 07/2011; 27(13):1878-9. · 5.47 Impact Factor
  • Article: Comparative proteome cataloging of Lactobacillus rhamnosus strains GG and Lc705.
    [show abstract] [hide abstract]
    ABSTRACT: The present study reports an in-depth proteome analysis of two Lactobacillus rhamnosus strains, the well-known probiotic strain GG and the dairy strain Lc705. We used GeLC-MS/MS, in which proteins are separated using 1-DE and identified using nanoLC-MS/MS, to generate high-quality protein catalogs. To maximize the number of identifications, all data sets were searched against the target databases using two search engines, Mascot and Paragon. As a result, over 1600 high-confidence protein identifications, covering nearly 60% of the predicted proteomes, were obtained from each strain. This approach enabled identification of more than 40% of all predicted surfome proteins, including a high number of lipoproteins, integral membrane proteins, peptidoglycan associated proteins, and proteins predicted to be released into the extracellular environment. A comparison of both data sets revealed the expression of more than 90 proteins in GG and 150 in Lc705, which lack evolutionary counterparts in the other strain. Differences were noted in proteins with a likely role in biofilm formation, phage-related functions, reshaping the bacterial cell wall, and immunomodulation. The present study provides the most comprehensive catalog of the Lactobacillus proteins to date and holds great promise for the discovery of novel probiotic effector molecules.
    Journal of Proteome Research 06/2011; 10(8):3460-73. · 5.11 Impact Factor
  • Source
    Article: Adhesive polypeptides of Staphylococcus aureus identified using a novel secretion library technique in Escherichia coli.
    [show abstract] [hide abstract]
    ABSTRACT: Bacterial adhesive proteins, called adhesins, are frequently the decisive factor in initiation of a bacterial infection. Characterization of such molecules is crucial for the understanding of bacterial pathogenesis, design of vaccines and development of antibacterial drugs. Because adhesins are frequently difficult to express, their characterization has often been hampered. Alternative expression methods developed for the analysis of adhesins, e.g. surface display techniques, suffer from various drawbacks and reports on high-level extracellular secretion of heterologous proteins in Gram-negative bacteria are scarce. These expression techniques are currently a field of active research. The purpose of the current study was to construct a convenient, new technique for identification of unknown bacterial adhesive polypeptides directly from the growth medium of the Escherichia coli host and to identify novel proteinaceous adhesins of the model organism Staphylococcus aureus. Randomly fragmented chromosomal DNA of S. aureus was cloned into a unique restriction site of our expression vector, which facilitates secretion of foreign FLAG-tagged polypeptides into the growth medium of E. coli ΔfliCΔfliD, to generate a library of 1663 clones expressing FLAG-tagged polypeptides. Sequence and bioinformatics analyses showed that in our example, the library covered approximately 32% of the S. aureus proteome. Polypeptides from the growth medium of the library clones were screened for binding to a selection of S. aureus target molecules and adhesive fragments of known staphylococcal adhesins (e.g coagulase and fibronectin-binding protein A) as well as polypeptides of novel function (e.g. a universal stress protein and phosphoribosylamino-imidazole carboxylase ATPase subunit) were detected. The results were further validated using purified His-tagged recombinant proteins of the corresponding fragments in enzyme-linked immunoassay and surface plasmon resonance analysis. A new technique for identification of unknown bacterial adhesive polypeptides was constructed. Application of the method on S. aureus allowed us to identify three known adhesins and in addition, five new polypeptides binding to human plasma and extracellular matrix proteins. The method, here used on S. aureus, is convenient due to the use of soluble proteins from the growth medium and can in principle be applied to any bacterial species of interest.
    BMC Microbiology 05/2011; 11:117. · 3.04 Impact Factor
  • Article: Probiotic Lactobacillus rhamnosus downregulates FCER1 and HRH4 expression in human mast cells.
    [show abstract] [hide abstract]
    ABSTRACT: To investigate the effects of four probiotic bacteria and their combination on human mast cell gene expression using microarray analysis. Human peripheral-blood-derived mast cells were stimulated with Lactobacillus rhamnosus (L. rhamnosus) GG (LGG(®)), L. rhamnosus Lc705 (Lc705), Propionibacterium freudenreichii ssp. shermanii JS (PJS) and Bifidobacterium animalis ssp. lactis Bb12 (Bb12) and their combination for 3 or 24 h, and were subjected to global microarray analysis using an Affymetrix GeneChip(®) Human Genome U133 Plus 2.0 Array. The gene expression differences between unstimulated and bacteria-stimulated samples were further analyzed with GOrilla Gene Enrichment Analysis and Visualization Tool and MeV Multiexperiment Viewer-tool. LGG and Lc705 were observed to suppress genes that encoded allergy-related high-affinity IgE receptor subunits α and γ (FCER1A and FCER1G, respectively) and histamine H4 receptor. LGG, Lc705 and the combination of four probiotics had the strongest effect on the expression of genes involved in mast cell immune system regulation, and on several genes that encoded proteins with a pro-inflammatory impact, such as interleukin (IL)-8 and tumour necrosis factor alpha. Also genes that encoded proteins with anti-inflammatory functions, such as IL-10, were upregulated. Certain probiotic bacteria might diminish mast cell allergy-related activation by downregulation of the expression of high-affinity IgE and histamine receptor genes, and by inducing a pro-inflammatory response.
    World Journal of Gastroenterology 02/2011; 17(6):750-9. · 2.47 Impact Factor
  • Article: Proteomics and transcriptomics characterization of bile stress response in probiotic Lactobacillus rhamnosus GG.
    [show abstract] [hide abstract]
    ABSTRACT: Lactobacillus rhamnosus GG (GG) is a widely used and intensively studied probiotic bacterium. Although the health benefits of strain GG are well documented, the systematic exploration of mechanisms by which this strain exerts probiotic effects in the host has only recently been initiated. The ability to survive the harsh conditions of the gastrointestinal tract, including gastric juice containing bile salts, is one of the vital characteristics that enables a probiotic bacterium to transiently colonize the host. Here we used gene expression profiling at the transcriptome and proteome levels to investigate the cellular response of strain GG toward bile under defined bioreactor conditions. The analyses revealed that in response to growth of strain GG in the presence of 0.2% ox gall the transcript levels of 316 genes changed significantly (p < 0.01, t test), and 42 proteins, including both intracellular and surface-exposed proteins (i.e. surfome), were differentially abundant (p < 0.01, t test in total proteome analysis; p < 0.05, t test in surfome analysis). Protein abundance changes correlated with transcriptome level changes for 14 of these proteins. The identified proteins suggest diverse and specific changes in general stress responses as well as in cell envelope-related functions, including in pathways affecting fatty acid composition, cell surface charge, and thickness of the exopolysaccharide layer. These changes are likely to strengthen the cell envelope against bile-induced stress and signal the GG cells of gut entrance. Notably, the surfome analyses demonstrated significant reduction in the abundance of a protein catalyzing the synthesis of exopolysaccharides, whereas a protein dedicated for active removal of bile compounds from the cells was up-regulated. These findings suggest a role for these proteins in facilitating the well founded interaction of strain GG with the host mucus in the presence of sublethal doses of bile. The significance of these findings in terms of the functionality of a probiotic bacterium is discussed.
    Molecular &amp Cellular Proteomics 11/2010; 10(2):M110.002741. · 7.40 Impact Factor
  • Article: Genome sequence of Lactobacillus crispatus ST1.
    [show abstract] [hide abstract]
    ABSTRACT: Lactobacillus crispatus is a common member of the beneficial microbiota present in the vertebrate gastrointestinal and human genitourinary tracts. Here, we report the genome sequence of L. crispatus ST1, a chicken isolate displaying strong adherence to vaginal epithelial cells.
    Journal of bacteriology 07/2010; 192(13):3547-8. · 3.94 Impact Factor
  • Article: Mucosal adhesion properties of the probiotic Lactobacillus rhamnosus GG SpaCBA and SpaFED pilin subunits.
    [show abstract] [hide abstract]
    ABSTRACT: Lactobacillus rhamnosus GG is a well-established Gram-positive probiotic strain, whose health-benefiting properties are dependent in part on prolonged residence in the gastrointestinal tract and are likely dictated by adherence to the intestinal mucosa. Previously, we identified two pilus gene clusters (spaCBA and spaFED) in the genome of this probiotic bacterium, each of which contained the predicted genes for three pilin subunits and a single sortase. We also confirmed the presence of SpaCBA pili on the cell surface and attributed an intestinal mucus-binding capacity to one of the pilin subunits (SpaC). Here, we report cloning of the remaining pilin genes (spaA, spaB, spaD, spaE, and spaF) in Escherichia coli, production and purification of the recombinant proteins, and assessment of the adherence of these proteins to human intestinal mucus. Our findings indicate that the SpaB and SpaF pilin subunits also exhibit substantial binding to mucus, which can be inhibited competitively in a dose-related manner. Moreover, the binding between the SpaB pilin subunit and the mucosal substrate appears to operate through electrostatic contacts and is not related to a recognized mucus-binding domain. We conclude from these results that it is conceivable that two pilin subunits (SpaB and SpaC) in the SpaCBA pilus fiber play a role in binding to intestinal mucus, but for the uncharacterized and putative SpaFED pilus fiber only a single pilin subunit (SpaF) is potentially responsible for adhesion to mucus.
    Applied and environmental microbiology 04/2010; 76(7):2049-57. · 3.69 Impact Factor
  • Source
    Article: Comparative genomic analysis of Lactobacillus rhamnosus GG reveals pili containing a human- mucus binding protein.
    [show abstract] [hide abstract]
    ABSTRACT: To unravel the biological function of the widely used probiotic bacterium Lactobacillus rhamnosus GG, we compared its 3.0-Mbp genome sequence with the similarly sized genome of L. rhamnosus LC705, an adjunct starter culture exhibiting reduced binding to mucus. Both genomes demonstrated high sequence identity and synteny. However, for both strains, genomic islands, 5 in GG and 4 in LC705, punctuated the colinearity. A significant number of strain-specific genes were predicted in these islands (80 in GG and 72 in LC705). The GG-specific islands included genes coding for bacteriophage components, sugar metabolism and transport, and exopolysaccharide biosynthesis. One island only found in L. rhamnosus GG contained genes for 3 secreted LPXTG-like pilins (spaCBA) and a pilin-dedicated sortase. Using anti-SpaC antibodies, the physical presence of cell wall-bound pili was confirmed by immunoblotting. Immunogold electron microscopy showed that the SpaC pilin is located at the pilus tip but also sporadically throughout the structure. Moreover, the adherence of strain GG to human intestinal mucus was blocked by SpaC antiserum and abolished in a mutant carrying an inactivated spaC gene. Similarly, binding to mucus was demonstrated for the purified SpaC protein. We conclude that the presence of SpaC is essential for the mucus interaction of L. rhamnosus GG and likely explains its ability to persist in the human intestinal tract longer than LC705 during an intervention trial. The presence of mucus-binding pili on the surface of a nonpathogenic Gram-positive bacterial strain reveals a previously undescribed mechanism for the interaction of selected probiotic lactobacilli with host tissues.
    Proceedings of the National Academy of Sciences 10/2009; 106(40):17193-8. · 9.68 Impact Factor
  • Article: Proteome analysis of Lactobacillus rhamnosus GG using 2-D DIGE and mass spectrometry shows differential protein production in laboratory and industrial-type growth media.
    [show abstract] [hide abstract]
    ABSTRACT: Lactobacillus rhamnosus GG (LGG) is one of the most extensively studied and widely used probiotic bacteria. While the benefits of LGG treatment in gastrointestinal disorders and immunomodulation are well-documented, functional genomics research of this bacterium has only recently been initiated. In the present study, a 2-D DIGE approach was used for the quantitative analysis of growth media-dependent changes in LGG protein abundance. Proteins were isolated from cells grown in industrial-type whey-based medium or in rich laboratory medium for subsequent 2-D DIGE. The analysis revealed patterns of protein abundance unique to each growth condition. In total, 196 quantitatively altered protein spots (at least 1.5-fold change in relative abundance, p < 0.05) representing approximately 13% of all protein spots in the gel were detected. From these protein spots, 157 were identified by mass spectrometry and were found to represent 100 distinct gene products. Collectively, these data show that growth of LGG in whey medium increased the relative abundance of proteins involved in purine biosynthesis, galactose metabolism, and fatty acid biosynthesis. In comparison, growth of LGG in laboratory medium resulted in an increase in the amount of proteins involved in translation and the general stress response, as well as pyrimidine and exopolysaccharide biosynthesis. Moreover, several enzymes of the proteolytic system of LGG demonstrated growth medium-dependent production. The present study demonstrates the fundamental effects of culture conditions on the proteome of LGG, which are likely to affect the functionality and characteristics of its use as a probiotic.
    Journal of Proteome Research 09/2009; 8(11):4993-5007. · 5.11 Impact Factor
  • Article: LOCP--locating pilus operons in gram-positive bacteria.
    Ilya Plyusnin, Liisa Holm, Matti Kankainen
    [show abstract] [hide abstract]
    ABSTRACT: Pilus operons encode a pivotal host-microbe interaction structure that is vital for pathogenicity, colonization and adhesion. LOCP is a computational tool to quickly test whether or not the genome of a gram-positive bacterium of interest or some DNA-contig in a metagenomic sample hold pilus operons. Predictions are made based on distinctive motifs of pilus-related protein sequences and on the tendency of these protein sequences to occur in dense clusters. The tool showed a phenomenal accuracy and revealed that various novel, and even unexpected, gram-positive bacteria do possess pilus operons. Thus, the tool helps us to focus the laboratory research on genes behind this important and indicative feature, and to screen for strains containing them. AVAILABILITY: Software is available at http://ekhidna.biocenter.helsinki.fi/locp/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
    Bioinformatics 04/2009; 25(9):1187-8. · 5.47 Impact Factor
  • Source
    Article: MATLIGN: a motif clustering, comparison and matching tool.
    Matti Kankainen, Ari Löytynoja
    [show abstract] [hide abstract]
    ABSTRACT: Sequence motifs representing transcription factor binding sites (TFBS) are commonly encoded as position frequency matrices (PFM) or degenerate consensus sequences (CS). These formats are used to represent the characterised TFBS profiles stored in transcription factor databases, as well as to represent the potential motifs predicted using computational methods. To fill the gap between the known and predicted motifs, methods are needed for the post-processing of prediction results, i.e. for matching, comparison and clustering of pre-selected motifs. The computational identification of over-represented motifs in sets of DNA sequences is, in particular, a task where post-processing can dramatically simplify the analysis. Efficient post-processing, for example, reduces the redundancy of the motifs predicted and enables them to be annotated. In order to facilitate the post-processing of motifs, in both PFM and CS formats, we have developed a tool called Matlign. The tool aligns and evaluates the similarity of motifs using a combination of scoring functions, and visualises the results using hierarchical clustering. By limiting the number of distinct gaps created (though, not their length), the alignment algorithm also correctly aligns motifs with an internal spacer. The method selects the best non-redundant motif set, with repetitive motifs merged together, by cutting the hierarchical tree using silhouette values. Our analyses show that Matlign can reliably discover the most similar analogue from a collection of characterised regulatory elements such that the method is also useful for the annotation of motif predictions by PFM library searches. Matlign is a user-friendly tool for post-processing large collections of DNA sequence motifs. Starting from a large number of potential regulatory motifs, Matlign provides a researcher with a non-redundant set of motifs, which can then be further associated to known regulatory elements. A web-server is available at http://ekhidna.biocenter.helsinki.fi/poxo/matlign.
    BMC Bioinformatics 02/2007; 8:189. · 2.75 Impact Factor
  • Source
    Article: POXO: a web-enabled tool series to discover transcription factor binding sites.
    [show abstract] [hide abstract]
    ABSTRACT: We present POXO, a comprehensive tool series to discover transcription factor binding sites from co-expressed genes (www.bioinfo.biocenter.helsinki.fi/poxo). POXO manages tasks such as functional evaluation and grouping of genes, sequence retrieval, pattern discovery and pattern verification. It also allows users to tailor analytical pipelines from these tools, with single mouse clicks. One typical pipeline of POXO begins by examining the biological functions that a set of co-expressed genes are involved in. In this examination, the functional coherence of the gene set is evaluated and representative functions are associated with the gene set. This examination can also be used to group genes into functionally similar subsets, if several biological processes are affected in the experiment. The next step in the pipeline is then to discover over-represented nucleotide patterns from the upstream sequences of the selected gene sets. This enables to investigate the possibility that the genes are co-regulated by common cis-elements. If over-represented patterns are found, similar ones can then be clustered together and be verified. The performance of POXO is demonstrated by analysing expression data from pathogen treated Arabidopsis thaliana. In this example, POXO detected activated gene sets and suggested transcription factors responsible for their regulation.
    Nucleic Acids Research 08/2006; 34(Web Server issue):W534-40. · 8.03 Impact Factor
  • Source
    Article: Identifying functional gene sets from hierarchically clustered expression data: map of abiotic stress regulated genes in Arabidopsis thaliana.
    [show abstract] [hide abstract]
    ABSTRACT: We present MultiGO, a web-enabled tool for the identification of biologically relevant gene sets from hierarchically clustered gene expression trees (http://ekhidna.biocenter.helsinki.fi/poxo/multigo). High-throughput gene expression measuring techniques, such as microarrays, are nowadays often used to monitor the expression of thousands of genes. Since these experiments can produce overwhelming amounts of data, computational methods that assist the data analysis and interpretation are essential. MultiGO is a tool that automatically extracts the biological information for multiple clusters and determines their biological relevance, and hence facilitates the interpretation of the data. Since the entire expression tree is analysed, MultiGO is guaranteed to report all clusters that share a common enriched biological function, as defined by Gene Ontology annotations. The tool also identifies a plausible cluster set, which represents the key biological functions affected by the experiment. The performance is demonstrated by analysing drought-, cold- and abscisic acid-related expression data sets from Arabidopsis thaliana. The analysis not only identified known biological functions, but also brought into focus the less established connections to defense-related gene clusters. Thus, in comparison to analyses of manually selected gene lists, the systematic analysis of every cluster can reveal unexpected biological phenomena and produce much more comprehensive biological insights to the experiment of interest.
    Nucleic Acids Research 02/2006; 34(18):e124. · 8.03 Impact Factor
  • Source
    Article: POCO: discovery of regulatory patterns from promoters of oppositely expressed gene sets.
    Matti Kankainen, Liisa Holm
    [show abstract] [hide abstract]
    ABSTRACT: Functionally associated genes tend to be co-expressed, which indicates that they could also be co-regulated. Since co-regulation is usually governed by transcription factors via their specific binding elements, putative regulators can be identified from promoter sets of (co-expressed) genes by screening for over-represented nucleotide patterns. Here, we present a program, POCO, which discovers such over-represented patterns from either one or two promoter sets. Typical microarray experiments yield up- and down-regulated gene sets that may represent, for example, distinct defense pathways. Assuming that a functional transcription factor cannot simultaneously both up- and down-regulate the gene sets, its binding element should respectively be over- and under-represented in the corresponding promoter sets. This idea is implemented in POCO, which tests the hypothesis that the distributions of a pattern differ among three sets of promoters: up-regulated, down-regulated and randomly-chosen. In the program, pattern discovery is based on explicit enumeration of all possible patterns on the alphabet (A, C, G, T and N). The mean occurrences and SDs of the patterns are estimated using bootstrapping and their significance is assessed using ANOVA F-statistics, Tukey's honestly significantly difference test and P-values. The program is freely available at http://ekhidna.biocenter.helsinki.fi/poco.
    Nucleic Acids Research 08/2005; 33(Web Server issue):W427-31. · 8.03 Impact Factor
  • Article: POBO, transcription factor binding site verification with bootstrapping.
    Matti Kankainen, Liisa Holm
    [show abstract] [hide abstract]
    ABSTRACT: Transcription factors can either activate or repress target genes by binding onto short nucleotide sequence motifs in the promoter regions of these genes. Here, we present POBO, a promoter bootstrapping program, for gene expression data. POBO can be used to detect, compare and verify predetermined transcription factor binding site motifs in the promoters of one or two clusters of co-regulated genes. The program calculates the frequencies of the motif in the input promoter sets. A bootstrap analysis detects significantly over- or underrepresented motifs. The output of the program presents bootstrapped results in picture and text formats. The program was tested with published data from transgenic WRKY70 microarray experiments. Intriguingly, motifs recognized by the WRKY transcription factors of plant defense pathways are similarly enriched in both up- and downregulated clusters. POBO analysis suggests slightly modified hypothetical motifs that discriminate between up- and downregulated clusters. In conclusion, POBO allows easy, fast and accurate verification of putative regulatory motifs. The statistical tests implemented in POBO can be useful in eliminating false positives from the results of pattern discovery programs and increasing the reliability of true positives. POBO is freely available from http://ekhidna.biocenter.helsinki.fi:9801/pobo.
    Nucleic Acids Research 08/2004; 32(Web Server issue):W222-9. · 8.03 Impact Factor

Institutions

  • 2004–2012
    • University of Helsinki
      • • Institute of Biotechnology
      • • Department of Biosciences
      Helsinki, Province of Southern Finland, Finland
  • 2011
    • VTT Technical Research Centre of Finland
      Espoo, Province of Southern Finland, Finland
  • 2003
    • University of Kuopio
      Kuopio, Province of Eastern Finland, Finland