Diogo Borges LimaEXACT Sciences Corporation · Proteomics
Diogo Borges Lima
PhD
About
55
Publications
4,834
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
780
Citations
Introduction
I'm Research Scientist interested in Computational Proteomics, Cross-linking MS and Artificial Intelligence.
Additional affiliations
Education
March 2013 - January 2016
March 2011 - February 2013
August 2006 - February 2011
Publications
Publications (55)
Protein identification by mass spectrometry is commonly accomplished using a peptide sequence matching search algorithm, whose sensitivity varies inversely with the size of the sequence database and the number of post-translational modifications considered. We present the Spectrum Identification Machine, a peptide sequence matching tool that capita...
Advancing data analysis tools for proteome-wide cross-linking mass spectrometry (XL-MS) requires ground-truth standards that mimic biological complexity. Here we develop well-controlled XL-MS standards comprising hundreds of recombinant proteins that are systematically mixed for cross-linking. We use one standard dataset to guide the development of...
The functions of cellular organelles and sub-compartments depend on their protein content, which can be characterized by spatial proteomics approaches. However, many spatial proteomics methods are limited in their ability to resolve organellar sub-compartments, profile multiple sub-compartments in parallel, and/or characterize membrane-associated p...
We present RawVegetable 2.0, a software tailored for assessing mass spectrometry data quality and fine-tuned for cross-linking mass spectrometry (XL-MS) applications. Building upon the capabilities of its predecessor, RawVegetable 2.0 introduces four main modules, each providing distinct and new functionalities: 1) Pair Finder, which identifies ion...
A bstract
Advancing data analysis tools for proteome-wide cross-linking mass spectrometry (XL-MS) requires ground-truth standards that mimic biological complexity. Here, we develop wellcontrolled XL-MS standards comprising hundreds of recombinant proteins that are systematically mixed for cross-linking. We use one standard dataset to guide the deve...
Upon activation, vinculin reinforces cytoskeletal anchorage during cell adhesion. Activating ligands classically disrupt intramolecular interactions between the vinculin head and tail domains that bind to actin filaments. Here, we show that Shigella IpaA triggers major allosteric changes in the head domain, leading to vinculin homo-oligomerization....
Complex protein mixtures typically generate many tandem mass spectra produced by different peptides coisolated in the gas phase. Widely adopted proteomic data analysis environments usually fail to identify most of these spectra, succeeding at best in identifying only one of the multiple cofragmenting peptides. We present PatternLab V (PLV), an upda...
Cross-linking mass spectrometry (XL-MS) is a universal tool for probing structural dynamics and protein-protein interactions in vitro and in vivo. Although cross-linked peptides are naturally less abundant than their unlinked counterparts, recent experimental advances improved cross-link identification by enriching the cross-linker-modified peptide...
Motivation:
There are several well-established paradigms for identifying and pinpointing discriminative peptides/proteins using shotgun proteomic data; examples are peptide-spectrum matching, de novo sequencing, open searches, and even hybrid approaches. Such an arsenal of complementary paradigms can provide deep data coverage, albeit some unident...
Cross-linking mass spectrometry (XL-MS) is a universal tool for probing structural dynamics and protein-protein interactions in vitro and in vivo. Although cross-linked peptides are naturally less abundant than their unlinked counterparts, recent experimental advances improved cross-link identification by enriching the cross-linker modified peptide...
Upon activation, vinculin reinforces cytoskeletal anchorage during cell adhesion. Activating ligands classically disrupt intramolecular interactions between the vinculin head and tail domain that binds to actin filaments. Here, we show that Shigella IpaA triggers major allosteric changes in the head domain leading to vinculin homo-oligomerization....
Motivation:
Confident deconvolution of proteomic spectra is critical for several applications such as de novo sequencing, cross-linking mass spectrometry, and handling chimeric mass spectra.
Results:
In general, all deconvolution algorithms may eventually report mass peaks that are not compatible with the chemical formula of any peptide. We show...
The specific functions of cellular organelles and sub-compartments depend on their protein content, which can be characterized by spatial proteomics approaches. However, many spatial proteomics methods are limited in their ability to resolve organellar sub-compartments, profile multiple sub-compartments in parallel, and/or characterize membrane-ass...
Shotgun proteomics aims to identify and quantify the thousands of proteins in complex mixtures such as cell and tissue lysates and biological fluids. This approach uses liquid chromatography coupled with tandem mass spectrometry and typically generates hundreds of thousands of mass spectra that require specialized computational environments for dat...
DM64 is a toxin-neutralizing serum glycoprotein isolated from Didelphis aurita , an ophiophagous marsupial naturally resistant to snake envenomation. This 64 kDa antitoxin targets myotoxic phospholipases A 2 , which account for most local tissue damage of viperid snakebites. We investigated the noncovalent complex formed between native DM64 and myo...
Motivation:
We present a new software-tool allowing an easy visualization of fragment ions and thus a rapid evaluation of key experimental parameters on the sequence coverage obtained for the MS/MS analysis of intact proteins. Our tool can process data obtained from various deconvolution and fragment assignment software.
Results:
We demonstrate...
Motivation
We present a new software-tool allowing an easy visualization of fragment ions and thus a rapid evaluation of key experimental parameters on the sequence coverage obtained for the MS/MS analysis of intact proteins. Our tool can deal with multiple fragmentation methods.
Results
We demonstrate that TDFragMapper can rapidly highlight the e...
In proteomics, the identification of peptides from mass spectral data can be mathematically described as the partitioning of mass spectra into clusters (i.e., groups of spectra derived from the same peptide). The way partitions are validated is just as important, having evolved side by side with the clustering algorithms themselves and given rise t...
Pathogen identification is crucial to confirm bacterial infections and guide antimicrobial therapy. Although MALDI-TOF mass spectrometry (MS) serves as foundation for tools that enable rapid microbial identification, some bacteria remain challenging to identify. We recently showed that top-down proteomics (TDP) could be used to discriminate closely...
Software tools that allow the visualization and analysis of protein interaction networks are essential for studies in systems biology. One of the most popular network visualization tools in biology is Cytoscape, which offers a great selection of plug-ins for the interpretation of network data. Chemical cross-linking coupled to mass spectrometry (XL...
Motivation
Chemical cross-linking coupled to mass spectrometry (XLMS) emerged as a powerful technique for studying protein structures and large-scale protein-protein interactions. Nonetheless, XLMS lacks software tailored toward dealing with multiple conformers; this scenario can lead to high-quality identifications that are mutually exclusive. Thi...
Software tools that allow visualization and analysis of protein interaction networks are essential for studies in systems biology. One of the most popular network visualization tools in biology is Cytoscape, which offers a large selection of plugins for interpretation of protein interaction data. Chemical cross-linking coupled to mass spectrometry...
We present a high-performance app for Cytoscape to visualize cross-linking mass-spectrometry (XL-MS) data. XlinkCyNET is an open-source Java plugin that generates residue-to-residue connections provided by XL-MS in protein interaction networks. Importantly, it provides an interactive interface for the exploration of cross-links and offers various o...
Motivation
We present a high-performance software integrating shotgun with top-down proteomic data. The tool can deal with multiple experiments and search engines. Enable rapid and easy visualization, manual validation and comparison of the identified proteoform sequences including the post-translational modification characterization.
Results
We d...
The current technique used for microbial identification in hospitals is MALDI-TOF MS. However, it suffers from important limitations, in particular for closely-related species or when the database used for the identification lacks the appropriate reference. In this work, we set up a LC-MS/MS top-down proteomics platform, which aims at discriminatin...
Here we present a new software-tool for visualizing fragment ions and sequence coverage of intact proteins in top-down mass spectrometry. TDFragMapper combines the data arising from multiple and diverse tandem mass spectrometry experiments of intact proteins. Our tool maps fragment ions onto the protein backbone sequence and allows for a rapid comp...
We present RawVegetable, a software for mass spectrometry data assessment and quality control tailored toward shotgun proteomics and cross-linking experiments. RawVegetable provides four main modules with distinct features: (A) The charge state chromatogram that independently displays the ion current for each charge state; useful for optimizing the...
Strigomonas culicis is a kinetoplastid parasite of insects that maintains a mutualistic association with an intracellular symbiotic bacterium, which is highly integrated into the protist metabolism: it furnishes essential compounds and divides in synchrony with the eukaryotic nucleus. The protist, conversely, can be cured of the endosymbiont, produ...
The Shigella effector IpaA co-opts the focal adhesion protein vinculin to promote bacterial invasion. Here, we show that IpaA triggers an unreported mode of vinculin activation through the cooperative binding of its three vinculin-binding sites (VBSs) leading to vinculin oligomerization via its D1 and D2 head subdomains and highly stable adhesions...
Motivation:
We present the first tool for unbiased quality control of top-down proteomics datasets. Our tool can select high-quality top-down proteomics spectra, serve as a gateway for building top-down spectral libraries and, ultimately, improve identification rates.
Results:
We demonstrate that a twofold rate increase for two E. coli top-down...
Here we present a high-performance software for proteome analysis that combines different mass spectrometric approaches, such as, top-down for intact protein analyses and bottom-up, for proteolytic fragment characterization. ProteoCombiner capitalizes on the data arising from different experiments and proteomics search engines and presents the resu...
We present a new module integrated into the widely adopted PatternLab for proteomics to enable analysis of isotope-labeled peptides produced using dimethyl or SILAC. The accurate quantitation of proteins lies within the heart of proteomics; dimethylation has shown to be reliable, inexpensive, and applicable to any sample type. We validate our algor...
Disulfide bonds (SS) are post-translational modifications important for the proper folding and stabilization of many cellular proteins with therapeutic uses, including antibodies and other biologics. With budding advances of biologics and biosimilars, there is a mounting need for a robust method for accurate identification of SS. Even though severa...
The data presented herein is related to the article entitled “Trypanosoma cruzi immunoproteome: calpain-like CAP5.5 differentially detected throughout distinct stages of human Chagas disease cardiomyopathy” [1]. Electrophoretic analyses under denaturing and reducing conditions indicate that covalent immobilization of human IgG to Protein G magnetic...
Chagas disease, caused by the protozoan Trypanosoma cruzi, affects millions of people worldwide, especially in Latin America. Approximately 30% of the cases evolve to the chronic symptomatic stage due to cardiac and/or digestive damage, generally accompanied by nervous system impairment. Given the higher frequency and severity of clinical manifesta...
Cross-linking/Mass spectrometry (XLMS) is a consolidated technique for structural characterization of proteins and protein complexes. Despite its success, the cross-linking chemistry currently used is mostly based on N-hydroxysuccinimed (NHS) esters, which react primarily with lysine residues. One way to expand the current applicability of XLMS int...
Cross-linking coupled with mass spectrometry (XL-MS) has emerged as a powerful strategy for the identification of protein–protein interactions, characterization of interaction regions, and obtainment of structural information on proteins and protein complexes. In XL-MS, proteins or complexes are covalently stabilized with cross-linkers and digested...
A system, method, computer readable medium and device for identifying discriminant spectrum clusters including receiving known input data set comprising spectra generated from biological samples known to either have or not have a biological condition where each spectrum may be either known to have been generated from the biological samples known to...
Top-down proteomics is an emerging technology based on the analysis of intact proteins using high-resolution mass spectrometry (MS). This chapter gives brief example, showing that top-down proteomics is capable of going beyond matrix-assisted laser desorption/ionization-time-of-flight (MALDI-TOF) for the accurate characterization of bacterial patho...
Motivation: Around 75% of all mass spectra remain unidentified by widely adopted proteomic strategies. We present DiagnoProt, an integrated computational environment that can efficiently cluster millions of spectra and use machine learning to shortlist high-quality unidentified mass spectra that are discriminative of different biological conditions...
Apolipoprotein (apo)A-I mediates many of the anti-atherogenic functions attributed to high density lipoprotein (HDL). Unfortunately,
efforts toward a high-resolution structure of full-length apoA-I have not been fruitful, though there have been successes
with deletion mutants. Recently, a C-terminal truncation (apoA-IΔ185-243) was crystallized as a...
PatternLab for proteomics is an integrated computational environment that unifies several previously published modules for the analysis of shotgun proteomic data. The contained modules allow for formatting of sequence databases, peptide spectrum matching, statistical filtering and data organization, extracting quantitative information from label-fr...
Background:
Chromobacterium violaceum is a bacterium commonly found in tropical and subtropical regions and is associated with important pharmacological and industrial attributes such as producing substances with therapeutic properties and synthesizing biodegradable polymers. Its genome was sequenced, however, approximately 40% of its genes still...
Peptide Spectrum Matching (PSM) is the current gold standard for protein identification by mass spectrometry-based proteomics. PSM compares experimental mass spectra against theoretical spectra generated from a protein sequence database to perform identification, but protein sequences not present in a database can not be identified unless their seq...
Accessing localized proteomic profiles has emerged as a fundamental strategy to understand the biology of diseases, as recently demonstrated, for example, in the context of determining cancer resection margins with improved precision. Here, we analyze a gastric cancer biopsy sectioned into 10 parts, each one subjected to MudPIT analysis. We introdu...
Sarcopenia describes an age-related decline in skeletal muscle mass, strength, and function that ultimately impairs metabolism, leads to poor balance, frequent falling, limited mobility, and a reduction in quality of life. Here we investigate the pathogenesis of sarcopenia through a proteomic shotgun approach. Briefly, we employed tandem mass tags...
Peptide sequence matching algorithms used for peptide identification by tandem mass spectrometry (MS/MS) enumerate theoretical peptides from the database, predict their fragment ions, and match them to the experimental MS/MS spectra. Here, we present an approach for scoring MS/MS identifications based on the high mass accuracy matching of precursor...