Diogo Borges Lima

Diogo Borges Lima
EXACT Sciences Corporation · Proteomics

PhD

About

55
Publications
4,834
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
780
Citations
Introduction
I'm Research Scientist interested in Computational Proteomics, Cross-linking MS and Artificial Intelligence.
Additional affiliations
August 2020 - July 2024
Leibniz-ForschungsInstitut für Molekulare Pharmakologie
Position
  • Post-doc in Computational XL-MS
Description
  • I've developed algorithm for identifying and quantifying cross-linked peptides by using cleavable cross-linkers.
January 2020 - August 2020
CeMM Research Center for Molecular Medicine
Position
  • Analyst
Description
  • I worked on the developments of new strategies to target undraggable proteins for the treatment of cancer.
February 2016 - November 2019
Institut Pasteur
Position
  • PostDoc Position
Description
  • I worked on the development of algorithms for analyzing top-down proteomics data.
Education
March 2013 - January 2016
Oswaldo Cruz Foundation
Field of study
  • Bioinformatics in Proteomics
March 2011 - February 2013
Federal University of Rio de Janeiro
Field of study
  • Artificial Intelligence
August 2006 - February 2011
Federal University of Rio de Janeiro
Field of study
  • Computer Science

Publications

Publications (55)
Article
Full-text available
Protein identification by mass spectrometry is commonly accomplished using a peptide sequence matching search algorithm, whose sensitivity varies inversely with the size of the sequence database and the number of post-translational modifications considered. We present the Spectrum Identification Machine, a peptide sequence matching tool that capita...
Article
Full-text available
Advancing data analysis tools for proteome-wide cross-linking mass spectrometry (XL-MS) requires ground-truth standards that mimic biological complexity. Here we develop well-controlled XL-MS standards comprising hundreds of recombinant proteins that are systematically mixed for cross-linking. We use one standard dataset to guide the development of...
Article
Full-text available
The functions of cellular organelles and sub-compartments depend on their protein content, which can be characterized by spatial proteomics approaches. However, many spatial proteomics methods are limited in their ability to resolve organellar sub-compartments, profile multiple sub-compartments in parallel, and/or characterize membrane-associated p...
Article
We present RawVegetable 2.0, a software tailored for assessing mass spectrometry data quality and fine-tuned for cross-linking mass spectrometry (XL-MS) applications. Building upon the capabilities of its predecessor, RawVegetable 2.0 introduces four main modules, each providing distinct and new functionalities: 1) Pair Finder, which identifies ion...
Preprint
A bstract Advancing data analysis tools for proteome-wide cross-linking mass spectrometry (XL-MS) requires ground-truth standards that mimic biological complexity. Here, we develop wellcontrolled XL-MS standards comprising hundreds of recombinant proteins that are systematically mixed for cross-linking. We use one standard dataset to guide the deve...
Article
Upon activation, vinculin reinforces cytoskeletal anchorage during cell adhesion. Activating ligands classically disrupt intramolecular interactions between the vinculin head and tail domains that bind to actin filaments. Here, we show that Shigella IpaA triggers major allosteric changes in the head domain, leading to vinculin homo-oligomerization....
Article
Complex protein mixtures typically generate many tandem mass spectra produced by different peptides coisolated in the gas phase. Widely adopted proteomic data analysis environments usually fail to identify most of these spectra, succeeding at best in identifying only one of the multiple cofragmenting peptides. We present PatternLab V (PLV), an upda...
Article
Full-text available
Cross-linking mass spectrometry (XL-MS) is a universal tool for probing structural dynamics and protein-protein interactions in vitro and in vivo. Although cross-linked peptides are naturally less abundant than their unlinked counterparts, recent experimental advances improved cross-link identification by enriching the cross-linker-modified peptide...
Article
Motivation: There are several well-established paradigms for identifying and pinpointing discriminative peptides/proteins using shotgun proteomic data; examples are peptide-spectrum matching, de novo sequencing, open searches, and even hybrid approaches. Such an arsenal of complementary paradigms can provide deep data coverage, albeit some unident...
Preprint
Cross-linking mass spectrometry (XL-MS) is a universal tool for probing structural dynamics and protein-protein interactions in vitro and in vivo. Although cross-linked peptides are naturally less abundant than their unlinked counterparts, recent experimental advances improved cross-link identification by enriching the cross-linker modified peptide...
Preprint
Upon activation, vinculin reinforces cytoskeletal anchorage during cell adhesion. Activating ligands classically disrupt intramolecular interactions between the vinculin head and tail domain that binds to actin filaments. Here, we show that Shigella IpaA triggers major allosteric changes in the head domain leading to vinculin homo-oligomerization....
Article
Motivation: Confident deconvolution of proteomic spectra is critical for several applications such as de novo sequencing, cross-linking mass spectrometry, and handling chimeric mass spectra. Results: In general, all deconvolution algorithms may eventually report mass peaks that are not compatible with the chemical formula of any peptide. We show...
Preprint
The specific functions of cellular organelles and sub-compartments depend on their protein content, which can be characterized by spatial proteomics approaches. However, many spatial proteomics methods are limited in their ability to resolve organellar sub-compartments, profile multiple sub-compartments in parallel, and/or characterize membrane-ass...
Article
Shotgun proteomics aims to identify and quantify the thousands of proteins in complex mixtures such as cell and tissue lysates and biological fluids. This approach uses liquid chromatography coupled with tandem mass spectrometry and typically generates hundreds of thousands of mass spectra that require specialized computational environments for dat...
Article
Full-text available
DM64 is a toxin-neutralizing serum glycoprotein isolated from Didelphis aurita , an ophiophagous marsupial naturally resistant to snake envenomation. This 64 kDa antitoxin targets myotoxic phospholipases A 2 , which account for most local tissue damage of viperid snakebites. We investigated the noncovalent complex formed between native DM64 and myo...
Article
Full-text available
Motivation: We present a new software-tool allowing an easy visualization of fragment ions and thus a rapid evaluation of key experimental parameters on the sequence coverage obtained for the MS/MS analysis of intact proteins. Our tool can process data obtained from various deconvolution and fragment assignment software. Results: We demonstrate...
Preprint
Full-text available
Motivation We present a new software-tool allowing an easy visualization of fragment ions and thus a rapid evaluation of key experimental parameters on the sequence coverage obtained for the MS/MS analysis of intact proteins. Our tool can deal with multiple fragmentation methods. Results We demonstrate that TDFragMapper can rapidly highlight the e...
Article
In proteomics, the identification of peptides from mass spectral data can be mathematically described as the partitioning of mass spectra into clusters (i.e., groups of spectra derived from the same peptide). The way partitions are validated is just as important, having evolved side by side with the clustering algorithms themselves and given rise t...
Article
Full-text available
Pathogen identification is crucial to confirm bacterial infections and guide antimicrobial therapy. Although MALDI-TOF mass spectrometry (MS) serves as foundation for tools that enable rapid microbial identification, some bacteria remain challenging to identify. We recently showed that top-down proteomics (TDP) could be used to discriminate closely...
Article
Software tools that allow the visualization and analysis of protein interaction networks are essential for studies in systems biology. One of the most popular network visualization tools in biology is Cytoscape, which offers a great selection of plug-ins for the interpretation of network data. Chemical cross-linking coupled to mass spectrometry (XL...
Article
Full-text available
Motivation Chemical cross-linking coupled to mass spectrometry (XLMS) emerged as a powerful technique for studying protein structures and large-scale protein-protein interactions. Nonetheless, XLMS lacks software tailored toward dealing with multiple conformers; this scenario can lead to high-quality identifications that are mutually exclusive. Thi...
Preprint
Software tools that allow visualization and analysis of protein interaction networks are essential for studies in systems biology. One of the most popular network visualization tools in biology is Cytoscape, which offers a large selection of plugins for interpretation of protein interaction data. Chemical cross-linking coupled to mass spectrometry...
Article
Full-text available
We present a high-performance app for Cytoscape to visualize cross-linking mass-spectrometry (XL-MS) data. XlinkCyNET is an open-source Java plugin that generates residue-to-residue connections provided by XL-MS in protein interaction networks. Importantly, it provides an interactive interface for the exploration of cross-links and offers various o...
Article
Full-text available
Motivation We present a high-performance software integrating shotgun with top-down proteomic data. The tool can deal with multiple experiments and search engines. Enable rapid and easy visualization, manual validation and comparison of the identified proteoform sequences including the post-translational modification characterization. Results We d...
Article
Full-text available
The current technique used for microbial identification in hospitals is MALDI-TOF MS. However, it suffers from important limitations, in particular for closely-related species or when the database used for the identification lacks the appropriate reference. In this work, we set up a LC-MS/MS top-down proteomics platform, which aims at discriminatin...
Preprint
Full-text available
Here we present a new software-tool for visualizing fragment ions and sequence coverage of intact proteins in top-down mass spectrometry. TDFragMapper combines the data arising from multiple and diverse tandem mass spectrometry experiments of intact proteins. Our tool maps fragment ions onto the protein backbone sequence and allows for a rapid comp...
Article
We present RawVegetable, a software for mass spectrometry data assessment and quality control tailored toward shotgun proteomics and cross-linking experiments. RawVegetable provides four main modules with distinct features: (A) The charge state chromatogram that independently displays the ion current for each charge state; useful for optimizing the...
Article
Strigomonas culicis is a kinetoplastid parasite of insects that maintains a mutualistic association with an intracellular symbiotic bacterium, which is highly integrated into the protist metabolism: it furnishes essential compounds and divides in synchrony with the eukaryotic nucleus. The protist, conversely, can be cured of the endosymbiont, produ...
Preprint
Full-text available
The Shigella effector IpaA co-opts the focal adhesion protein vinculin to promote bacterial invasion. Here, we show that IpaA triggers an unreported mode of vinculin activation through the cooperative binding of its three vinculin-binding sites (VBSs) leading to vinculin oligomerization via its D1 and D2 head subdomains and highly stable adhesions...
Article
Motivation: We present the first tool for unbiased quality control of top-down proteomics datasets. Our tool can select high-quality top-down proteomics spectra, serve as a gateway for building top-down spectral libraries and, ultimately, improve identification rates. Results: We demonstrate that a twofold rate increase for two E. coli top-down...
Preprint
Full-text available
Here we present a high-performance software for proteome analysis that combines different mass spectrometric approaches, such as, top-down for intact protein analyses and bottom-up, for proteolytic fragment characterization. ProteoCombiner capitalizes on the data arising from different experiments and proteomics search engines and presents the resu...
Article
We present a new module integrated into the widely adopted PatternLab for proteomics to enable analysis of isotope-labeled peptides produced using dimethyl or SILAC. The accurate quantitation of proteins lies within the heart of proteomics; dimethylation has shown to be reliable, inexpensive, and applicable to any sample type. We validate our algor...
Article
Disulfide bonds (SS) are post-translational modifications important for the proper folding and stabilization of many cellular proteins with therapeutic uses, including antibodies and other biologics. With budding advances of biologics and biosimilars, there is a mounting need for a robust method for accurate identification of SS. Even though severa...
Article
Full-text available
The data presented herein is related to the article entitled “Trypanosoma cruzi immunoproteome: calpain-like CAP5.5 differentially detected throughout distinct stages of human Chagas disease cardiomyopathy” [1]. Electrophoretic analyses under denaturing and reducing conditions indicate that covalent immobilization of human IgG to Protein G magnetic...
Article
Chagas disease, caused by the protozoan Trypanosoma cruzi, affects millions of people worldwide, especially in Latin America. Approximately 30% of the cases evolve to the chronic symptomatic stage due to cardiac and/or digestive damage, generally accompanied by nervous system impairment. Given the higher frequency and severity of clinical manifesta...
Article
Cross-linking/Mass spectrometry (XLMS) is a consolidated technique for structural characterization of proteins and protein complexes. Despite its success, the cross-linking chemistry currently used is mostly based on N-hydroxysuccinimed (NHS) esters, which react primarily with lysine residues. One way to expand the current applicability of XLMS int...
Article
Full-text available
Cross-linking coupled with mass spectrometry (XL-MS) has emerged as a powerful strategy for the identification of protein–protein interactions, characterization of interaction regions, and obtainment of structural information on proteins and protein complexes. In XL-MS, proteins or complexes are covalently stabilized with cross-linkers and digested...
Patent
A system, method, computer readable medium and device for identifying discriminant spectrum clusters including receiving known input data set comprising spectra generated from biological samples known to either have or not have a biological condition where each spectrum may be either known to have been generated from the biological samples known to...
Chapter
Top-down proteomics is an emerging technology based on the analysis of intact proteins using high-resolution mass spectrometry (MS). This chapter gives brief example, showing that top-down proteomics is capable of going beyond matrix-assisted laser desorption/ionization-time-of-flight (MALDI-TOF) for the accurate characterization of bacterial patho...
Article
Motivation: Around 75% of all mass spectra remain unidentified by widely adopted proteomic strategies. We present DiagnoProt, an integrated computational environment that can efficiently cluster millions of spectra and use machine learning to shortlist high-quality unidentified mass spectra that are discriminative of different biological conditions...
Article
Full-text available
Apolipoprotein (apo)A-I mediates many of the anti-atherogenic functions attributed to high density lipoprotein (HDL). Unfortunately, efforts toward a high-resolution structure of full-length apoA-I have not been fruitful, though there have been successes with deletion mutants. Recently, a C-terminal truncation (apoA-IΔ185-243) was crystallized as a...
Article
PatternLab for proteomics is an integrated computational environment that unifies several previously published modules for the analysis of shotgun proteomic data. The contained modules allow for formatting of sequence databases, peptide spectrum matching, statistical filtering and data organization, extracting quantitative information from label-fr...
Article
Full-text available
Background: Chromobacterium violaceum is a bacterium commonly found in tropical and subtropical regions and is associated with important pharmacological and industrial attributes such as producing substances with therapeutic properties and synthesizing biodegradable polymers. Its genome was sequenced, however, approximately 40% of its genes still...
Article
Peptide Spectrum Matching (PSM) is the current gold standard for protein identification by mass spectrometry-based proteomics. PSM compares experimental mass spectra against theoretical spectra generated from a protein sequence database to perform identification, but protein sequences not present in a database can not be identified unless their seq...
Article
Accessing localized proteomic profiles has emerged as a fundamental strategy to understand the biology of diseases, as recently demonstrated, for example, in the context of determining cancer resection margins with improved precision. Here, we analyze a gastric cancer biopsy sectioned into 10 parts, each one subjected to MudPIT analysis. We introdu...
Article
Sarcopenia describes an age-related decline in skeletal muscle mass, strength, and function that ultimately impairs metabolism, leads to poor balance, frequent falling, limited mobility, and a reduction in quality of life. Here we investigate the pathogenesis of sarcopenia through a proteomic shotgun approach. Briefly, we employed tandem mass tags...
Article
Peptide sequence matching algorithms used for peptide identification by tandem mass spectrometry (MS/MS) enumerate theoretical peptides from the database, predict their fragment ions, and match them to the experimental MS/MS spectra. Here, we present an approach for scoring MS/MS identifications based on the high mass accuracy matching of precursor...

Network

Cited By