About
84
Publications
9,689
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,917
Citations
Publications
Publications (84)
Recent advances in molecular modeling using deep learning can revolutionize our understanding of dynamic protein structures. NMR is particularly well-suited for determining dynamic features of biomolecular structures. The conventional process for determining biomolecular structures from experimental NMR data involves its representation as conformat...
Recent advances in molecular modeling of protein structures are changing the field of structural biology. AlphaFold-2 (AF2), an AI system developed by DeepMind, Inc., utilizes attention-based deep learning to predict models of protein structures with high accuracy relative to structures determined by X-ray crystallography and cryo-electron microsco...
Recent advances in molecular modeling of protein structures are changing the field of structural biology. AlphaFold-2 (AF2), an AI system developed by DeepMind, Inc., utilizes attention-based deep learning to predict models of protein structures with high accuracy relative to structures determined by X-ray crystallography and cryo-electron microsco...
NMR is a valuable experimental tool in the structural biologist’s toolkit to elucidate the structures, functions, and motions of biomolecules. The progress of machine learning, particularly in structural biology, reveals the critical importance of large, diverse, and reliable datasets in developing new methods and understanding in structural biolog...
Recent advances in molecular modeling using deep learning have the potential to revolutionize the field of structural biology. In particular, AlphaFold has been observed to provide models of protein structures with accuracies rivaling medium-resolution X-ray crystal structures, and with excellent atomic coordinate matches to experimental protein NM...
Recent advances in molecular modeling using deep learning have the potential to revolutionize the field of structural biology. In particular, AlphaFold has been observed to provide models of protein structures with accuracy rivaling medium-resolution X-ray crystal structures, and with excellent atomic coordinate matches to experimental protein NMR...
NMR is a valuable experimental tool in the structural biologist’s toolkit to elucidate the structures, functions, and motions of biomolecules. The progress of machine learning, particularly in structural biology, reveals the critical importance of large, diverse, and reliable datasets in developing new methods and understanding in structural biolog...
NMR studies can provide unique information about protein conformations in solution. In CASP14, three reference structures provided by solution NMR methods were available (T1027, T1029, and T1055), as well as a fourth data set of NMR-derived contacts for an integral membrane protein (T1088). For the three targets with NMR-based structures, the best...
NMR studies can provide unique information about protein conformations in solution. In CASP14, three reference structures provided by solution NMR methods were available (T1027, T1029, and T1055), as well as a fourth data set of NMR-derived contacts for a integral membrane protein (T1088). For the three targets with NMR-based structures, the best p...
CASP13 has investigated the impact of sparse NMR data on the accuracy of protein structure prediction. NOESY and 15N‐1H residual dipolar coupling data, typical of that obtained for 15N,13C‐enriched, perdeuterated proteins up to about 40 kDa, were simulated for 11 CASP13 targets ranging in size from 80 to 326 residues. For several targets, two predi...
Accurate protein structure determination by solution-state NMR is challenging for proteins greater than about 20 kDa, for which extensive perdeuteration is generally required, providing experimental data that are incomplete (sparse) and ambiguous. However, the massive increase in evolutionary sequence information coupled with advances in methods fo...
Cell surface molecules are important for development and function of multicellular organisms. Although several methods are available to identify ligand–receptor pairs, ELISA-based methods are particularly amenable to high-throughput screens. ELISA-based methods have high sensitivity and low false-positive rates for detecting protein–protein interac...
While 3D structure determination of small (<15 kDa) proteins by solution NMR is largely automated and routine, structural analysis of larger proteins is more challenging. An emerging hybrid strategy for modeling protein structures combines sparse NMR data that can be obtained for larger proteins with sequence co-variation data, called evolutionary...
RAS binding is a critical step in the activation of BRAF protein serine/threonine kinase and stimulation of the mitogen-activated protein kinase signaling pathway. Mutations in both RAS and BRAF are associated with many human cancers. Here, we report the solution nuclear magnetic resonance (NMR) and X-ray crystal structures of the RAS-binding domai...
Accurate determination of protein structure by NMR spectroscopy is challenging for larger proteins, for which experimental data are often incomplete and ambiguous. Evolutionary sequence information together with advances in maximum entropy statistical methods provide a rich complementary source of structural constraints. We have developed a hybrid...
ASDP is an automated NMR NOE assignment program. It uses a distinct bottom-up topology-constrained network anchoring approach for NOE interpretation, with 2D, 3D and/or 4D NOESY peak lists and resonance assignments as input, and generates unambiguous NOE constraints for iterative structure calculations. ASDP is designed to function interactively wi...
Intrinsically disordered or unstructured regions in proteins are both common and biologically important, particularly in regulation, signaling, and modulating intermolecular recognition processes. From a practical point of view, however, such disordered regions often can pose significant challenges for crystallization. Disordered regions are also d...
High-quality solution NMR structures of immunoglobulin-like domains 7 and 12 from human obscurin-like protein 1 were solved. The two domains share 30 % sequence identity and their structures are, as expected, rather similar. The new structures contribute to structural coverage of human cancer associated proteins. Mutations of Arg 812 in domain 7 ca...
High-quality solution NMR structures of three homeodomains from human proteins ALX4, ZHX1 and CASP8AP2 were solved. These domains were chosen as targets of a biomedical theme project pursued by the Northeast Structural Genomics Consortium. This project focuses on increasing the structural coverage of human proteins associated with cancer.
The 500 kDa protein plectin is essential for the cytoskeletal organization of most mammalian cells and it is up-regulated in some types of cancer. Here, we report nearly complete sequence-specific polypeptide backbone, (13)C(β) and methyl group resonance assignments for 24 kDa human plectin(4403-4606) containing the C-terminal plectin repeat domain...
For the 10th experiment on Critical Assessment of the techniques of protein Structure Prediction (CASP) the prediction target proteins were broken into independent evaluation units (EUs), which were then classified into template-based modeling (TBM) or free modeling (FM) categories. We describe here how the EUs were defined and classified, what iss...
Template Based Modeling (TBM) is a major component of the Critical Assessment of Protein Structure Prediction (CASP). In CASP10, some 41,740 predicted models submitted by 150 predictor groups were assessed as TBM predictions. The accuracy of protein structure prediction was assessed by geometric comparison with experimental X-ray crystal and NMR st...
Maximizing the scientific impact of NMR-based structure determination requires robust and statistically sound methods for assessing the precision of NMR-derived structures. In particular, a method to define a core atom set for calculating superimpositions and validating structure predictions is critical to the use of NMR-derived structures as targe...
The bacteriophage λ Q protein is a transcription antitermination factor that controls expression of the phage late genes as a stable component of the transcription elongation complex. To join the elongation complex, λQ binds a specific DNA sequence element and interacts with RNA polymerase that is paused during early elongation. λQ binds to the pau...
SecA is an intensively studied mechanoenzyme that uses ATP hydrolysis to drive processive extrusion of secreted proteins through a protein-conducting channel in the cytoplasmic membrane of eubacteria. The ATPase motor of SecA is strongly homologous to that in DEAD-box RNA helicases. It remains unclear how local chemical events in its ATPase active...
The ribosome consists of small and large subunits each composed of dozens of proteins and RNA molecules. However, the functions of many of the individual protomers within the ribosome are still unknown. In this article, we describe the solution NMR structure of the ribosomal protein RP-L35Ae from the archaeon Pyrococcus furiosus. RP-L35Ae is buried...
Despite the passage of ∼30 years since the complete primary sequence of the intermediate filament (IF) protein vimentin was reported, the structure remains unknown for both an individual protomer and the assembled filament. In this report, we present data describing the structure of vimentin linker 1 (L1) and rod 1B. Electron paramagnetic resonance...
We describe the RPF web server, a quality assessment tool for protein NMR structures. The RPF server measures the ‘goodness-of-fit’
of the 3D structure with NMR chemical shift and unassigned NOESY data, and calculates a discrimination power (DP) score, which
estimates the differences between the fits of the query structures and random coil structur...
The protocols currently used for protein structure determination by nuclear magnetic resonance (NMR) depend on the determination of a large number of upper distance limits for proton-proton pairs. Typically, this task is performed manually by an experienced researcher rather than automatically by using a specific computer program. To assess whether...
Large-scale initiatives for obtaining spatial protein structures by experimental or computational means have accentuated the need for the critical assessment of protein structure determination and prediction methods. These include blind test projects such as the critical assessment of protein structure prediction (CASP) and the critical assessment...
In this chapter, we concentrate on the production of high-quality protein samples for nuclear magnetic resonance (NMR) studies. In particular, we provide an in-depth description of recent advances in the production of NMR samples and their synergistic use with recent advancements in NMR hardware. We describe the protein production platform of the N...
Human retinoblastoma binding protein 9 (RBBP9) is an interacting partner of the retinoblastoma susceptibility protein (Rb). RBBP9 is a tumor-associated protein required for pancreatic neoplasia, affects cell cycle control, and is involved in the TGF-β signalling pathway. Sequence analysis suggests that RBBP9 belongs to the α/β hydrolase superfamily...
UNC119 is widely expressed among vertebrates and other phyla. We found that UNC119 recognized the acylated N terminus of the rod photoreceptor transducin α (Tα) subunit and Caenorhabditis elegans G proteins ODR-3 and GPA-13. The crystal structure of human UNC119 at 1.95-Å resolution revealed an immunoglobulin-like β-sandwich fold. Pulldowns and iso...
We describe the core Protein Production Platform of the Northeast Structural Genomics Consortium (NESG) and outline the strategies used for producing high-quality protein samples. The platform is centered on the cloning, expression and purification of 6X-His-tagged proteins using T7-based Escherichia coli systems. The 6X-His tag allows for similar...
The AT-rich interactive domain (ARID) of human AT-rich interactive domain-containing protein 3A (ARID3A) has been selected for structural characterization by Northeast Structural Genomics Consortium (residues 218-351 NESG ID HR4394C) as part of our Human Cancer Protein Interaction Network (HCPIN) project. Protein ARID3A belongs to the ARID family D...
Conventional NMR structure determination requires nearly complete assignment of the cross peaks of a refined NOESY peak list. Depending on the size of the protein and quality of the spectral data, this can be a time-consuming manual process requiring several rounds of peak list refinement and structure determination. Programs such as Aria, CYANA, a...
As part of efforts to develop improved methods for NMR protein sample preparation and structure determination, the Northeast Structural Genomics Consortium (NESG) has implemented an NMR screening pipeline for protein target selection, construct optimization, and buffer optimization, incorporating efficient microscale NMR screening of proteins using...
NMR spectroscopy is currently the only technique for determining the solution structure of biological macromolecules. - doi:10.1038/nmeth0909-625 This typically requires both the assignment of resonances and a labor-intensive analysis of multidimensional nuclear Overhauser effect spectroscopy (NOESY) spectra, in which peaks are matched to assigned...
Disordered or unstructured regions of proteins, while often very important biologically, can pose significant challenges for resonance assignment and three-dimensional structure determination of the ordered regions of proteins by NMR methods. In this article, we demonstrate the application of (1)H/(2)H exchange mass spectrometry (DXMS) for the rapi...
For cell regulation, E2-like ubiquitin-fold modifier conjugating enzyme 1 (Ufc1) is involved in the transfer of ubiquitin-fold modifier 1 (Ufm1), a ubiquitin like protein which is activated by E1-like enzyme Uba5, to various target proteins. Thereby, Ufc1 participates in the very recently discovered Ufm1-Uba5-Ufc1 ubiquination pathway which is foun...
We describe the proceedings and conclusions from the "Workshop on Applications of Protein Models in Biomedical Research" (the Workshop) that was held at the University of California, San Francisco on 11 and 12 July, 2008. At the Workshop, international scientists involved with structure modeling explored (i) how models are currently used in biomedi...
As a step towards better integrating protein three-dimensional (3D) structural information in cancer systems biology, the Northeast Structural Genomics Consortium (NESG) (www.nesg.org) has constructed a Human Cancer Pathway Protein Interaction Network (HCPIN) by analysis of several classical cancer-associated signaling pathways and their physical p...
The solution structure of protein AF2095 from the thermophilic archaea Archaeglobus fulgidis, a 123-residue (13.6-kDa) protein, has been determined by NMR methods. The structure of AF2095 is comprised of four α-helices and a mixed β-sheet consisting of four parallel and anti-parallel β-strands, where the α-helices sandwich the β-sheet. Sequence and...
Escherichia coli Spr is a membrane-anchored cell wall hydrolase. The solution NMR structure of the C-terminal NlpC/P60 domain of E. coli Spr described here reveals that the protein adopts a papain-like alpha+beta fold and identifies a substrate-binding cleft featuring several highly conserved residues. The active site features a novel Cys-His-His c...
Structural genomics provides an important approach for characterizing and understanding systems biology. As a step toward better integrating protein three-dimensional (3D) structural information in cancer systems biology, we have constructed a Human Cancer Pathway Protein Interaction Network (HCPIN) by analysis of several classical cancer-associate...
The ribosomal protein S17E from the archaeon Methanobacterium thermoautotrophicum is a component of the 30S ribosomal subunit. S17E is a 62-residue protein conserved in archaea and eukaryotes and has no counterparts in bacteria. Mammalian S17E is a phosphoprotein component of eukaryotic ribosomes. Archaeal S17E proteins range from 59 to 79 amino ac...
Pathways are integral to systems biology. Their classical representation has proven useful but is inconsistent in the meaning assigned to each arrow (or edge) and inadvertently implies the isolation of one pathway from another. Conversely, modern high-throughput (HTP) experiments offer standardized networks that facilitate topological calculations....
Tropomyosin is a coiled-coil protein that binds head-to-tail along the length of actin filaments in eukaryotic cells, stabilizing them and providing protection from severing proteins. Tropomyosin cooperatively regulates actin's interaction with myosin and mediates the Ca2+ -dependent regulation of contraction by troponin in striated muscles. The N-...
Protein ytfP from Escherichia coli (Swiss-Prot ID: YTFP-ECOLI; NESG target ID: ER111; Wunderlich et al., 2004) is a 113-residue member of the UPF0131 protein family (Pfam ID: PF03674) of unknown function. This domain family is found in organisms from all three kingdoms, archaea, eubacteria and eukaryotes. Using triple resonance NMR techniques, we h...
Static pictures of protein structures are so prevalent that it is easy
to forget they are dynamic molecular machines. Characterizing their
intrinsic motions may be necessary to understand how they work.
One of the most important challenges in modern protein NMR is the development of fast and sensitive structure quality assessment measures that can be used to evaluate the "goodness-of-fit" of the 3D structure with NOESY data, to indicate the correctness of the fold and accuracy of the resulting structure. Quality assessment is especially critical f...
This article formulates the multidimensional nuclear Overhauser effect spectroscopy (NOESY) interpretation problem using graph theory and presents a novel, bottom-up, topology-constrained distance network analysis algorithm for NOESY cross peak interpretation using assigned resonances. AutoStructure is a software suite that implements this topology...
Recent developments provide automated analysis of NMR assignments and three-dimensional (3D) structures of proteins. These approaches are generally applicable to proteins ranging from about 50 to 150 amino acids. In this chapter, we summarize progress by the Northeast Structural Genomics Consortium in standardizing the NMR data collection process f...
Recent developments provide automated analysis of NMR assignments and 3D structures. These approaches are generally applicable to proteins ranging from about 50 to 150 amino acids. To date, little work has focused on the specific problems associated with nucleic acid structures. As a result, there has been no community-wide consensus reached on how...
The structure of Drosophila LC8 pH-induced monomer has been determined by NMR spectroscopy using the program AutoStructure. The structure at pH 3 and 30 degrees C is similar to the individual subunits of mammalian LC8 dimer with the exception that a beta strand, which crosses between monomers to form an intersubunit beta-sheet in the dimer, is a fl...
We report NMR assignments and solution structure of the 71-residue 30S ribosomal protein S28E from the archaean Pyrococcus horikoshii, target JR19 of the Northeast Structural Genomics Consortium. The structure, determined rapidly with the aid of automated backbone resonance assignment (AutoAssign) and automated structure determination (AutoStructur...
The antibacterial peptide microcin J25 (MccJ25) inhibits bacterial transcription by binding within, and obstructing, the nucleotide-uptake channel of bacterial RNA polymerase. Published covalent and three-dimensional structures indicate that MccJ25 is a 21-residue cycle. Here, we show that the published covalent and three-dimensional structures are...
TOUCHSTONEX, a new method for folding proteins that uses a small number of long-range contact restraints derived from NMR experimental NOE (nuclear Overhauser enhancement) data, is described. The method employs a new lattice-based, reduced model of proteins that explicitly represents C(alpha), C(beta), and the sidechain centers of mass. The force f...
Determination of precise and accurate protein structures by NMR generally requires weeks or even months to acquire and interpret all the necessary NMR data. However, even medium-accuracy fold information can often provide key clues about protein evolution and biochemical function(s). In this article we describe a largely automatic strategy for rapi...
Ribosome-binding factor A (RbfA) from Escherichia coli is a cold-shock adaptation protein. It is essential for efficient processing of 16S rRNA and is suspected to interact with the 5'-terminal helix (helix I) of 16S rRNA. RbfA is a member of a large family of small proteins found in most bacterial organisms, making it an important target for struc...
Coiled coils are well-known as oligomerization domains, but they are also important sites of protein-protein interactions. We determined the NMR solution structure and backbone (15)N relaxation rates of a disulfide cross-linked, two-chain, 37-residue polypeptide containing the 34 C-terminal residues of striated muscle alpha-tropomyosin, TM9a(251-28...
Genome sequencing projects have already determined nearly complete genome sequences of several organisms, including human. The products of these genes are widely recognized as the next generation of therapeutics and targets for the development of pharmaceuticals. While identification of these genes is proceeding quickly, elucidation of their three-...
Tropomyosin is an alpha-helical coiled-coil protein that aligns head-to-tail along the length of the actin filament and regulates its function. The solution structure of the functionally important N terminus of a short 247-residue non-muscle tropomyosin was determined in an engineered chimeric protein, GlyTM1bZip, consisting of the first 19 residue...
Protein NMR spectroscopy provides an important complement to X-ray crystallography for structural genomics, both for determining three-dimensional protein structures and in characterizing their biochemical and biophysical functions.
The solution NMR structure of the RNA-binding domain from influenza virus non-structural protein 1 exhibits a novel dimeric six-helical protein fold. Distributions of basic residues and conserved salt bridges of dimeric NS1(1-73) suggest that the face containing antiparallel helices 2 and 2' forms a novel arginine-rich nucleic acid binding motif.
An expert system for determining resonance assignments from NMR spectra of proteins is described. Given the amino acid sequence, a two-dimensional 15N-1H heteronuclear correlation spectrum and seven to eight three-dimensional triple-resonance NMR spectra for seven proteins, AUTOASSIGN obtained an average of 98% of sequence-specific spin-system assi...