Automated NMR data treatment of simple and complex experimental datasets.

Poster (PDF Available) · October 2017with 87 Reads
DOI: 10.13140/RG.2.2.12673.15209
CECAM Meeting: Disordered protein segments, DOI:10.13140/RG.2.2.12673.15209
Abstract
IDPs are hub proteins that respond to a multitude of stimuli. Analysing NMR data from multi-variable experimental sets is still awkward and slow. We present here a software that automatically treats and analyses large amounts of simple, and complex, biomolecular NMR data – the Farseer-NMR program. Data from experimental series investigating different sets of variables (temperature, pH, ligand concentration, …) can be treated jointly and correlated along the possible combinations. Publication quality plots are generated on-the-fly. Currently, biomolecular NMR projects (macromolecular interactions and dynamics) generate considerable amounts of data (several peaklists with the many entries necessary to represent all protein’s residues), that have to be treated and analysed manually by the user in a repetitive, tedious and error-prone way while many data correlations are simply left unanalysed unless strictly required. This old-fashioned workflow constraints the analysis of the information contained in peaklists to be a major gap in the so desired automated NMR pipeline. We have developed Farseer-NMR with the intent to automate 1) the analysis of complex NMR data (single- to multi-variable dependence) and consequent calculation of restraints (CSPs, intensity ratios, PRE, affinity constants, …) and 2) the data representation process, we generate visual appealing and organized publication quality plots and tables containing parsed results. Farseer-NMR is written completely in Python, open source (upon publication) and hosted at GitHub. It can read the most common NMR peaklist formats: Ansig, NmrDraw, NmrView, CYANA, XEASY, Sparky and CcpNmr Analysis 2.4 via simple drag-and-drop import. The graphical interface is written using the most up-to-date version of PyQt, PyQt v5.8, and its modular code base enables facile extension. Our recent publications already feature NMR data analysed with Farseer-NMR and we are now at the stage of delivering it to the user community in a well organized and documented suite.
AUTOMATED NMR DATA TREATMENT OF SIMPLE AND COMPLEX
EXPERIMENTAL DATASETS.
João M.C. Teixeira*,a, Simon P. Skinnerb, Miguel Arbesúa, Alexander L. Breezeb, Miquel Ponsa
IDPs are hub proteins that respond to a multitude of stimuli. Analysing NMR data from multi-variable experimental sets is still awkward
and slow. We present here a software that automatically treats and analyses large amounts of simple, and complex, biomolecular NMR data
the Farseer-NMR program. Data from experimental series investigating different sets of variables (temperature, pH, ligand concentration, …)
can be treated jointly and correlated along the possible combinations. Publication quality plots are generated on-the-fly.
Currently, biomolecular NMR projects (macromolecular interactions and dynamics) generate considerable amounts of data (several
peaklists with the many entries necessary to represent all protein’s residues), that have to be treated and analysed manually by the user in a
repetitive, tedious and error-prone way while many data correlations are simply left unanalysed unless strictly required. This old-fashioned
workflow constraints the analysis of the information contained in peaklists to be a major gap in the so desired automated NMR pipeline. We
have developed Farseer-NMR with the intent to automate 1) the analysis of complex NMR data (single- to multi-variable dependence) and
consequent calculation of restraints (CSPs, intensity ratios, PRE, affinity constants, …) and 2) the data representation process, we generate
visual appealing and organized publication quality plots and tables containing parsed results. Farseer-NMR is written completely in Python, open
source (upon publication) and hosted at GitHub. It can read the most common NMR peaklist formats: Ansig, NmrDraw, NmrView, CYANA,
XEASY, Sparky and CcpNmr Analysis 2.4 via simple drag-and-drop import. The graphical interface is written using the most up-to-date version
of PyQt, PyQt v5.8, and its modular code base enables facile extension.
Our recent publications already feature NMR data analysed with Farseer-NMR [1, 2, 3] and we are now at the stage of delivering it to the
user community in a well organized and documented suite.
Abstract The NMR experimental pipeline
Consider the following example
Given a protein system P, the binding profile of the
ligand L1 was measured at five concentrations (C). The
same protein P was screened against four related
ligands (L1, L2, L3, L4) and each experiment was
repeated at three different temperatures (T1, T2, T3).
Increasing experimental complexity
pkl_ref.csv
pkl_01.csv
pkl_02.csv
pkl_03.csv
pkl_04.csv
Protein P
vs. L1
A series of 2D NMR
Experiments
pkl_ref.csv
pkl_01.csv
pkl_02.csv
pkl_03.csv
pkl_04.csv
Protein P
vs. L1
pkl_ref.csv
pkl_01.csv
pkl_02.csv
pkl_03.csv
pkl_04.csv
Protein P
vs. L2
pkl_ref.csv
pkl_01.csv
pkl_02.csv
pkl_03.csv
pkl_04.csv
Protein P
vs. L3
pkl_ref.csv
pkl_01.csv
pkl_02.csv
pkl_03.csv
pkl_04.csv
Protein P
vs. L4
Several series analysed in parallel
Protein P
vs. L1
Protein P
vs. L2
Protein P
vs. L3
Protein P
vs. L4 @298K
@288K
@278K
A 3D combination of related
experimental series
Input data:
NMR peaklists
Data Treatment and Preparation
Chemical shift differences (1H, 13C, 15N, ...)
Combined CSPs
Height ratio
Volume ratio
Data fitting
ΔPRE Analysis (Arbesú et.al. Structure, 2017)
… write your calculation here ...
CSPs(ppm)=
1
2
[
δ
H
2
+(α∗δ
N
)
2
]
α can be assigned independently for each residue type
Williamson, M. P. Prog. Nucl. Magn. Reson. Spectrosc.
73, 1–16 (2013).
1
2
3Restraint Calculation 4Restraint Plotting
4.1 Bar Plots
4.2
Publication-quality plots
8 preconfigured templates
Fully configurable plots
Invite us to implement your
favourite template
4.3 DeltaPRE Analysis and
representation
(Arbesú et.al. Structure, 2017)
5Parsed data tables
Parsed peaklists with
‘missing’ and ‘unassigned’
residue entries
Dedicated columns to
all calculated restraints
Dedicated tables
for each plot
Parsed Chimera
Attribute Files
Restraint evolution per
residue, with fitting!
Fully Featured User Interface
The Peaklist Selection tab allows the
user to create a logical organization of
the experimental peaklists according to
the experimentally measured variables.
Allows Drag&Drop from file browser or
sidebar to the experimental tree. Up to
three variables can be taken in
consideration (z, y and x). Each variable
can receive any number of data points,
where a data point is a fixed value of
that variable. Variables can be
continuous, for example, “ligand
concentration”, “temperature range”; or
discontinuous, as “protein construct”,
“ligand nature”.
The Settings tab allows the user to
have full control on the calculation
workflow and data representation. All
features can be activated or disabled
according to the user preference. Plots
are fully configurable and each routine
has its own settings submenu. All
settings can be saved and loaded
properly so that calculations can be
easily repeated which facilitates the
iterative process of peaklist correction
and calculation rerun.
aBioNMR group, Inorganic and Organic Chemistry
Department, University of Barcelona, Barcelona, Spain.
bAstbury Centre for Structural Molecular Biology, Faculty of
Biological Sciences, University of Leeds, United Kingdom.
Corresponding Author
correia.teixeira@ub.edu
joaomcteixeira@gmail.com
miguelarbesu@ub.edus.p.skinner@leeds.ac.uk A.L.Breeze@leeds.ac.uk mpons@ub.edu
From data set to publication quality
plots in minutes!
user friendly!
Get more from your data!
Res# 1-
letter
3-
letter
0 G Gly
1 M Met
2 D Asp
Assign
F1
0GlyH
1MetH
2AspH
Extracts assignment information
Reference peaklist Target peaklist
0,1,M,Met,measured,1.0,8.13693,...
1,2,D,Asp,lost,0.0,NaN,...
2,3,E,Glu,lost,0.0,NaN,...
3,4,Y,Tyr,measured,1.0,8.20178,...
4,5,S,Ser,measured,1.0,7.89991,...
0,1,M,Met,measured,1.0,8.13009,...
1,2,D,Asp,measured,1.0,8.51961,...
2,3,E,Glu,measured,1.0,8.74699,...
3,4,Y,Tyr,measured,1.0,8.19169,...
4,5,S,Ser,measured,1.0,7.90295,...
0,1,M,Met,measured,1.0,8.13693,...
3,4,Y,Tyr,measured,1.0,8.20178,...
4,5,S,Ser,measured,1.0,7.89991,...
Target peaklist with added entries
Adds entries for the missing peaks/residues
MDEYSPKRHDVAQLKFLCESLYDEGIATLGDSHHGWV
NDPTSAVNLQLNDLIEHIASFVMSFKIKYPDDGDLSELV
EEYLDDTYTLFSSYGINDPELQRWQKTKERLFRLFSGE
YISTLMKT
5,S,Ser,measured,1.0,7.90295,...
9,H,His,measured,1.0,7.47479,...
5,S,Ser,measured,1.0,7.90295,118.6998,...
6,P,Pro,not_assigned,0.0,NaN,NaN,...
7,K,Lys,not_assigned,0.0,NaN,NaN,...
8,R,Arg,not_assigned,0.0,NaN,NaN,...
9,H,His,measured,1.0,7.47479,118.26708,...
Adds entries of unassigned residues
based on a FASTA sequence file
@farseer_nmr
This research hasn't been cited in any other publications.
This research doesn't cite any other publications.