Article

Advanced Statistical and Numerical Methods for Spectroscopic Characterization of Protein Structural Evolution

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Researchers conducted investigations to develop advanced statistical and numerical methods for spectroscopic characterization of protein structural evolution. The researchers provided a a classification of methods followed by a short and a qualitative description of each of the algorithms. Each of the algorithms required a properly constructed set of spectral data that consisted of one or more spectra and is called the data set. The experiment was designed such that the sufficient number of spectra was recorded for the analysis whenever possible. Multivariate curve resolution (MCR) was described as a broad class of methods that extracted the spectra of individual components and the contributions of the components to each spectrum of a data set. The contribution of each component to the spectrum was proportional to its concentration in the experimental sample.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Therefore, sufficient prior knowledge about these parameters is required to accomplish the curve fitting analysis. More recently, multivariate curve resolutionalternating least squares (MCR-ALS) and two-dimensional correlation spectroscopy (2DCOS) have been introduced to analyze the ATR-FTIR spectra and investigate the protein-involved interactions (Schmidt and Martinez, 2016;Shashilov and Lednev, 2010). MCR-ALS allows for the mathematical resolution of concentration and spectral profiles of pure components from the raw dataset without a priori knowledge about the studied system. ...
... Hence, it has been proven to be very useful to interpret the reaction processes (Alcaraz et al., 2017;del Rio et al., 2009;Li et al., 2006). For instance, MCR-ALS has been successfully applied to resolve the intermediates in protein folding and the kinetic concentration profiles from the reaction-based chemical sensors (Alcaraz et al., 2017;Shashilov and Lednev, 2010). 2DCOS, which is based on covariance and/or correlation analysis of the external perturbation-induced variations, is able to improve the spectral resolution (Chen et al., 2019;Schmidt and Martinez, 2016). ...
... MCR-ALS analysis assumes that each spectrum is a linear combination of the independent contributions of various components and noise. Therefore, it could decompose the spectral dataset into the product of two smaller matrices containing the concentration profiles and the spectra of the corresponding components (Shashilov and Lednev, 2010). In the calculation, singular value decomposition (SVD) and evolving factor analysis (EFA) were firstly performed to confirm the number of pure components. ...
Article
Proteins are one of the major contributors to membrane fouling. The interaction between proteins and the polymer membrane at the molecular level is essential for the alleviation/prevention of membrane fouling, but remains unclear. In this work, time-dependent in-situ attenuated total reflectance Fourier transform infrared spectroscopy is applied to investigate the interaction process between two model proteins, bovine serum albumin and lysozyme, and the poly(vinylidene fluoride) (PVDF) membrane. Multivariate curve resolution-alternating least squares is integrated with two-dimensional correlation spectroscopy analysis to resolve the membrane-induced conformational changes of proteins. The multivariate curve resolution-alternating least squares analysis reveals a two-step process in the protein-membrane interaction and provides the kinetics of the conformational transition, which aids the segmentation of the spectral dataset. By applying two-dimensional correlation spectroscopy analysis to different groups of the time-dependent spectra, the sequential order of the secondary structural changes of proteins is determined. The proteins initially undergo unfolding transition to a more open, less structured state, which appears to be triggered by the hydrophobic membrane surface. Afterwards, the proteins become aggregated with the high anti-parallel β-sheet content, aggravating the membrane fouling. The conformational transition process of proteins was also confirmed by the atomic force microscopic images and quartz crystal microbalance measurement. Overall, this work provides an in-depth understanding of the interaction between proteins and the membrane surface, which is helpful for the development of membrane anti-fouling strategies.
... It generally includes aspects such as spectral variance analysis, data pretreatment, sample variance analysis, classification, spectral deconvolution and quantitative regression analysis [20,41]. Several other methods are available for these processes, including pre-processing methods (baseline correction, cosmic ray removal, smoothing, de-noising and normalization); feature extraction methods such as linear discriminant analysis (LDA) and classification methods including linear discriminant classification (LDC), and hierarchical clustering analysis (HCA) [31,42]. For more However, two well-known limitations of the Raman effect are its intrinsic weakness and the interference of fluorescence. ...
... It generally includes aspects such as spectral variance analysis, data pretreatment, sample variance analysis, classification, spectral deconvolution and quantitative regression analysis [20,41]. Several other methods are available for these processes, including pre-processing methods (baseline correction, cosmic ray removal, smoothing, de-noising and normalization); feature extraction methods such as linear discriminant analysis (LDA) and classification methods including linear discriminant classification (LDC), and hierarchical clustering analysis (HCA) [31,42]. For more extensive reviews on chemometric approaches as well as available Raman instrumentation aimed at biological samples, the reader is directed to the listed references [30,31,37,41,42]. ...
... Several other methods are available for these processes, including pre-processing methods (baseline correction, cosmic ray removal, smoothing, de-noising and normalization); feature extraction methods such as linear discriminant analysis (LDA) and classification methods including linear discriminant classification (LDC), and hierarchical clustering analysis (HCA) [31,42]. For more extensive reviews on chemometric approaches as well as available Raman instrumentation aimed at biological samples, the reader is directed to the listed references [30,31,37,41,42]. ...
Article
Full-text available
In the last decade, a number of studies have successfully demonstrated Raman spectroscopy as an emerging analytical technique for monitoring antibody aggregation, especially in the context of drug development and formulation. Raman spectroscopy is a robust method for investigating protein conformational changes, even in highly concentrated antibody solutions. It is non-destructive, reproducible and can probe samples in an aqueous environment. In this review, we focus on the application and challenges associated with using Raman spectroscopy as a tool to study antibody aggregates.
... Coupled to statistics, RS can detect gunshot residues [20,26], distinguish between blood from different species [27], and detect the presence of dyes in hair [21]. Partial least-squares discriminant analysis (PLS-DA) is a type of multivariate statistical method [28] that determines the number of significant components and, in the case of RS, the wavenumbers in the spectra which best explain the differences between different treatments, or classes [29]. ...
... We show that using a hand-held Raman spectrometer we can detect and identify urine in liquid samples, both on cotton and synthetic fabric, as well as on sweat-contaminated clothing. We also utilize multivariate data analysis to determine whether RS can be used for the quantitative detection and identification of urine [28]. Our results indicate that using PLS-DA coupled to RS allows for high accuracy prediction of the presence of urine on all studied types of fabric. ...
... Next, we used multivariate data analysis to determine whether RS can be used for the quantitative detection and identification of urine on cotton fabric [28,29]. A data set of 10 Raman spectra collected from cotton contaminated by urine and 10 reference spectra of cotton fabric were imported into PLS Toolbox 8.5.2 for statistical analysis. ...
Article
On-site detection and identification of body fluid samples at crime scenes and in prisons is critical for law enforcement. Current forensic tests for body fluids are highly specific, destructive to potential DNA evidence and time-consuming. Raman spectroscopy (RS) is a label-free, non-invasive and non-destructive analytical technique that provides information about molecular vibrations and consequently chemical structure of the analyzed specimen. These advantages make RS highly attractive for forensic applications. This study demonstrates how RS can be used for confirmatory, non-invasive and non-destructive detection and identification of urine directly on fabrics. This is very important because there have been cases of correctional officers subjected to urine from prisoners. One would envision that conclusive evidence other than eyewitness testimonies will help in potential prosecution of such cases, especially if both urine and DNA can be simultaneously detected. In this study, we show that using a handheld Raman spectrometer we can detect and identify urine in liquid samples, both on cotton and synthetic fabric, as well as on sweat-contaminated clothes. We also demonstrate that RS is capable of detection and identification of urine directly on police uniform. Finally, we show that coupling of partial least squares discriminant analysis with RS allows for high accuracy prediction of urine on all studied types of fabric.
... Their initial values were set, c f , c 1 , and c z at each L/P were calculated according to Equations (3)-(6), and then the concentration matrix C was output. The matrix of the pure component spectra S for M2 in native, monomeric, and oligomeric states was calculated using the classical least-squares approach [39]: ...
... The 2-norm of the error E = |CS − D| 2 was minimized by repeating the calculation with different initial values of K, K 1z , z, and n. The obtained solutions for the parameters were then used to calculate the fractions of M2 in the three states at each L/P using the least-squares method [39]: ...
Article
Full-text available
The antimicrobial peptide magainin 2 (M2) interacts with and induces structural damage in bacterial cell membranes. Although extensive biophysical studies have revealed the interaction mechanism between M2 and membranes, the mechanism of membrane-mediated oligomerization of M2 is controversial. Here, we measured the synchrotron-radiation circular dichroism and linear dichroism (LD) spectra of M2 in dipalmitoyl-phosphatidylglycerol lipid membranes in lipid-to-peptide (L/P) molar ratios from 0–26 to characterize the conformation and orientation of M2 on the membrane. The results showed that M2 changed from random coil to α-helix structures via an intermediate state with increasing L/P ratio. Singular value decomposition analysis supported the presence of the intermediate state, and global fitting analysis revealed that M2 monomers with an α-helix structure assembled and transformed into M2 oligomers with a β-strand-rich structure in the intermediate state. In addition, LD spectra showed the presence of β-strand structures in the intermediate state, disclosing their orientations on the membrane surface. Furthermore, fluorescence spectroscopy showed that the formation of β-strand oligomers destabilized the membrane structure and induced the leakage of calcein molecules entrapped in the membrane. These results suggest that the formation of β-strand oligomers of M2 plays a crucial role in the disruption of the cell membrane.
... An initial estimate of S is used to build an estimate for C, which after being filtered by constrains specific to C, is used to obtain a new estimate of S, which after being in turn filtered by constraints specific to S, gives a new C estimate, repeating this process until declaring convergence. 486,487 Typical constraints are nonnegativity, unimodality, closure, selectivity, or continuity. These constraints need to be informative enough to break the rotational and intensity ambiguities inherent to the decomposition of a single matrix into the product of two matrices. ...
... These constraints need to be informative enough to break the rotational and intensity ambiguities inherent to the decomposition of a single matrix into the product of two matrices. 486,487 The application of MCR to time-resolved IR data of the photocycle of bacteriorhodopsin has provided nonsensical results, 488,489 most likely due to the fact that the very informative positivity constraint cannot be applied to S when reconstructing difference spectra. MCR has been shown to be useful to remove artifacts in perfusion experiments. ...
Article
Full-text available
Infrared difference spectroscopy probes vibrational changes of proteins upon their perturbation. Compared with other spectroscopic methods, it stands out by its sensitivity to the protonation state, H-bonding, and the conformation of different groups in proteins, including the peptide backbone, amino acid side chains, internal water molecules, or cofactors. In particular, the detection of protonation and H-bonding changes in a time-resolved manner, not easily obtained by other techniques, is one of the most successful applications of IR difference spectroscopy. The present review deals with the use of perturbations designed to specifically change the protein between two (or more) functionally relevant states, a strategy often referred to as reaction-induced IR difference spectroscopy. In the first half of this contribution, I review the technique of reaction-induced IR difference spectroscopy of proteins, with special emphasis given to the preparation of suitable samples and their characterization, strategies for the perturbation of proteins, and methodologies for time-resolved measurements (from nanoseconds to minutes). The second half of this contribution focuses on the spectral interpretation. It starts by reviewing how changes in H-bonding, medium polarity, and vibrational coupling affect vibrational frequencies, intensities, and bandwidths. It is followed by band assignments, a crucial aspect mostly performed with the help of isotopic labeling and site-directed mutagenesis, and complemented by integration and interpretation of the results in the context of the studied protein, an aspect increasingly supported by spectral calculations. Selected examples from the literature, predominately but not exclusively from retinal proteins, are used to illustrate the topics covered in this review.
... In this paper we show that if the cause of the light scattering is independent from the peptide structural changes, the CD spectra can be corrected for said scattering using principal component analysis (PCA) 12 . Although PCA decomposition itself and its use to analyse titration data are well established 13,14 , the application presented here to correct and analyse light scattering in CD spectroscopy is useful to extract the maximum amount of information from such an experiment, particularly in the context of automated screening approaches. We demonstrate that this correction leads to a better estimate of the secondary structure content. ...
... Principal Component Analysis. The independence of scattering and structural changes should make it possible to decompose the spectra by principal component analysis [12][13][14] , a decomposition method based (usually) on singular value decomposition (SVD). PCA and SVD have been used extensively to decompose CD spectra into their structural constituents 6 and are well established methods for the analysis of spectroscopic data 14 , but we found no reports where PCA or other decomposition methods are used to correct for light scattering. ...
Article
Full-text available
Circular Dichroism data are often decomposed into their constituent spectra to quantify the secondary structure of peptides or proteins but the estimation of the secondary structure content fails when light scattering leads to spectral distortion. If peptide-induced liposome self-association occurs, subtracting control curves cannot correct for this. We show that if the cause of the light scattering is independent from the peptide structural changes, the CD spectra can be corrected using principal component analysis (PCA). The light scattering itself is analysed and found to be in good agreement with backscattering experiments. This method therefore allows to simultaneously follow structural changes related to peptide-liposome binding as well as peptide induced liposome self-association. We apply this method to study the structural changes and liposome binding of vectofusin-1, a transduction enhancing peptide used in lentivirus based gene therapy. Vectofusin-1 binds to POPC/POPS liposomes, causing a reversal of the negative liposome charge at high peptide concentrations. When the peptide charges exactly neutralise the lipid charges on both leaflets reversible liposome self-association occurs. These results are in good agreement with biological observations and provide further insight into the conditions required for efficent transduction enhancement.
... (2) The position and intensity of the observed modes are the sum of the contributions from each peptide bond as a function of the geometrical constraints imposed by the secondary structure within which they are found [40,41]. Thus dUVRR spectra can be readily deconvoluted to resolve the content and contribution of the different secondary α-helical, β-sheet, and disordered structural elements present [42][43][44][45][46]. Strong intensities in the amideIII 1,2 submodes (1280-1300 cm -1 and 1310-1330 cm -1 , respectively) are characteristic of canonical α-helical content. ...
... Finally, intensities in the amideIII 3 submode coupled to an increase in amideS intensity are directly proportional to polyproline II helix (PPII) and β-strand content [33,39,47,48]. Notably among these submodes, the amide III 3 submode has been shown to be dependent on the psi (ψ) dihedral angles of the peptide backbone (Fig. 1B) [44,59,60]. This correlation allows the determination of ψ-angles within ±8° by measuring the amide III 3 shift and using Equation 1 [49]: (1) (3) Not limited by macromolecular size, and as background signal from buffer, lipid, and detergent is small and easily substractable, dUVRR spectroscopy is ideally suited to measure secondary structure of membrane proteins in detergent micelles and lipid bilayers [50][51][52][53][54]. ...
Chapter
We present a new method based on deep-UV resonance Raman spectroscopy to determine the backbone conformation of intramembrane protease substrates. The classical amide vibrational modes reporting on the conformation of just the transmembrane region of the substrate can be resolved from solvent exchangeable regions outside the detergent micelle by partial deuteration of the solvent. In the presence of isotopically triple-labeled intramembrane protease, these amide modes can be accurately measured to monitor the transmembrane conformation of the substrate during intramembrane proteolysis.
... Moreover, accuracy will be affected further by the overlapping of analytes spectra. The third category of methods is known as source separation or "deconvolution" methods. 1 The source separation methods, also known as multivariate curve resolution (MCR) methods, extract concentration and spectra of individual components from multicomponent mixtures spectra [24]. ...
... Linear mixture model (LMM) is commonly used in chemometrics [24,[32][33][34][35][36][37][38][39] in general and in NMR spectroscopy in particular [32,[34][35][36][37][38]. It is the model upon which linear instantaneous BSS methods are based [25,[28][29][30][31]. Taking into account the fact that NMR signals are intrinsically time domain harmonic signals with amplitude decaying exponentially with some time constant, [49], linear mixture model in the absence of additive noise reads as: ...
Article
We introduce an improved model for sparseness-constrained nonnegative matrix factorization (sNMF) of amplitude nuclear magnetic resonance (NMR) spectra of mixtures into a greater number of component spectra. In the proposed method, the selected sNMF algorithm is applied to the square of the amplitude of the NMR spectrum of the mixture instead of to the amplitude spectrum itself. Afterwards, the square roots of separated squares of the component spectra and the concentration matrix yield estimates of the true component amplitude spectrum and of the concentration matrix. The proposed model remains linear on average when the number of overlapping components is increasing, while the model based on the amplitude spectra of the mixtures deviates from the linear one when the number of overlapping components is increased. This is demonstrated through the conducted sensitivity analysis. Thus, the proposed model improves the capability of the sparse NMF algorithms to separate correlated (overlapping) component spectra from the smaller number of mixture NMR spectra. This is demonstrated in two experimental scenarios: extraction of three correlated component spectra from two 1H NMR mixture spectra and extraction of four correlated component spectra from three COSY NMR mixture spectra. The proposed method can increase efficiency in a spectral library search by reducing the occurrence of false positives and false negatives. That, in turn, can yield better accuracy in biomarker identification studies, which makes the proposed method important for natural product research and the field of metabolic studies.
... The use of Web server-based predictors is highly attractive as supplements to experimental techniques like NMR, X-ray crystallography and FTIR because only the primary structure of the protein is required as input. Using computational approaches in combination with FTIR spectroscopy [7,13,14] has much to offer in protein secondary structure determination and structure-function relation, and it was known that studies involving FGF structure can benefit from FTIR spectroscopy [15,16]. FGF is a protein involved in distinct biological activities [12,[17][18][19][20][21][22]. ...
Article
Full-text available
Fourier Transform Infrared (FTIR) spectroscopy can provide relative proportion of secondary structure elements in a protein. However, extracting this information from the Amide I band area of an FTIR spectrum is difficult. In addition to experimental methods, several protein secondary structure prediction algorithms serving on the Web can be used as supplementary tools requiring only protein amino acid sequences as inputs. In addition, web-server based docking tools can provide structure information when proteins are mixed and potentially interacting. Accordingly, we aimed to utilize web-server based structure predictors in fibroblast growth factor (FGF) protein structure determination through the FTIR data. Seven such predictors were selected and tested on basic FGF (bFGF) protein, to predict FGF secondary structure. Results were compared to available structure-files deposited in the Protein Data Bank (PDB). Then, FTIR spectra of bFGF and the acidic form of the protein with 50 folds more bovine serum albumin as carrier protein (1FGFA/50BSA) were collected. Optimized Amide I curve-fit parameters of bFGF with low (<5) root mean square deviation (RMSD) in the PDB data and the predictions were obtained. Those parameters were applied in curve-fitting of 1FGFA/50BSA data. Secondary structure was inspected also through applying models derived from the previously established methods. Results of model-based secondary structure estimation from FTIR data were compared with secondary structure calculated as 1 part contribution from 1FGFA/1BSA complex and 49 parts contribution from BSA. Complex structure was obtained through docking. RMSD in the PDB data and the predictions were respectively 3.05 and 2.39 with the optimized parameters. Those parameters did not work well for the 1FGFA/50BSA data. Models are better in this case, wherein one model (Model-1') with the lowest average RMSD has 8.38 RMSD in the bFGF and 4.78 RMSD in the 1FGFA/50BSA structures. Model-based secondary structure predictions are better for determining bFGF and 1FGFA/50BSA secondary structures through the curve-fit approach that we followed, under non-optimal conditions like protein/BSA mixtures. Web servers can assist experimental studies investigating structures with unknown structures. Any web-based structure prediction supporting the experimental results would be enforcing the findings, but the unsupported results would not necessarily falsify the experimental data.
... Yan et al. [10] combined terahertz spectroscopy and support vector machines for regression (SVR) to identify individual components. Shashilov et al. [11] provided a classification of advanced statistical and numerical methods that can be used for qualitative spectroscopic characterization. ...
Article
Full-text available
A mid-infrared absorption-based laser sensor is developed for selective and simultaneous benzene, toluene, ethylbenzene, and xylenes (BTEX) measurements under ambient conditions. The sensor is based on a distributed feedback inter-band cascade laser emitting near 3.3 µm. Wavelength tuning and deep neural networks were employed to differentiate the broadband absorbance of BTEX species. The sensor was validated with gas mixtures and real-time measurements were demonstrated at a temporal resolution of 1 s. Minimum detection limits for BTEX in air are 8, 20, 5, and 46 ppm, respectively. This sensor can be utilized to monitor BTEX emissions in the petrochemical, rubber, and paint industries to avoid hazardous health effects.
... Partial least squares discriminant analysis (PLS-DA) is a frequently used classification method for spectroscopic data (Shashilov and Lednev, 2010). It combines linear discriminant analysis (LDA) with the multicollinearity tolerance of partial least squares. ...
Article
Full-text available
A growing body of evidence suggests that Raman spectroscopy (RS) can be used for diagnostics of plant biotic and abiotic stresses. RS can be also utilized for identification of plant species and their varieties, as well as assessment of the nutritional content and commercial values of seeds. The power of RS in such cases to a large extent depends on chemometric analyses of spectra. In this work, we critically discuss three major approaches that can be used for advanced analyses of spectroscopic data: summary statistics, statistical testing and chemometric classification. On the example of Raman spectra collected from roses, we demonstrate the outcomes and the potential of all three types of spectral analyses. We anticipate that our findings will help to design the most optimal spectral processing and preprocessing that is required to achieved the desired results. We also expect that reported collection of results will be useful to all researchers who work on spectroscopic analyses of plant specimens.
... Therefore, international cooperation among research groups interested in source apportionment studies is highly recommended either for database expansion or validation. Lednev, 2010;Zhang et al., 2016). Recently, an on-line tool named DeltaSA, which was initially developed by Pernigotti and Belis (2018) to assess source apportionment model performance in inter-comparison exercises in Europe (Belis et al. ...
Article
To identify sources of pollutants, source apportionment studies are conducted using receptor models. Although a receptor model such as Positive Matrix Factorization (PMF) can effectively retrieve factor profiles, the necessary step to link these profiles with actual sources is rather subjective and might be inconsistent due to lack of harmonization in the literatures or databases of source fingerprints. To address this gap, this study developed a numerical method integrating distance- and probability-based profile matching approaches to objectively identify apportioned factor profiles, thus facilitating source identification and enhancing the comparability of the interpretation results. The applicability of this method was evaluated with profile data derived from the U.S. Environmental Protection Agency's SPECIATE database. Results showed that the matching accuracy of each category using the probability-based profile matching approach was greater than 80% except for diesel emissions (61%), demonstrating the potential of using numerical methods for factor identification. In addition, the integrated multi-step method is feasible for identifying mixed source profiles with the matching accuracy >85% after post-hoc grouping. This approach would be more effective with a harmonized database of source fingerprints. Therefore, international cooperation among research groups interested in source apportionment studies is highly recommended either for database expansion or validation.
... Moreover, this must be based on the premise that each component in the mixture has at least one characteristic absorption peak [20]. Subsequently, an effective solution is source separation by utilizing machine learning methods, which extract the characteristic spectra of individual components from multicomponent-mixture spectra [21]. Particularly, blind source separation (BSS) methods [22][23] based on multivariate data analysis methods are competent for blind (unsupervised) extraction of components from multicomponentmixture spectra. ...
Article
Preservatives are universally used in synergistic combination to enhance antimicrobial effect. Identify compositions and quantify components of preservatives are crucial steps in quality monitoring to guarantee merchandise safety. In the work, three most common preservatives, sorbic acid, potassium sorbate and sodium benzoate, are deliberately mixed in pairs with different mass ratios, which are supposed to be the “unknown” multicomponent systems and measured by terahertz (THz) time-domain spectroscopy. Subsequently, three major challenges have been accomplished by machine learning methods in this work. The singular value decomposition (SVD) effectively obtains the number of components in mixed preservatives. Then, the component spectra are successfully extracted by non-negative matrix factorization (NMF) and self-modeling mixture analysis (SMMA), which match well with the measured THz spectra of pure reagents. Moreover, the support vector machine for regression (SVR) designed an underlying model to the target components and simultaneously identify contents of each individual component in validation mixtures with decision coefficient R² = 0.989. By taking advantages of the fingerprint-based THz technique and machine learning methods, our approach has been demonstrated the great potential to be served as a useful strategy for detecting preservative mixtures in practical applications.
... Partial least squares discriminant analysis (PLS-DA) was performed to differentiate between the experimental classes and identify spectral regions that best explained separation between the classes. Our own experimental results, as well as findings reported by other groups, show that PLS-DA performs equally well or better than other chemometric methods, such as linear discriminant analysis (LDA) or soft independent modeling by class analogy (SIMCA) Lee et al., 2018;Sanchez, Pant, Irey, et al., 2019;Sanchez, Pant, Mandadi, & Kurouski, 2020;Shashilov & Lednev, 2010). Therefore, we have selected PLS-DA for statistical analysis of spectra collected in this study. ...
Article
Full-text available
Water deficit and salinity are two major abiotic stresses that have tremendous effect on crop yield worldwide. Timely identification of these stresses can help limit associated yield loss. Confirmatory detection and identification of water deficit stress can also enable proper irrigation management. Traditionally, unmanned aerial vehicle (UAV)-based imaging and satellite-based imaging, together with visual field observation, are used for diagnostics of such stresses. However, these approaches can only detect salinity and water deficit stress at the symptomatic stage. Raman spectroscopy (RS) is a noninvasive and nondestructive technique that can identify and detect plant biotic and abiotic stress. In this study, we investigated accuracy of Raman-based diagnostics of water deficit and salinity stresses on two greenhouse-grown peanut accessions: tolerant and susceptible to water deficit. Plants were grown for 76 days prior to application of the water deficit and salinity stresses. Water deficit treatments received no irrigation for 5 days, and salinity treatments received 1.0 L of 240-mM salt water per day for the duration of 5-day sampling. Every day after the stress was imposed, plant leaves were collected and immediately analyzed by a hand-held Raman spectrometer. RS and chemometrics could identify control and stressed (either water deficit or salinity) susceptible plants with 95% and 80% accuracy just 1 day after treatment. Water deficit and salinity stressed plants could be differentiated from each other with 87% and 86% accuracy, respectively. In the tolerant accessions at the same timepoint, the identification accuracies were 66%, 65%, 67%, and 69% for control, combined stresses, water deficit, and salinity stresses, respectively. The high selectivity and specificity for presymptomatic identification of abiotic stresses in the susceptible line provide evidence for the potential of Raman-based surveillance in commercial-scale agriculture and digital farming.
... There are numerous supervised chemometric methods, including: soft independent modelling of class analogies (SIMCA), partial least squares discriminant analysis (PLS-DA), partial least squares regression (PLSR), and linear discriminant analysis (LDA). Recently reported review by Shashilov and Lednev suggest that all supervised methods perform equivalently well in prediction of the spectral classes [35]. ...
Article
Full-text available
Our civilization has to enhance food production to feed world’s expected population of 9.7 billion by 2050. These food demands can be met by implementation of innovative technologies in agriculture. This transformative agricultural concept, also known as digital farming, aims to maximize the crop yield without an increase in the field footprint while simultaneously minimizing environmental impact of farming. There is a growing body of evidence that Raman spectroscopy, a non-invasive, non-destructive, and laser-based analytical approach, can be used to: (i) detect plant diseases, (ii) abiotic stresses, and (iii) enable label-free phenotyping and digital selection of plants in breeding programs. In this review, we critically discuss the most recent reports on the use of Raman spectroscopy for confirmatory identification of plant species and their varieties, as well as Raman-based analysis of the nutrition value of seeds. We show that high selectivity and specificity of Raman makes this technique ideal for optical surveillance of fields, which can be used to improve agriculture around the world. We also discuss potential advances in synergetic use of RS and already established imaging and molecular techniques. This combinatorial approach can be used to reduce associated time and cost, as well as enhance the accuracy of diagnostics of biotic and abiotic stresses.
... Incorrect usage of these processing steps can have a serious impact on the reliability of data [45,46]. Post data collection spectral processing such as second derivative analysis is commonly used for quantification of proteins, and can combat the slightly changing backgrounds between different samples, as well as to accentuate spectral features [47][48][49][50]. Researchers have proposed slightly altered techniques such as simultaneous fitting of absorption and second derivative spectra [51], and have conducted comparability studies on the most effective second derivative techniques [52]. ...
Article
Attenuated Total Reflection Fourier Transform Infrared (ATR-FTIR) spectroscopy is a label-free, non-destructive technique that can be applied to a vast range of biological applications, from imaging cancer tissues and live cells, to determining protein content and secondary structure composition. This review summarises the recent advances in applications of ATR-FTIR spectroscopy to biopharmaceuticals, the application of this technique to biosimilars, and the current uses of FTIR spectroscopy in biopharmaceutical production. We discuss the use of ATR-FTIR spectroscopic imaging to investigate biopharmaceuticals, and finally, give an outlook on the possible future developments and applications of ATR-FTIR spectroscopy and spectroscopic imaging to this field. Throughout the review comparisons will be made between FTIR spectroscopy and alternative analytical techniques, and areas will be identified where FTIR spectroscopy could perhaps offer a better alternative in future studies. This review focuses on the most recent advances in the field of using ATR-FTIR spectroscopy and spectroscopic imaging to characterise and evaluate biopharmaceuticals, both in an industrial and an academic research based environment.
... In general, the main drawbacks of many previous methods for Raman spectral classification with a large number of classes are manual tuning during training and testing, and preprocessing and feature engineering, as well as over-fitting problem. A list of machine learning techniques used for spectroscopy can be found in a study by (Shashilov & Lednev, 2010). (Acquarelli et al., 2017;Carey et al., 2015;Fan et al., 2019;Jinchao Liu et al., 2017) Partial least square regression (PLSR) (Arrobas et al., 2015;Wold et al., 2001) Multiple linear regression (LR) (Galvão et al., 2013) Support vector machines for regression (H. ...
Preprint
Raman spectroscopy is a powerful analytical tool with applications ranging from quality control to cutting edge biomedical research. One particular area which has seen tremendous advances in the past decade is the development of powerful handheld Raman spectrometers. They have been adopted widely by first responders and law enforcement agencies for the field analysis of unknown substances. Field detection and identification of unknown substances with Raman spectroscopy rely heavily on the spectral matching capability of the devices on hand. Conventional spectral matching algorithms (such as correlation, dot product, etc.) have been used in identifying unknown Raman spectrum by comparing the unknown to a large reference database. This is typically achieved through brute-force summation of pixel-by-pixel differences between the reference and the unknown spectrum. Conventional algorithms have noticeable drawbacks. For example, they tend to work well with identifying pure compounds but less so for mixture compounds. For instance, limited reference spectra inaccessible databases with a large number of classes relative to the number of samples have been a setback for the widespread usage of Raman spectroscopy for field analysis applications. State-of-the-art deep learning methods (specifically convolutional neural networks CNNs), as an alternative approach, presents a number of advantages over conventional spectral comparison algorism. With optimization, they are ideal to be deployed in handheld spectrometers for field detection of unknown substances. In this study, we present a comprehensive survey in the use of one-dimensional CNNs for Raman spectrum identification. Specifically, we highlight the use of this powerful deep learning technique for handheld Raman spectrometers taking into consideration the potential limit in power consumption and computation ability of handheld systems.
... However, only the accurate application of such statistical methods prohibits from misinterpretation of the results, which might lead to erroneous understanding otherwise. 18 This particularly refers to wellestablished fitting procedures, which often appear to be poorly documented in scientific articles. Hence, the provision of comprehensive numerical information necessary for the reproduction of such an analysis is indispensable. ...
Article
Full-text available
A comprehensive molecular analysis of a simple aqueous complexing system - U(VI) acetate - selected to be independently investigated by various spectroscopic (vibrational, luminescence, X-ray absorption, and nuclear magnetic resonance spectroscopy) and quantum chemical methods was achieved by an international round-robin test (RRT). Twenty laboratories from six different countries with a focus on actinide or geochemical research participated and contributed to this scientific endeavor. The outcomes of this RRT were considered on two levels of complexity: first, within each technical discipline, conformities as well as discrepancies of the results and their sources were evaluated. The raw data from the different experimental approaches were found to be generally consistent. In particular, for complex setups such as accelerator-based X-ray absorption spectroscopy, the agreement between the raw data was high. By contrast, luminescence spectroscopic data turned out to be strongly related to the chosen acquisition parameters. Second, the potentials and limitations of coupling various spectroscopic and theoretical approaches for the comprehensive study of actinide molecular complexes were assessed. Previous spectroscopic data from the literature were revised and the benchmark data on the U(VI) acetate system provided an unambiguous molecular interpretation based on the correlation of spectroscopic and theoretical results. The multimethodologic approach and the conclusions drawn address not only important aspects of actinide spectroscopy but particularly general aspects of modern molecular analytical chemistry.
... Next, we applied multivariate data analysis [47] to determine whether RS could be used for highly accurate diagnostics of HLB and ND on grapefruit and orange leaves [48]. The loading plot (Fig. 2) and misclassification table (Tables 2, 3, 4, and 5) were then generated using this final model, which contained 3 predictive components, 4 orthogonal components, and 1651 original wavenumbers. ...
Article
Full-text available
Huanglongbing (HLB) or citrus greening is a devastating disease of citrus trees that is caused by the gram-negative Candidatus Liberibacter spp. bacteria. The bacteria are phloem limited and transmitted by the Asian citrus psyllid, Diaphorina citri, and the African citrus psyllid, Trioza erytreae, which allows for a wider dissemination of HLB. Infected trees exhibit yellowing of leaves, premature leaf and fruit drop, and ultimately the death of the entire plant. Polymerase chain reaction (PCR) and antibody-based assays (ELISA and/or immunoblot) are commonly used methods for HLB diagnostics. However, they are costly, time-consuming, and destructive to the sample and often not sensitive enough to detect the pathogen very early in the infection stage. Raman spectroscopy (RS) is a noninvasive, nondestructive, analytical technique which provides insight into the chemical structures of a specimen. In this study, by using a handheld Raman system in combination with chemometric analyses, we can readily distinguish between healthy and HLB (early and late stage)-infected citrus trees, as well as plants suffering from nutrient deficits. The detection rate of Raman-based diagnostics of healthy vs HLB infected vs nutrient deficit is ~ 98% for grapefruit and ~ 87% for orange trees, whereas the accuracy of early- vs late-stage HLB infected is 100% for grapefruits and ~94% for oranges. This analysis is portable and sample agnostic, suggesting that it could be utilized for other crops and conducted autonomously. Graphical abstract
... 17−19 The typical protein DUVRR spectrum is dominated by amide bands, which characterize the polypeptide backbone conformation, and aromatic amino acid bands, which report on their local environment. 19 The DUVRR spectra of both pH 2.0 and pH 6.0 HET-s (218−289) prion fibrils show sharp, intense Amide I and II bands as well as a high intensity of the C α −H band that are indicative of an extended β-sheet conformation ( Figure 3). It was found that the positions and relative intensities of all bands remained the same, which shows that the secondary cross-β core structure of HET-s (218−289) prion fibrils does not change upon reversal of the filament supramolecular chirality. ...
Article
Amyloid fibril polymorphism is not well understood despite its potential importance for biological activity and associated toxicity. Controlling the polymorphism of mature fibrils including their morphology and supramolecular chirality by post-fibrillation changes in local environment is the subject of this study. Specifically, the effect of pH on the stability and dynamics of HET-s (218-289) prion fibrils has been determined through the use of vibrational circular dichroism (VCD), deep UV resonance Raman and fluorescence spectroscopies. It was found that a change in solution pH causes deprotonation of Asp and Glu amino acid residues on the surface of HET-s (218-289) prion fibrils and triggers rapid transformation of one supramolecular chiral polymorph into another. This process involves changes in higher-order arrangements like lateral filament and fibril association and their supramolecular chirality, while the fibril cross-β core remains intact. This work suggests a hypothetical mechanism for HET-s (218-289) prion fibril refolding and proposes that the interconversion between fibril polymorphs driven by the solution environment change is a general property of amyloid fibrils.
Article
Protein-membrane interactions play an important role in various biological phenomena, such as material transport, demyelinating diseases, and antimicrobial activity. We combined vacuum-ultraviolet circular dichroism (VUVCD) spectroscopy with theoretical (e.g., molecular dynamics and neural networks) and polarization experimental (e.g., linear dichroism and fluorescence anisotropy) methods to characterize the membrane interaction mechanisms of three soluble proteins (or peptides). α1 -Acid glycoprotein has the drug-binding ability, but the combination of VUVCD and neural-network method revealed that the membrane interaction causes the extension of helix in the N-terminal region, which reduces the binding ability. Myelin basic protein (MBP) is an essential component of the myelin sheath with a multi-layered structure. Molecular dynamics simulations using a VUVCD-guided system showed that MBP forms two amphiphilic and three non-amphiphilic helices as membrane interaction sites. These multivalent interactions may allow MBP to interact with two opposing membrane leaflets, contributing to the formation of a multi-layered myelin structure. The antimicrobial peptide magainin 2 interacts with the bacterial membrane, causing damage to its structure. VUVCD analysis revealed that the M2 peptides assemble in the membrane and turn into oligomers with a β-strand structure. Linear dichroism and fluorescence anisotropy suggested that the oligomers are inserted into the hydrophobic core of the membrane, disrupting the bacterial membrane. Overall, our findings demonstrate that VUVCD and its combination with theoretical and polarization experimental methods pave the way for unraveling the molecular mechanisms of biological phenomena related to protein-membrane interactions.
Article
Raman spectroscopy has been widely used to provide the structural fingerprint for molecular identification. Due to interference from coexisting components, noise, baseline, and systematic differences between spectrometers, component identification with Raman spectra is challenging, especially for mixtures. In this study, a method entitled DeepRaman has been proposed to solve those problems by combining the comparison ability of a pseudo-Siamese neural network (pSNN) and the input-shape flexibility of spatial pyramid pooling (SPP). DeepRaman was trained, validated, and tested with 41,564 augmented Raman spectra from two databases (pharmaceutical material and S.T. Japan). It can achieve 96.29% accuracy, 98.40% true positive rate (TPR), and 94.36% true negative rate (TNR) on the test set. Another six data sets measured on different instruments were used to evaluate the performance of the proposed method from different aspects. DeepRaman can provide accurate identification results and significantly outperform the hit quality index (HQI) method and other deep learning models. In addition, it performs well in cases of different spectral complexity and low-content components. Once the model is established, it can be used directly on different data sets without retraining or transfer learning. Furthermore, it also obtains promising results for the analysis of surface-enhanced Raman spectroscopy (SERS) data sets and Raman imaging data sets. In summary, it is an accurate, universal, and ready-to-use method for component identification in various application scenarios.
Article
Full-text available
A transcriptional regulatory system called heat shock response (HSR) has been developed in eukaryotic cells to maintain proteome homeostasis under various stresses. Heat shock factor-1 (Hsf1) plays a central role in HSR, mainly by upregulating molecular chaperones as a transcription factor. Hsf1 forms a complex with chaperones and exists as a monomer in the resting state under normal conditions. However, upon heat shock, Hsf1 is activated by oligomerization. Thus, oligomerization of Hsf1 is considered an important step in HSR. However, the lack of information about Hsf1 monomer structure in the resting state, as well as the structural change via oligomerization at heat response, impeded the understanding of the thermosensing mechanism through oligomerization. In this study, we applied solution biophysical methods, including fluorescence spectroscopy, nuclear magnetic resonance, and circular dichroism spectroscopy, to investigate the heat-induced conformational transition mechanism of Hsf1 leading to oligomerization. Our study showed that Hsf1 forms an inactive closed conformation mediated by intramolecular contact between leucine zippers (LZs), in which the intermolecular contact between the LZs for oligomerization is prevented. As the temperature increases, Hsf1 changes to an open conformation, where the intramolecular LZ interaction is dissolved so that the LZs can form intermolecular contacts to form oligomers in the active form. Furthermore, since the interaction sites with molecular chaperones and nuclear transporters are also expected to be exposed in the open conformation, the conformational change to the open state can lead to understanding the regulation of Hsf1-mediated stress response through interaction with multiple cellular components.
Article
Here, we report a new phenomenon in which lysozyme fibrils formed in a solution of acetic acid spontaneously refold to a different polymorph through a disassembled intermediate upon the removal of acetic acid. The structural changes were revealed and characterized by deep-UV resonance Raman spectroscopy, nonresonance Raman spectroscopy, intrinsic tryptophan fluorescence spectroscopy, and atomic force microscopy. A PPII-like structure with highly solvent-exposed tryptophan residues predominates the intermediate aggregates before refolding to polymorph II fibrils. Furthermore, the disulfide (SS) bonds undergo significant rearrangements upon the removal of acetic acid from the lysozyme fibril environment. The main SS bond conformation changes from gauche-gauche-trans in polymorph I to gauche-gauche-gauche in polymorph II. Changing the hydrophobicity of the fibril environment was concluded to be the decisive factor causing the spontaneous refolding of lysozyme fibrils from one polymorph to another upon the removal of acetic acid. Potential biological implications of the discovered phenomenon are discussed.
Article
Nanoparticles and protein bioconjugates have been studied for multiple biomedical applications. We sought to investigate the interaction and structural modifications of bovine serum albumin (BSA) with iron oxide nanoparticles (IONPs). The IONPs were green synthesized using E. crassipes aqueous leaf extract following characterization using transmission electron microscopy, energy dispersive X-ray analysis and X-Ray Diffraction. Two different concentrations of native/glycated albumin (0.5 and 1.5 mg/ml) with IONPs were allowed to interact for 1 h at 37 °C. Glycation markers, protein modification markers, cellular antioxidant, and hemolysis studies showed structural modifications and conformational changes in albumin due to the presence of IONPs. UV–Visible absorbance resulted in hyperchromic and bathochromic effects of IONPs-BSA conjugates. Fluorescence measurements of tyrosine, tryptophan, advanced glycated end products, and ANS binding assay were promising and quenching effects proved IONPs-BSA conjugate formation. In FTIR of BSA-IONPs, transmittance was increased in amide A and B bands while decreased in amide I and II bands. In summary, native PAGE, HPLC, and FTIR analysis displayed a differential behaviour of IONPs with native and glycated BSA. These results provided an understanding of the interaction and structural modifications of glycated and native BSA which may provide fundamental repercussions in future studies.
Article
Expression of heterologous genes in Escherichia coli is a routine technology for recombinant protein production, but the predictable recovery of properly folded and uniformly bioactive material remains a challenge. Misfolded proteins typically accumulate as insoluble inclusion bodies, and a variety of strategies have been employed in efforts to increase the yield of soluble product. One technique is the overexpression of E. coli protein chaperones during recombinant protein induction, in an effort to increase the folding capacity of the bacterial host. We have developed an alternative approach, by supplementing the host protein folding machinery with chaperones from other species. Extremophiles have evolved under conditions (extremes of temperature, salinity, pressure, and/or pH) that make them attractive candidates for possessing chaperones with novel folding activities. The green fluorescent protein (GFP) of Aequorea victoria, which is predominantly insoluble under typical recombinant expression culture conditions, was employed as an in vivo indicator of protein folding activity for chaperone homologs from a variety of extremophiles. For a subset of the chaperones tested, co-expression with GFP promoted an increase in both fluorescence signal intensity as well as the amount of GFP recovered in the soluble protein fraction. Several archaeal chaperones were also found to be able to refold soluble Lyt_Orn C40 peptidase from inclusion bodies in vitro. In particular, Pf Cpn(MA), a mutant chaperonin which exhibited significant refolding activity, is also shown to deconstruct the morphology and structure of inclusion bodies (Kurouski et al., 2012). Hence, the simple and rapid GFP assay provides a tool to screen for extremophilic chaperones that exhibit folding activity under E. coli growth conditions, and suggests that increasing the repertoire of heterologous chaperones might provide a partial but general solution to the problem of recombinant protein insolubility.
Article
Full-text available
Identification of organic acid is crucial indicator for field of environment, healthcare and industry, which involves the time‐consuming and labor‐intensive progress. The simultaneous detection for complex mixture becomes a new target when large‐scale instruments is the mainstream measurement approach. With the burst development of machine learning in solving complex problem, the progress is being made for quick detection of mixtures. The next challenge is to provide sufficient data to satisfy the need of algorithm. In this work, we proposed the modified spectrometer technology combined with the deep learning to quantify the mixed organic acids. Organic acids interact with light in the various background to map the information in the form of images. Deep learning could establish the relationship between images and concentrations. According to the result, trained convolutional neural network could achieve the goal of simultaneous measurement of organic acids, which speeds up the implementation in the parallel detection and suggests new twists in the field of chemistry.
Article
Full-text available
Raman spectroscopy (RS) is an emerging analytical technique that can be used to develop and deploy precision agriculture. RS allows for confirmatory diagnostic of biotic and abiotic stresses on plants. Specifically, RS can be used for Huanglongbing (HLB) diagnostics on both orange and grapefruit trees, as well as detection and identification of various fungal and viral diseases. The questions that remain to be answered is how early can RS detect and identify the disease and whether RS is more sensitive than qPCR, the “golden standard” in pathogen diagnostics? Using RS and HLB as case study, we monitored healthy (qPCR-negative) in-field grown citrus trees and compared their spectra to the spectra collected from healthy orange and grapefruit trees grown in a greenhouse with restricted insect access and confirmed as HLB free by qPCR. Our result indicated that RS was capable of early prediction of HLB and that nearly all in-field qPCR-negative plants were infected by the disease. Using advanced multivariate statistical analysis, we also showed that qPCR-negative plants exhibited HLB-specific spectral characteristics that can be distinguished from unrelated nutrition deficit characteristics. These results demonstrate that RS is capable of much more sensitive diagnostics of HLB compared to qPCR.
Article
Due to its capability for high-throughput screening 1H nuclear magnetic resonance (NMR) spectroscopy is commonly used for metabolite research. The key problem in 1H NMR spectroscopy of multicomponent mixtures is overlapping of component signals and that is increasing with the number of components, their complexity and structural similarity. It makes metabolic profiling, that is carried out through matching acquired spectra with metabolites from the library, a hard problem. Here, we propose a method for nonlinear blind separation of highly correlated components spectra from a single 1H NMR mixture spectra. The method transforms a single nonlinear mixture into multiple high-dimensional reproducible kernel Hilbert Spaces (mRKHSs). Therein, highly correlated components are separated by sparseness constrained nonnegative matrix factorization in each induced RKHS. Afterwards, metabolites are identified through comparison of separated components with the library comprised of 160 pure components. Thereby, a significant number of them are expected to be related with diabetes type 2. Conceptually similar methodology for nonlinear blind separation of correlated components from two or more mixtures is presented in the Supplementary material. Single-mixture blind source separation is exemplified on: (i) annotation of five components spectra separated from one 1H NMR model mixture spectra; (ii) annotation of fifty five metabolites separated from one 1H NMR mixture spectra of urine of subjects with and without diabetes type 2. Arguably, it is for the first time a method for blind separation of a large number of components from a single nonlinear mixture has been proposed. Moreover, the proposed method pinpoints urinary creatine, glutamic acid and 5-hydroxyindoleacetic acid as the most prominent metabolites in samples from subjects with diabetes type 2, when compared to healthy controls.
Article
Full-text available
Raman spectroscopy is widely used as a fingerprint technique for molecular identification. However, Raman spectra contain molecular information from multiple components and interferences from noise and instrumentation. Thus, component identification using Raman spectra is still challenging, especially for mixtures. In this study, a novel approach entitled deep learning-based component identification (DeepCID) was proposed to solve this problem. Convolution neural network (CNN) models were established to predict the presence of components in mixtures. Comparative studies showed that DeepCID could learn spectral features and identify components in both simulated and real Raman spectral datasets of mixtures with higher accuracy and significantly lower false positive rates. In addition, DeepCID showed better sensitivity when compared with the logistic regression (LR) with L1-regularization, k-nearest neighbor (kNN), random forest (RF) and back propagation artificial neural network (BP-ANN) models for ternary mixture spectral datasets. In conclusion, DeepCID is a promising method for solving the component identification problem in the Raman spectra of mixtures.
Article
Full-text available
Rapid detection and identification of crop pathogens is essential for improving crop yield. Typical pathogen assaying methods, such as polymerase chain reaction (PCR) or enzyme-linked immunosorbent assay (ELISA), are time-consuming and destructive to the sample. Raman spectroscopy (RS) is a non-invasive non-destructive analytical technique which provides insight on the chemical structure of the specimen. In this study, we demonstrate that using a handheld Raman spectrometer, in combination with chemometric analyses, we can distinguish between healthy and diseased maize (Zea mays) kernels, as well as between different diseases with 100% accuracy. Our analysis is portable and sample-agnostic, suggesting that it could be retooled for other crops and conducted autonomously.
Article
Full-text available
Protein misfolding and aggregation is a key attribute of different neurodegenerative diseases. Misfolded and aggregated proteins are intrinsically disordered and rule out structure based drug design. The comprehensive characterization of misfolded proteins and associated aggregation pathway is prerequisite to develop therapeutics for neurodegenerative diseases caused due to the protein aggregation. Visible protein aggregates used to be the final stage during aggregation mechanism. The structural analysis of intermediate steps in such protein aggregates will help us to discern the conformational role and subsequently involved pathways. The structural analysis of protein aggregation using various biophysical methods may aid the improved therapeutics for protein misfolding and aggregation related neurodegenerative diseases. In this mini review, we have summarized different spectroscopic methods such as fluorescence spectroscopy, circular dichroism (CD), nuclear magnetic resonance (NMR) spectroscopy, Fourier transform infrared spectroscopy (FTIR), and Raman spectroscopy for structural analysis of protein aggregation. We believe that the understanding of invisible intermediate of misfolded proteins and the key steps involved during protein aggregation mechanisms may advance the therapeutic approaches for targeting neurological diseases that are caused due to misfolded proteins.
Article
In this paper, we demonstrate a new promising resonance Raman (RR)-based method for the determination of Fe3+ concentrations in aqueous solutions. Iron ions were quantified at a low concentration range by employing hydroxylamine hydrochloride as the reductant, and phenanthroline as the complexing agent, thereby reducing Fe3+ to Fe2+. The addition of Fe3+ to the detection reagent resulted in a rapid color change from colorless to orange-red, together with an obvious new RR band appearing at 1459 cm(-1). Herein, the RR intensity of the phenanthroline-Fe2+ complex strengthened with increasing Fe3+ concentration, which was identified from the variation of the Raman spectra. Therefore, we successfully detected Fe3+ at lower concentrations using the proposed method, illustrating its great potential for the detection of Fe3+ with abundant RR fingerprint information. More importantly, the proposed method exhibited a wide liner range from 0.05 to 10 mu g/mL.
Article
8-Methoxypsoralen (8-MOP) is a naturally occurring furanocoumarin with various biological activities. However, there is little information on the binding mechanism of 8-MOP with trypsin. Here, the interaction between 8-MOP and trypsin in vitro was determined by multi-spectroscopic methods combined with the multivariate curve resolution-alternating least squares (MCR–ALS) chemometrics approach. An expanded UV–vis spectral data matrix was analysed by MCR–ALS, the concentration profiles and pure spectra for the three reaction species (trypsin, 8-MOP and 8-MOP–trypsin) were obtained to monitor the interaction between 8-MOP and trypsin. The fluorescence data suggested that a static type of quenching mechanism occurred in the binding of 8-MOP to trypsin. Hydrophobic interaction dominated the formation of the 8-MOP–trypsin complex on account of the positive enthalpy and entropy changes, and trypsin had one high affinity binding site for 8-MOP with a binding constant of 3.81 × 10⁴ L mol− 1 at 298 K. Analysis of three dimensional fluorescence, UV–vis absorption and circular dichroism spectra indicated that the addition of 8-MOP induced the rearrangement of the polypeptides carbonyl hydrogen-bonding network and the conformational changes in trypsin. The molecular docking predicted that 8-MOP interacted with the catalytic residues His57, Asp102 and Ser195 in trypsin. The binding patterns and trypsin conformational changes may result in the inhibition of trypsin activity. This study has provided insights into the binding mechanism of 8-MOP with trypsin.
Chapter
Representative results showing the position and strength of two-dimensional (2D) correlation spectroscopy in protein research are surveyed in this article. Special emphasis is placed on infrared spectroscopy. Different types of external perturbations that are particularly useful for exploring properties of proteins are discussed. Most promising developments in 2D correlation spectroscopy are demonstrated through results obtained for protein systems. The aim of this article has been to present 2D correlation spectroscopy as a simple method that significantly improves information about protein structure gained from infrared spectra.
Article
Regression and classification chemometrical algorithms were used to achieve effective discrimination of pure body fluids from their binary mixtures. Raman spectra of dried blood, semen, and their mixtures in different ratios, collected in an automatic mapping manner, were used as a model system. Mixtures of blood and semen in different ratios were prepared. Each microscope slide was covered with a piece of aluminum foil, which has a small Raman and fluorescence profile, and a 10 μL sample was deposited on the foil. The entire data set was formed from spectra recorded by a Renishaw inVia confocal Raman spectrometer equipped with a Leica microscope with a 50× objective and a Prior Scientific automatic stage. A 785-nm laser beam was used for excitation. The lowest concentrations that were detected were 5% of blood in semen and approximately 1% of semen in blood stains. Lower concentrations could be detected, but the accuracy of detection decreased significantly.
Article
Discovering effective solutions to cope with the problems associated with human health issues necessitates the use of state of the art techniques in the fields of medicinal, industrial, and service-providing applications. Proteins are undoubtedly the work-horse of biological systems and play vital roles in a wide variety of important processes. Thus monitoring of proteins in cells and tissues using vibrational spectroscopy can be valuable for diagnosis and screening. Vibrational spectroscopy is particularly attractive as it can be used to probe proteins in complex systems including cells, tissues, biofluids and even whole organisms without the need for potentially perturbing probe molecules. Thanks to developments in instrumentation and data processing tools, it provides the researchers and laboratory technicians with relative ease to overcome the hurdles associated with the biological specimen preparations for protein research and handling of the data that is collected from a large number of samples and a huge variety of sources. In this chapter, besides experimental techniques and methods used in protein screening, applications of vibrational spectroscopy to different biological systems will be discussed.
Article
Deep UV resonance Raman spectroscopy is a powerful technique for probing the structure and formation mechanism of protein fibrils, which are traditionally difficult to study with other techniques owing to their low solubility and noncrystalline arrangement. Utilizing a tunable deep UV Raman system allows for selective enhancement of different chromophores in protein fibrils, which provides detailed information on different aspects of the fibrils' structure and formation. Additional information can be extracted with the use of advanced data treatment such as chemometrics and 2D correlation spectroscopy. In this chapter we give an overview of several techniques for utilizing deep UV resonance Raman spectroscopy to study the structure and mechanism of formation of protein fibrils. Clever use of hydrogen-deuterium exchange can elucidate the structure of the fibril core. Selective enhancement of aromatic amino acid side chains provides information about the local environment and protein tertiary structure. The mechanism of protein fibril formation can be investigated with kinetic experiments and advanced chemometrics.
Article
Full-text available
Amyloid fibrils are β-sheet rich protein aggregates that are strongly associated with various neurodegenerative diseases. Raman spectroscopy has been broadly utilized to investigate protein aggregation and amyloid fibril formation and has shown to be capable of revealing changes in secondary and tertiary structure at all stages of fibrillation. When coupled with atomic force (AFM) and scanning electron (SEM) microscopies, Raman spectroscopy becomes a powerful spectroscopic approach that can investigate structural organization of amyloid fibril polymorphs. In this review, we discuss the applications of Raman spectroscopy, a unique, label-free and non-destructive technique, for the structural characterization of amyloidogenic proteins, prefibrilar oligomers, and mature fibrils.
Article
A combination of analytical instrumentation and multivariate statistics is widely applied to improve in-line process monitoring. Currently, post combustion CO2 capture (PCC) technology often involves the use of multi-amine based chemical reagents for carbon dioxide removal from flue gas. The CO2 capture efficiency and overall process performance may be improved by introduction of the chemometrics analytical methods for flexible and reliable process monitoring. In this study, six variables were measured (conductivity, pH, density, speed of sound, refractive index, and near-infrared absorbance spectra). A compact data-collecting chemometric setup was constructed and installed at an industrial pilot plant for real-case testing. This setup was applied to the characterisation of CO2 absorption into aqueous 2-amino-2-methyl-1-propanol (AMP) activated by piperazine (PZ) as the absorption agent. A partial least squares (PLS) regression model was calibrated and validated based on the measurements conducted in the laboratory environment. The developed approach was applied to predict the concentrations of AMP, PZ, and CO2 with accuracies of ± 2.1%, ± 3.5%, and ± 4.3%, respectively. The model was constructed to be temperature independent in order to make it insensitive to operational temperature fluctuations during a CO2 capture process. The setup and model have been tested for almost 850 hours of in-line measurements at a post-combustion CO2 capture pilot plant. To provide validation of the chemometrics approach, an off-line analysis of the samples has been conducted. The results of the validation techniques benchmarking appear to be consistent with values predicted in-line, with average deviations of ± 1.8%, ± 1.3%, and ± 3.9% for the concentrations of AMP, PZ, and CO2, respectively.
Article
Determination of protein secondary structure (α-helical, β-sheet, and disordered motifs) has become an area of great importance in biochemistry and biophysics as protein secondary structure is directly related to protein function and protein related diseases. While NMR and X-ray crystallography can predict the placement of each atom in a protein to within an angstrom, optical methods (i.e. CD, Raman, and IR) are the preferred techniques for rapid evaluation of protein secondary structure content. Such techniques require calibration data to predict unknown protein secondary structure content where accuracy may be improved with the application of multivariate analysis. Here, a comparison of the protein secondary structure predictions obtained from multivariate analysis of ultraviolet resonance Raman (UVRR) and circular dichroism (CD) spectroscopic data using classical least squares (CLS), partial least squares (PLS), and multivariate curve resolution-alternating least squares (MCR-ALS) is made. Results of the multivariate analysis suggest that CD measurements provide more accurate prediction of protein α-helical content whereas UVRR more accurately predicts β-sheet content, an observation that is consistent with previous studies. Based on this analysis, it is suggested that the best approach to rapid and accurate protein secondary structure determination is to combine both CD and UVRR spectroscopic data.
Article
Recent decades have witnessed the development of techniques able to obtain information at the submolecular level and their applications especially on the biomedical field. As an example, the stimulating work of studying the role of conformational dynamics in reaction mechanisms has resulted in numerous advances in life sciences. Interferometry belongs to a family of techniques in which waves, usually electromagnetic, are superimposed in order to gain information about them. Although the complexity of the optical technology generally requires the use of sophisticated equipment, DPI employs a simplified slab waveguide interferometer, where the reference layer is located under the sensing waveguide. Slab waveguides are structures with a planar geometry, which guide light in only one transverse direction as lateral modes become effectively infinite. An important advantage of this configuration is the absence of scattering between transverse and lateral modes.
Article
A fundamental assumption in biology is that protein structure is more conserved than protein sequence. This opinion stems from observations of the fold distribution of protein structures present in the protein data bank (PDB), where homologous proteins are found to display the same fold. However, the set of proteins in PDB is not representative of protein sequence space. The predominant experimental method used for the structural characterization of the proteins in PDB is X-ray crystallography. Thus, conformational flexibility is lost and a static snapshot of a protein’s dynamic structure remains. Using structural disorder as a proxy for protein dynamism, it seems that at least 30% of eukaryotic proteins have dynamic regions, that is, regions with conformational flexibility. Little is known about the conservation of structurally disordered regions, except that they often evolve at an elevated rate compared to the structured regions. Our preliminary results indicate that there are regions that show fast evolutionary dynamics of structural disorder. A feature of structurally disordered proteins is their functional promiscuity; these proteins often interact with many different biomolecules in a signalling fashion. A central tenet of protein structure and function is that a protein’s function is given by its structure. Structurally disordered proteins exist as conformational ensembles and hence, it is intuitive that a functional ensemble would accompany the structural ensemble. The different conformations in the ensemble are rapidly interconverting over a shallow free energy landscape with multiple minima of similarly low stability. In accordance with the extended view of allostery (conformational selection), stabilizing interactions with other biomolecules can drive the population of conformations towards a particular functional conformation. Further, coupling the conformational ensemble free energy landscape to an adaptive fitness landscape for gene/protein function, we are investigating how proteins with conformational and functional ensembles evolve. Here, a study across flaviviruses is presented. Flaviviruses depend on conformational flexibility at many steps in their life cycle and the entire flavivirus proteome contains about 3400 amino acid long polyprotein that is spliced into 11–12 proteins. Therefore, full genome studies are feasible. We find support for mutation-driven conformational selection, as amino acid substitutions can increase the rate of divergence of conformational flexibility among different homologous proteins. Some regions are more prone to shift from order to disorder or vice versa, and some lineages show rapid shifts from order to disorder. Lineage-specific shift between disorder and order can alter the conformational ensemble for the same protein in different species, causing a subtle functional change. Thus, rapid evolutionary dynamics of structural disorder could be a potential driving force for biological divergence among flaviviruses.
Article
Amyloid fibrils formed by peptides found in semen have been shown to enhance HIV infectivity in vitro. The first of these peptides to be identified was the 248–286 fragment of prostatic acid phosphatase (PAP248–286) (Munich et al., 2007). PAP248–286 is highly cationic, and its fibrils might facilitate infection by decreasing the electrostatic repulsion between the negatively charged surfaces of the virus and the target cell. Whereas PAP248–286 can easily form fibrils in seminal fluid, it needs rapid agitation in other environments, and certain ions have been shown to be critical for its assembly into fibrils (Olsen et al., 2012). However, mutation of the positively charged residues to alanine results in a peptide (PAP248–286Ala) that can more easily form fibrilar aggregates. We studied PAP248–286 and PAP248–286Ala fibril formation in water and water + NaCl environments. While PAP248-286Ala can efficiently form fibrils in both water and water + NaCl, PAP248-286 can only do so in a water + NaCl solution. The inability of PAP248–286 to form fibrils in water could be due solely to repulsion between the positively charged peptides, an effect that might be diminished by the presence of salt. However, it is also possible that the explanation lies in PAP248–286’s failure to populate conformations that can easily lead to ordered aggregates. To answer this question, using molecular dynamics simulations, we characterized the ensemble of conformations populated by the two peptides in water and water + NaCl environments. The results indicate that PAP248-286Ala favors contacts that stabilize a strand-turn-strand, or β-arch, motif around P31, the only proline residue in the sequence. Because β-arches are a common feature in amyloid fibrils, and because it is very unlikely that a proline residue would be in any position other than the β-arch, we expect the formation of this motif to be the rate-limiting step in PAP248–286Ala / PAP248–286 fibril formation. Moreover, the contacts stabilizing the β-arch would bring positively charged residues into contact in PAP248–286, which, consistent with the experimental results, would be facilitated by the presence of negative ions. To summarize, we have tried to understand if the inability of PAP248–286 to efficiently form fibrils in water is only due to a slower aggregation caused by electrostatic repulsion between the positively charged peptides. Our data suggest that this effect is also due to electrostatic repulsion between the residues within each monomeric peptide, which prevents PAP248–286 from populating conformations that would lead to ordered aggregates.
Article
Amyloid fibrils are associated with many neurodegenerative diseases. All known amyloids including pathogenic and nonpathogenic forms display functional and structural heterogeneity (polymorphism) which determines the level of their toxicity. Despite a significant biological and biomedical importance, the nature of the amyloid fibril polymorphism remains elusive. We utilized for the first time three most advanced vibrational techniques to probe the core, the surface, and supramolecular chirality of fibril polymorphs. A new type of folding, aggregation phenomenon, spontaneous refolding from one polymorph to another, was discovered (Kurouski, Lauro et al., 2010). Hydrogen–deuterium exchange deep UV resonance Raman spectroscopy (Oladepo, Xiong et al., 2012) combined with advanced statistical analysis (Shashilov & Lednev, 2010) allowed for structural characterization of the highly ordered cross-β core of amyloid fibrils. We reported several examples showing significant variations in the core structure for fibril polymorphs. Amyloid fibrils are generally composed of several protofibrils and may adopt variable morphologies, such as twisted ribbons or flat-like sheets. We discovered the existence of another level of amyloid polymorphism, namely, that associated with fibril supramolecular chirality. Two chiral polymorphs of insulin, which can be controllably grown by means of small pH variations, exhibit opposite signs of vibrational circular dichroism (VCD) spectra (Kurouski, Dukor et al. 2012). VCD supramolecular chirality is correlated not only by the apparent fibril handedness but also by the sense of supramolecular chirality from a deeper level of chiral organization at the protofilament level of fibril structure. A small pH change initiates spontaneous transformation of insulin fibrils from one polymorph to another. As a result, fibril supramolecular chirality overturns both accompanying morphological and structural changes (Kurouski, Dukor et al. 2012). No conventional methods could probe the fibril surface despite its significant role in the biological activity. We utilized tip-enhanced Raman spectroscopy (TERS) to characterize the surface structure of an individual fibril due to a high depth and lateral spatial resolution of the method in the nanometer range (Kurouski, Deckert-Gaudig et al. 2012). It was found that the surface is strongly heterogeneous and consists of clusters with various protein conformations and amino acid composition.
Article
The portable even handheld Raman spectrometer can provide adequate spectral resolution for materials identification in situ. Raman spectra are information rich but not easy to interpret, especially for the spectra of mixtures. The ability to identify components in the mixture is, therefore, of considerable interest and challenge. In this study, a promising solution for rapid mixture identification was developed with the assistance of handheld Raman spectrometer, spectral database and chemometrics. The Raman spectra of raw materials commonly used in the pharmaceutical industry have been acquired under suitable situation and inserted into the spectral database. Classic reverse searching procedure has been modified according to the features of Raman spectrum based on automatic and accurate peak detection in the wavelet spaces. The match quality can be calculated by counting the negative ratio in the subtractive spectrum between mixture and database (scaled by the minimal ratio of the reversely matched peaks). On the basis of the modified reverse searching and non-negative least square (RSearch-NNLS), a practical method has been proposed in this study for mixture analysis. The results show that proposed RSearch procedure is superior for identifying compound in the mixture than the method based on correlation coefficient. The employment of non-negative least square can further refine searching results and estimate ratios of the compounds in the mixture. The proposed RSearch-NNLS method may be a promising procedure for solving mixture analysis problem of Raman spectra for some applications.
Article
The cellular and matrix cues that induce stem cell differentiation into distinct cell lineages must be identified to permit the ex vivo expansion of desired cell populations for clinical applications. Combinatorial biomaterials enable screening multiple different microenvironments while using small numbers of rare stem cells. New methods to identify the phenotypes of individual cells in cocultures with location specificity would increase the efficiency and throughput of these screening platforms. Here, we demonstrate that partial least-squares discriminant analysis (PLS-DA) models of calibration Raman spectra from cells in pure cultures can be used to identify the lineages of individual cells in more complex culture environments. The calibration Raman spectra were collected from individual cells of four different lineages, and a PLS-DA model that captured the Raman spectral profiles characteristic of each cell line was created. The application of these models to Raman spectra from test sets of cells indicated individual, fixed and living cells in separate monocultures, as well as those in more complex culture environments, such as cocultures, could be identified with low error. Cells from populations with very similar biochemistries could also be identified with high accuracy. We show that these identifications are based on reproducible cell-related spectral features, and not spectral contributions from the culture environment. This work demonstrates that PLS-DA of Raman spectra acquired from pure monocultures provides an objective, noninvasive, and label-free approach for accurately identifying the lineages of individual, living cells in more complex coculture environments.
Article
Full-text available
Determination of risedronate sodium (RIS) in its commercial tablet formulations in the presence of the effect of excipients was performed by derivative spectrophotometry and continuous wavelet transform. No preliminary separation step was used for the quantitative analysis by the proposed methods. Firstly direct absorbance measurement (DAM) method was applied to the analysis of RIS in samples, but this method did not give desirable results for the analysis of the commercial tablet samples. For this reason, the signal processing methods, first derivative spectrophotometry (DS), Morlet and Biorthogonal2.8 continuous wavelet transforms (Morlet-CWT and Bior2.8-CWT, respectively) were subjected to the quantitative resolution of the samples containing RIS and tablet excipients without using any separation step. These methods were found to be suitable for the analysis of the related drug. Calibration graphs were obtained by using the relationships between first derivative, CWT signals and concentration. The linearity ranges of DS and CWT methods were found to be 0.8-120 mu g/mL, 3.0-100.0 mu g/mL respectively and good correlations were observed for the calibration equations. The proposed methods were validated and applied to the determination of RIS in pharmaceutical preparations. The experimental results obtained by the proposed methods were statistically compared with each other and with those obtained by spectrophotometric method reported in literature. In the statistical analysis, no significant difference was found between the assay results. It was observed that the DS, Morlet-CWT and Bior2.8-CWT methods were accurate, sensitive, precise, rugged and useful for the quality control of RIS in commercial pharmaceutical samples without the interference of the excipients.
Article
Full-text available
Lack of reliable methods for accurate estimation of protein secondary structure from infrared spectra of proteins is a major barrier in its widespread use in protein secondary structure characterisation. Here we report a method for protein secondary structure estimation, from FTIR spectra of proteins, based on a multi‒layer feed‒forward neural network approach using an enhanced “resilient backpropagation” learning algorithm. The method utilises a database consisting of infrared spectra of 18 proteins, with known X‒ray structure, as the reference set. Our study revealed that providing the neural network analysis with only part of the amide I region from empirically determined structure sensitive regions in combination with appropriate pre‒processing of the spectral data produced the best overall results. This lead to a standard error of prediction (SEP) of 4.47% for α‒helix, an SEP of 6.16% for β‒sheet, and an SEP of 4.61% for turns. Compared to a previous factor analysis study by Lee et al., using the same set of 18 FTIR spectra of proteins, the error in prediction of α‒helix and β‒sheet was improved by 3.33% and 3.54% respectively, with minor increase for turns by 0.31%. Generally, our neural network analysis achieved comparable, in most cases even better prediction accuracy than most of the alternative pattern recognition based methods that were previously reported indicating the significant potential of this approach.
Article
Full-text available
We demonstrate the Bayesian spectral analysis approach for analyzing neutron scattering molecular tunneling data. It is a generalized form of model fitting, which is appropriate when the number of parameters to be optimized is not known. Specifically, it addresses the question of how many excitation lines there is evidence for in the data. We review the theory of Bayesian spectral analysis relevant to our particular application, describe an efficient algorithm for its implementation, and illustrate its use with both simulated and real data. We believe that this powerful method of analysis will be a very useful tool in experimental molecular spectroscopy.
Article
Full-text available
Secondary structure of proteins have been predicted using neural networks (NN) from their Fourier transform infrared spectra. Leave-one-out approach has been used to demonstrate the applicability of the method. A form of cross-validation is used to train NN to prevent the overfitting problem. Multiple neural network outputs are averaged to reduce the variance of predictions. The networks realized have been tested and rms errors of 7.7% for α-helix, 6.4% for β-sheet and 4.8% for turns have been achieved. These results indicate that the methodology introduced is effective and estimation accuracies are in some cases better than those previously reported in the literature.
Article
Full-text available
The problem of the non-robustness of the classical estimates in the setting of the quadratic and linear discriminant analysis has been addressed by many authors: Todorov et al. [19, 20], Chork and Rousseeuw [1], Hawkins and McLachlan [4], He and Fung [5], Croux and Dehon [2], Hubert and Van Driessen [6]. To obtain high breakdown these methods are based on high breakdown point estimators of lo-cation and covariance matrix like MVE, MCD and S. Most of the authors use also one step re-weighting after the high breakdown point estimation in order to obtain increased efficiency. We propose to use M-iteration as described by Woodruff and Rocke [22] instead, since this is the preferred means of achieving efficiency with high breakdown. Further we experiment with the pairwise class of algorithms proposed by Maronna and Zamar [10] which were not used up to now in the context of dis-criminant analysis. The available methods for robust linear discriminant analysis are compared on two real data sets and on a large scale simulation study. These methods are implemented as R functions in the package for robust multivariate analysis rrcov .
Article
Full-text available
We describe an automated spectro-goniometer ASG that rapidly measures the spectral hemispherical-directional reflectance factor HDRF of snow in the field across the wavelength range 0.42.5 m. Few measurements of snow's HDRF exist in the literature, in part caused by a lack of a portable instrument capable of rapid, repeatable sampling. The ASG is a two-link spherical robot coupled to a field spectroradiometer. The ASG is the first revolute joint and first automated field goniometer for use over snow and other smooth surfaces. It is light enough 50 kg to be portable in a sled by an individual. The ASG samples the HDRF at arbitrary angular resolution and 0.5 Hz sampling rate. The arm attaches to the fixed-point frame 0.65 m above the surface. With vertical and oblique axes, the ASG places the sensor of the field spectroradiometer at any point on the hemisphere above a snow target. In standard usage, the ASG has the sun as the illumination source to facilitate in situ measurements over fragile surfaces not easily transported to the laboratory and to facilitate simultaneous illumination conditions for validation and calibration of satellite retrievals. The kinematics of the ASG is derived using Rodrigues' formula applied to the 2 degree-of-freedom arm. We describe the inverse kinematics for the ASG and solve the inverse problem from a given view angle to the necessary rotation about each axis. Its two-dimensional hemispheric sampling space facilitates the measurement of spectral reflectance from snow and other relatively smooth surfaces into any direction. The measurements will be used to validate radiative transfer model results of directional reflectance and to validate/calibrate directional satellite measurements of reflectance from these smooth surfaces. © 2003 American Institute of Physics.
Article
Stabilization of Protein Structure Kenneth P. Murphy Protein Stabilization by Naturally Occurring Osmolytes D. Wayne Bolen The Thermodynamic Linkage Between Protein Structure, Stability, and Function Ernesto Freire Measuring the Conformational Stability of a Protein by Hydrogen Exchange Beatrice M.P. Huyghues-Despointes, C. Nick Pace, S. Walter Englander, and J. Martin Scholtz Modeling the Native State Ensemble Vincent J. Hilser Conformational Entropy in Protein Folding: A Guide to Estimating Conformational Entropy via Modeling and Computation Trevor P. Creamer Turn Scanning: Experimental and Theoretical Approaches to the Role of Turns Carl Frieden, Enoch S. Huang, and Jay W. Ponder Laser Temperature-Jump Methods for Studying Folding Dynamics James Hofrichter Kinetics of Conformational Fluctuations by EX1 Hydrogen Exchange in Native Proteins T. Sivaraman and Andrew D. Robertson Molecular Dynamics Simulations of Protein Unfolding/Folding Valerie Daggett
Article
Support Vector Machines Basic Methods of Least Squares Support Vector Machines Bayesian Inference for LS-SVM Models Robustness Large Scale Problems LS-SVM for Unsupervised Learning LS-SVM for Recurrent Networks and Control.
Chapter
Fourier transform infrared spectroscopy is now becoming an important method to study protein secondary structure. The amide I region of protein infrared spectrum is the widely used region, and the amide III region has been comparatively neglected due to their low signal. Since there is no water interference in amide III region, and more importantly, the different secondary structures of proteins have more significant differences in their amide III spectra, it is quite promising to use amide III region to determine protein secondary structure. In our current study, partial least-squares (PLS) method was used to predict protein secondary structures from protein IR spectra. The IR spectra of H20 solutions of 9 different proteins of known crystal structure have been recorded, and amide I, amide III, and amide I combined with amide III region of these proteins were used to set up the calibration set for PLS algorithm. Our results correlate quite well with the data from X-ray studies, and the prediction from amide III region is better than that from amide I or combined amide I and amide III regions.
Article
Background: The aim of this study was to investigate the potential of infrared (IR) spectroscopy as a fast and reagent-free adjunct tool in the diagnosis and screening of β-thalassemia. Methods: Blood was obtained from 56 patients with β-thalassemia major, 1 patient with hemoglobin H disease, and 35 age-matched controls. Hemolysates of blood samples were centrifuged to remove stroma. IR absorption spectra were recorded for duplicate films dried from 5 μL of hemolysate. Differentiation between the two groups of hemoglobin spectra was by two statistical methods: an unsupervised cluster analysis and a supervised linear discriminant analysis (LDA). Results: The IR spectra revealed changes in the secondary structure of hemoglobin from β-thalassemia patients compared with that from controls, in particular, a decreased α-helix content, an increased content of parallel and antiparallel β-sheets, and changes in the tyrosine ring absorption band. The hemoglobin from β-thalassemia patients also showed an increase in the intensity of the IR bands from the cysteine −SH groups. The unsupervised cluster analysis, statistically separating spectra into different groups according to subtle IR spectral differences, allowed separation of control hemoglobin from β-thalassemia hemoglobin spectra, based mainly on differences in protein secondary structure. The supervised LDA method provided 100% classification accuracy for the training set and 98% accuracy for the validation set in partitioning control and β-thalassemia samples. Conclusion: IR spectroscopy holds promise in the clinical diagnosis and screening of β-thalassemia.
Article
This paper summarizes our recent activities to develop analytical spectroscopic tools for high-throughput screening of combinatorial materials libraries and the adaptation of the developed techniques for measurements on more traditional scales. The development strategies are presented, followed by the review of requirements for monitoring of chemical reactions in combinatorial and scaled-up reactors. Examples are provided to detail development of multivariate data-analysis methods for prediction of properties of combinatorial materials, determination of contributing factors to combinatorial-scale chemical reactions, and high-throughput optimization of process parameters.
Article
Independent component analysis (ICA) method was applied as a processing step for Raman spectra. 136 Raman spectra were acquired from urine samples from 18 subjects. Each spectrum was acquired from different sample. 785nm, 100mW (at sample) laser with 2048 element linear silicon TE cooled CCD were used. In order to separate information of glucose, creatinine, urea nitrogen, uric acid and invaluable information from the urine spectrum, ICA by Maximum Likelihood (ML) fast fixed-point estimation algorithm was applied. By looking for maximum likelihood, independent information could be separated from the urine spectra. Among separated information, high frequency noise which could be generated by ambient noise and low frequency noise which contain information of baseline shift were observed. Additionally, peak information of each component was observed. The processing time was very short because fast fixed point algorithm was added to ML estimation method. Before applying ICA, all spectra were mean centered in order to enhance the peak information. In addition, all spectra were pre-processed to have unit variance in order to shorten calculation time. This first study about applying ICA suggested that this algorithm can be used as a pattern recognition algorithm to extract information from Raman spectra. Additionally, because ICA can provide information with statistical independency sufficiently, further studies about ICA which can substitute PCA will be performed.
Article
Amyloid fibrils are associated with many neurodegenerative diseases. The application of conventional biophysical techniques including solution NMR and X‐ray crystallography for structural characterization of fibrils is limited because they are neither crystalline nor soluble. The Bayesian approach was utilized for extracting the deep UV resonance Raman (DUVRR) spectrum of the lysozyme fibrillar β‐sheet based on the hydrogen‐deuterium exchange spectral data. The problem was shown to be unsolvable when using blind source separation or conventional chemometrics methods because of the 100% correlation of the concentration profiles of the species under study. Information about the mixing process was incorporated by forcing the columns of the concentration matrix to be proportional to the expected concentration profiles. The ill‐conditioning of the matrix was removed by concatenating it to the diagonal matrix with entries corresponding to the known pure spectra (sources). Prior information about the spectral features and characteristic bands of the spectra was taken into account using the Bayesian signal dictionary approach. The extracted DUVRR spectrum of the cross‐β sheet core exhibited sharp bands indicating the highly ordered structure. Well resolved sub‐bands in Amide I and Amide III regions enabled us to assign the fibril core structure to anti‐parallel β‐sheet and estimate the amide group facial angle Ψ in the cross‐β structure. The elaborated Bayesian approach was demonstrated to be applicable for studying correlated biochemical processes.
Article
The alpha1-beta2 subunit contacts in the half-ligated hemoglobin A (Hb A) have been explored with ultraviolet resonance Raman (UVRR) spectroscopy using the Ni-Fe hybrid Hb under various solution conditions. Our previous studies demonstrated that Trpbeta37, Tyralpha42, and Tyralpha140 are mainly responsible for UVRR spectral differences between the complete T (deoxyHb A) and R (COHb A) structures [Nagai, M., Wajcman, H., Lahary, A., Nakatsukasa, T., Nagatomo, S., and Kitagawa, T. (1999) Biochemistry, 38, 1243-1251]. On the basis of it, the UVRR spectra observed for the half-ligated alpha(Ni)beta(CO) and alpha(CO)beta(Ni) at pH 6.7 in the presence of IHP indicated the adoption of the complete T structure similar to alpha(Ni)beta(deoxy) and alpha(deoxy)beta(Ni). The extent of the quaternary structural changes upon ligand binding depends on pH and IHP, but their characters are qualitatively the same. For alpha(Ni)beta(Fe), it is not until pH 8.7 in the absence of IHP that the Tyr bands are changed by ligand binding. The change of Tyr residues is induced by binding of CO, but not of NO, to the (x heme, while it was similarly induced by binding of CO and NO to the,8 heme. The Trp bands are changed toward R-like similarly for alpha(Ni)beta(CO) and alpha(CO)beta(Ni), indicating that the structural changes of Trp residues are scarcely different between CO binding to either the alpha or beta heme. The ligand induced quaternary structural changes of Tyr and Trp residues did not take place In a concerted way and were different between alpha(Ni)beta(CO) and alpha(CO)beta(Ni). These observations directly indicate that the phenomenon occurring at the alpha1-beta2 interface is different between the ligand binding to the alpha and beta hemes and is greatly influenced by IHP. A plausible mechanism of the intersubunit communication upon binding of a ligand to the alpha or beta subunit to the other subunit and its difference between NO and CO as a ligand are discussed.
Article
A mathematical technique for the identification of components in the near-infrared spectra of liquid mixtures without any prior chemical information is demonstrated. Originally, the technique was developed for searching mid-infrared spectral libraries. It utilizes principal component analysis to generate an orthonormal reference library and to compute the projections or scores of a mixture spectrum onto the principal space spanned by the orthonormal set. Both library and mixture spectra are analyzed and processed in Fourier domain to enhance the searching performance. A calibration matrix is calculated from library cores and is used to predict the mixture composition. Five liquid mixtures were correctly identified with the use of the calibration algorithm, whereas only one mixture was correctly characterized with a straight dot-product metric. The predictions were verified with the use of an adaptive filter to remove each of the resulting components from the library and the mixture spectra. In addition, a similarity index between the original mixture spectrum and a regenerated mixture spectrum is used as a final confirmation of the predictions. The effects of random noise on the searching method were also examined, and further enhancements of searching performance are suggested for identifying poor-quality mixture spectra.
Article
Recently, a new SIMPLISMA approach was described where both the conventional spectral intensities (for pure variables of wide spectral features) and second derivative spectral intensities (for pure variables of narrow spectral features, overlapping with wide spectral features) are used. The term conventional is used to contrast the original data to the second derivative data. This new SIMPLISMA approach is able to resolve spectra with wide and narrow peaks properly and minimizes baseline problems by resolving them as separate components. A problem with this approach is that the mathematically calculated spectra shown during the interactive process are difficult to interpret. For example, a background spectrum may have a significant negative contribution from the other components in the mixture. In this paper, an alternative approach will be described based on a different algorithm which will avoid some of the problems with SIMPLISMA. Furthermore, this new approach is also capable to deal with concentrations that are constant in the data set.The method is based on the determination of the maximum angle between variables and is called stepwise maximum angle calculations (SMAC).Examples will be given of NMR spectra of bound and free surfactants and Raman microspectrometry data of dust particles taken from ore stocks of a lead and zinc factory.
Article
The synthesis and hydrolysis of ethyl acetate was monitored with a Raman spectrometer equipped with a 785-nm laser, confocal optics, an optical fiber, a holographic transmission grating, and a charge coupled device (CCD) detector. In the Raman spectra, the signals were seen to vary in proportion to the concentrations of ethanol, acetic acid and ethyl acetate. First-order reaction, rate constants were determined for the hydrolysis reaction, under different experimental conditions according to a full factorial experimental design. The rate constants were determined either by using the peak height at one single wavelength, or by using score values from principal components analysis (PCA) or partial least squares (PLS). The rate constants determined agree well with literature values. It is shown that the single wavelength and the PCA/PLS approaches give similar reaction profiles, where the standard normal variate (SNV) transformation was used for spectral pretreatment. The use of Raman spectroscopy for reaction monitoring in opalescent mixtures is also shown and discussed.
Article
A novel approach for the pre-selection of wavelengths, to be used in combination with Partial Least Squares (PLS) or other multivariate regression techniques, is presented. This variable selection method makes use of the purity function, originally suggested in the SIMPLe-to-use Interactive Self-modeling Mixture Analysis (SIMPLISMA) algorithm, to map up the regions of potentially influential variables. The selected intervals are then individually tested in practical modeling and prediction, and an optimal subset of variables is obtained. The algorithm is simple and intuitive and does not rely on iterative variable searches. The method was tested on a set of infrared protein spectra in order to improve the quantitative determination of the fractions of two secondary structure elements, α-helices and β-strands (β-sheets) in the protein polypeptide chain. Comparable results to those obtained through interval PLS (iPLS), an exhaustive search-based algorithm, were achieved in this study. Our method was shown to be particularly beneficial in combination with variable weighting by their inverse standard deviation.
Article
Two examples are given demonstrating the use of multivariate modeling in Raman process control applications. In one example, principal component analysis (PCA) and principal component regression (PCR) are used to model the curing of a high performance thermoset. The PCA results are found to give more accurate results when compared to univariate methods. In a second example, the octane number of gasoline is accurately modeled using partial least squares (PLS) regression analysis. For both examples, methods of normalization are considered in an effort to overcome the limitations of the single beam nature of Raman spectra.
Article
Software using maximum entropy (MaxEnt) analysis has been developed, and used to econvolute complete electrospray spectra of protein mixtures. It automatically produces zero-charge mass spectra on a molecular mass scale, along with probabilistic quantification so that the reliability of features in the spectrum can be ascertained. Because maximum entropy is faithful to the experimental data, the results tend to have improved resolution and signal-to-noise ratio. This improved performance, particularly regarding resolution, is demonstrated on a haemoglobin containing two β-globins separated by 12 Da at m/z 15 867 (0.08%). A separation of 12 Da was previously the closest at which mass measurement of two globins was practicable. Also, two hiherto unresolved β-globins from a second haemoglobin, separated by 9 Da (0.06%) were resolved by MaxEnt and their masses accurately measured. These are the first results using rigorous MaxEnt analysis in electrospray mass spectrometry.
Article
This work is mainly oriented to give an overview of the progress of multivariate curve resolution methods in the last 5 years. Conceived as a review that combines theory and practice, it will present the basics needed to understand what is the use, prospects and limitations of this family of chemometric methods with the latest trends in theoretical contributions and in the field of analytical applications.
Article
Ab initio quantum mechanical computations of force fields (FF) and atomic polar and axial tensors (APT and AAT) were carried out for triamide strands Ac-A-A-NH-CH3 clustered into single-, double-, and triple- strand beta -sheet-like conformations. Models with phi, psi, and omega angles constrained to values appropriate for planar antiparallel and parallel as well as coiled antiparallel (two-stranded) and twisted antiparallel and parallel sheets were computed. The FF, APT, and AAT values were transferred to corresponding larger oligopeptide beta -sheet structures of up o five strands of eight residues each, and their respective IR and vibrational circular dichroism (VCD) spectra were simulated. The antiparallel planar models in a multiple-stranded assembly give a unique IR amide I spectrum with a high-intensity, low-frequency component, but they have very weak negative amide I VCD, both reflecting experimental patterns seen in aggregated structures. Parallel and twisted beta -sheet structures do not develop a highly split amide I, their IR spectra all being similar. A twist in the antiparallel beta -sheet structure leads to a significant increase in VCD intensity, while the parallel structure was not as dramatically affected by the twist. The overall predicted VCD intensity is quite weak but predominantly negative (amide 1) for all conformations. This intrinsically weak VCD can explain the high variation seen experimentally in beta -forming peptides and proteins. An even larger variation was predicted in the amide II VCD, which had added complications due to non-hydrogen-bonded residues on the edges of the model sheets.
Article
Thermal-induced unfolding of α-chymotrypsin has been monitored with circular dichroism spectroscopy, which shows a far-UV-CD region sensitive to changes in the protein secondary structure and a near-UV-CD region, which gives information at the tertiary structure level. Changes in CD signals in both the far-UV and the near-UV are used to monitor comprehensively the loss of protein structure during unfolding.The application of the chemometric method multivariate curve resolution–alternating least-squares (MCR–ALS) to the spectroscopic measurements allowed for the recovery of the concentration profiles and spectra of three different protein conformations, one of them not obtainable experimentally. Joining the resolved information about the evolution of the tertiary structure and the results coming from methods devoted to the elucidation of the protein secondary structure, the three protein conformations can be characterised as: a native conformation, with both secondary and tertiary structure organized as in the natural active protein; a second conformation, with a modified secondary structure richer in β-sheet and a native-like tertiary structure, and a third conformation, with a secondary structure very similar to the second conformation and with the tertiary structure unfolded.
Article
A model of the curing reaction between phenyl glycidyl ether (PGE) and aniline as the curing agent was studied isothermally at 95°C and monitored in situ by near-infrared spectroscopy (NIR). The spectra were recorded every 5min. The ubiquitous problem of rank deficiency in reaction network systems was solved by assembling an augmented column-wise matrix containing five process runs from different initial conditions. The data were analyzed using a two-way multivariate curve resolution alternating least squares method (MCR-ALS). Initial estimates of spectra required by MCR-ALS were given by a SIMPLe-to-use Interactive Self-modeling Mixture Analysis (SIMPLISMA) approach. The reactants, product and intermediate spectra were successfully resolved and the concentration profiles properly represented the system studied. The performance of the model was evaluated by two parameters: ALS lack of fit (lof=0.88%) and explained variance (R2=99.99%). To validate the MCR-ALS results, the similarity coefficients (r) between the recovered spectra and the pure species spectra were calculated. These were: PGE (r=0.998), aniline (r=0.994) and tertiary amine (r=0.999).
Article
Massive amounts of tandem mass spectra are produced in high-throughput proteomics studies. The manual interpretation of these spectra is not feasible. Instead, search engines are used to match the tandem mass spectra with sequence information contained in proteomics and genomics databases. Typically, these search engines provide a list of the best matching peptide sequences for an individual tandem mass spectrum. As well, they provide scores that are somewhat related to the confidence level in the match. Many peptide tandem mass spectra search engines have been reported. These search engines provide very different results depending on the type of mass spectrometers used and their input parameters. Here we describe a comparative analysis of different search engines using validated test sets of tandem mass spectra. We have defined test sets of MS/MS spectra derived from high throughput proteomics experiments performed by HPLC-ESI-MS/MS on ion trap (LCQ) and tandem quadrupole time-of-flight instruments with a pulsar functionality (Qstar Pulsar) mass spectrometers. We analyzed the ability of the different search engines to identify the correct peptides, and the cross-validations of the different search engines.
Article
Estimating the solvent content in protein crystals is one of the first steps in a macromolecular structure determination. We apply a new statistical technique, the independent component analysis (ICA), to determine the volume fraction of the asymmetric unit of proteins occupied by the solvent. The results for several crystal forms are in good agreement with available ones and allow to validate the method. Its main advantage with respect to existing techniques is that it requires only the knowledge of crystallographic data of structure factors and no a priori information about protein.
Article
Ion association in aqueous solutions of varied concentrations of LiNO3, Mg(NO3)2, Co(NO3)2, Li2SO4 and MgSO4 was studied by means of Raman spectrometry assisted by principal-components (PCA) and evolving-factor (EFA) analyses. Formation of one Raman-active associated species, {M···L(R)} (M=Li+, Mg2+, Co2+; L=NO3−, SO42−), was detected at higher salt concentrations (2 mol dm−3 for LiNO3, 1 mol dm−3 for Co(NO3)2, 0.2 mol dm−3 for Li2SO4, 0.5 mol dm−3 for MgSO4). Spectral profiles of L(aq) and {M···L(R)} species were computed by means of EFA, as well as their equilibrium concentrations in each solution. Maximum fractions of anions engaged in {M···L(R)} species amounted (at highest salt concentrations) to ≈0.6 (LiNO3), ≈0.4 (Co(NO3)2), ≈0.60 (Li2SO4) and ≈0.3 (MgSO4). Since, for a given salt concentration, the eqilibrium concentration of Raman-active species, [M···L(R)], was always lower than the concentration of analogous UV-active species, [M···L(UV)] (reported in a previous paper) it is suggested that {M···L(UV)} could be a precursor of {M···L(R)}. In all instances, apparent stability constants of {M···L(R)} species, falling into the range of (0.01K1/dm3 mol−11), have a progressively upward trend with increasing salt concentration.
Article
A library mixture search method originally developed for infrared spectra has been successfully applied to UV-visible spectra. This novel approach for searching a spectral library performs a principal component analysis (PCA) on the entire library of spectra for pure compounds. The library spectra are represented by their PCA scores, and the concentrations (assumed to be unity) are regressed onto these scores. The scores for an unknown spectrum projected onto the PCA basis set are multiplied by the regression matrix to predict pseudo-concentrations or composition indices. After the first pass through the library, a subgroup of the top 20 hits (10% of the library) is selected and the PCR analysis is repeated on this set to improve the selection process. Spectra of each of the individual target components are adaptively filtered from the subgroup of library spectra and from the unknown spectrum prior to the repeat of the PCR analysis. The application of the adaptive filter greatly improves the success rate on hitting the second and third components by removing the first hit during each pass through the library. Computation times for training and applying the Mix-Match algorithm are greatly reduced by pre-processing with Fourier Transforms. A 200-compound library could be trained in 45 min and searched in 9 s; a 20-compound subgroup could be adaptively filtered and searched in 37 s. Both components in 12 two-component mixtures and one component in each of two two-component mixtures were correctly identified; the algorithm failed on both components in only one out of 15 two-component mixtures. All three components were correctly identified in one three-component mixture, and one component was correctly identified in another three-component mixture.
Book
A comprehensive introduction to ICA for students and practitionersIndependent Component Analysis (ICA) is one of the most exciting new topics in fields such as neural networks, advanced statistics, and signal processing. This is the first book to provide a comprehensive introduction to this new technique complete with the fundamental mathematical background needed to understand and utilize it. It offers a general overview of the basics of ICA, important solutions and algorithms, and in-depth coverage of new applications in image processing, telecommunications, audio signal processing, and more.Independent Component Analysis is divided into four sections that cover:* General mathematical concepts utilized in the book* The basic ICA model and its solution* Various extensions of the basic ICA model* Real-world applications for ICA modelsAuthors Hyvarinen, Karhunen, and Oja are well known for their contributions to the development of ICA and here cover all the relevant theory, new algorithms, and applications in various fields. Researchers, students, and practitioners from a variety of disciplines will find this accessible volume both helpful and informative.
Article
We propose a filtering based on independent component analysis (ICA) for Poisson noise reduction. In the proposed filtering, the image is first transformed to ICA domain and then the noise components are removed by a soft thresholding (shrinkage). The proposed filter, which is used as a preprocessing of the reconstruction, has been successfully applied to penumbral imaging. Both simulation results and experimental results show that the reconstructed image is dramatically improved in comparison to that without the noise-removing filters.
Article
The aim of this study is the determination of the ionic distribution versus the spatial coordinates within the membrane polymers. The confocal Raman spectroscopy is a useful tool for the study because it allows investigation of the species nature within materials. The studied membrane is a AW from Solvay (with a poly-4-vinylpyridine graft and an ethylene tetrafluoroethylene matrix). The first results are the pK value (2.25), verification of the graft's homogeneity in the whole membrane and an initial study about ionic transport within membrane and spectra analysis with factor analysis (principal components analysis and evolving factor analysis with multivariate curve resolution).
Article
We have examined the UV Raman spectra of a series of amino acids, dipeptides, tripeptides, and the polypeptides poly(L-glutamic acid) and poly(L-lysine) in their random coil, α-helical and β-sheet forms. Our study examines the assignments of the resonance-enhanced amide bands and characterizes their resonance enhancement mechanisms. We have for the first time characterized the conformational dependence of the intense newly assigned band that derives from the overtone of the amide V vibration. We have examined the pH and conformational dependence of the amide band frequencies and Raman cross sections and relate these dependences to changes in the resonant electronic transition frequency and oscillator strength. We use these spectral parameters to monitor the thermal conversion of poly(L-glutamic acid) (PGA) and poly(L-lysine) (PLL) from the α-helix to β-sheet form and to determine the pH dependence of the transition of PLL to the β-sheet form. We also demonstrate an α-helix-like intermediate of PGA during the transition to the β-sheet conformation. We propose a quantitative relationship between the observed UV Raman spectral cross sections and frequencies and protein secondary structure that will prove useful for conformational studies. These results will serve as the background for analyses of protein secondary structure.
Article
One of the major methods used to resolve spectral data by self-modeling techniques requires the presence of pure variables. A pure variable is a variable that has an intensity contribution from only one of the components in the data set. For spectral data obtained in the near-infrared (near-IR) region (ca. 1-2.5 mum), pure variables are often not available, due to the width of the spectral peaks and the presence of a baseline. The application of self-modeling mixture analysis techniques has to be used with caution for these data. In this paper, it will be shown that, despite the absence of pure variables in near-IR data, It is possible to resolve the data properly by using the second-derivative spectra as an intermediate step. The basic technique will be demonstrated with the recently developed SIMPLISMA (SIMPLe-to-use Interactive Self-modeling Mixture Analysis) approach using a simulated data set. A complete example is given for a five-component solvent mixture using near-IR data.
Article
In the analytical environment, spectral data resulting from the analysis of samples often represent mixtures. To extract information about pure components often is a major problem, especially when reference spectra are not available. For this type of problem, self-modeling mixture analysis techniques have been developed. Although successful commercial applications have been developed, the application of these techniques to complex data sets requires skilled operators. Furthermore, no general purpose software is available. In order to make self-modeling mixture analysis more accessible, a new method has been developed. For the approach described here, all the intermediate steps can be presented straightforwardly in the form of spectra, and it is possible to direct the procedure by using chemical knowledge of the samples. Examples will be shown of Raman spectroscopic data of a reaction, where spectra of intermediates are extracted, and of FT-IR microscopic data of a polymer laminate, where it will be shown that spectra of layers below the resolution of the FT-IR microscope can be calculated.
Article
Raman spectra of proteins that are obtained with deep ultraviolet excitation contain resonance-enhanced amide bands of the polypeptide backbone, as well as aromatic side chain bands. The amide bands are sensitive to conformation, and can be used to estimate the backbone secondary structure. UV Raman spectra are reported at 206.5 and 197 nm, for a set of 12 proteins with varied secondary structure content, and are used to establish quantitative signatures of secondary structure via least-squares fitting. Amide band enhancement is greater at 197 nm, where basis spectra are established for β-turn, as well as α-helix, β-sheet and unordered structures; the lower signal strength at 206.5 nm does not provide a reliable spectrum for the first of these. Application of these basis spectra is illustrated for the melting of apo-myoglobin. The amide band positions and cross sections are discussed. Copyright © 2006 John Wiley & Sons, Ltd.
Article
This paper describes a contribution to Elsevier's data base of files. The spectral files are: 1.(a) Raman spectra of a reaction followed in time2.(b) FTIR microscopy spectra of a polymer laminate,3.(c) NIR spectra of mixtures of five solvents,4.(d) time resolved mass spectra of a three-component mixture. The application of self-modeling mixture analysis will be demonstrated using the Simplisma and Tsimplisma approach. Matlab functions to reproduce these results are included in the paper.
Article
The performance of five curve resolution methods was compared systematically for the identification and quantification of impurities in drug impurity profiling. These methods are alternating least-squares (ALS) with either random or iterative key-set factor analysis (IKSFA) initialisation, iterative target transformation factor analysis (ITTFA), evolving factor analysis (EFA), and heuristic evolving latent projections (HELP). Real and simulated high-performance liquid chromatography diode array detection (HPLC-DAD) data were obtained for drug mixtures containing one main compound and two impurities. The elution order of the main compound and the impurities was varied. Furthermore, resolutions were varied from 0.56 to 3.36 and impurity levels from 30% down to 0.1%. For simulated data, ALS with IKSFA initialisation and HELP perform better than ITTFA and EFA, which perform better than ALS with random initialisation. ITTFA works better than EFA for almost completely separated data, while the opposite is true for moderately or strongly overlapping data. Only ALS with IKSFA initialisation and HELP were found to resolve the required 0.1% level for moderately overlapping data. For real data, comparison of the methods provides similar results. ITTFA performs clearly better than EFA. However, none of the curve resolution methods can identify or quantify impurities at the required 0.1% level. The results for real data are worse than for simulated data because of heteroscedasticity, nonlinearity, and the acquisition resolution of the A/D-converter.
Article
We used UV resonance Raman spectroscopy to characterize the equilibrium conformation and the kinetics of thermal denaturation of a 21 amino acid, mainly alanine, R-helical peptide (AP). The 204-nm UV resonance Raman spectra show selective enhancements of the amide vibrations, whose intensities and frequencies strongly depend on the peptide secondary structure. These AP Raman spectra were accurately modeled by a linear combination of the temperature-dependent Raman spectra of the pure random coil and the pure R-helix conformations; this demonstrates that the AP helix-coil equilibrium is well-described by a two-state model. We constructed a new transient UV resonance Raman spectrometer and developed the necessary methodologies to measure the nanosecond relaxation of AP following a 3-ns T-jump. We obtained the T-jump by using a 1.9-μm IR pulse that heats the solvent water. We probed the AP relaxation using delayed 204-nm excitation pulses which excite the Raman spectra of the amide backbone vibrations. We observe little AP structural changes within the first 40 ns, after which the R-helix starts unfolding. We determined the temperature dependence of the folding and unfolding rates and found that the unfolding rate constants show Arrhenius-type behavior with an apparent ∼8 kcal/mol activation barrier and a reciprocal rate constant of 240 (60 ns at 37 °C. However, the folding rate constants show a negative activation barrier, indicating a failure of transition-state theory in the simple two-state modeling of AP thermal unfolding, which assumes a temperature-independent potential energy profile along the reaction coordinate. Our measurements of the initial steps in the R-helical structure evolution support recent protein folding landscape and funnel theories; our temperature-dependent rate constants sense the energy landscape complexity at the earliest stages of folding and unfolding.
Article
We used ultraviolet resonance Raman (UVRR) spectra to examine the spatial dependence and the thermodynamics of R-helix melting of an isotopically labeled R-helical, 21-residue, mainly alanine peptide. The peptide was synthesized with six natural abundance amino acids at the center and mainly perdeuterated residues elsewhere. C R deuteration of a peptide bond decouples C R -H bending from N-H bending, which significantly shifts the random coil conformation amide III band; this shift clearly resolves it from the amide III band of the nondeuterated peptide bonds. Analysis of the isotopically spectrally resolved amide III bands from the external and central peptide amide bonds show that the six central amide bonds have a higher R-helix melting temperature (∼32 °C) than that of the exterior amide bonds (∼5 °C).
Article
A recently developed algorithm, called Convex Constraint Analysis (CCA), was successfully applied to determine the circular dichroism (CD) spectra of the pure β-pleated sheet in globular proteins. On the basis of X-ray diffraction determined secondary structures, the original data set used (Perczel, A., Hollosi, M., Tusnady, G. Fasman, G.D. Convex constraint analysis: A natural deconvolution of circular dichroism curves of proteins, Prot. Eng., 4:-669–679, 1991), was improved by the addition of proteins with high β-pleated sheet content. The analysis yielded CD curves of the pure components of the main secondary structural elements (α-helix, antiparallel β-pleated sheet, β-turns, and unordered conformation), as well as a curve attributed to the “aromatic contribution” in the wavelength range of 195–240 nm. Upon deconvolution the curves obtained were assigned to various secondary structures. The calculated weights (percentages determining the contributions of each pure component curve in the measured CD spectra of a given protein) were correlated with the X-ray diffraction determined percentages in an assignment procedure and were evaluated. The Pearson product correlation coefficients (R) are significant for all five components. The new pure component curves, which were obtained through deconvolution of the protein CD spectra alone, are promising candidates for determining the percentages of the secondary structural components in globular proteins without the necessity of adopting an X-ray database. The CD spectrum of the CheY protein was interesting because it has the characteristic shape associated with the α-helical structure, but upon analysis yielded a considerable amount of β-sheet in agreement with the X-ray structure. © 1992 Wiley-Liss, Inc.
Article
With the aim of developing PLS models with improved predictive properties, an interactive variable selection (IVS) approach for PLS regression was introduced in Part I of this series. IVS‐PLS is based on a dimension‐wise selective removal of single elements in the PLS weight vector w. IVS uses cross‐validation (CV) as a guiding tool. The present paper illustrates the use of IVS‐PLS on both simulated data and real examples from chemistry. In the first example, spectrophotometric data were simulated according to an experimental design. The objective was to see how IVS‐PLS was influenced by different levels of noise in X and Y and by the number of predictor variables ( K ). The results of the modelling are shown as response surfaces. In addition, four real examples were modelled by the IVS‐PLS technique. The real data sets were chosen to reflect different types of data from chemistry. For each example a comparison of ‘prediction error sum of squares’ (PRESS) between IVS‐PLS and classical PLS is made For most of the examples containing many predictor variables IVS‐PLS shows an improvement in predictive properties over classical PLS. Also, improvements for IVS‐PLS2 (modelling of more than one y‐variable) models were found. For data sets with a moderate number of variables the influence of the IVS method becomes less pronounced.
Article
Characterizing and classifying molecular variations within biological samples are critical for determining the fundamental mechanisms of biological processes. Toward these ends, time-of-flight secondary ion mass spectrometry (ToF-SIMS) was used to examine increasingly complex samples of biological relevance. The large, multivariate datasets were analyzed using five common statistical and chemometric techniques: principal component analysis (PCA), linear discriminant analysis (LDA), partial least-squares discriminant analysis (PLSDA), soft independent modeling of class analogy (SIMCA), and decision-tree analysis by recursive partitioning. PCA was found to provide insight into both the relative groupings of samples and the molecular basis for those groupings. For monosaccharide, pure protein, and complex protein mixture samples, LDA, PLSDA, and SIMCA all produced excellent classification. For mouse embryo tissues, however, SIMCA did not classify samples as accurately. The decision-tree analysis was the least successful for all tested samples, providing neither as accurate a classification nor chemical insight. Based on these results we conclude that as the complexity of samples increases, so must the sophistication of the multivariate technique used to classify the samples. PCA is a preferred first step for understanding ToF-SIMS data that can be followed by either LDA or PLSDA for effective classification. This study demonstrates the strength of the combination of ToF-SIMS and multivariate analysis to classify increasingly complex biological samples. Applying these techniques to information-rich mass spectral data opens the possibilities for new applications including classification of subtly different biological samples that may provide insights into cellular processes, disease progress, and disease diagnosis. Copyright © 2008 John Wiley & Sons, Ltd.