About
321
Publications
170,468
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
27,980
Citations
Introduction
Louise Jespensgaard just defended a very nice MSc report:
"Evaluation of an Electronic Tongue in the Pharmaceutical Industry – Performance Evaluation and Sensory Validation of Taste Masking".
She did the project together with the pharmaceutical company Lundbeck
Additional affiliations
January 2013 - December 2015
Publications
Publications (321)
Unlike other food products, virgin olive oil must undergo an organoleptic assessment that is currently based on a trained human panel, which presents drawbacks that might affect the efficiency and robustness. Therefore, disposing of instrumental methods that could serve as screening tools to support sensory panels is of paramount importance. The pr...
Tensor decompositions, such as CANDECOMP/PARAFAC (CP), are widely used in a variety of applications, such as chemometrics, signal processing, and machine learning. A broadly used method for computing such decompositions relies on the Alternating Least Squares (ALS) algorithm. When the number of components is small, regardless of its implementation,...
The Canonical Polyadic (CP) tensor decomposition is frequently used as a model in applications in a variety of different fields. Using jackknife resampling to estimate parameter uncertainties is often desirable but results in an increase of the already high computational cost. Upon observation that the resampled tensors, though different, are nearl...
Higher-order tensor data analysis has been extensively employed to understand complicated data, such as multi-way GC-MS data in untargeted/targeted analysis. However, the analysis can be complicated when one of the modes shifts e.g., the elution profiles of specific compounds often with respect to retention time; something which violates the assump...
A total of 56 key volatile compounds present in natural and alkalized cocoa powders have been rapidly evaluated using a non-target approach using stir bar sorptive extraction gas chromatography mass spectrometry (SBSE-GC-MS) coupled to Parallel Factor Analysis 2 (PARAFAC2) automated in PARADISe. Principal component analysis (PCA) explained 80% of t...
The Canonical Polyadic (CP) tensor decomposition is frequently used as a model in applications in a variety of different fields. Using jackknife resampling to estimate parameter uncertainties is often desirable but results in an increase of the already high computational cost. Upon observation that the resampled tensors, though different, are nearl...
Increasing awareness of the ability to transform data into knowledge has steered more focus on data science within the educational system as well as the development of machine learning methods capable of handling complex problems with minimal or no human interaction. In principle, this raises the question on where human–computer interaction is supe...
Analyzing multi-way measurements with variations across one mode of the dataset is a challenge in various fields including data mining, neuroscience and chemometrics. For example, measurements may evolve over time or have unaligned time profiles. The PARAFAC2 model has been successfully used to analyze such data by allowing the underlying factor ma...
Gas chromatography – mass spectrometry (GC-MS) is an important tool in contemporary untargeted chemical analysis, where the batch analysis of sample series and subsequent generation of peak tables are still commonly subject to software-uncertainty leading to issues in reproducibility and hypothesis testing.
Using tensor-based modelling in combinati...
One of the most important types of evidence in certain criminal investigations is traces of human blood. For a detailed investigation, blood samples must be identified and collected at the crime scene. The present study aimed to evaluate the potential of the identification of human blood in stains deposited on different types of floor tiles (five t...
PARAFAC2 is a useful algorithm for decomposing tensors that do not have low-rank variation such as e.g. PARAFAC requires. It has been applied in analyzing different types of multi-way data, such as GC-MS data. Since the optimization in fitting the loss function of PARAFAC2 is non-convex, the PARAFAC2 model suffers from local minima. In this paper,...
PARAFAC2 is a well-established method for specific type of tensor decomposition problems, for example when observations have different lengths or measured profiles slightly change position in the multi-way data. Most commonly used PARAFAC2-ALS algorithms are very slow. In this paper, we propose novel implementations of extrapolation-based PARAFAC2...
Calibration model maintenance is often overlooked but is a significant part of successful use of multivariate calibration models, for example, in process monitoring and optimization. In some cases, companies are maintaining tens or even hundreds of calibration models. This could be partial least squares (PLS) calibration models pertaining to differ...
Elevated levels of particulate matter (PM) in urban atmospheres are one of the major environmental challenges of the Anthropocene. To effectively lower those levels, identification and quantification of sources of PM is required. Biomonitoring methods are helpful tools to tackle this problem but have not been fully established yet. An example is th...
The consumers' interest towards beer consumption has been on the rise during the past decade: new approaches and ingredients get tested, expanding the traditional recipe for brewing beer. As a consequence, the field of "beeromics" has also been constantly growing, as well as the demand for quick and exhaustive analytical methods. In this study, we...
This work describes the development of methodology based on the hierarchical soft classification method by combining multivariate analysis techniques and Hyperspectral Near Infrared Images (HSI-NIR) to confirm identification of bloodstains on colored and printed fabrics. The term hierarchical is used to designate that the classification is done seq...
Sparse Principal Component Analysis (sPCA) is a popular matrix factorization approach based on Principal Component Analysis (PCA). It combines variance maximization and sparsity with the ultimate goal of improving data interpretation. A main application of sPCA is to handle high-dimensional data, for example biological omics data. In Part I of this...
In this paper, we discuss the validity of using score plots of component models such as partial least squares regression, especially when these models are used for building classification models, and models derived from partial least squares regression for discriminant analysis (PLS-DA). Using examples and simulations, it is shown that the currentl...
Matrix factorization methods are extensively employed to understand complex data. In this paper, we introduce the cross-product penalized component analysis (X-CAN), a matrix factorization based on the optimization of a loss function that allows a trade-off between variance maximization and structural preservation, with a focus on highlighting diff...
Laser-induced breakdown spectroscopy (LIBS) was used to characterize base (Al and Cu) and noble (Au and Ag) elements on a printed circuit board (PCB) from hard disk (HD). A PCB was cut in 77 fragments and, a matrix of 4 rows and 4 columns with 10 laser pulses in each point of the matrix was acquired in each fragment by LIBS. For each element, a spe...
In this section, multilinear models for multi-way arrays requiring iterative fitting algorithms are outlined. Among them: the PARAFAC (PARAllel FACtor analysis) model and one of its variants (the PARAFAC2 model); Tucker models in which one or more modes are reduced (viz., the N-way Tucker-N and Tucker-m models); hybrid models having intermediate pr...
We propose networkmetrics, a new data-driven approach for monitoring, troubleshooting and understanding communication networks using multivariate analysis. Networkmetric models are powerful machine-learning tools to interpret and interact with data collected from a network. In this paper, we illustrate the application of Multivariate Big Data Analy...
Matrix factorization methods are extensively employed to understand complex data. In this paper, we introduce the cross-product penalized component analysis (XCAN), a sparse matrix factorization based on the optimization of a loss function that allows a trade-off between variance maximization and structural preservation. The approach is based on pr...
Analysis of untargeted gas-chromatographic data is time consuming. With the earlier introduction of the PARAFAC2 (PARAllel FACtor analysis 2) based PARADISe (PARAFAC2 based Deconvolution and Identification System) approach in 2017, this task was made considerably more time-efficient. However, there are still a number of manual steps in the analysis...
Multivariate exploratory data analysis allows revealing patterns and extracting information from complex multivariate data sets. However, highly complex data may not show evident groupings or trends in the principal component space, e.g. because the variation of the variables are not grouped but rather continuous. In these cases, classical explorat...
The spectra responsible for natural dissolved organic matter (DOM) fluorescence in 90 peer-reviewed studies have been compared using new similarity metrics. Numerous spectra cluster in specific wavelength regions. The emerging...
NMR is one of the most powerful analytical techniques of our time. It allows detailed investigation of qualitative and quantitative characteristics of complex chemical and biological samples. The resulting NMR data provides a wealth of information about the samples, but the NMR data analysis has been and still is suffering from oversimplified appro...
Modeling variability in tensor decomposition methods is one of the challenges of source separation. One possible solution to account for variations from one data set to another, jointly analysed, is to resort to the PARAFAC2 model. However, so far imposing constraints on the mode with variability has not been possible. In the following manuscript,...
PARAFAC2 is a powerful decomposition method which is ideally suited for modeling gas chromatography-mass spectrometry (GC-MS) data. However, the most widely used fitting algorithms (alternating least squares, ALS) are very slow which hinders use of the model. In this paper, an iterative method called geometric search is proposed to fit the PARAFAC2...
Modeling variability in tensor decomposition methods is one of the challenges of source separation. One possible solution to account for variations from one data set to another, jointly analysed, is to resort to the PARAFAC2 model. However, so far imposing constraints on the mode with variability has not been possible. In the following manuscript,...
Parallel factor analysis (PARAFAC) of food fluorescence has found many applications in food science, such as in non-contact and non-destructive food characterization, the detection of food adulteration, and the authentication of geographical and botanical origins of food products. This Chapter presents a theoretical background of the PARAFAC method...
It has become easy to obtain multivariate chemical data of high dimensions. However, it may be expensive or time consuming to obtain a large number of samples or to acquire reference measures, so the number of samples available for multivariate calibration modelling may be limited. If data contains nonlinear relationships, nonlinear methods are req...
Demonstrates the use of PARAllel FACtor analysis for excitation emission fluorescence.
Foodomics is a newly developed discipline that has become more and more important in the last years where focus on food and the understanding of food systems has increased significantly. In this review, the flow of a typical foodomics study will be followed with a focus on the core components, where chemometric expertise is more deeply involved. Th...
Flavour matching can be viewed as trying to reproduce a specific flavour. This is a time consuming task and may lead to flavour mixtures that are too complex or too expensive to be commercialized. In order to facilitate the matching, we have developed a new mathematical model, called Prioritizer. Based on the chemical composition of a mixture of vo...
Data fusion, i.e., extracting information through the fusion of complementary data sets, is a topic of great interest in metabolomics since analytical platforms such as Liquid Chromatography - Mass Spectrometry (LC-MS) and Nuclear Magnetic Resonance (NMR) spectroscopy commonly used for chemical profiling of biofluids provide complementary informati...
Evaluation of GC–MS data may be challenging due to the high complexity of data including overlapped, embedded, retention time shifted and low S/N ratio peaks. In this work, we demonstrate a new approach, PARAFAC2 based Deconvolution and Identification System (PARADISe), for processing raw GC–MS data. PARADISe is a computer platform independent free...
Multi-way data arrays are becoming more common in several fields of science. For instance, analytical instruments can sometimes collect signals at different modes simultaneously, as e.g. fluorescence and LC/GC-MS. Higher order data can also arise from sensory science, were product scores can be reported as function of sample, judge and attribute. A...
This IUPAC Technical Report describes and compares the currently applied methods for the calibration and standardization of multi-dimensional fluorescence (MDF) spectroscopy data as well as recommendations on the correct use of chemometric methods for MDF data analysis. The paper starts with a brief description of the measurement principles for the...
Flavour matching can be viewed as trying to reproduce a specific flavour. This is a time consuming task and may lead to flavour mixtures that are too complex or too expensive to be commercialized. In order to facilitate the matching, we have developed a new mathematical model, called Prioritizer. Based on the chemical composition of a mixture of vo...
NMR is one of the most powerful analytical techniques of our time. It allows detailed investigation of qualitative and quantitative characteristics of complex chemical and biological samples. The resulting NMR data provides a wealth of information about the samples, but the NMR data analysis has been and still is suffering from oversimplified appro...
Significant improvements can be realized by converting conventional batch processes into continuous ones. The main drivers include reduction of cost and waste, increased safety, and simpler scale-up and tech transfer activities. Re-designing the process layout offers the opportunity to incorporate a set of process analytical technologies (PAT) embr...
In many areas of science multiple sets of data are collected pertaining to the same system. Examples are food products which are characterized by different sets of variables, bio-processes which are on-line sampled with different instruments, or biological systems of which different genomics measurements are obtained. Data fusion is concerned with...
The focus of the present paper is to propose and discuss different procedures for performing variable selection in a multi-block regression context. In particular, the focus is on two multi-block regression methods: Multi-Block Partial Least Squares (MB-PLS) and Sequential and Orthogonalized Partial Least Squares (SO-PLS) regression. A small simula...
The aim was to investigate the effects of increased water or dairy intake on total intake of energy, nutrients, foods and dietary patterns in overweight adolescents in the Milk Components and Metabolic Syndrome (MoMS) study (n= 173). Participants were randomly assigned to consume 1l/d of skim milk, whey, casein or water for 12 weeks. A decrease in...
With a goal of identifying biomarkers/patterns related to certain conditions or diseases, metabolomics focuses on the detection of chemical substances in biological samples such as urine and blood using a number of analytical tech- niques, including nuclear magnetic resonance (NMR) spectros- copy, liquid chromatography-mass spectrometry (LC-MS), an...
Little is known about the development of dietary patterns during toddlerhood and the relation to growth and health. The study objective was to characterise the development of dietary patterns from 9-36 mo of age and investigate the association to body size, body composition and metabolic risk markers at 36 mo. Food records were filled out at 9, 18...
We consider factoring low-rank tensors in the presence of outlying slabs.
This problem is important in practice, because data collected in many
real-world applications, such as speech, fluorescence, and some social network
data, fit this paradigm. Prior work tackles this problem by iteratively
selecting a fixed number of slabs and fitting, a proced...
It is important to increase the awareness of indicators associated with adverse infant dietary patterns to be able to prevent or to improve dietary patterns early on.
The aim of this study was to investigate the association between a wide range of possible family and child indicators and adherence to dietary patterns for infants aged 9 months.
The...
Tensor factorisations have proven useful to model amplitude and spectral information of brain recordings. Here, we assess the usefulness of tensor factorisations in the multiway analysis of other brain signal features in the context of complexity measures recently proposed to inspect multiscale dynamics. We consider the "refined composite multiscal...
Assessment of lameness prevalence and severity requires visual evaluation of the locomotion of a cow. Welfare schemes including locomotion assessments are increasingly being adopted, and more farmers and their veterinarians might implement a locomotion-scoring routine together. However, high within-observer agreement is a prerequisite for obtaining...
Near infrared (NIR) spectroscopy in combination with partial least-squares regression (PLS) is widely applied in process control for non-destructive measurement of quality parameters during production. PLS assumes an approximate linear relationship between the parameter to be estimated and the intensity of its absorption bands. Spectra, however, ma...
Breast cancer is a major cause of death for women. To improve treatment, current oncology research focuses on discovering and validating new biomarkers for early detection of cancer; so far with limited success. Metabolic profiling of plasma samples and auxiliary lifestyle information was combined by chemometric data fusion. It was possible to crea...
Excitation-Emission Matrix (EEM) fluorescence spectroscopy combined with second order decomposition algorithms such as PARAFAC provides interesting opportunities in analytical chemistry. However, the intrinsic presence of scattering effects in the EEM measurements poses a practical problem. Appropriate handling of the scatter is necessary to avoid...
Fructooligosaccharides (FOS) are popular components of functional foods produced by the enzymatic transfer of fructose units to sucrose. Improving β-fructofuranosidase traits by protein engineering is restricted by the absence of a rapid, direct screening method for the fructooligosaccharide products produced by enzyme variants. The use of standard...
Background/objectives:
Differences in the quality of complementary feeding between infants of obese and nonobese mothers have not been examined sufficiently. The aim of this paper was to compare dietary patterns, foods, nutrients and energy intakes of 9-month-old Danish infants in a cohort comprising obese mothers (SKOT II, n=184; SKOT, Danish abb...
Fluorescence spectroscopy coupled with parallel factor analysis (PARAFAC) and Partial least squares Discriminant Analysis (PLS DA) were used for characterization and classification of honey. Excitation emission spectra were obtained for 95 honey samples of different botanical origin (acacia, sunflower, linden, meadow, and fake honey) by recording e...
Lameness is prevalent in dairy herds. It causes decreased animal welfare and leads to higher production costs. This study explored data from an automatic milking system (AMS) to model on-farm gait scoring from a commercial farm. A total of 88 cows were gait scored once per week, for 2 5-wk periods. Eighty variables retrieved from AMS were summarize...
Analysis of data from multiple sources has the potential to enhance knowledge discovery by capturing underlying structures, which are, otherwise, difficult to extract. Fusing data from multiple sources has already proved useful in many applications in social network analysis, signal processing and bioinformatics. However, data fusion is challenging...
Cambridge Core - Geochemistry and Environmental Chemistry - Aquatic Organic Matter Fluorescence - edited by Paula G. Coble