Available via license: CC BY
Content may be subject to copyright.
Academic Editors: Oana Bianca Oprea
and Ignat Tolstorebrov
Received: 13 January 2025
Revised: 8 February 2025
Accepted: 8 February 2025
Published: 10 February 2025
Citation: Suzuki, K.; Akiyama, R.;
Llave, Y.; Matsumoto, T. Origin and
Variety Identification of Dried Kelp
Based on Fluorescence Fingerprinting
and Machine Learning Approaches.
Appl. Sci. 2025,15, 1803. https://
doi.org/10.3390/app15041803
Copyright: © 2025 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license
(https://creativecommons.org/
licenses/by/4.0/).
Article
Origin and Variety Identification of Dried Kelp Based on
Fluorescence Fingerprinting and Machine Learning Approaches
Kana Suzuki, Rikuto Akiyama, Yvan Llave and Takashi Matsumoto *
Department of Food Science and Technology, Tokyo University of Marine Science and Technology,
4-5-7 Konan, Minato-ku, Tokyo 108-8477, Japan; m246009@edu.kaiyodai.ac.jp (K.S.);
s213065@edu.kaiyodai.ac.jp (R.A.); pllave1@kaiyodai.ac.jp (Y.L.)
*Correspondence: tmatsu55@kaiyodai.ac.jp; Tel.: +81-3-5463-0635
Abstract: Accurate labeling of the origin of food ingredients is essential to ensure quality
and safety; however, establishing a reliable identification method remains an urgent task.
The origin and variety of dried kelp are generally identified based on their morphological
characteristics; however, they are difficult to distinguish unless experts are involved. In
addition, genetically close varieties have almost no differences in their base sequences;
therefore, the accuracy of conventional identification methods using genetic analysis is
limited. This study aimed to develop a system for identifying the origin and variety of
dried kelp using fluorescence fingerprint data obtained by fluorescence spectroscopy and a
convolutional neural network (CNN). The fluorescence characteristics of dried kelp were
measured in the range between 250 and 550 nm. The obtained fluorescence fingerprint
data were converted into image data and analyzed using a CNN model implemented in
Python, TensorFlow, and Keras. Unlike conventional methods that rely on morphological
characteristics and genetic analyses, by combining fluorescence spectroscopy and CNN,
a high identification accuracy of 98.86% was achieved even for genetically close varieties.
These results highlight the excellent potential of fluorescent fingerprints in identifying the
origin and variety of food and are believed to contribute to preventing food fraud and
quality control.
Keywords: machine learning; convolutional neural network (CNN); fluorescence spectroscopy;
fluorescence fingerprinting; food origin identification; food variety identification; kelp; kombu
1. Introduction
Labeling the origin of food ingredients is an important factor when consumers assess
the quality and safety of a product. In Japan, labeling the origin of ingredients in processed
foods has been mandatory since April 2022 [
1
], and interest in the accuracy of this labeling
has been increasing. This trend is not limited to Japan but is observed globally. For example,
labeling the origin of food is mandatory under European Union Law [
2
] and in the US
under the Code of Federal Regulations [
3
]. The introduction of such systems has enabled
consumers to obtain transparent information, and the labeling of origin contributes to
improving the reliability of food.
In contrast, the problem of food fraud remains a significant issue. People try to profit
by fraudulently using the names of production areas and varieties with high market value.
Fraudulent acts, such as false labeling of the origin of high-value products, pose a risk to
consumer trust and safety. Therefore, a reliable method is required to verify the origin of
food and to support accurate labeling.
Appl. Sci. 2025,15, 1803 https://doi.org/10.3390/app15041803
Appl. Sci. 2025,15, 1803 2 of 12
The flavor and ingredients of kelp are different depending on the origin and variety,
and there are large differences in distribution prices. Identifying the origin and variety of
kelp is important for food quality assurance, distribution management, and regional brand
protection. DNA and chemical analyses based on the soil composition of the production
area have been used to identify the origin of kelp using physicochemical tests. Hattori
et al. [4] have performed inorganic element analysis of kelp produced in Japan and China
using inductively coupled plasma mass spectrometry (ICP–MS) and developed a method
for determining the origin of kelp using inorganic elements as indicators. Morohashi
et al. [
5
] have also demonstrated the effectiveness of elemental analysis of wakame seaweed
using ICP–MS for identifying the origin of kelp. Shimizu et al. [
6
] have developed a
DNA extraction method with high amplification by polymerase chain reaction, extracted
mitochondrial DNA from 10 types of kelp, and performed genetic analysis. In this study,
six types of kelp have been identified; however, almost no difference has been found in base
sequences between Ma-kombu (Lonicera japonica) and its varieties Rishiri-kombu (Laminaria
ochotensis) and Rausu-kombu (L. ochotensis), indicating that further detailed studies are
necessary. Ma-kombu and its varieties, Rausu-kombu and Rishiri-kombu, have a low
degree of speciation, and the base sequences of rDNA internal transcribed spacer 1 (ITS-1)
and RuBisco spacer, which are effective in detecting species differences in many brown
algae, are completely identical [7].
Various studies have been conducted to identify the geographical origins of food, such
as coffee beans, fruit spirits, and Chinese red wine [
8
–
11
]. The methods used to distinguish
the origins include terahertz (THz) spectroscopy, Raman spectroscopy, ultraviolet–visible
(UV–VIS) spectroscopy, and NIR spectroscopy.
Fluorescence spectroscopy is a method for measuring food characteristics by using the
properties of substances to absorb light at certain wavelengths and emit light at different
wavelengths. Hu et al. [
12
] have demonstrated the possibility of using excitation-emission
matrix fluorescence spectroscopy combined with chemometrics to distinguish the geo-
graphical origins of rice from other varieties. Riza et al. [
13
] have identified the variety and
geographical origin of Italian olive oil. Karoui et al. [
14
] have proposed a method for identi-
fying the botanical origin of Swiss honey. Strelec et al. [
15
] have detected insects infesting
wheat grains using front-face fluorescence and UV–VIS spectroscopy. The effectiveness
of fluorescence spectroscopy has been confirmed for a wide variety of foods, including
rice, olive oil, honey, and wheat. A lot of information can be obtained more simply and
efficiently by fluorescence spectroscopy than conventional physicochemical analysis. Since
it is highly sensitive to minute differences in composition and is not easily affected by
moisture, it was thought to be an effective method for identifying the origin and variety
of dried kelp with high accuracy. This perspective has not been sufficiently examined in
previous research.
Various machine learning methods have been developed and combined with other
methods to identify the geographical origin of food such as peaches, Chinese Longjing
tea, and Pu’er tea, including VIS–NIR, fluorescence spectroscopy, image-processing
technology,
1
H nuclear magnetic resonance spectroscopy, hyperspectral imaging (HIS)
technology [16–18]
. Chen et al. [
19
] have proposed a method for identifying the adulter-
ation of camellia oil and quantifying the level of adulteration using excitation–emission
matrix spectroscopy and a CNN. Wu et al. [
20
] have proposed a method for detecting
the adulteration of nine types of vegetable oils using three-dimensional (3D) fluorescence
spectroscopy and a CNN. In addition to a CNN, k-nearest neighbor (KNN), Random Forest
(RF), Support Vector Machine (SVM), and Partial Least Squares have been used for com-
parison. Hu et al. [
21
] have reported that a CNN shows high classification accuracy when
using 3D fluorescence spectroscopy, SVM, RF, and CNN to detect counterfeit camellia oil.
Appl. Sci. 2025,15, 1803 3 of 12
In this study, fluorescence fingerprint data assessed using fluorescence spectroscopy
were used in a machine learning model to evaluate the identification of kelp varieties.
Building on the findings of previous studies, this study makes new academic and practical
contributions to the literature. First, it is novel as it uses fluorescence fingerprint data
obtained by fluorescence spectroscopy to identify the origin and variety of dried kelp.
Fluorescence spectroscopy has not been used for dried kelp, and no attempts have been
made to analyze the data with high accuracy using machine learning. This study aims to fill
this technological gap. Second, it aims to extract information on the origin and variety from
fluorescence fingerprint data with high accuracy using a CNN, a type of deep learning. A
CNN can automatically learn the features of complex patterns and is expected to have a
higher classification accuracy than conventional machine learning algorithms. Therefore,
improved accuracy is expected by applying this method to identify the origin and variety
of dried kelp. Furthermore, this study aims to provide a highly practical method for
identification to prevent food fraud and ensure quality assurance.
2. Materials and Methods
2.1. Target Foods
In this study, we investigated the origins and varieties of kombu in Hokkaido, a
representative kombu-producing region in Japan. Three types of dried kombu of differ-
ent origins and varieties were studied: Rausu-kombu, Rishiri-kombu (L. ochotensis), and
Mitsuishi-kombu (Laminaria angustata), which are genetically similar to Ma-kombu (L. japon-
ica). Eleven commercially available dried kombu samples were collected. Each product was
procured from a different manufacturer or with a different expiry date, with the intention
of obtaining data that reflects differences in origin, season, and manufacturing process. The
samples were ground and homogenized in a blender, then sealed in a powder cell, and
their fluorescence fingerprints were measured using a spectrofluorometer. Fluorescence
fingerprint data for 570 pieces (190 for each variety) were obtained. This resulted in the
construction of a dataset for identifying differences in the chemical and physical properties
between varieties using machine learning.
2.2. Analysis Method
A fluorescence spectrophotometer (F-7100; Hitachi High-Tech Science, Ibaraki, Japan)
was used to measure the fluorescence properties of the samples. The excitation and
fluorescence wavelength ranges were both 250–550 nm, the ex-citation sampling interval
was 10.0 nm, and the fluorescence sampling interval was 5.0 nm. The excitation and
emission wavelengths were selected in the range of 250–550 nm so that they did not
overlap, as shown in Figures 1and 2.
Appl. Sci. 2025, 15, x FOR PEER REVIEW 4 of 12
(a) (b) (c)
Figure 1. Unified scale dataset: (a) Rausu, (b) Rishiri, and (c) Hidaka.
(a) (b) (c)
Figure 2. Individual scale dataset: (a) Rausu, (b) Rishiri, and (c) Hidaka.
Before measurement, the dried kelp samples were homogenized by grinding in a mill
(DM-7452; DR MILLS, Guangzhou, Guangdong China) for approximately 1 min (10 s ×
six times) to obtain uniform fluorescence properties. The ground samples were stored in
an airtight container to prevent humidification and oxidation, which may change the flu-
orescence properties. No solvent was used for sample treatment, the sample was placed
in a solid sample holder, and the detection angles of the excitation light and the fluores-
cence were set at 90 degrees, and measurements were performed in an orthogonal config-
uration. To maintain consistency, minimize sample-to-sample variation, and ensure accu-
rate data collection, approximately 0.3 g of sample was sealed in a powder cell and placed
in the fluorescence spectrophotometer just before measurement.
2.3. Development of Machine Learning Model
2.3.1. Data Acquisition and Preprocessing
The photomultiplier voltage was set to 480 V. Based on the spectra acquired under
these measurement conditions, a 3D fluorescence matrix (FD3 file) was generated for each
sample. The obtained data were presented as 2D maps.
The maximum and minimum fluorescence intensity values of all samples were ex-
tracted and unified as the maximum and minimum values of the scale for all 2D images.
The maximum and minimum values of the fluorescence intensity scale were 1712 and 38,
respectively.
Furthermore, the 2D images were cropped to a square (360 px vertical × 360 px hori-
zontal) using Python (ver. 3.13.2), and the graduations, scales, and unnecessary white
Figure 1. Unified scale dataset: (a) Rausu, (b) Rishiri, and (c) Hidaka.
Appl. Sci. 2025,15, 1803 4 of 12
Appl. Sci. 2025, 15, x FOR PEER REVIEW 4 of 12
(a) (b) (c)
Figure 1. Unified scale dataset: (a) Rausu, (b) Rishiri, and (c) Hidaka.
(a) (b) (c)
Figure 2. Individual scale dataset: (a) Rausu, (b) Rishiri, and (c) Hidaka.
Before measurement, the dried kelp samples were homogenized by grinding in a mill
(DM-7452; DR MILLS, Guangzhou, Guangdong China) for approximately 1 min (10 s ×
six times) to obtain uniform fluorescence properties. The ground samples were stored in
an airtight container to prevent humidification and oxidation, which may change the flu-
orescence properties. No solvent was used for sample treatment, the sample was placed
in a solid sample holder, and the detection angles of the excitation light and the fluores-
cence were set at 90 degrees, and measurements were performed in an orthogonal config-
uration. To maintain consistency, minimize sample-to-sample variation, and ensure accu-
rate data collection, approximately 0.3 g of sample was sealed in a powder cell and placed
in the fluorescence spectrophotometer just before measurement.
2.3. Development of Machine Learning Model
2.3.1. Data Acquisition and Preprocessing
The photomultiplier voltage was set to 480 V. Based on the spectra acquired under
these measurement conditions, a 3D fluorescence matrix (FD3 file) was generated for each
sample. The obtained data were presented as 2D maps.
The maximum and minimum fluorescence intensity values of all samples were ex-
tracted and unified as the maximum and minimum values of the scale for all 2D images.
The maximum and minimum values of the fluorescence intensity scale were 1712 and 38,
respectively.
Furthermore, the 2D images were cropped to a square (360 px vertical × 360 px hori-
zontal) using Python (ver. 3.13.2), and the graduations, scales, and unnecessary white
Figure 2. Individual scale dataset: (a) Rausu, (b) Rishiri, and (c) Hidaka.
Before measurement, the dried kelp samples were homogenized by grinding in a
mill (DM-7452; DR MILLS, Guangzhou, Guangdong, China) for approximately 1 min
(
10 s ×six times
) to obtain uniform fluorescence properties. The ground samples were
stored in an airtight container to prevent humidification and oxidation, which may change
the fluorescence properties. No solvent was used for sample treatment, the sample was
placed in a solid sample holder, and the detection angles of the excitation light and the
fluorescence were set at 90 degrees, and measurements were performed in an orthogonal
configuration. To maintain consistency, minimize sample-to-sample variation, and ensure
accurate data collection, approximately 0.3 g of sample was sealed in a powder cell and
placed in the fluorescence spectrophotometer just before measurement.
2.3. Development of Machine Learning Model
2.3.1. Data Acquisition and Preprocessing
The photomultiplier voltage was set to 480 V. Based on the spectra acquired under
these measurement conditions, a 3D fluorescence matrix (FD3 file) was generated for each
sample. The obtained data were presented as 2D maps.
The maximum and minimum fluorescence intensity values of all samples were ex-
tracted and unified as the maximum and minimum values of the scale for all 2D images.
The maximum and minimum values of the fluorescence intensity scale were 1712 and 38,
respectively.
Furthermore, the 2D images were cropped to a square (360 px vertical
×
360 px
horizontal) using Python (ver. 3.13.2), and the graduations, scales, and unnecessary white
spaces common to all images were removed. These 2D images and the CSV file containing
the image classes (variety) were used as input data for the machine learning model.
2.3.2. Model Construction
As deep learning algorithms, some models were created using a CNN, as well as KNN,
RF, SVM, and logistic regression (LR); a CNN was adopted as it had the highest validation
accuracy. The CNN was implemented using Python, TensorFlow (ver. 2.18.0), and the
Keras (ver. 3.9) library. The CNN architecture was simple and was designed with reference
to the LeNet-5 architecture. Specifically, multiple Conv2D and MaxPooling2D were used,
and the data were flattened in a Flatten layer and connected to a fully connected layer.
•
Convolutional layers: The first convolutional layer used 16 filters (3
×
3 kernel size)
and applied the ReLU function as an activation function for feature extraction. This
process extracts the local patterns from the image. L2 regularization was also ap-
plied, and the weights were penalized to suppress the overfitting of the model and
improve stability;
Appl. Sci. 2025,15, 1803 5 of 12
•
Pooling layers: A 2
×
2 max pooling was performed to shrink the feature map and
retain important information. Consequently, the computational load was reduced, and
overfitting was suppressed;
•
Additional convolutional and pooling layers: Convolutional layers with 32 and
64 filters
were combined with subsequent pooling layers to extract higher-level fea-
tures. L2 regularization was also applied to these convolutional layers;
•
Fully connected layers: After converting the feature map to one dimension in the
Flatten layer, it was passed through a fully connected layer of 256 units for the final
classification. L2 regularization was also applied to this fully connected layer;
•
Dropout layer: Randomly invalidating 30% (Dropout = 0.3) output of the fully con-
nected layer suppressed overfitting and improved robustness;
•
Output layer: The probability of each variety was calculated using the SoftMax acti-
vation function. This outputs the probability that the sample belongs to a particular
variety, and the final prediction is made.
2.3.3. Training and Optimization
Adam Optimizer was used to train the model. Adam dynamically adjusts the learning
rate according to changes in the loss function, and optimization is performed efficiently
and effectively. Categorical cross-entropy was adopted as the loss function, and the settings
were made suitable for multiclass classification problems. The batch size was set to 32. In
addition, early stopping was introduced to improve training efficiency while preventing
overfitting. Considering that this was a multiclass classification problem, the number of
samples for each class was constant, and the accuracy of the model was important, the
validation accuracy was set as a monitoring indicator. Furthermore, patience (waiting
period) was set to 10, and training was terminated if no improvement in validation loss
was observed within 10 epochs. The number of epochs was set to 60 so that learning could
continue until no improvement in the validation accuracy was observed. Additionally, by
integrating the prediction results of all fold models rather than a single-fold model, it was
designed to be more robust and have a high generalization performance. The test data was
evaluated using the ensemble model.
2.3.4. Evaluation of Model Performance
In this study, ensemble learning was performed based on the prediction results of each
model obtained from five folds, and the final evaluation was performed using a weighted
average method based on the validation accuracy. Specifically, weights were calculated
based on the validation accuracy of the models obtained in each fold, and the predictions
of the test data by the models in each fold were weighed to emphasize the predictions of
the models with high validation accuracy. Finally, the accuracy of the test data obtained via
ensemble learning was used as an evaluation index for the model.
2.3.5. Prediction for Unknown Data
For unknown samples, the trained model was used to predict the probability of each
variety. The variety with the highest predicted probability was determined as the origin
and variety of the unknown sample. This enabled the identification of new samples.
3. Results
3.1. Development of the Variety Identification System
The kelp variety identification system was developed based on three steps: data
acquisition and preprocessing, model building, and training.
Appl. Sci. 2025,15, 1803 6 of 12
During preprocessing, the fluorescence intensity scale was unified to a maximum
value of 1712 and a minimum value of 38 for all image data to minimize data variability. In
addition, unnecessary scales and white spaces were removed from the images to reduce the
influence of noise. This process reduces computational load and improves data consistency.
The dataset was split into training (80%) and test (20%) data. The training data were
further split into five folds using stratified k-fold cross-validation. This method improved
the generalization performance of the model.
A convolutional neural network (CNN) was used to build the model. This ensemble
model achieved an accuracy of 98.86% for the test data and effectively captured the fine
characteristics of the varieties.
By combining fluorescent fingerprint data and a CNN, this system achieved high
accuracy in identifying the origin and variety of dried kelp, demonstrating that it is a
promising method for preventing food fraud and ensuring quality control.
3.2. Influence of Fluorescence Intensity Scale Setting Method
In the preprocessing of the fluorescence fingerprint data, a dataset of images with
uniform maximum and minimum values of the scale of the fluorescence intensity was
created for all image data, and a dataset of non-uniform scale images with maximum and
minimum values of the scale of the fluorescence intensity set for each image data was
also created.
A part of the dataset of uniform scale images and the dataset of non-uniform scale
images is shown in Figures 1and 2, and the learning curves are shown in Figures 3and 4.
The images of these datasets revealed the following points. The uniform scale images
clearly visualized the differences in fluorescence intensity between kelp varieties, while the
non-uniform scale images emphasized the trends and patterns of each image, making it
easier to capture the features.
Appl. Sci. 2025, 15, x FOR PEER REVIEW 7 of 12
Figure 3. Unified scale dataset learning curve.
Figure 4. Individual scale dataset learning curve.
The uniform scale images were expected to reduce noise and improve the learning
accuracy of the model because the data were provided to the model in a relatively con-
sistent state. However, discriminant analysis using a CNN indicated that the uniform
scale images did not show a significant improvement in accuracy compared with that of
the nonuniform scale images. A comparison of the learning curves showed that the da-
taset of nonuniform scale images (individual scale dataset) converged the learning curve
in relatively few epochs.
Furthermore, the dataset of nonuniform scale images confirmed that the fluorescence
paerns of Rausu-kombu and Rishiri-kombu were similar; however, the dataset of uni-
form scale images revealed differences in fluorescence intensities between the two.
3.3. Identification of Kombu Varieties
The training and validation accuracies were 0.9982 and 0.9796, indicating excellent
performance of the entire model (Table 1). In addition, the training and validation losses
also showed low values, and no significant discrepancy was observed between training
and validation. The test accuracy was 0.9886, confirming that predictions could be made
with an accuracy of >90%, even for samples of unknown varieties (Table 2). In the process
of model construction, Hidaka-kombu showed a characteristic fluorescent fingerprint pat-
tern and achieved high classification accuracy. However, the identification accuracies of
Rausu-kombu and Rishiri-kombu were slightly lower than that of Hidaka-kombu, and
the tendency for some samples to be misclassified was confirmed. This difference was
probably owing to similarities in the fluorescence characteristics of Rausu-kombu and
Rishiri-kombu. However, the misclassification was improved by reviewing the data pre-
processing method (method of seing the fluorescence intensity scale). Analysis using the
confusion matrix revealed the accuracy rate and misclassification tendency for each vari-
ety. The overall model performance was high, with the Precision, Recall, and F1-Score
Figure 3. Unified scale dataset learning curve.
Appl. Sci. 2025, 15, x FOR PEER REVIEW 7 of 12
Figure 3. Unified scale dataset learning curve.
Figure 4. Individual scale dataset learning curve.
The uniform scale images were expected to reduce noise and improve the learning
accuracy of the model because the data were provided to the model in a relatively con-
sistent state. However, discriminant analysis using a CNN indicated that the uniform
scale images did not show a significant improvement in accuracy compared with that of
the nonuniform scale images. A comparison of the learning curves showed that the da-
taset of nonuniform scale images (individual scale dataset) converged the learning curve
in relatively few epochs.
Furthermore, the dataset of nonuniform scale images confirmed that the fluorescence
paerns of Rausu-kombu and Rishiri-kombu were similar; however, the dataset of uni-
form scale images revealed differences in fluorescence intensities between the two.
3.3. Identification of Kombu Varieties
The training and validation accuracies were 0.9982 and 0.9796, indicating excellent
performance of the entire model (Table 1). In addition, the training and validation losses
also showed low values, and no significant discrepancy was observed between training
and validation. The test accuracy was 0.9886, confirming that predictions could be made
with an accuracy of >90%, even for samples of unknown varieties (Table 2). In the process
of model construction, Hidaka-kombu showed a characteristic fluorescent fingerprint pat-
tern and achieved high classification accuracy. However, the identification accuracies of
Rausu-kombu and Rishiri-kombu were slightly lower than that of Hidaka-kombu, and
the tendency for some samples to be misclassified was confirmed. This difference was
probably owing to similarities in the fluorescence characteristics of Rausu-kombu and
Rishiri-kombu. However, the misclassification was improved by reviewing the data pre-
processing method (method of seing the fluorescence intensity scale). Analysis using the
confusion matrix revealed the accuracy rate and misclassification tendency for each vari-
ety. The overall model performance was high, with the Precision, Recall, and F1-Score
Figure 4. Individual scale dataset learning curve.
Appl. Sci. 2025,15, 1803 7 of 12
The uniform scale images were expected to reduce noise and improve the learning
accuracy of the model because the data were provided to the model in a relatively consistent
state. However, discriminant analysis using a CNN indicated that the uniform scale
images did not show a significant improvement in accuracy compared with that of the
nonuniform scale images. A comparison of the learning curves showed that the dataset
of nonuniform scale images (individual scale dataset) converged the learning curve in
relatively few epochs.
Furthermore, the dataset of nonuniform scale images confirmed that the fluorescence
patterns of Rausu-kombu and Rishiri-kombu were similar; however, the dataset of uniform
scale images revealed differences in fluorescence intensities between the two.
3.3. Identification of Kombu Varieties
The training and validation accuracies were 0.9982 and 0.9796, indicating excellent
performance of the entire model (Table 1). In addition, the training and validation losses
also showed low values, and no significant discrepancy was observed between training
and validation. The test accuracy was 0.9886, confirming that predictions could be made
with an accuracy of >90%, even for samples of unknown varieties (Table 2). In the process
of model construction, Hidaka-kombu showed a characteristic fluorescent fingerprint
pattern and achieved high classification accuracy. However, the identification accuracies
of Rausu-kombu and Rishiri-kombu were slightly lower than that of Hidaka-kombu,
and the tendency for some samples to be misclassified was confirmed. This difference
was probably owing to similarities in the fluorescence characteristics of Rausu-kombu
and Rishiri-kombu. However, the misclassification was improved by reviewing the data
preprocessing method (method of setting the fluorescence intensity scale). Analysis using
the confusion matrix revealed the accuracy rate and misclassification tendency for each
variety. The overall model performance was high, with the Precision, Recall, and F1-Score
exceeding 0.95 (Table 3). In the evaluation of each class, Hidaka-kombu had a Recall of 1.0,
whereas Precision was slightly lower at 0.97, suggesting the possibility of false positives
in predictions. The Precision and Recall values confirmed that Rausu-kombu achieved
the highest classification accuracy. Rishiri-kombu had slightly lower Recall than Precision,
indicating the possibility of misclassification.
Table 1. Model evaluation: Accuracy and loss of learning data by weighted average ensemble method
based on the accuracy of each fold of 10 trials.
Trial Number Training
Accuracy
Validation
Accuracy Training Loss Validation Loss
1 0.9968 0.9763 0.07083 0.1483
2 1.000 0.9869 0.03911 0.1055
3 0.9941 0.9739 0.05920 0.1300
4 1.000 0.9870 0.03515 0.1164
5 0.9984 0.9849 0.03445 0.1012
6 0.9978 0.9849 0.04721 0.09288
7 0.9979 0.9682 0.04644 0.1457
8 1.000 0.9828 0.03829 0.1376
9 0.9969 0.9808 0.04789 0.1359
10 1.000 0.9699 0.04979 0.1211
Average 0.9982 0.9796 0.04684 0.1235
Analysis of the learning curve showed that the difference between the training and
validation losses was small, and no signs of overfitting were noticed. Accuracy approached
1 as the number of epochs increased, and the overall performance of the model was
evaluated to be considerable (Figure 5). Furthermore, the validation accuracy of each fold
Appl. Sci. 2025,15, 1803 8 of 12
was stable and high. Therefore, the model performed consistently across the entire dataset.
By contrast, the model converged very early, suggesting that the learning rate was high or
that the initial weights were affected. In addition, the validation loss temporarily increased
in the middle of some folds (Fold 4), suggesting temporary instability of the model.
Appl. Sci. 2025, 15, x FOR PEER REVIEW 9 of 12
Figure 5. Learning curves showing the trends of training and validation losses during model train-
ing.
Furthermore, analysis of the fluorescence fingerprint images confirmed that charac-
teristic peaks of fluorescence intensity appeared for each variety. A strong fluorescence
peak derived from amino acids at 290/330 nm excitation/emission region was observed
for Hidaka-kombu, and while the trend paerns of Rishiri-kombu and Rausu-kombu
were similar at approximately 280–330 and 350–440 nm, Rausu-kombu showed a rela-
tively weak overall fluorescence intensity. Further localization of this peak may provide
additional insights but was not determined here as it is beyond the scope of this study.
This suggests that the differences in the chemical components of each variety are reflected
in their fluorescent properties, providing useful information for identification.
These results confirmed that the combination of fluorescence spectroscopy and a
CNN is an effective method for identifying the origin and variety of kelp.
Figure 5. Learning curves showing the trends of training and validation losses during model training.
Furthermore, analysis of the fluorescence fingerprint images confirmed that charac-
teristic peaks of fluorescence intensity appeared for each variety. A strong fluorescence
peak derived from amino acids at 290/330 nm excitation/emission region was observed
for Hidaka-kombu, and while the trend patterns of Rishiri-kombu and Rausu-kombu were
similar at approximately 280–330 and 350–440 nm, Rausu-kombu showed a relatively weak
overall fluorescence intensity. Further localization of this peak may provide additional
insights but was not determined here as it is beyond the scope of this study. This sug-
gests that the differences in the chemical components of each variety are reflected in their
fluorescent properties, providing useful information for identification.
Appl. Sci. 2025,15, 1803 9 of 12
These results confirmed that the combination of fluorescence spectroscopy and a CNN
is an effective method for identifying the origin and variety of kelp.
Table 2. Accuracy and loss of test data by weighted average ensemble method based on the accuracy
of each fold of 10 trials.
Trial Number Accuracy Loss
1 0.9912 0.5934
2 0.9912 0.5709
3 0.9737 0.5884
4 0.9912 0.5728
5 0.9912 0.5735
6 0.9912 0.5758
7 0.9912 0.5777
8 0.9825 0.5755
9 0.9912 0.5837
10 0.9912 0.5742
Average 0.9886 0.5786
Table 3. Classification performance evaluation index values for each variety (average of 10 trials).
Class Precision Recall F1-Score
Hidaka 0.970 1.000 0.990
Rausu 0.995 0.997 0.996
Rishiri 0.997 0.965 0.985
4. Discussion
In this study, by combining fluorescence fingerprint data with a CNN, the origin and
variety of dried kelp from Hokkaido were identified with high accuracy. The statistical
validity of the model was evaluated using accuracy, precision, recall, and F1 score (Table 3).
The test accuracy was 0.9886 (average of 10 trials), showing high prediction accuracy even
for samples of unknown variety. Comparison of the unified and individual scale datasets
(Figures 1and 2) showed that scale standardization improved classification performance.
These results confirm the robustness of the method in this study. This result was similar
to or better than the accuracy of machine learning studies [
8
,
9
]. In addition, although
the training loss was very low at 0.04684 (average of 10 trials), the validation loss (0.1235,
average of 10 trials) was high, suggesting the possibility of overfitting and the need for
improved preprocessing methods.
Although elemental analysis by ICP-MS [
4
] and DNA analysis [
6
] have a certain degree
of accuracy, ICP-MS requires expensive equipment and skilled techniques, and DNA analy-
sis has the problem that the accuracy of identification decreases when the genetic similarity
is high. The method proposed in this study was particularly effective in identifying Rausu-
kombu and Rishiri-kombu. Although these varieties have similar fluorescence properties,
CNN analysis enabled capturing subtle differences in the fluorescent patterns. Moreover,
Hidaka-kombu had a prominent fluorescent peak derived from amino acids, which helped
in achieving high identification accuracy. Therefore, the fluorescent fingerprints reflected
the differences in chemical properties among the kombu varieties. Identification methods
that utilize such differences in chemical properties are expected to be applied in the fields
of food quality assurance and origin certification, as used for identifying the geographical
origin of rice [12].
This study converts fluorescence fingerprint data from text data to image data using a
CNN model. The results of this study indicate that a CNN is superior to conventional ma-
Appl. Sci. 2025,15, 1803 10 of 12
chine learning methods (such as KNN and SVM) in pattern recognition of image data. This
result is consistent with that previously reported [
21
]. However, because the CNN model
requires a large amount of data and computational resources, a hybrid approach using
lightweight models (e.g., MobileNet and EfficientNet) or other algorithms (
e.g., XGBoost
and RF) can be considered for future applications.
The fluorescence properties of dried kelp are significantly affected by components
such as amino acids and polyphenols. In particular, fluorescence peaks derived from amino
acids were prominent in Hidaka-kombu, and this characteristic may be the main factor
in achieving high identification accuracy. However, Rausu-kombu and Rishiri-kombu
had similar fluorescence characteristics at approximately 280–330 and 350–440 nm, and
this similarity may have been a cause of misclassification. Lia et al. [
22
] have identified
fluorescent substances, such as chlorophyll and tocopherol, and distinguished olive oil
from Malta and other origins using fluorescence spectroscopy. Xie et al. [
23
] have used
the fluorescence properties of caffeine to determine coffee quality using a fluorescence
fingerprinting method. Ali et al. [
24
] have determined the quality of Sidr honey using a
fluorescence fingerprinting method by focusing on specific phenolic compounds (caffeic
acid, chlorogenic acid, and ferulic acid). The accuracy of the identification model may be
improved by combining fluorescence fingerprinting with chemical component analysis.
This model has the potential to be applied to different food categories and to identify
geographical origins, production areas, and other varieties. Fluorescence spectroscopy has
been used to identify the geographic origin and other varieties of rice [
15
], varieties and
geographic origin of olive oil [
16
], and botanical origin of Swiss honey [
17
]. Therefore, the
effectiveness of fluorescence spectroscopy has been confirmed for a wide variety of foods,
and its application for identifying kelp varieties may contribute to strengthening quality
control and traceability in the food manufacturing industry.
In future research, it is expected that the accuracy of identification will be improved
by integrating other spectroscopic techniques, such as NIR. NIR is excellent for measuring
the chemical components of food, and by combining it with fluorescence spectroscopy,
more multifaceted analysis will be possible. By utilizing multimodal machine learning
and integrating multiple spectroscopic data, it is expected that the accuracy of origin
identification will be improved.
In this study, only three types of kelp from Hokkaido were used, but the effects of
differences in region and harvest time were not taken into consideration. In the future, the
versatility of the model will be improved by using samples from other regions and different
harvest times.
5. Conclusions
The combination of fluorescence spectroscopy and a CNN is effective in identifying
the origin and variety of kelp, demonstrating its versatility and potential applications.
Future studies should aim to contribute to the prevention of fraud and quality control in
the food industry.
Expanding the dataset is necessary for practically using this model in the food manu-
facturing industry. In addition to the three varieties of kelp studied here, the versatility of
the model can be increased by targeting samples from other regions and different harvest
times. Furthermore, highly accurate models can be built using small datasets by utilizing
transfer learning to apply them to various food categories.
The identification accuracy can also be improved by integrating fluorescent fingerprint
data with other analytical methods. For example, through the supplementary use of
component analysis and spectral data, evaluation of food characteristics from multiple
angles and building an identification system with high accuracy and reliability are possible.
Appl. Sci. 2025,15, 1803 11 of 12
Ultimately, the results of this study will contribute to strengthening quality control and
prevention of origin fraud in the food manufacturing industry and improve
consumer trust.
Author Contributions: Conceptualization, T.M.; methodology, K.S. and R.A.; software, K.S., R.A. and
T.M.; validation, T.M.; data curation, K.S. and R.A.; writing—original draft preparation, K.S.; review,
Y.L.; writing—review and editing, T.M.; project administration, T.M.; funding acquisition, T.M. All
authors have read and agreed to the published version of the manuscript.
Funding: This work was supported by JSPS KAKENHI Grant Number JP22K02156. https://www.
jsps.go.jp/j-grantsinaid/16_rule/rule.html (accessed on 25 December 2024).
Data Availability Statement: The data that support the findings of this study are available from the
corresponding author upon reasonable request.
Conflicts of Interest: The authors declare no conflicts of interest.
References
1.
Ministry of Agriculture, Forestry and Fisheries. About the System for Labeling the Origin of Ingredients in Processed Foods.
Available online: https://www.maff.go.jp/j/syouan/hyoji/gengen_hyoji.html (accessed on 25 December 2024). (In Japanese)
2.
European Union Laws. Available online: https://eur-lex.europa.eu/legal-content/EN/ALL/?uri=CELEX:32018R0775 (accessed
on 25 December 2024).
3.
Code of Federal Regulations. Available online: https://www.ecfr.gov/current/title-19/chapter-I/part-134 (accessed on 25
December 2024).
4.
Hattori, S.; Tsukuda, M.; Homura, Y. Determination of the geographic origin of dried kelp by inorganic analysis. Nippon Suisan
Gakkaishi 2009,75, 77–82. (In Japanese) [CrossRef]
5.
Morohashi, T.; Aoyama, K.; Namikoshi, A.; Kimura, Y.; Hattori, S. Determination of the geographic origin of boiled and salted
wakame Undaria pinnatifida products by element analysis. Nippon Suisan Gakkaishi 2011,77, 243–245. (In Japanese) [CrossRef]
6.
Shimizu, K.; Kato, Y.; Kato, S.; Inoue, A.; Ojima, T.; Yasokawa, D. Technology for Geographic Origin Identification of Edible Kelps;
Report No. 11; Hokkaido Industrial Technology Center: Sapporo, Japan, 2010. (In Japanese)
7.
Kawai, T.; Yotsukura, N. Current remarks of phylogeny and taxonomy on genus Laminaria. Rishiri Stud. 2005,24, 37–47. (In
Japanese)
8.
Yang, S.; Li, C.; Mei, Y.; Liu, W.; Liu, R.; Chen, W.; Han, D.; Xu, K. Determination of the geographical origin of coffee beans using
terahertz spectroscopy combined with machine learning methods. Front. Nutr. 2021,8, 680627. [CrossRef]
9.
Berghian-Grosan, C.; Magdas, D.A. Application of Raman spectroscopy and machine learning algorithms for fruit distillates
discrimination. Sci. Rep. 2020,10, 21152. [CrossRef] [PubMed]
10.
Gu, H.W.; Zhou, H.H.; Lv, Y.; Wu, Q.; Pan, Y.; Peng, Z.X.; Zhang, X.H.; Yin, X.L. Geographical origin identification of Chinese red
wines using ultraviolet-visible spectroscopy coupled with machine learning techniques. J. Food Compos. Anal. 2023,119, 105265.
[CrossRef]
11.
Yan, X.; Xie, Y.; Chen, J.; Yuan, T.; Leng, T.; Chen, Y.; Xie, J.; Yu, Q. NIR spectrometric approach for geographical origin
identification and taste related compounds content prediction of Lushan Yunwu tea. Foods 2022,11, 2976. [CrossRef] [PubMed]
12.
Hu, L.; Zhang, Y.; Ju, Y.; Meng, X.; Yin, C. Rapid identification of rice geographical origin and adulteration by excitation-emission
matrix fluorescence spectroscopy combined with chemometrics based on fluorescence probe. Food Control 2023,146, 109547.
[CrossRef]
13.
Al Riza, D.F.; Kondo, N.; Rotich, V.K.; Perone, C.; Giametta, F. Cultivar and geographical origin authentication of Italian extra
virgin olive oil using front-face fluorescence spectroscopy and chemometrics. Food Control 2021,121, 107604. [CrossRef]
14.
Karoui, R.; Dufour, E.; Bosset, J.-O.; De Baerdemaeker, J. The use of front face fluorescence spectroscopy to classify the botanical
origin of honey samples produced in Switzerland. Food Chem. 2007,101, 314–323. [CrossRef]
15.
Strelec, I.; Kucko, L.; Roknic, D.; Mrsa, V.; Ugarcic-Hardi, Z. Spectrofluorimetric, spectrophotometric and chemometric analysis of
wheat grains infested by Sitophilus granaries.J. Stored Prod. Res. 2012,50, 42–48. [CrossRef]
16.
Yang, Q.; Tian, S.; Xu, H. Identification of the geographic origin of peaches by VIS-NIR spectroscopy, fluorescence spectroscopy
and image processing technology. J. Food Compos. Anal. 2022,114, 104843. [CrossRef]
17.
Hou, Z.; Jin, Y.; Gu, Z.; Zhang, R.; Su, Z.; Liu, S. 1H NMR spectroscopy combined with machine-learning algorithm for origin
recognition of Chinese famous green tea Longjing tea. Foods 2024,13, 2702. [CrossRef] [PubMed]
18.
Chen, M.; Guo, W.; Yi, X.; Jiang, Q.; Hu, X.; Peng, J.; Tian, J. Hyperspectral imaging combined with convolutional neural network
for Pu’er ripe tea origin recognition. J. Food Compos. Anal. 2025,139, 107093. [CrossRef]
Appl. Sci. 2025,15, 1803 12 of 12
19.
Chen, A.Q.; Wu, H.L.; Wang, T.; Wang, X.Z.; Sun, H.B.; Yu, R.Q. Intelligent analysis of excitation-emission matrix fluorescence
fingerprint to identify and quantify adulteration in camellia oil based on machine learning. Talanta 2023,251, 123733. [CrossRef]
[PubMed]
20.
Wu, M.; Li, M.; Fan, B.; Sun, Y.; Tong, L.; Wang, F.; Li, L. A rapid and low-cost method for detection of nine kinds of vegetable oil
adulteration based on 3-D fluorescence spectroscopy. LWT 2023,188, 115419. [CrossRef]
21. Hu, Y.; Wei, C.; Wang, X.; Wang, W.; Jiao, Y. Using three-dimensional fluorescence spectroscopy and machine learning for rapid
detection of adulteration in camellia oil. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2025,329, 125524. [CrossRef] [PubMed]
22.
Lia, F.; Formosa, J.P.; Zammit-Mangion, M.; Farrugia, C. The first identification of the uniqueness and authentication of Maltese
extra virgin olive oil using 3D-fluorescence spectroscopy coupled with multi-way data analysis. Foods 2020,9, 498. [CrossRef]
[PubMed]
23.
Xie, J.Y.; Tan, J. Front-face synchronous fluorescence spectroscopy: A rapid and non-destructive authentication method for
Arabica coffee adulterated with maize and soybean flours. J. Consum. Prot. Food Saf. 2022,17, 209–219. [CrossRef]
24.
Ali, H.; Khan, S.; Ullah, R.; Khan, B. Fluorescence fingerprints of Sidr honey in comparison with uni/polyfloral honey samples.
Eur. Food Res. Technol. 2020,246, 1829–1837. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.