ArticlePDF Available

Research on Multimodal Prediction of E-Commerce Customer Satisfaction Driven by Big Data

MDPI
Applied Sciences
Authors:

Abstract and Figures

This study deeply integrates multimodal data analysis and big data technology, proposing a multimodal learning framework that consolidates various information sources, such as user geographic location, behavior data, and product attributes, to achieve a more comprehensive understanding and prediction of consumer behavior. By comparing the performance of unimodal and multimodal approaches in handling complex cross-border e-commerce data, it was found that multimodal learning models using the Adam optimizer significantly outperformed traditional unimodal learning models in terms of prediction accuracy and loss rate. The improvements were particularly notable in training loss and testing accuracy. This demonstrates the efficiency and superiority of multimodal methods in capturing and analyzing heterogeneous data. Furthermore, the study explores and validates the potential of big data and multimodal learning methods to enhance customer satisfaction in the cross-border e-commerce environment. Based on the core findings, specific applications of big data technology in cross-border e-commerce operations were further explored. A series of innovative strategies aimed at improving operational efficiency, enhancing consumer satisfaction, and increasing global market competitiveness were proposed.
This content is subject to copyright.
Citation: Zhang, X.; Guo, C. Research
on Multimodal Prediction of
E-Commerce Customer Satisfaction
Driven by Big Data. Appl. Sci. 2024,
14, 8181. https://doi.org/10.3390/
app14188181
Academic Editor: Andrea Prati
Received: 4 August 2024
Revised: 8 September 2024
Accepted: 9 September 2024
Published: 11 September 2024
Copyright: © 2024 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
applied
sciences
Article
Research on Multimodal Prediction of E-Commerce Customer
Satisfaction Driven by Big Data
Xiaodong Zhang 1, and Chunrong Guo 2, *,†
1School of Economics and Management, Inner Mongolia Agricultural University, Hohhot 010010, China;
ethan@imau.edu.cn
2School of Economics and Management, Ningbo University of Technology, Ningbo 315211, China
*Correspondence: guochr@126.com
These authors contributed equally to this work.
Abstract: This study deeply integrates multimodal data analysis and big data technology, propos-
ing a multimodal learning framework that consolidates various information sources, such as user
geographic location, behavior data, and product attributes, to achieve a more comprehensive un-
derstanding and prediction of consumer behavior. By comparing the performance of unimodal and
multimodal approaches in handling complex cross-border e-commerce data, it was found that multi-
modal learning models using the Adam optimizer significantly outperformed traditional unimodal
learning models in terms of prediction accuracy and loss rate. The improvements were particularly
notable in training loss and testing accuracy. This demonstrates the efficiency and superiority of multi-
modal methods in capturing and analyzing heterogeneous data. Furthermore, the study explores and
validates the potential of big data and multimodal learning methods to enhance customer satisfaction
in the cross-border e-commerce environment. Based on the core findings, specific applications of big
data technology in cross-border e-commerce operations were further explored. A series of innovative
strategies aimed at improving operational efficiency, enhancing consumer satisfaction, and increasing
global market competitiveness were proposed.
Keywords: multimodal learning; customer satisfaction; cross-border e-commerce; big data;
intelligent business
1. Introduction
The cross-border e-commerce industry necessitates intelligent empowerment. As the
world’s largest trading nation, China has seen rapid growth in its cross-border e-commerce
sector. Since 2014, customs codes “9610” and “1210” have included B2C retail and bonded
models in statistics. Starting in 2020, codes “9710” and “9810” were introduced for B2B
direct exports and exports to overseas warehouses. Cross-border e-commerce transactions
surged from 36 billion RMB in 2015 to 2.38 trillion RMB in 2023, with exports consistently
dominating over 75% of the total. In 2023, export transactions reached 1.83 trillion RMB
(76.9%), while imports totaled 548.3 billion RMB (23.1%). To foster foreign trade, innovation,
and internationalization, the State Council expanded comprehensive pilot zones to 165
from 2015 to 2022. The rapid global development of online trade has positioned cross-
border e-commerce as a crucial infrastructure for foreign trade, with significant impacts on
agriculture, manufacturing, and services. Leveraging new digital technologies effectively
addresses supply–demand matching issues in the cross-border e-commerce supply chain.
Intelligent decision-making in product development empowers upstream manufacturers
to penetrate overseas markets and offers integrated international sales services, enhancing
big data accumulation and an understanding of market demands. Downstream, intelligent
decision-making aids distributors by identifying consumer needs and optimizing prod-
uct selection and stocking, thus integrating global resources in manufacturing, logistics,
marketing, and services and resolving issues from production to post-sale.
Appl. Sci. 2024,14, 8181. https://doi.org/10.3390/app14188181 https://www.mdpi.com/journal/applsci
Appl. Sci. 2024,14, 8181 2 of 18
Predicting customer satisfaction can better serve downstream distributors by accu-
rately identifying consumer needs based on the accumulation of big data in cross-border
e-commerce. This capability assists cross-border e-commerce sellers in selecting and stock-
ing products, facilitating the integration of global resources in manufacturing, logistics,
marketing, and services and effectively addressing a series of issues from production to
post-sale service [
1
]. Traditional satisfaction prediction requires the establishment of an
indicator system and surveys consumer subjective evaluations through questionnaires.
This approach is costly and time-consuming, and its accuracy depends on the design of the
questionnaire and the precision of the survey sample [
2
]. E-commerce platforms typically
rely on customer satisfaction ratings post-purchase, which is not applicable to products
that are still in the testing phase or are newly launched with few consumers. New products
often lack historical data, making the use of multimodal big data—such as product posi-
tioning and attributes—as feature variables for decision-making particularly suitable. This
approach is not only faster and more cost-effective but also retains more sample informa-
tion, enhancing decision-making capabilities. This study integrates multiple attributes and
complex features of cross-border e-commerce sales data to develop a multimodal learning
model that combines product and market attributes. This model offers a more comprehen-
sive, precise, and in-depth understanding of consumer behavior, habits, preferences, and
characteristics [
3
]. It rapidly collects extensive information on apparel features, enabling
brands to accurately assess which products are suitable for market expansion and estimate
consumer satisfaction levels. The model assists cross-border e-commerce operators in
making well-informed decisions regarding product listings, selection, and development,
allowing them to estimate sales and profits, optimize customer experiences, and support
the internationalization and intelligent growth of cross-border e-commerce operations [4].
2. Literature Review
2.1. Research on Multimodal Learning
Multimodal Machine Learning (MMML) involves creating models that enable ma-
chines to learn from various modalities and facilitate information exchange across these
modalities [
5
]. Multimodal learning has demonstrated effective applications across various
fields such as education, healthcare, smart hardware, and gaming, showcasing strong
potential for development [
6
]. In the education sector, multimodal technology can offer
students richer learning resources and more interactive experiences [
7
,
8
]. In the healthcare
industry, the application of multimodal techniques—combining image recognition, speech
recognition, and natural language processing—enables the intelligent analysis and interpre-
tation of medical imaging data, assisting doctors in making accurate diagnoses. In smart
hardware, multimodal functions enhance devices’ perception and interaction capabilities;
by integrating voice and image recognition technologies, these devices can more accurately
understand user commands and offer a broader range of functionalities [
9
]. In the gaming
industry, the combination of image recognition, speech recognition, and gesture recognition
technologies creates more immersive virtual reality gaming experiences. Additionally,
multimodal functionalities enable richer character expressions and action interactions,
thereby enhancing the enjoyment and interactivity of games [
10
]. From the perspective of
modality fusion, it includes data fusion and feature fusion. Feature fusion methods train a
model for each modality, make decisions, and then use an attention mechanism to merge
these decisions into a final comprehensive decision [
11
]. Data fusion directly links feature
data from different modalities, using the fused data for model training.
2.2. Data Empowerment and Intelligent Marketing
Data empowerment is crucial for the digital transformation of enterprises [
12
]. Ad-
vances in data technology and analytical techniques have significantly enhanced data
empowerment [
13
,
14
]. Intelligent sales link the supply chain ends, enabling differentiated
demand mining at the front end and personalized production at the back end [
15
]. The
apparel supply chain, characterized by rapid changes, short cycles, and flexibility, benefits
Appl. Sci. 2024,14, 8181 3 of 18
from the integration of artificial intelligence, which offers innovative management models
for the fashion supply chain system [
16
]. Leveraging business big data technology and ma-
chine learning algorithms can significantly enhance supply chain intelligence [
17
]. Artificial
intelligence drives supply chain transformation in areas such as platform reconstruction,
ecosystem reshaping, and advantage rebuilding [18,19].
Due to market uncertainty and intense competition, enterprises are compelled to try
various strategies to improve performance [
20
]. Business intelligence helps companies
quickly generate insights, guiding decision-makers to improve operational efficiency, seek
new opportunities, and stand out from the competition [
1
,
21
,
22
]. E-commerce facilitates
resource sharing, coordination, and optimized allocation, promoting intelligent market-
ing [
23
]. Intelligent marketing requires integrating networking, digital, and intelligent
technologies for deep supply chain cooperation [
24
]. According to Senyo et al. [
25
], digital
innovation fundamentally alters cooperation and competition among enterprises, focusing
on building digital business ecosystems and collaboration networks. Innovations in digital
business models and open organizational structures promote global innovation networks,
better meeting customer needs and enhancing innovation capabilities [
26
]. Digital intelli-
gent technologies drive supply chain reform from supply and demand perspectives [
27
,
28
].
Intelligent marketing expands resource allocation breadth and depth, creating competitive
advantages at different product lifecycle stages [29].
2.3. Intelligent Forecasting in Cross-Border E-Commerce
With the rapid development of cross-border e-commerce, sales enterprises aim to accu-
rately predict sales performance and reasonably develop products to achieve greater profits,
attract more investment, and provide a better customer experience. When shopping online,
users often communicate their needs to customer service through various modalities, in-
cluding text, images, and videos, resulting in a significant amount of unstructured data [
30
].
Multimodal data, such as text and images, play a crucial role in e-commerce customer
service [
31
]. However, traditional unimodal methods can only capture information from a
single dimension, often failing to fully reflect the complexity of customer behavior. This
limitation makes multimodality a significant challenge for e-commerce customer service
systems [
32
]. Bi et al. [
33
] explored e-commerce product classification using a multimodal
late fusion approach based on text and image modalities. Their research demonstrated
that the proposed method outperformed traditional unimodal methods in multimodal
product classification tasks. Cai et al. [
34
] proposed a spatial feature fusion and grouping
strategy based on multimodal data and developed a neural network model for predicting
e-commerce product demand. The experimental results confirmed the superiority and
effectiveness of the proposed algorithm. Xu et al. [
35
] designed a multimodal analysis
framework to predict product return rates in live-streaming e-commerce. Experiments us-
ing real-world data from Taobao Live demonstrated that multimodal signals from products
and anchors effectively predict return rates. Wróblewska et al. [
36
] developed a machine
learning-based recommendation system that supports the fusion of various data representa-
tions through multimodal methods. Their research showed that this system outperformed
state-of-the-art techniques on open datasets. Xu et al. [
37
] designed a multimodal analysis
framework for predicting product sales in live-streaming e-commerce and explored the
impact of anchor reputation on sales. Experiments using real-world data from Douyin Live
demonstrated the effectiveness of the constructed multimodal anchor reputation signals in
predicting product sales. To address the challenges of sentiment and emotion modeling
posed by unstructured big data with different modalities, Seng and Ang proposed a new
architecture for multimodal sentiment and emotion modeling, which was validated for its
performance [
38
]. Shoumy et al. highlighted that a new architecture combining different
modalities can achieve more complex and accurate sentiment analyses [
39
]. These studies
indicate that multimodal deep learning techniques in e-commerce demonstrate higher pre-
dictive accuracy and model robustness when dealing with heterogeneous data. Utilizing
Appl. Sci. 2024,14, 8181 4 of 18
multimodal data to predict customer satisfaction not only improves prediction accuracy
but also enhances the model’s adaptability to different customer groups [40].
Although deep learning has made significant progress in intelligent recognition and
decision-making, especially in image processing, its application in management theory and
practice has lagged behind. The globalization of cross-border e-commerce markets and
the digitization of operations offer ample opportunities for the development of artificial
intelligence and big data in management decision-making. Traditional methods, which
do not account for the complex attributes of sales products and market characteristics,
often result in time-consuming, labor-intensive, and insufficiently accurate predictions.
Current research on intelligent decision-making in product development encompasses
aspects such as customer value co-creation, dynamic capability evolution, supply chain
collaboration, value chain enhancement, and open innovation [
41
]. However, there is
limited focus on specific technologies for intelligent development, data analysis, and
deployment implementation. Research methods primarily rely on traditional case analysis,
econometric analysis, and structural equation modeling [
2
,
42
], elaborating on definitions,
influencing factors, and significance. Basic theories and key technologies of artificial
intelligence in intelligent decision-making for product development are rarely addressed.
E-commerce market positioning and product attributes play a crucial role in consumer
satisfaction, a point that intelligent marketing research has yet to fully consider. Introducing
multimodal learning into the field of intelligent sales in cross-border e-commerce is highly
beneficial for improving the accuracy of intelligent decision-making [
43
]. Research on
intelligent marketing must also engage in interdisciplinary studies to promote the dual
development of theoretical exploration and practical application.
3. Technical Construction
3.1. Data Collection and Processing
Taking the dress category in the cross-border e-commerce market as an example, data
from the product detail pages of 862 dresses on the Amazon platform were collected. The
data include variables such as Style, Price, Size, Season, Waistline, Neckline, Sleeve Length,
Material, Fabric Type, Decoration, Pattern Type, and Rating. Among these, Style, Price,
Size, Season, and Waistline are market positioning feature variables, while Neckline, Sleeve
Length, Material, Fabric Type, Decoration, and Pattern Type are product attribute feature
variables. Rating is used as the label variable. These 12 variables were further transformed
into feature values, as shown in Table 1.
Table 1. Variables and feature values.
Variable Name Feature Value
Style 1—Bohemian; 2—brief; 3—casual; 4—cute; 5—fashion; 6—flare;
7—novelty; 8—OL party; 9—sexy; 10—vintage; 11—work
Price 1—low; 2—average; 3—medium; 4—high; 5—very high
Size 1—small; 2—S; 3—M; 4—L; 5—XL; 6—free
Season 1—autumn; 2—winter; 3—spring; 4—summer
Waistline 1—dropped; 2—empire; 3—natural; 4—princess; 5—null
Neckline
1—O-neck; 2—backless; 3—boat-neck; 4—bowneck; 5—halter;
6—mandarin collar; 7—open; 8—Peter Pan collar; 9—ruffled; 10—scoop;
11—slash-neck; 12—square collar; 13—sweetheart; 14—turndown collar;
15—V-neck
Sleeve Length 1—full; 2—half sleeve; 3—butterfly; 4—sleeveless; 5—short;
6—three-quarter; 7—turndown; 8—cap sleeves
Appl. Sci. 2024,14, 8181 5 of 18
Table 1. Cont.
Variable Name Feature Value
Material
1—cotton; 2—wool; 3—microfiber; 4—polyester; 5—silk; 6—chiffon
fabric; 7—nylon; 8—linen; 9—rayon; 10—Lycra; 11—milk silk;
12—acrylic; 13—spandex; 14—mix; 15—cashmere; 16—knitted;
17—chiffon; 18—viscose; 19—lace; 20—modal; 21—other
Fabric Type
1—chiffon; 2—broadcloth; 3—jersey; 4—batik; 5—worsted; 6—woolen;
7—satin; 8—flannel; 9—poplin; 10—dobby; 11—knitting; 12—flannel;
13—tulle; 14—satin; 15—organza; 16—lace; 17—corduroy; 18—terry;
19—none
Decoration
1—ruffles; 2—embroidery; 3—bow; 4—lace; 5—beading; 6—sashes;
7—hollow out; 8—pockets; 9—sequin; 10—applique; 11—button;
12—tiered; 13—rivet; 14—feathers; 15—flowers; 16—pearls; 17—pleat;
18—crystal; 19—ruched; 20—draped; 21—tassels; 22—plain;
23—cascading; 24—none
Pattern Type
1—animal; 2—print; 3—dot; 4—solid; 5—patchwork; 6—striped;
7—geometric; 8—plaid; 9—leopard; 10—floral; 11—character; 12—splice;
13—leopard; 14—none
Rating 1~5
The feature values in the dataset vary across different ranges, resulting in certain
discrepancies. Before training, the feature values were standardized using the formula
(X-mean)/std to ensure that all data are distributed with a variance of 1 within the range of
1 to 1. The holdout method was employed for data splitting, dividing the 862 records
into two mutually exclusive sets: 80% of the data (689 records) was used for training, and
20% (173 records) was used for testing.
3.2. Unimodal Learning Model
The market and positioning of Modal I include data such as Style, Price, Size, Season,
Waistline, and Rating. A deep neural network constructs a fully connected layer (dense_1)
as the input layer for this modal, using the ReLU activation function to transform the
12-dimensional input into a 128-dimensional output. This layer generates 768 parameters
to be estimated. To prevent overfitting, a dropout layer (dropout_1) is added next, ran-
domly disconnecting 20% of the input neurons’ connections during each parameter update.
Another fully connected layer (dense_2) is then established, using the ReLU activation
function to convert the 128-dimensional input into a 32-dimensional output, generating
4218 parameters to be estimated. Finally, a fully connected layer (dense_3) is constructed,
applying the ReLU function to transform the 32-dimensional input into a 1-dimensional
output Y, as shown in Table 2.
Table 2. Deep neural network structure of Modal I.
Layer (Type) Output Shape Param
dense_1 (Dense) (None, 128) 768
activation_1 (Activation) (None, 128) 0
dropout_1 (Dropout) (None, 128) 0
dense_2 (Dense) (None, 32) 4128
activation_2 (Activation) (None, 32) 0
dense_3 (Dense) (None, 1) 33
activation_3 (Activation) (None, 1) 0
Total params: 4929
Trainable params: 4929
Non-trainable params: 0
Appl. Sci. 2024,14, 8181 6 of 18
The product attributes of Modal II include data such as Neckline, Sleeve Length,
Material, Fabric Type, Decoration, Pattern Type, and Rating. A deep neural network is
constructed for Modal II as follows:
Input Layer (dense_1): Uses the ReLU activation function to transform the 12-dimensional
input into a 128-dimensional output, generating 896 parameters to be estimated.
Dropout Layer (dropout_1): Randomly disconnects 20% of the input neurons’ connec-
tions during training to prevent overfitting.
Hidden Layer (dense_2): Uses the ReLU activation function to convert the 128-dimensional
input into a 32-dimensional output, generating 4218 parameters to be estimated.
Output Layer (dense_3): Uses the ReLU function to transform the 32-dimensional
input into a 1-dimensional output Y, as shown in Table 3.
Table 3. Deep neural network structure of Modal II.
Layer (Type) Output Shape Param
dense_1 (Dense) (None, 128) 896
activation_1 (Activation) (None, 128) 0
dropout_1 (Dropout) (None, 128) 0
dense_2 (Dense) (None, 32) 4128
activation_2 (Activation) (None, 32) 0
dense_3 (Dense) (None, 1) 33
activation_3 (Activation) (None, 1) 0
Total params: 5070
Trainable params: 5070
Non-trainable params: 0
3.3. Multimodal Learning Model
A tensor is a higher-order extension of vectors and matrices, where the dimensions
determine the tensor’s order. Tensor fusion networks can effectively integrate interaction
information correlated between different modalities, preserving the original data to the
maximum extent. This improves the recognition and prediction accuracy of multimodal data.
The market positioning information is input as
P={pn}N
n=1
and the product attributes
are input as
A={an}N
n=1
. These two feature vectors are encoded and merged into the
same space, forming the tensor fusion input:
M=npn,an,oN
n=1
. The multimodal fusion
tensor, formed based on the tensor product, undergoes intelligent learning and training.
The model is shown in Figure 1.
The neural network structure and parameters for multimodal learning are shown in
Table 4.
Table 4. Multilayer deep neural network structure.
Layer (Type) Output Shape Param
dense_1 (Dense) (None, 128) 1536
activation_1 (Activation) (None, 128) 0
dropout_1 (Dropout) (None, 128) 0
dense_2 (Dense) (None, 32) 4128
activation_2 (Activation) (None, 32) 0
dense_3 (Dense) (None, 1) 33
activation_3 (Activation) (None, 1) 0
Total params:5697
Trainable params: 5697
Non-trainable params: 0
Appl. Sci. 2024,14, 8181 7 of 18
Appl. Sci. 2024, 14, x FOR PEER REVIEW 7 of 18
Figure 1. Multimodal deep neural network structure. Note: FCL: fully connected layer; DL: drop-
out layer.
The neural network structure and parameters for multimodal learning are shown in
Table 4.
Table 4. Multilayer deep neural network structure.
Layer (Type) Output Shape Param
dense_1 (Dense) (None, 128) 1536
activation_1 (Activation) (None, 128) 0
dropout_1 (Dropout) (None, 128) 0
dense_2 (Dense) (None, 32) 4128
activation_2 (Activation) (None, 32) 0
dense_3 (Dense) (None, 1) 33
activation_3 (Activation) (None, 1) 0
Total params:5697
Trainable params: 5697
Non-trainable params: 0
4. Analysis and Testing
Before machine learning, data preprocessing ensures that the data structure, features,
quality, attributes, and distribution meet the standard requirements. Modal I and Modal
II each use 689 data entries for training and 173 for testing.
In deep learning, RMSProp and Adam are high-performance optimizers. RMSProp
(Root Mean Square Propagation) introduces a decay coefficient to reduce oscillations in
gradient descent, improving convergence speed and stability. Adam (Adaptive Moment
Estimation) combines momentum gradient descent and adaptive learning rate optimiza-
tion. It uses exponential moving averages of gradients, computing first and second mo-
ment estimates, and designs adaptive learning rates for different parameters.
This study uses the RMSProp and Adam algorithms for deep learning and federated
learning, performing a comparative analysis of the training and testing results.
Figure 1. Multimodal deep neural network structure. Note: FCL: fully connected layer; DL: dropout layer.
4. Analysis and Testing
Before machine learning, data preprocessing ensures that the data structure, features,
quality, attributes, and distribution meet the standard requirements. Modal I and Modal II
each use 689 data entries for training and 173 for testing.
In deep learning, RMSProp and Adam are high-performance optimizers. RMSProp
(Root Mean Square Propagation) introduces a decay coefficient to reduce oscillations in
gradient descent, improving convergence speed and stability. Adam (Adaptive Moment
Estimation) combines momentum gradient descent and adaptive learning rate optimization.
It uses exponential moving averages of gradients, computing first and second moment
estimates, and designs adaptive learning rates for different parameters.
This study uses the RMSProp and Adam algorithms for deep learning and federated
learning, performing a comparative analysis of the training and testing results.
4.1. Market and Positioning Modal Learning
Modal I was trained for 500 epochs using the Keras framework and the Adam opti-
mizer. The results are shown in Table 5. The training loss decreased from 7.6809 to 2.2602,
and the training accuracy (mean absolute error) improved from 2.4985 to 1.2045. The test
loss was 3.9180 and the test accuracy was 1.4277, indicating a moderate learning effect.
Using the Keras learning framework and the RMSProp optimizer with a learning
rate of 0.001, Modal I was trained for 500 epochs. The results are shown in Table 6. The
training loss decreased from 6.2182 to 2.7196, and the training accuracy (mean absolute
error) improved from 2.2700 to 1.2271. The test loss was 4.1295 and the test accuracy was
1.4984, indicating poor learning performance.
Appl. Sci. 2024,14, 8181 8 of 18
Table 5. Learning results of Modal I using Adam.
Training Epochs Training Time
Training Loss
(MSE) Mean
Squared Error
Training Accuracy
(MAE) Mean
Absolute Error
Epoch 1/500 689/689
[=============] 0 s 418 us/step -loss: 7.6809 -mae: 2.4985
Epoch 2/500 689/689
[=============] 0 s 110 us/step -loss: 4.6210 -mae: 1.6709
Epoch 3/500 689/689
[=============] 0 s 116 us/step -loss: 4.4499 -mae: 1.7674
Epoch 4/500 689/689
[=============] 0 s 110 us/step -loss: 4.1976 -mae: 1.7066
Epoch 5/500 689/689
[=============] 0 s 99 us/step -loss: 4.1244 -mae: 1.6571
. . .. . .
Epoch 496/500 689/689
[=============] 0 s 115 us/step -loss: 2.6092 -mae: 1.2116
Epoch 497/500 689/689
[=============] 0 s 106 us/step -loss: 2.6684 -mae: 1.1917
Epoch 498/500 689/689
[=============] 0 s 104 us/step -loss: 2.6892 -mae: 1.2178
Epoch 499/500 689/689
[=============] 0 s 116 us/step -loss: 2.5525 -mae: 1.2200
Epoch 500/500 689/689
[=============] 0 s 116 us/step -loss: 2.6202 -mae: 1.2045
Appl. Sci. 2024, 14, x FOR PEER REVIEW 8 of 18
4.1. Market and Positioning Modal Learning
Modal I was trained for 500 epochs using the Keras framework and the Adam opti-
mizer. The results are shown in Table 5. The training loss decreased from 7.6809 to 2.2602,
and the training accuracy (mean absolute error) improved from 2.4985 to 1.2045. The test
loss was 3.9180 and the test accuracy was 1.4277, indicating a moderate learning effect.
Table 5. Learning results of Modal I using Adam.
Training Epochs Training
Time
Training Loss
(MSE) Mean
Squared Error
Training Accu-
racy (MAE)
Mean Absolute
Error
Epoch 1/500 689/689 [=============] 0 s 418
us/step -loss: 7.6809 -mae: 2.4985
Epoch 2/500 689/689 [=============] 0 s 110
us/step -loss: 4.6210 -mae: 1.6709
Epoch 3/500 689/689 [=============] 0 s 116
us/step -loss: 4.4499 -mae: 1.7674
Epoch 4/500 689/689 [=============] 0 s 110
us/step -loss: 4.1976 -mae: 1.7066
Epoch 5/500 689/689 [=============] 0 s 99 us/step -loss: 4.1244 -mae: 1.6571
……
Epoch 496/500 689/689 [=============] 0 s 115
us/step -loss: 2.6092 -mae: 1.2116
Epoch 497/500 689/689 [=============] 0 s 106
us/step -loss: 2.6684 -mae: 1.1917
Epoch 498/500 689/689 [=============] 0 s 104
us/step -loss: 2.6892 -mae: 1.2178
Epoch 499/500 689/689 [=============] 0 s 116
us/step -loss: 2.5525 -mae: 1.2200
Epoch 500/500 689/689 [=============] 0 s 116
us/step -loss: 2.6202 -mae: 1.2045
Test Samples Testing Time Test Loss Test Accuracy
173/173 [========================] 0 s 219
us/step -loss: 3.9180 -mae: 1.4277
Using the Keras learning framework and the RMSProp optimizer with a learning rate
of 0.001, Modal I was trained for 500 epochs. The results are shown in Table 6. The training
loss decreased from 6.2182 to 2.7196, and the training accuracy (mean absolute error) im-
proved from 2.2700 to 1.2271. The test loss was 4.1295 and the test accuracy was 1.4984,
indicating poor learning performance.
Appl. Sci. 2024, 14, x FOR PEER REVIEW 8 of 18
4.1. Market and Positioning Modal Learning
Modal I was trained for 500 epochs using the Keras framework and the Adam opti-
mizer. The results are shown in Table 5. The training loss decreased from 7.6809 to 2.2602,
and the training accuracy (mean absolute error) improved from 2.4985 to 1.2045. The test
loss was 3.9180 and the test accuracy was 1.4277, indicating a moderate learning effect.
Table 5. Learning results of Modal I using Adam.
Training Epochs Training
Time
Training Loss
(MSE) Mean
Squared Error
Training Accu-
racy (MAE)
Mean Absolute
Error
Epoch 1/500 689/689 [=============] 0 s 418
us/step -loss: 7.6809 -mae: 2.4985
Epoch 2/500 689/689 [=============] 0 s 110
us/step -loss: 4.6210 -mae: 1.6709
Epoch 3/500 689/689 [=============] 0 s 116
us/step -loss: 4.4499 -mae: 1.7674
Epoch 4/500 689/689 [=============] 0 s 110
us/step -loss: 4.1976 -mae: 1.7066
Epoch 5/500 689/689 [=============] 0 s 99 us/step -loss: 4.1244 -mae: 1.6571
……
Epoch 496/500 689/689 [=============] 0 s 115
us/step -loss: 2.6092 -mae: 1.2116
Epoch 497/500 689/689 [=============] 0 s 106
us/step -loss: 2.6684 -mae: 1.1917
Epoch 498/500 689/689 [=============] 0 s 104
us/step -loss: 2.6892 -mae: 1.2178
Epoch 499/500 689/689 [=============] 0 s 116
us/step -loss: 2.5525 -mae: 1.2200
Epoch 500/500 689/689 [=============] 0 s 116
us/step -loss: 2.6202 -mae: 1.2045
Test Samples Testing Time Test Loss Test Accuracy
173/173 [========================] 0 s 219
us/step -loss: 3.9180 -mae: 1.4277
Using the Keras learning framework and the RMSProp optimizer with a learning rate
of 0.001, Modal I was trained for 500 epochs. The results are shown in Table 6. The training
loss decreased from 6.2182 to 2.7196, and the training accuracy (mean absolute error) im-
proved from 2.2700 to 1.2271. The test loss was 4.1295 and the test accuracy was 1.4984,
indicating poor learning performance.
Test Samples Testing Time Test Loss Test Accuracy
173/173 [========================] 0 s 219 us/step -loss: 3.9180 -mae: 1.4277
Table 6. Learning results of Modal I using RMSProp.
Training Epochs Training Time
Training Loss
(MSE) Mean
Squared Error
Training Accuracy
(MAE) Mean
Absolute Error
Epoch 1/500 689/689
[=============] 0 s 238 us/step -loss: 6.2128 -mae: 2.2700
Epoch 2/500 689/689
[=============] 0 s 93 us/step -loss: 4.6579 -mae: 1.8378
Epoch 3/500 689/689
[=============] 0 s 105 us/step -loss: 4.3028 -mae: 1.7373
Epoch 4/500 689/689
[=============] 0 s 105 us/step -loss: 4.3562 -mae: 1.7212
Epoch 5/500 689/689
[=============] 0 s 104 us/step -loss: 4.1525 -mae: 1.6840
. . .. . .
Epoch 496/500 689/689
[=============] 0 s 99 us/step -loss: 2.7705 -mae: 1.2560
Epoch 497/500 689/689
[=============] 0 s 93 us/step -loss: 2.7571 -mae: 1.2304
Epoch 498/500 689/689
[=============] 0 s 93 us/step -loss: 2.7844 -mae: 1.2510
Epoch 499/500 689/689
[=============] 0 s 87 us/step -loss: 2.7852 -mae: 1.2515
Epoch 500/500 689/689
[=============] 0 s 81 us/step -loss: 2.7196 -mae: 1.2271
Appl. Sci. 2024,14, 8181 9 of 18
Table 6. Cont.
Training Epochs Training Time
Training Loss
(MSE) Mean
Squared Error
Training Accuracy
(MAE) Mean
Absolute Error
Appl. Sci. 2024, 14, x FOR PEER REVIEW 9 of 18
Table 6. Learning results of Modal I using RMSProp.
Training Epochs Training
Time
Training Loss
(MSE) Mean
Squared Error
Training Accu-
racy (MAE)
Mean Absolute
Error
Epoch 1/500 689/689 [=============] 0 s 238
us/step -loss: 6.2128 -mae: 2.2700
Epoch 2/500 689/689 [=============] 0 s 93 us/step -loss: 4.6579 -mae: 1.8378
Epoch 3/500 689/689 [=============] 0 s 105
us/step -loss: 4.3028 -mae: 1.7373
Epoch 4/500 689/689 [=============] 0 s 105
us/step -loss: 4.3562 -mae: 1.7212
Epoch 5/500 689/689 [=============] 0 s 104
us/step -loss: 4.1525 -mae: 1.6840
……
Epoch 496/500 689/689 [=============] 0 s 99 us/step -loss: 2.7705 -mae: 1.2560
Epoch 497/500 689/689 [=============] 0 s 93 us/step -loss: 2.7571 -mae: 1.2304
Epoch 498/500 689/689 [=============] 0 s 93 us/step -loss: 2.7844 -mae: 1.2510
Epoch 499/500 689/689 [=============] 0 s 87 us/step -loss: 2.7852 -mae: 1.2515
Epoch 500/500 689/689 [=============] 0 s 81 us/step -loss: 2.7196 -mae: 1.2271
Test Samples Testing Time Test Loss Test Accuracy
173/173 [========================] 0 s 167
us/step -loss: 4.1295 -mae: 1.4984
4.2. Product Attribute Modal Learning
Modal II was trained for 500 epochs using the Keras learning framework and the
Adam optimizer. The results are shown in Table 7. The training loss decreased from 8.8630
to 1.4559, and the training accuracy (mean absolute error) improved from 2.4872 to 0.8144.
The test loss was 4.1140 and the test accuracy was 1.4259, indicating that the learning per-
formance was similar to that of Modal I.
Appl. Sci. 2024, 14, x FOR PEER REVIEW 9 of 18
Table 6. Learning results of Modal I using RMSProp.
Training Epochs Training
Time
Training Loss
(MSE) Mean
Squared Error
Training Accu-
racy (MAE)
Mean Absolute
Error
Epoch 1/500 689/689 [=============] 0 s 238
us/step -loss: 6.2128 -mae: 2.2700
Epoch 2/500 689/689 [=============] 0 s 93 us/step -loss: 4.6579 -mae: 1.8378
Epoch 3/500 689/689 [=============] 0 s 105
us/step -loss: 4.3028 -mae: 1.7373
Epoch 4/500 689/689 [=============] 0 s 105
us/step -loss: 4.3562 -mae: 1.7212
Epoch 5/500 689/689 [=============] 0 s 104
us/step -loss: 4.1525 -mae: 1.6840
……
Epoch 496/500 689/689 [=============] 0 s 99 us/step -loss: 2.7705 -mae: 1.2560
Epoch 497/500 689/689 [=============] 0 s 93 us/step -loss: 2.7571 -mae: 1.2304
Epoch 498/500 689/689 [=============] 0 s 93 us/step -loss: 2.7844 -mae: 1.2510
Epoch 499/500 689/689 [=============] 0 s 87 us/step -loss: 2.7852 -mae: 1.2515
Epoch 500/500 689/689 [=============] 0 s 81 us/step -loss: 2.7196 -mae: 1.2271
Test Samples Testing Time Test Loss Test Accuracy
173/173 [========================] 0 s 167
us/step -loss: 4.1295 -mae: 1.4984
4.2. Product Attribute Modal Learning
Modal II was trained for 500 epochs using the Keras learning framework and the
Adam optimizer. The results are shown in Table 7. The training loss decreased from 8.8630
to 1.4559, and the training accuracy (mean absolute error) improved from 2.4872 to 0.8144.
The test loss was 4.1140 and the test accuracy was 1.4259, indicating that the learning per-
formance was similar to that of Modal I.
Test Samples Testing Time Test Loss Test Accuracy
173/173 [========================] 0 s 167 us/step -loss: 4.1295 -mae: 1.4984
4.2. Product Attribute Modal Learning
Modal II was trained for 500 epochs using the Keras learning framework and the
Adam optimizer. The results are shown in Table 7. The training loss decreased from 8.8630
to 1.4559, and the training accuracy (mean absolute error) improved from 2.4872 to 0.8144.
The test loss was 4.1140 and the test accuracy was 1.4259, indicating that the learning
performance was similar to that of Modal I.
Table 7. Learning results of Modal II using Adam.
Training Epochs Training Time
Training Loss
(MSE) Mean
Squared Error
Training Accuracy
(MAE) Mean
Absolute Error
Epoch 1/500 689/689
[=============] 0 s 217 us/step -loss: 8.8630 -mae: 2.4872
Epoch 2/500 689/689
[=============] 0 s 59 us/step -loss: 4.7590 -mae: 1.8383
Epoch 3/500 689/689
[=============] 0 s 56 us/step -loss: 4.4530 -mae: 1.7244
Epoch 4/500 689/689
[=============] 0 s 57 us/step -loss: 4.2655 -mae: 1.7089
Epoch 5/500 689/689
[=============] 0 s 57 us/step -loss: 4.1099 -mae: 1.6387
. . .. . .
Epoch 496/500 689/689
[=============] 0 s 48 us/step -loss: 1.4138 -mae: 0.8143
Epoch 497/500 689/689
[=============] 0 s 52 us/step -loss: 1.4424 -mae: 0.8234
Epoch 498/500 689/689
[=============] 0 s 50 us/step -loss: 1.4560 -mae: 0.8057
Epoch 499/500 689/689
[=============] 0 s 52 us/step -loss: 1.4690 -mae: 0.8216
Epoch 500/500 689/689
[=============] 0 s 49 us/step -loss: 1.4559 -mae: 0.8144
Appl. Sci. 2024, 14, x FOR PEER REVIEW 10 of 18
Table 7. Learning results of Modal II using Adam.
Training Epochs Training
Time
Training Loss
(MSE) Mean
Squared Error
Training Accu-
racy (MAE)
Mean Absolute
Error
Epoch 1/500 689/689 [=============] 0 s 217
us/step -loss: 8.8630 -mae: 2.4872
Epoch 2/500 689/689 [=============] 0 s 59 us/step -loss: 4.7590 -mae: 1.8383
Epoch 3/500 689/689 [=============] 0 s 56 us/step -loss: 4.4530 -mae: 1.7244
Epoch 4/500 689/689 [=============] 0 s 57 us/step -loss: 4.2655 -mae: 1.7089
Epoch 5/500 689/689 [=============] 0 s 57 us/step -loss: 4.1099 -mae: 1.6387
……
Epoch 496/500 689/689 [=============] 0 s 48 us/step -loss: 1.4138 -mae: 0.8143
Epoch 497/500 689/689 [=============] 0 s 52 us/step -loss: 1.4424 -mae: 0.8234
Epoch 498/500 689/689 [=============] 0 s 50 us/step -loss: 1.4560 -mae: 0.8057
Epoch 499/500 689/689 [=============] 0 s 52 us/step -loss: 1.4690 -mae: 0.8216
Epoch 500/500 689/689 [=============] 0 s 49 us/step -loss: 1.4559 -mae: 0.8144
Test Samples Testing Time Test Loss Test Accuracy
173/173 [========================] 0 s 294
us/step -loss: 4.1140 -mae: 1.4259
Using the Keras learning framework and the RMSProp optimizer with a learning rate
of 0.001, Modal II was trained for 500 epochs. The results are shown in Table 8. The train-
ing loss decreased from 6.4326 to 1.8072, and the training accuracy (mean absolute error)
improved from 2.2166 to 0.9188. The test loss was 4.4277 and the test accuracy was 1.5643,
indicating poor learning performance.
Table 8. Learning results of Modal II using RMSProp.
Training Epochs Training
Time
Training Loss
(MSE) Mean
Squared Error
Training Accu-
racy (MAE)
Mean Absolute
Error
Epoch 1/500 689/689 [=============] 0 s 164
us/step -loss: 6.4326 -mae: 2.2166
Epoch 2/500 689/689 [=============] 0 s 50 us/step -loss: 4.9162 -mae: 1.8606
Epoch 3/500 689/689 [=============] 0 s 55 us/step -loss: 4.5942 -mae: 1.7759
Epoch 4/500 689/689 [=============] 0 s 56 us/step -loss: 4.4481 -mae: 1.7342
Epoch 5/500 689/689 [=============] 0 s 49 us/step -loss: 4.2125 -mae: 1.6909
……
Epoch 496/500 689/689 [=============] 0 s 45 us/step -loss: 1.8812 -mae: 0.9679
Epoch 497/500 689/689 [=============] 0 s 39 us/step -loss: 1.9394 -mae: 0.9694
Appl. Sci. 2024, 14, x FOR PEER REVIEW 10 of 18
Table 7. Learning results of Modal II using Adam.
Training Epochs Training
Time
Training Loss
(MSE) Mean
Squared Error
Training Accu-
racy (MAE)
Mean Absolute
Error
Epoch 1/500 689/689 [=============] 0 s 217
us/step -loss: 8.8630 -mae: 2.4872
Epoch 2/500 689/689 [=============] 0 s 59 us/step -loss: 4.7590 -mae: 1.8383
Epoch 3/500 689/689 [=============] 0 s 56 us/step -loss: 4.4530 -mae: 1.7244
Epoch 4/500 689/689 [=============] 0 s 57 us/step -loss: 4.2655 -mae: 1.7089
Epoch 5/500 689/689 [=============] 0 s 57 us/step -loss: 4.1099 -mae: 1.6387
……
Epoch 496/500 689/689 [=============] 0 s 48 us/step -loss: 1.4138 -mae: 0.8143
Epoch 497/500 689/689 [=============] 0 s 52 us/step -loss: 1.4424 -mae: 0.8234
Epoch 498/500 689/689 [=============] 0 s 50 us/step -loss: 1.4560 -mae: 0.8057
Epoch 499/500 689/689 [=============] 0 s 52 us/step -loss: 1.4690 -mae: 0.8216
Epoch 500/500 689/689 [=============] 0 s 49 us/step -loss: 1.4559 -mae: 0.8144
Test Samples Testing Time Test Loss Test Accuracy
173/173 [========================] 0 s 294
us/step -loss: 4.1140 -mae: 1.4259
Using the Keras learning framework and the RMSProp optimizer with a learning rate
of 0.001, Modal II was trained for 500 epochs. The results are shown in Table 8. The train-
ing loss decreased from 6.4326 to 1.8072, and the training accuracy (mean absolute error)
improved from 2.2166 to 0.9188. The test loss was 4.4277 and the test accuracy was 1.5643,
indicating poor learning performance.
Table 8. Learning results of Modal II using RMSProp.
Training Epochs Training
Time
Training Loss
(MSE) Mean
Squared Error
Training Accu-
racy (MAE)
Mean Absolute
Error
Epoch 1/500 689/689 [=============] 0 s 164
us/step -loss: 6.4326 -mae: 2.2166
Epoch 2/500 689/689 [=============] 0 s 50 us/step -loss: 4.9162 -mae: 1.8606
Epoch 3/500 689/689 [=============] 0 s 55 us/step -loss: 4.5942 -mae: 1.7759
Epoch 4/500 689/689 [=============] 0 s 56 us/step -loss: 4.4481 -mae: 1.7342
Epoch 5/500 689/689 [=============] 0 s 49 us/step -loss: 4.2125 -mae: 1.6909
……
Epoch 496/500 689/689 [=============] 0 s 45 us/step -loss: 1.8812 -mae: 0.9679
Epoch 497/500 689/689 [=============] 0 s 39 us/step -loss: 1.9394 -mae: 0.9694
Test Samples Testing Time Test Loss Test Accuracy
173/173 [========================] 0 s 294 us/step -loss: 4.1140 -mae: 1.4259
Appl. Sci. 2024,14, 8181 10 of 18
Using the Keras learning framework and the RMSProp optimizer with a learning
rate of 0.001, Modal II was trained for 500 epochs. The results are shown in Table 8. The
training loss decreased from 6.4326 to 1.8072, and the training accuracy (mean absolute
error) improved from 2.2166 to 0.9188. The test loss was 4.4277 and the test accuracy was
1.5643, indicating poor learning performance.
Table 8. Learning results of Modal II using RMSProp.
Training Epochs Training Time
Training Loss
(MSE) Mean
Squared Error
Training Accuracy
(MAE) Mean
Absolute Error
Epoch 1/500 689/689
[=============] 0 s 164 us/step -loss: 6.4326 -mae: 2.2166
Epoch 2/500 689/689
[=============] 0 s 50 us/step -loss: 4.9162 -mae: 1.8606
Epoch 3/500 689/689
[=============] 0 s 55 us/step -loss: 4.5942 -mae: 1.7759
Epoch 4/500 689/689
[=============] 0 s 56 us/step -loss: 4.4481 -mae: 1.7342
Epoch 5/500 689/689
[=============] 0 s 49 us/step -loss: 4.2125 -mae: 1.6909
. . .. . .
Epoch 496/500 689/689
[=============] 0 s 45 us/step -loss: 1.8812 -mae: 0.9679
Epoch 497/500 689/689
[=============] 0 s 39 us/step -loss: 1.9394 -mae: 0.9694
Epoch 498/500 689/689
[=============] 0 s 38 us/step -loss: 1.8735 -mae: 0.9542
Epoch 499/500 689/689
[=============] 0 s 43 us/step -loss: 1.7950 -mae: 0.9267
Epoch 500/500 689/689
[=============] 0 s 39 us/step -loss: 1.8072 -mae: 0.9188
Appl. Sci. 2024, 14, x FOR PEER REVIEW 11 of 18
Epoch 498/500 689/689 [=============] 0 s 38 us/step -loss: 1.8735 -mae: 0.9542
Epoch 499/500 689/689 [=============] 0 s 43 us/step -loss: 1.7950 -mae: 0.9267
Epoch 500/500 689/689 [=============] 0 s 39 us/step -loss: 1.8072 -mae: 0.9188
Test Samples Testing Time Test Loss Test Accuracy
173/173 [========================] 0 s 173
us/step -loss: 4.2477 -mae: 1.5463
4.3. Multimodal Learning
The multimodal model was trained for 500 epochs using the Keras learning frame-
work and the Adam optimizer. The results are shown in Table 9. The training loss de-
creased from 8.8611 to 0.5283, and the training accuracy (mean absolute error) improved
from 2.4433 to 0.4083. The test loss was 3.3166 and the test accuracy was 1.0464, indicating
a significant improvement in learning performance compared to the unimodal models.
Table 9. Learning results of multimodal model using Adam.
Training Epochs Training Time
Training Loss
(MSE) Mean
Squared Error
Training Accu-
racy (MAE) Mean
Absolute Error
Epoch 1/500 689/689 [=============] 0 s 210 us/step -loss: 8.8611 -mae: 2.4433
Epoch 2/500 689/689 [=============] 0 s 58 us/step -loss: 4.5633 -mae: 1.7869
Epoch 3/500 689/689 [=============] 0 s 65 us/step -loss: 4.4982 -mae: 1.7732
Epoch 4/500 689/689 [=============] 0 s 53 us/step -loss: 4.1600 -mae: 1.6923
Epoch 5/500 689/689 [=============] 0 s 56 us/step -loss: 4.0209 -mae: 1.6389
……
Epoch 496/500 689/689 [=============] 0 s 46 us/step -loss: 0.4467 -mae: 0.3770
Epoch 497/500 689/689 [=============] 0 s 59 us/step -loss: 0.6286 -mae: 0.4305
Epoch 498/500 689/689 [=============] 0 s 51 us/step -loss: 0.5423 -mae: 0.4168
Epoch 499/500 689/689 [=============] 0 s 69 us/step -loss: 0.4598 -mae: 0.3890
Epoch 500/500 689/689 [=============] 0 s 67 us/step -loss: 0.5282 -mae: 0.4083
Test Samples Testing Time Test Loss Test Accuracy
173/173 [========================] 0 s 190 us/step -loss: 3.3166 -mae: 1.0464
Test Samples Testing Time Test Loss Test Accuracy
173/173 [========================] 0 s 173 us/step -loss: 4.2477 -mae: 1.5463
4.3. Multimodal Learning
The multimodal model was trained for 500 epochs using the Keras learning framework
and the Adam optimizer. The results are shown in Table 9. The training loss decreased from
8.8611 to 0.5283, and the training accuracy (mean absolute error) improved from 2.4433 to
0.4083. The test loss was 3.3166 and the test accuracy was 1.0464, indicating a significant
improvement in learning performance compared to the unimodal models.
Using the Keras learning framework and the RMSProp optimizer with a learning
rate of 0.001, the multimodal model was trained for 500 epochs. The results are shown
in Table 10. The training loss decreased from 5.8074 to 0.5500, and the training accuracy
(mean absolute error) improved from 2.0678 to 0.4818. The test loss was 3.3702 and the test
accuracy was 1.0887, indicating good learning performance.
Appl. Sci. 2024,14, 8181 11 of 18
Table 9. Learning results of multimodal model using Adam.
Training Epochs Training Time
Training Loss
(MSE) Mean
Squared Error
Training Accuracy
(MAE) Mean
Absolute Error
Epoch 1/500 689/689
[=============] 0 s 210 us/step -loss: 8.8611 -mae: 2.4433
Epoch 2/500 689/689
[=============] 0 s 58 us/step -loss: 4.5633 -mae: 1.7869
Epoch 3/500 689/689
[=============] 0 s 65 us/step -loss: 4.4982 -mae: 1.7732
Epoch 4/500 689/689
[=============] 0 s 53 us/step -loss: 4.1600 -mae: 1.6923
Epoch 5/500 689/689
[=============] 0 s 56 us/step -loss: 4.0209 -mae: 1.6389
. . .. . .
Epoch 496/500 689/689
[=============] 0 s 46 us/step -loss: 0.4467 -mae: 0.3770
Epoch 497/500 689/689
[=============] 0 s 59 us/step -loss: 0.6286 -mae: 0.4305
Epoch 498/500 689/689
[=============] 0 s 51 us/step -loss: 0.5423 -mae: 0.4168
Epoch 499/500 689/689
[=============] 0 s 69 us/step -loss: 0.4598 -mae: 0.3890
Epoch 500/500 689/689
[=============] 0 s 67 us/step -loss: 0.5282 -mae: 0.4083
Appl. Sci. 2024, 14, x FOR PEER REVIEW 11 of 18
Epoch 498/500 689/689 [=============] 0 s 38 us/step -loss: 1.8735 -mae: 0.9542
Epoch 499/500 689/689 [=============] 0 s 43 us/step -loss: 1.7950 -mae: 0.9267
Epoch 500/500 689/689 [=============] 0 s 39 us/step -loss: 1.8072 -mae: 0.9188
Test Samples Testing Time Test Loss Test Accuracy
173/173 [========================] 0 s 173
us/step -loss: 4.2477 -mae: 1.5463
4.3. Multimodal Learning
The multimodal model was trained for 500 epochs using the Keras learning frame-
work and the Adam optimizer. The results are shown in Table 9. The training loss de-
creased from 8.8611 to 0.5283, and the training accuracy (mean absolute error) improved
from 2.4433 to 0.4083. The test loss was 3.3166 and the test accuracy was 1.0464, indicating
a significant improvement in learning performance compared to the unimodal models.
Table 9. Learning results of multimodal model using Adam.
Training Epochs Training Time
Training Loss
(MSE) Mean
Squared Error
Training Accu-
racy (MAE) Mean
Absolute Error
Epoch 1/500 689/689 [=============] 0 s 210 us/step -loss: 8.8611 -mae: 2.4433
Epoch 2/500 689/689 [=============] 0 s 58 us/step -loss: 4.5633 -mae: 1.7869
Epoch 3/500 689/689 [=============] 0 s 65 us/step -loss: 4.4982 -mae: 1.7732
Epoch 4/500 689/689 [=============] 0 s 53 us/step -loss: 4.1600 -mae: 1.6923
Epoch 5/500 689/689 [=============] 0 s 56 us/step -loss: 4.0209 -mae: 1.6389
……
Epoch 496/500 689/689 [=============] 0 s 46 us/step -loss: 0.4467 -mae: 0.3770
Epoch 497/500 689/689 [=============] 0 s 59 us/step -loss: 0.6286 -mae: 0.4305
Epoch 498/500 689/689 [=============] 0 s 51 us/step -loss: 0.5423 -mae: 0.4168
Epoch 499/500 689/689 [=============] 0 s 69 us/step -loss: 0.4598 -mae: 0.3890
Epoch 500/500 689/689 [=============] 0 s 67 us/step -loss: 0.5282 -mae: 0.4083
Test Samples Testing Time Test Loss Test Accuracy
173/173 [========================] 0 s 190 us/step -loss: 3.3166 -mae: 1.0464
Appl. Sci. 2024, 14, x FOR PEER REVIEW 11 of 18
Epoch 498/500 689/689 [=============] 0 s 38 us/step -loss: 1.8735 -mae: 0.9542
Epoch 499/500 689/689 [=============] 0 s 43 us/step -loss: 1.7950 -mae: 0.9267
Epoch 500/500 689/689 [=============] 0 s 39 us/step -loss: 1.8072 -mae: 0.9188
Test Samples Testing Time Test Loss Test Accuracy
173/173 [========================] 0 s 173
us/step -loss: 4.2477 -mae: 1.5463
4.3. Multimodal Learning
The multimodal model was trained for 500 epochs using the Keras learning frame-
work and the Adam optimizer. The results are shown in Table 9. The training loss de-
creased from 8.8611 to 0.5283, and the training accuracy (mean absolute error) improved
from 2.4433 to 0.4083. The test loss was 3.3166 and the test accuracy was 1.0464, indicating
a significant improvement in learning performance compared to the unimodal models.
Table 9. Learning results of multimodal model using Adam.
Training Epochs Training Time
Training Loss
(MSE) Mean
Squared Error
Training Accu-
racy (MAE) Mean
Absolute Error
Epoch 1/500 689/689 [=============] 0 s 210 us/step -loss: 8.8611 -mae: 2.4433
Epoch 2/500 689/689 [=============] 0 s 58 us/step -loss: 4.5633 -mae: 1.7869
Epoch 3/500 689/689 [=============] 0 s 65 us/step -loss: 4.4982 -mae: 1.7732
Epoch 4/500 689/689 [=============] 0 s 53 us/step -loss: 4.1600 -mae: 1.6923
Epoch 5/500 689/689 [=============] 0 s 56 us/step -loss: 4.0209 -mae: 1.6389
……
Epoch 496/500 689/689 [=============] 0 s 46 us/step -loss: 0.4467 -mae: 0.3770
Epoch 497/500 689/689 [=============] 0 s 59 us/step -loss: 0.6286 -mae: 0.4305
Epoch 498/500 689/689 [=============] 0 s 51 us/step -loss: 0.5423 -mae: 0.4168
Epoch 499/500 689/689 [=============] 0 s 69 us/step -loss: 0.4598 -mae: 0.3890
Epoch 500/500 689/689 [=============] 0 s 67 us/step -loss: 0.5282 -mae: 0.4083
Test Samples Testing Time Test Loss Test Accuracy
173/173 [========================] 0 s 190 us/step -loss: 3.3166 -mae: 1.0464
Test Samples Testing Time Test Loss Test Accuracy
173/173 [========================] 0 s 190 us/step -loss: 3.3166 -mae: 1.0464
Table 10. Learning results of multimodal model using RMSProp.
Training Epochs Training Time
Training Loss
(MSE) Mean
Squared Error
Training Accuracy
(MAE) Mean
Absolute Error
Epoch 1/500 689/689
[=============]
0 s 138 us/step -loss: 5.8074 -mae: 2.0678
Epoch 2/500 689/689
[=============]
0 s 52 us/step -loss: 4.7239 -mae: 1.8163
Epoch 3/500 689/689
[=============]
0 s 53 us/step -loss: 4.2364 -mae: 1.7059
Epoch 4/500 689/689
[=============]
0 s 56 us/step -loss: 4.1844 -mae: 1.6532
Epoch 5/500 689/689
[=============]
0 s 55 us/step -loss: 4.1077 -mae: 1.6797
. . .. . .
Epoch 496/500 689/689
[=============]
0 s 42 us/step -loss: 0.6323 -mae: 0.5014
Epoch 497/500 689/689
[=============]
0 s 43 us/step -loss: 0.5574 -mae: 0.4812
Epoch 498/500 689/689
[=============]
0 s 42 us/step -loss: 0.5874 -mae: 0.4866
Epoch 499/500 689/689
[=============]
0 s 44 us/step -loss: 0.5736 -mae: 0.4876
Epoch 500/500 689/689
[=============]
0 s 43 us/step -loss: 0.5500 -mae: 0.4818
Appl. Sci. 2024,14, 8181 12 of 18
Table 10. Cont.
Training Epochs Training Time
Training Loss
(MSE) Mean
Squared Error
Training Accuracy
(MAE) Mean
Absolute Error
Appl. Sci. 2024, 14, x FOR PEER REVIEW 12 of 18
Using the Keras learning framework and the RMSProp optimizer with a learning rate
of 0.001, the multimodal model was trained for 500 epochs. The results are shown in Table
10. The training loss decreased from 5.8074 to 0.5500, and the training accuracy (mean
absolute error) improved from 2.0678 to 0.4818. The test loss was 3.3702 and the test accu-
racy was 1.0887, indicating good learning performance.
Table 10. Learning results of multimodal model using RMSProp.
Training Epochs Training
Time
Training Loss
(MSE) Mean
Squared Error
Training Accu-
racy (MAE)
Mean Absolute
Error
Epoch 1/500 689/689 [=============] 0 s 138
us/step -loss: 5.8074 -mae: 2.0678
Epoch 2/500 689/689 [=============] 0 s 52 us/step -loss: 4.7239 -mae: 1.8163
Epoch 3/500 689/689 [=============] 0 s 53 us/step -loss: 4.2364 -mae: 1.7059
Epoch 4/500 689/689 [=============] 0 s 56 us/step -loss: 4.1844 -mae: 1.6532
Epoch 5/500 689/689 [=============] 0 s 55 us/step -loss: 4.1077 -mae: 1.6797
……
Epoch 496/500 689/689 [=============] 0 s 42 us/step -loss: 0.6323 -mae: 0.5014
Epoch 497/500 689/689 [=============] 0 s 43 us/step -loss: 0.5574 -mae: 0.4812
Epoch 498/500 689/689 [=============] 0 s 42 us/step -loss: 0.5874 -mae: 0.4866
Epoch 499/500 689/689 [=============] 0 s 44 us/step -loss: 0.5736 -mae: 0.4876
Epoch 500/500 689/689 [=============] 0 s 43 us/step -loss: 0.5500 -mae: 0.4818
Test Samples Testing Time Test Loss Test Accuracy
173/173 [========================] 0 s 271
us/step -loss: 3.3702 -mae: 1.0887
5. Conclusions and Recommendations
5.1. Conclusions
The training loss, training accuracy, test loss, and test accuracy for the market posi-
tioning modality using Adam optimization, market positioning modality using RMSProp
optimization, product attribute modality using Adam optimization, product attribute mo-
dality using RMSProp optimization, multimodal Adam optimization, and multimodal
RMSProp optimization are shown in Table 11.
Appl. Sci. 2024, 14, x FOR PEER REVIEW 12 of 18
Using the Keras learning framework and the RMSProp optimizer with a learning rate
of 0.001, the multimodal model was trained for 500 epochs. The results are shown in Table
10. The training loss decreased from 5.8074 to 0.5500, and the training accuracy (mean
absolute error) improved from 2.0678 to 0.4818. The test loss was 3.3702 and the test accu-
racy was 1.0887, indicating good learning performance.
Table 10. Learning results of multimodal model using RMSProp.
Training Epochs Training
Time
Training Loss
(MSE) Mean
Squared Error
Training Accu-
racy (MAE)
Mean Absolute
Error
Epoch 1/500 689/689 [=============] 0 s 138
us/step -loss: 5.8074 -mae: 2.0678
Epoch 2/500 689/689 [=============] 0 s 52 us/step -loss: 4.7239 -mae: 1.8163
Epoch 3/500 689/689 [=============] 0 s 53 us/step -loss: 4.2364 -mae: 1.7059
Epoch 4/500 689/689 [=============] 0 s 56 us/step -loss: 4.1844 -mae: 1.6532
Epoch 5/500 689/689 [=============] 0 s 55 us/step -loss: 4.1077 -mae: 1.6797
……
Epoch 496/500 689/689 [=============] 0 s 42 us/step -loss: 0.6323 -mae: 0.5014
Epoch 497/500 689/689 [=============] 0 s 43 us/step -loss: 0.5574 -mae: 0.4812
Epoch 498/500 689/689 [=============] 0 s 42 us/step -loss: 0.5874 -mae: 0.4866
Epoch 499/500 689/689 [=============] 0 s 44 us/step -loss: 0.5736 -mae: 0.4876
Epoch 500/500 689/689 [=============] 0 s 43 us/step -loss: 0.5500 -mae: 0.4818
Test Samples Testing Time Test Loss Test Accuracy
173/173 [========================] 0 s 271
us/step -loss: 3.3702 -mae: 1.0887
5. Conclusions and Recommendations
5.1. Conclusions
The training loss, training accuracy, test loss, and test accuracy for the market posi-
tioning modality using Adam optimization, market positioning modality using RMSProp
optimization, product attribute modality using Adam optimization, product attribute mo-
dality using RMSProp optimization, multimodal Adam optimization, and multimodal
RMSProp optimization are shown in Table 11.
Test Samples Testing Time Test Loss Test Accuracy
173/173
[========================] 0 s 271 us/step -loss: 3.3702 -mae: 1.0887
5. Conclusions and Recommendations
5.1. Conclusions
The training loss, training accuracy, test loss, and test accuracy for the market posi-
tioning modality using Adam optimization, market positioning modality using RMSProp
optimization, product attribute modality using Adam optimization, product attribute
modality using RMSProp optimization, multimodal Adam optimization, and multimodal
RMSProp optimization are shown in Table 11.
In terms of training loss, the multimodal Adam optimization has the smallest value
at 0.5282. Regarding training accuracy, the multimodal Adam optimization is the highest,
with a mean absolute error of 0.4083. For test loss, the multimodal Adam optimization is
the lowest, at 3.3166. In terms of test accuracy, the multimodal Adam optimization is the
highest, with a mean absolute error of 1.0464, as shown in Figure 2. Overall, multimodal
learning demonstrates higher accuracy and lower loss compared to single-modal learning.
The multimodal satisfaction prediction using the Adam optimizer performs better than
when using the RMSProp optimizer.
Appl. Sci. 2024, 14, x FOR PEER REVIEW 13 of 18
Table 11. Learning performance of various modalities.
Training
Loss (MSE)
Training Accu-
racy (MAE)
Test Loss
(MSE)
Test Accuracy
(MAE)
Market Positioning Modality
with Adam Optimization 2.6202 1.2045 3.9180 1.4277
Market Positioning Mo-
dality with RMSProp Opti-
mization
2.7196 1.2271 4.1295 1.4984
Product Attribute Mo-
dality with Adam Optimiza-
tion
1.4559 0.8144 4.1140 1.4259
Product Attribute Mo-
dality with RMSProp Opti-
mization
1.8072 0.9188 4.2477 1.5463
Multimodal Adam Optimi-
zation 0.5282 0.4083 3.3166 1.0464
Multimodal RMSProp Opti-
mization 0.5500 0.4818 3.3702 1.0887
In terms of training loss, the multimodal Adam optimization has the smallest value
at 0.5282. Regarding training accuracy, the multimodal Adam optimization is the highest,
with a mean absolute error of 0.4083. For test loss, the multimodal Adam optimization is
the lowest, at 3.3166. In terms of test accuracy, the multimodal Adam optimization is the
highest, with a mean absolute error of 1.0464, as shown in Figure 2. Overall, multimodal
learning demonstrates higher accuracy and lower loss compared to single-modal learning.
The multimodal satisfaction prediction using the Adam optimizer performs better than
when using the RMSProp optimizer.
Figure 2. Comparison of single-modality and multimodal learning.
5.2. Applications and Development Recommendations
5.2.1. Building a Big Data-Based AI E-Commerce Decision Support System
Promote innovation in e-commerce technology architecture and platforms by design-
ing modular technology architectures that support plug-in technology upgrades. By in-
troducing deep machine learning and data fusion methods, the decision support system
Figure 2. Comparison of single-modality and multimodal learning.
Appl. Sci. 2024,14, 8181 13 of 18
Table 11. Learning performance of various modalities.
Training Loss
(MSE)
Training Accuracy
(MAE)
Test Loss
(MSE)
Test Accuracy
(MAE)
Market Positioning Modality with Adam Optimization 2.6202 1.2045 3.9180 1.4277
Market Positioning Modality with RMSProp Optimization
2.7196 1.2271 4.1295 1.4984
Product Attribute Modality with Adam Optimization 1.4559 0.8144 4.1140 1.4259
Product Attribute Modality with RMSProp Optimization 1.8072 0.9188 4.2477 1.5463
Multimodal Adam Optimization 0.5282 0.4083 3.3166 1.0464
Multimodal RMSProp Optimization 0.5500 0.4818 3.3702 1.0887
5.2. Applications and Development Recommendations
5.2.1. Building a Big Data-Based AI E-Commerce Decision Support System
Promote innovation in e-commerce technology architecture and platforms by de-
signing modular technology architectures that support plug-in technology upgrades. By
introducing deep machine learning and data fusion methods, the decision support system
can flexibly integrate the latest AI algorithms and data processing technologies, compre-
hensively considering intra-modal and inter-modal dependencies to address evolving
market demands and technological changes. Deeply integrate cross-functional data flows
to promote the integration and fusion of data across departments such as marketing, sales,
customer service, and supply chain management within enterprises. This enhances mul-
tidimensional data analysis capabilities and facilitates information sharing and strategic
collaboration between different departments. Through centralized management and analy-
sis of multi-departmental data, the decision support system can gain a more comprehensive
understanding of business processes and consumer behavior, leading to the generation of
more accurate business insights.
Continuously develop dynamic optimization algorithms that automatically adjust
parameters based on real-time market data, optimizing the decision-making process. By
leveraging reinforcement learning and online learning technologies, the system can con-
tinuously learn and improve during transactions, achieving optimal marketing strategies
and inventory management. Promote the practical application of advanced analytics and
predictive models. Utilizing time series analysis, sentiment analysis, and complex event
processing, companies can gain deep insights into market dynamics and consumer psy-
chology, identify subtle market changes, and predict their potential impact on sales and
brand loyalty, enabling proactive responses ahead of competitors.
Enhance the interpretability of decision models to ensure that all stakeholders can
understand the basis of the model’s decisions. This helps build trust among internal
users and maintains transparency with external regulators, especially when using complex
algorithms like deep learning. Establish frameworks for monitoring and auditing AI-
driven decisions to ensure transparency and compliance, thereby strengthening trust
among stakeholders and complying with data regulations. These strategies enable e-
commerce companies to develop efficient, transparent, and compliant big data-driven AI
decision-making systems, enhancing responsiveness, accuracy, and competitiveness in the
global market.
5.2.2. Develop a Multimodal Data Prediction Platform Integrating Geolocation and
Product Characteristics
Build a multimodal prediction system based on geolocation and product attributes
to analyze and forecast product demands and user preferences in different countries and
regions. Establish a centralized cross-border e-commerce data management platform that in-
tegrates multiple data sources, including geographical information, product specifications,
consumer interactions, and historical purchase records.
Utilize big data technologies like Hadoop or Spark to process and analyze large-scale
datasets. Implement data quality control measures to ensure data accuracy and consistency,
supporting subsequent analyses. Use NLP and image recognition to handle text and
Appl. Sci. 2024,14, 8181 14 of 18
visual data, extracting key information for prediction models. Utilize GIS technology to
analyze consumer distribution and market characteristics, integrating geolocation as a core
dimension in model data. Develop and train hybrid neural network models to identify
regional differences in product preferences. Use machine learning algorithms to process
structured (product specifications, user geolocation) and unstructured data (user reviews,
social media content, images). For instance, text analysis can interpret review sentiments,
while image recognition assesses the impact of product images on decisions. This approach
builds comprehensive user profiles and purchasing behavior models, allowing for tailored
product recommendations and customized assistance.
Regularly review and adjust model parameters to maintain prediction accuracy and
relevance, particularly when entering new markets or responding to emerging consumer
trends. Establish a real-time feedback mechanism to translate predictions into actionable
strategies, enabling swift responses from marketing and sales teams. Implement automated
business-intelligence dashboards to monitor key performance indicators (KPIs) and provide
instant data viewing for rapid decision-making. Adopt a continuous learning strategy
using new data to constantly train and improve models, ensuring alignment with market
dynamics. Ensure the system’s scalability to expand functionalities and data processing as
the business grows. These strategies will provide deep insights into consumer behavior
and market trends, enhancing the business’s responsiveness and accuracy in adapting to
market changes, thus maintaining a competitive edge.
5.2.3. Implement Big Data Analytics-Based E-Commerce Precision Marketing
Optimization Strategies
Utilize big data analytics and multimodal learning technologies to base decisions and
actions on precise, real-time data insights. Use multimodal learning models to deeply
analyze customer data on the cross-border e-commerce platform. Segment customers based
on behavior patterns, purchase history, and social media activities to uncover unique needs
and preferences, supporting customized marketing strategies. Develop and deploy per-
sonalized recommendation systems by combining users’ purchase histories and browsing
behaviors. Implement machine learning algorithms such as collaborative filtering and
content-based recommendation systems to improve the relevance and accuracy of the
recommendations.
Evaluate the effectiveness of different recommendation models through A/B testing
to select the optimal model for comprehensive deployment. Implement dynamic pricing
strategies and intelligent promotional activities by developing demand-based dynamic
pricing models. Use regression analysis and machine learning to predict price sensitivity
and adjust prices according to market supply and demand. Analyze the impact of promo-
tional activities on different customer groups, customizing targeted promotions based on
historical data and consumer behavior patterns. Additionally, use big data analytics to
optimize the timing and content of promotional activities, ensuring that marketing efforts
reach the most interested customer segments directly.
Integrate cross-channel marketing efforts by combining multimodal data analysis to
unify online and offline marketing channels, ensuring consistency in brand messaging and
marketing activities across all touchpoints. Use cross-channel tracking tools to monitor
and analyze the performance of marketing campaigns, and adjust strategies in real time to
maximize return on investment. Establish a real-time feedback mechanism to continuously
monitor the effectiveness of marketing activities and customer feedback, allowing for
ongoing adjustments and optimization of marketing strategies to stay aligned with market
trends and consumer expectations.
By implementing precision marketing strategies, cross-border e-commerce enterprises
can more effectively meet the diverse needs of global consumers, enhance customer satis-
faction, and strengthen market competitiveness.
Appl. Sci. 2024,14, 8181 15 of 18
5.2.4. Develop End-to-End Cross-Border E-Commerce Supply Chain Optimization Solutions
Enhance supply chain data transparency by building an integrated supply chain
management system. Collect and analyze data in real-time across the entire supply chain,
from suppliers to end consumers, creating a continuously updated and stable big data set.
Utilize big data technologies and multimodal learning models to provide deep insights,
predicting market trends and adjusting production plans accordingly. Gain a comprehen-
sive understanding of demand characteristics in different markets and customer groups.
Use predictive analytics to optimize inventory levels and implement precise inventory
management strategies. Employ efficient logistics planning to reduce delivery times and
costs. Dynamically adjust the supply chain to build a flexible network, adapting quickly to
real-time data and market demands. For example, if there is a surge in demand in a specific
region, the system can automatically adjust production priorities and logistics resources to
ensure the timely fulfillment of demand.
Utilize big data-driven analytical tools to assess and categorize potential risks within
the supply chain. By constructing a comprehensive risk-management framework, im-
plement effective risk prevention and mitigation measures to ensure the stability and
sustainability of the supply chain. Enhance customer engagement and feedback mecha-
nisms by using multimodal data analysis to continuously understand customer behavior
and feedback. This ongoing analysis helps optimize supply chain operations, ultimately im-
proving customer satisfaction. Establish a customer feedback system that allows consumers
to directly influence supply chain decisions, such as affecting inventory and production
through their reviews and feedback.
By implementing these strategies, cross-border e-commerce enterprises can build a
highly optimized and flexible supply chain system that quickly adapts to market changes
and significantly enhances customer satisfaction, thereby gaining a competitive advantage
in the global market. This end-to-end supply chain optimization solution will leverage big
data and artificial intelligence technologies to revolutionize supply chain management and
drive continuous business growth.
6. Contributions and Further Directions
This study can be expanded from multiple perspectives. First, by optimizing the
multimodal learning framework, considering the advantages of multimodal learning in
predicting customer satisfaction, the feature extraction process within each modality can
be further refined. For example, deep convolutional neural networks (CNNs) can be
introduced for image feature extraction, and Transformer-based models can be used for
text analysis. Exploring more complex model fusion techniques, such as hierarchical fusion
or feature-level fusion, can improve the efficiency of integrating signals from different data
sources, thereby enhancing the overall performance of the model.
Second, further testing and optimization of the optimizer selection is necessary. The
results of this study indicate that the Adam optimizer performs well in multimodal learning.
Systematic testing of a broader range of optimizers, including some recently proposed adap-
tive learning rate optimizers, such as AdaBelief, LAMB, or Lookahead, can be conducted.
Experimenting with different combinations of learning rates and batch sizes will help iden-
tify the best configuration for the current dataset. Additionally, using learning rate decay or
cyclical learning rate adjustment strategies can further enhance the model’s performance.
Third, expanding the breadth and depth of the research by applying the current model
to other types of cross-border e-commerce products, such as electronics or home goods, will
verify the model’s generality and robustness. By integrating model interpretation tools and
techniques, such as SHAP (Shapley Additive Explanations) and LIME (Local Interpretable
Model-agnostic Explanations), the research can provide deeper insights, helping merchants
understand the factors that most significantly affect customer satisfaction.
Appl. Sci. 2024,14, 8181 16 of 18
Author Contributions: Conceptualization, X.Z. and C.G.; methodology, X.Z.; software, X.Z.; valida-
tion, X.Z.; formal analysis, X.Z. and C.G.; investigation, X.Z.; data curation, X.Z.; writing—original
draft preparation, X.Z. and C.G.; writing—review and editing, X.Z. and C.G.; visualization, X.Z. and
C.G.; supervision, X.Z.; funding acquisition, X.Z. All authors have read and agreed to the published
version of the manuscript.
Funding: This work was supported by the Inner Mongolia Natural Science Foundation, “Research
on Intelligent Marketing of Cross-Border E-commerce with Multimodal Learning and Federated
Learning Collaborative Embedding” (Project Number: 2024MS07009); Interdisciplinary Research
Fund of Inner Mongolia Agricultural University, “Research on Open Innovation Intelligent Decision-
making in E-commerce Based on Federated Learning” (Project No. BR231518); Program for Young
Talents of Science and Technology in Universities of Inner Mongolia Autonomous Region, “Research
on Intelligent Marketing in E-commerce Based on Multimodal Learning” (Project No. NJYT24014);
National Key R&D Program of China, “Research on Sino-Mongolian Agricultural and Pastoral Supply
Chain Collaboration” (Project Number: 2021YFE0190200); National Social Science Fund of China
Post-funding Project, “Research on the Internationalization Development of Chinese Cross-border
E-commerce Brands” (Project Number: 20FGLB033); China Society of Logistics and China Federation
of Logistics & Purchasing General Research Project, “Research on the Operation of Agricultural and
Animal Husbandry Supply Chain between China and Mongolia under the Digital Trade Environment”
(Project Number: 2024CSLKT3-022); Inner Mongolia Autonomous Region Graduate Education
Teaching Reform Project, “Research on the Training Model for New Business Graduates in Inner
Mongolia under the Background of Digital Economy” (Project Number: JGCG2022059).
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The data used in this article are available upon request from the
corresponding author.
Conflicts of Interest: The authors declare no conflicts of interest.
References
1.
Tripathi, M.A.; Madhavi, K.; Kandi, V.P.; Nassa, V.K.; Mallik, B.; Chakravarthi, M.K. Machine learning models for evaluating the
benefits of business intelligence systems. J. High Technol. Manag. Res. 2023,34, 100470. [CrossRef]
2.
Li, Q.; Tan, J.; Jiao, Y. Research on the formation mechanism of brand identification in cross-border e-commerce platforms—Based
on the perspective of perceived brand globalness/localness. Heliyon 2024,10, e25155. [CrossRef] [PubMed]
3.
Dharshini, M.P.A.; Vijila, S.A. Survey of machine learning and deep learning approaches on sales forecasting. In Proceedings
of the 2021 4th International Conference on Computing and Communications Technologies (ICCCT), Chennai, India, 15–17
December 2021; pp. 59–64. [CrossRef]
4.
Qi, B.; Shen, Y.; Xu, T. An artificial-intelligence-enabled sustainable supply chain model for B2C e-commerce business in
international trade. Technol. Forecast. Soc. Chang. 2023,191, 122491. [CrossRef]
5.
Luo, X.; Jia, N.; Ouyang, E.; Fang, Z. Introducing machine-learning-based data fusion methods for analyzing multimodal data:
An application of measuring trustworthiness of microenterprises. Strat. Mgmt. J. 2024,45, 1597–1629. [CrossRef]
6.
Jabeen, S.; Li, X.; Amin, M.S.; Bourahla, O.; Li, S.; Jabbar, A. A review on methods and applications in multimodal deep learning.
ACM Trans. Multimed. Comput. Commun. Appl. 2023,19, 1–41. [CrossRef]
7.
Chango, W.; Lara, J.A.; Cerezo, R.; Romero, C. A review on data fusion in multimodal learning analytics and educational data
mining. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2022,12, e1458. [CrossRef]
8.
Philippe, S.; Souchet, A.D.; Lameras, P.; Petridis, P.; Caporal, J.; Coldeboeuf, G.; Duzan, H. Multimodal teaching, learning and
training in virtual reality: A review and case study. Virtual Real. Intell. Hardw. 2020,2, 421–442. [CrossRef]
9.
Kline, A.; Wang, H.; Li, Y.; Ghosh, S. Multimodal machine learning in precision health: A scoping review. npj Digit. Med. 2022,5,
171. [CrossRef]
10.
Liao, Y.C. Innovative Interaction Mode in VR Games. In Proceedings of the International Conference on Frontier Computing,
Singapore, 21 January 2024. [CrossRef]
11.
Liu, J.; Sun, Y.; Zhang, Y.; Lu, C. Research on Online Review Information Classification Based on Multimodal Deep Learning.
Appl. Sci. 2024,14, 3801. [CrossRef]
12.
Sun, X.; Su, Z.; Qian, Y.; Zhang, D. Current Status and Future Prospects of Data-Driven Research. Res. Dev. Manag. 2020,32,
155–166. [CrossRef]
13.
Alawadh, M.M.; Barnawi, A.M. A survey on methods and applications of intelligent market basket analysis based on association
rule. J. Big Data 2022,4, 1–25. [CrossRef]
Appl. Sci. 2024,14, 8181 17 of 18
14.
Huang, M.H.; Rust, R.T. A Framework for Collaborative Artificial Intelligence in Marketing. J. Bus. Ind. Mark. 2020,98, 209–223.
[CrossRef]
15.
Zhou, W.; Wang, P.; Yang, M. Digital Empowerment Promotes Technological Innovation in Mass Customization. Stud. Sci. Sci.
2018,3, 1516–1523. [CrossRef]
16. Li, C. Intelligent Enabling Fashion Supply Chain Management Innovation. In Proceedings of the 2020 International Conference
on Modern Education and Information Management (ICMEIM), Dalian, China, 25 September 2020. [CrossRef]
17.
Li, Y. Research on Intelligent Supply Chain Mode of Business Big Data Technology Based on Machine Learning. In Proceedings of
the 2021 IEEE Conference on Telecommunications, Optics and Computer Science (TOCS), Shenyang, China, 10–11 December
2021. [CrossRef]
18.
Chen, J. AI-Driven Supply Chain Transformation: Platform Reconstruction, Ecosystem Reshaping, and Advantage Rebuilding.
Contemp. Econ. Manag. 2023,45, 50–63. [CrossRef]
19.
Barile, S.; Bassano, C.; Piciocchi, P.; Saviano, M.; Spohrer, J.C. Empowering value co-creation in the digital age. J. Bus. Ind. Mark.
2024,39, 1130–1143. [CrossRef]
20.
Almaslamani, F.; Abuhussein, R.; Saleet, H.; AbuHilal, L.; Santarisi, N. Using big data analytics to design an intelligent market
basket-case study at sameh mall. Int. J. Eng. Res. Technol. 2020,13, 3444–3455. [CrossRef]
21.
Lies, J. Marketing intelligence and big data: Digital marketing techniques on their way to becoming social engineering techniques
in marketing. Int. J. Interact. Multimed. Artif. Intell. 2019,5, 134–144. [CrossRef]
22.
Cui, F.; Hu, H.; Xie, Y. An intelligent optimization method of E-commerce product marketing. Neural Comput. Appl. 2021,33,
4097–4110. [CrossRef]
23.
Cui, M.; Pan, S.L. Developing focal capabilities for e-commerce adoption: A resource orchestration perspective. Inf. Manag. 2015,
52, 200–209. [CrossRef]
24.
Zhang, D.; Pee, L.G.; Cui, L. Artificial intelligence in E-commerce fulfillment: A case study of resource orchestration at Alibaba’s
Smart Warehouse. Int. J. Inf. Manag. 2021,57, 102304. [CrossRef]
25.
Senyo, P.K.; Liu, K.; Effah, J. Digital business ecosystem: Literature review and a framework for future research. Int. J. Inf. Manag.
2019,47, 52–64. [CrossRef]
26.
Saebi, T.; Foss, N.J. Business models for open innovation: Matching heterogeneous open innovation strategies with business
model dimensions. Eur. Manag. J. 2015,33, 201–213. [CrossRef]
27.
MacCarthy, B.L.; Blome, C.; Olhager, J.; Srai, J.S.; Zhao, X. Supply chain evolution–theory, concepts and science. Int. J. Oper. Prod.
Manag. 2016,36, 1696–1718. [CrossRef]
28.
Xu, X.; Sethi, S.P.; Chung, S.H.; Choi, T.M. Reforming global supply chain management under pandemics: The GREAT-3Rs
framework. Prod. Oper. Manag. 2023,32, 524–546. [CrossRef]
29. Sirmon, D.G.; Hitt, M.A.; Ireland, R.D.; Gilbert, B.A. Resource Orchestration to Create Competitive Advantage: Breadth, Depth,
and Life Cycle Effects. Soc. Sci. Electron. Publ. 2021,37, 1390–1412. [CrossRef]
30.
Zhao, N.; Li, H.; Wu, Y.; He, X.; Zhou, B. The JDDC 2.0 corpus: A large-scale multimodal multi-turn chinese dialogue dataset for
e-commerce customer service. arXiv 2021, arXiv:2109.12913. [CrossRef]
31.
Kim, M.; Shin, W.; Kim, S.; Kim, H.W. Predicting Session Conversion on E-commerce: A Deep Learning-based Multimodal Fusion
Approach. Asia Pac. J. Inf. Syst. 2023,33, 737–767. [CrossRef]
32.
Sun, Y.; Mai, S.; Hu, H. Learning to learn better unimodal representations via adaptive multimodal meta-learning. IEEE Trans.
Affect. Comput. 2022,14, 2209–2223. [CrossRef]
33.
Bi, Y.; Wang, S.; Fan, Z. A Multimodal Late Fusion Model for E-Commerce Product Classification. arXiv 2020, arXiv:2008.06179.
[CrossRef]
34.
Cai, W.; Song, Y.; Wei, Z. Multimodal Data Guided Spatial Feature Fusion and Grouping Strategy for E-Commerce Commodity
Demand Forecasting. Mobile Inf. Syst. 2021,1, 5568208. [CrossRef]
35.
Xu, W.; Zhang, X.; Chen, R.; Yang, Z. How do you say it matters? A multimodal analytics framework for product return prediction
in live streaming e-commerce. Decis. Support Syst. 2023,172, 113984. [CrossRef]
36.
Wróblewska, A.; D ˛abrowski, J.; Pastuszak, A.; Michałowski, M.; Daniluk, M.; Rychalska, B.; Wieczorek, M.; Sysko-Roma ´nczuk, S.
Designing Multi-Modal Embedding Fusion-Based Recommender. Electronics 2022,11, 1391. [CrossRef]
37.
Xu, W.; Cao, Y.; Chen, R. A multimodal analytics framework for product sales prediction with the reputation of anchors in live
streaming e-commerce. Decis. Support Syst. 2024,177, 114104. [CrossRef]
38.
Seng, J.K.P.; Ang, K.L.M. Multimodal emotion and sentiment modeling from unstructured Big data: Challenges, architecture, &
techniques. IEEE Access 2019,7, 90982–90998.
39.
Shoumy, N.J.; Ang, L.M.; Seng, K.P.; Rahaman, D.M.; Zia, T. Multimodal big data affective analytics: A comprehensive survey
using text, audio, visual and physiological signals. J. Netw. Comput. Appl. 2020,149, 102447. [CrossRef]
40.
Park, E. CRNet: A multimodal deep convolutional neural network for customer revisit prediction. J. Big. Data. 2023,10, 1.
[CrossRef]
41.
Wang, W. A IoT-Based Framework for Cross-Border E-Commerce Supply Chain Using Machine Learning and Optimization.
IEEE Access 2024,12, 1852–1864. [CrossRef]
Appl. Sci. 2024,14, 8181 18 of 18
42.
Yang, J.M.; Xiang, Y.X.; Liu, C.W. Enterprise sellers’ satisfaction with business-to-business cross-border e-commerce platforms:
Alibaba.com as an example. Inf. Syst. 2024,122, 102348. [CrossRef]
43.
Xu, J.; Di Nardo, M.; Yin, S. Improved Swarm Intelligence-Based Logistics Distribution Optimizer: Decision Support for
Multimodal Transportation of Cross-Border E-Commerce. Mathematics 2024,12, 763. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.
... Platforms like Taobao and JD.com have deployed CLIP-based product image-text retrieval systems to enable "search by image" functionality [23]. In the autonomous driving field, multimodal models are used for image-navigation instruction understanding, integrating perception and decision-making [24]. ...
Article
Full-text available
In recent years, Vision-Language Models (VLMs) have emerged as a significant breakthrough in multimodal learning, demonstrating remarkable progress in tasks such as image-text alignment, image generation, and semantic reasoning. This paper systematically reviews current VLM pretraining methodologies, including contrastive learning and generative paradigms, while providing an in-depth analysis of efficient transfer learning strategies such as prompt tuning, LoRA, and adapter modules. Through representative models like CLIP, BLIP, and GIT, we examine their practical applications in visual grounding, image-text retrieval, visual question answering, affective computing, and embodied AI. Furthermore, we identify persistent challenges in fine-grained semantic modeling, cross-modal reasoning, and cross-lingual transfer. Finally, we envision future trends in unified architectures, multimodal reinforcement learning, and domain adaptation, aiming to provide systematic reference and technical insights for subsequent research.
... Sentiment-aware chatbots analyze user emotions in real-time, adapting their responses to provide empathetic and contextually relevant support. The integration of RL in chatbot development further enhances conversational flow by enabling adaptive dialogue management based on user interactions [90], [91], [92]. ...
Article
Full-text available
The rapid evolution of e-commerce has been significantly influenced by the integration of machine learning (ML) and data science techniques. The present survey provides a comprehensive overview of how ML methods are applied across various functional domains in e-commerce, including personalized recommendations, dynamic pricing, fraud detection, customer segmentation, and behavioral analysis. We categorize and evaluate a wide range of ML paradigms, namely supervised, unsupervised, reinforcement, and hybrid learning, as well as emerging approaches such as neurosymbolic artificial intelligence (AI), federated learning (FL), and quantumML(QML).Key challenges related to scalability, interpretability, cold-start problems, data sparsity, and privacy are critically analyzed. Additionally, we highlight underexplored areas, such as continual learning (CL) and multi-agent architectures in commerce. The survey incorporates comparative tables, real-world use cases, and a taxonomy of methods to support both academic and industrial perspectives. Ultimately, by analyzing trends and gaps in the literature, we provide a forward-looking research roadmap that bridges ML innovations with the evolving demands of e-commerce ecosystems.
... For instance, Lyu et al. [27] enhanced short-text similarity calculations by embedding a Chinese KG, which aided complex reasoning tasks. Zhang et al. [28] leveraged multimodal knowledge bases combining text and image data to improve recommendation accuracy in e-commerce, while Li et al. [29] incorporated a legal knowledge base into a BERT-based model for Chinese legal document analysis, achieving superior document matching by addressing terminology complexity. ...
Article
Full-text available
Short text similarity, as a pivotal research domain within Natural Language Processing (NLP), has been extensively utilized in intelligent search, recommendation systems, and question-answering systems. Most existing short-text similarity models focus on aligning the overall semantic content of an entire sentence, often ignoring the semantic associations between individual phrases in the sentence. It is particular in the Chinese context, as synonyms and near-synonyms can cause serious interference in the computation of text similarity. To overcome these limitations, a novel short text similarity computation method integrating both sentence-level and phrase-level semantics was proposed. By harnessing vector representations of Chinese words/phrases as external knowledge, this approach amalgamates global sentence characteristics with local phrase features to compute short text similarity from diverse perspectives, spanning from the global to the local level. Experimental results demonstrate that the proposed model outperforms previous methods in the Chinese short text similarity task. Specifically, the model achieves an accuracy of 90.16% in LCQMC, which is 2.23% and 1.46%, respectively, better than ERNIE and Glyce + BERT.
Article
The article provides a thorough analysis of the impact of consumer behavior forecasting on the effective optimization of enterprise production volumes based on the use of marketing approaches. The article reviews and systematizes key concepts, in particular, forecasting consumer behavior, optimizing production volumes and the marketing approach as a methodological basis for research. Particular attention is paid to the conceptualization of consumer behavior and its importance in the formation of business strategies. On the basis of relevant scientific works and analysis of practical cases, the author identifies marketing research tools suitable for accurate forecasting of the level of demand. On the basis of relevant scientific works and analysis of practical cases, the marketing research tools suitable for accurate forecasting of the level of demand are identified. The article highlights the importance of integrating consumer behavior forecasting into the process of optimizing production capacities, focusing on the relationship between demand forecasting and building adaptive production optimization models. The article also discusses modern software and analytical tools that provide opportunities for effective production management based on marketing forecasts. The article concludes with practical recommendations for implementing marketing forecasting strategies to improve the efficiency of production processes, taking into account the dynamic conditions of the modern market. Predicting consumer behavior plays a key role in ensuring that production volumes are optimized in today's business environment. With a marketing-based approach to forecasting, companies can not only anticipate changes in demand, but also quickly adapt their production processes, which in turn helps to minimize costs, manage inventory more efficiently, and increase overall productivity. In brief, anticipating consumer behavior through marketing helps with more than just production planning; it also leads to better resource and marketing efficiency, ultimately boosting the company's market competitiveness.
Technical Report
Full-text available
This article offers a comprehensive exploration of strategies for managing and analyzing Big Data, High-Dimensional Data, and Multimodal Data, all critical challenges in the current era of digital transformation. For Big Data, the solutions revolve around distributed computing frameworks like Hadoop and Spark, bolstered by cloud computing and predictive analytics to ensure scalability and efficiency. When addressing High-Dimensional Data, the paper highlights dimensionality reduction techniques such as Sparse Principal Component Analysis (PCA), sparse representation methods, and subspace clustering algorithms, all supported by parallel computing and compressed sensing. Finally, for Multimodal Data, it delves into data preprocessing, feature extraction, and advanced fusion techniques leveraging deep learning, attention mechanisms, and cross-modal transformers to integrate diverse data types and enhance predictive accuracy across various applications.
Article
Full-text available
Research Summary Multimodal data, comprising interdependent unstructured text, image, and audio data that collectively characterize the same source, with video being a prominent example, offer a wealth of information for strategy researchers. We emphasize the theoretical importance of capturing the interdependencies between different modalities when evaluating multimodal data. To automate the analysis of video data, we introduce advanced deep machine learning and data fusion methods that comprehensively account for all intra‐ and inter‐modality interdependencies. Through an empirical demonstration focused on measuring the trustworthiness of grassroots sellers in live streaming commerce on Tik Tok, we highlight the crucial role of interpersonal interactions in the business success of microenterprises. We provide access to our data and algorithms to facilitate data fusion in strategy research that relies on multimodal data. Managerial Summary Our study highlights the vital role of both verbal and nonverbal communication in attaining strategic objectives. Through the analysis of multimodal data—incorporating text, images, and audio—we demonstrate the essential nature of interpersonal interactions in bolstering trustworthiness, thus facilitating the success of microenterprises. Leveraging advanced machine learning techniques, such as data fusion for multimodal data and explainable artificial intelligence, we notably enhance predictive accuracy and theoretical interpretability in assessing trustworthiness. By bridging strategic research with cutting‐edge computational techniques, we provide practitioners with actionable strategies for enhancing communication effectiveness and fostering trust‐based relationships. Access our data and code for further exploration.
Article
Full-text available
The incessant evolution of online platforms has ushered in a multitude of shopping modalities. Within the food industry, however, assessing the delectability of meals can only be tentatively determined based on consumer feedback encompassing aspects such as taste, pricing, packaging, service quality, delivery timeliness, hygiene standards, and environmental considerations. Traditional text data mining techniques primarily focus on consumers’ emotional traits, disregarding pertinent information pertaining to the online products themselves. In light of these aforementioned issues in current research methodologies, this paper introduces the Bert BiGRU Softmax model combined with multimodal features to enhance the efficacy of sentiment classification in data analysis. Comparative experiments conducted using existing data demonstrate that the accuracy rate of the model employed in this study reaches 90.9%. In comparison to single models or combinations of three models with the highest accuracy rate of 7.7%, the proposed model exhibits superior accuracy and proves to be highly applicable to online reviews.
Preprint
Full-text available
The incessant evolution of online platforms has ushered in a multitude of shopping modalities. Within the food industry, however, assessing the delectability of meals can only be tentatively determined based on consumer feedback encompassing aspects such as taste, pricing, packaging, service quality, delivery timeliness, hygiene standards, and environmental considerations. Traditional text data mining techniques primarily focus on consumers' emotional traits, disregarding pertinent information pertaining to the online products themselves. In light of these aforementioned issues in current research methodologies, this paper introduces the Bert BiGRU Softmax model combined with multimodal features to enhance the efficacy of sentiment classification in data analysis. Comparative experiments conducted using existing data demonstrate that the accuracy rate of the model employed in this study reaches 90.9%. In comparison to single models or combinations of three models with the highest accuracy rate of 7.7%, the proposed model exhibits superior accuracy and proves highly applicable to online reviews.
Article
Full-text available
Cross-border e-commerce logistics activities increasingly use multimodal transportation modes. In this transportation mode, the use of high-performance optimizers to provide decision support for multimodal transportation for cross-border e-commerce needs to be given attention. This study constructs a logistics distribution optimization model for cross-border e-commerce multimodal transportation. The mathematical model aims to minimize distribution costs, minimize carbon emissions during the distribution process, and maximize customer satisfaction as objective functions. It also considers constraints from multiple dimensions, such as cargo aircraft and vehicle load limitations. Meanwhile, corresponding improvement strategies were designed based on the Sand Cat Swarm Optimization (SCSO) algorithm. An improved swarm intelligence algorithm was proposed to develop an optimizer based on the improved swarm intelligence algorithm for model solving. The effectiveness of the proposed mathematical model and improved swarm intelligence algorithm was verified through a real-world case of cross-border e-commerce logistics transportation. The results indicate that using the proposed solution in this study, the cost of delivery and carbon emissions can be reduced, while customer satisfaction can be improved.
Article
Full-text available
With the advent of the digital economy, enterprises have been engaging in brand management activities through cross-border e-commerce platforms to secure brand identification (BI) and capture market share. However, scant attention has been given to the impact of perceived brand globalness (PBG) and perceived brand localness (PBL) on brand identification in cross-border e-commerce platforms. This study delves into the underlying mechanisms governing the formation of brand identification in the context of cross-border e-commerce platforms. In this end, we employed the AMOS26.0 software to conduct structural equation analysis on a corpus of 300 survey questionnaires. The results show that: (1) PBG and PBL can exert a positive influence on BI through customer perceived value; (2) acculturation (AC) assumes a positive moderating role in the influence of PBG and PBL on emotional value (EV) and functional value (FV) respectively, (3) and that platform reputation (PR) plays a constructive moderating role in the impact of PBG on FV.
Article
Full-text available
With the expansion of communication technologies and their impact on trade patterns, e-commerce strategies have also undergone significant changes. Adapting to these changes necessitates the utilization of artificial intelligence techniques to automate a wide range of processes and further develop e-commerce. This article employs a combination of machine learning techniques and multi-objective optimization to enhance the supply chain performance in Cross-Border E-Commerce (CBEC). To achieve this, a framework for intelligent CBEC based on Internet of Things (IoT) technology is proposed. By deploying machine learning models within this framework, efforts are made to improve supply chain performance through demand volume prediction. The predictive model used in the proposed method is an ensemble system based on Adaptive Neuro-Fuzzy Inference System (ANFIS), which employs weighted averaging to predict demand volume for each retail unit. The configuration of this prediction model is done at two levels, utilizing Particle Swarm Optimization. At the first level, the hyperparameters of each ANFIS model are optimized, and at the second level, the weight values of each learning component are optimized using this algorithm. The performance of this predictive model in enhancing the CBEC supply chain structure is evaluated using real-world data. Based on the results, the proposed predictive model achieves an average absolute error of 2.54 in demand volume prediction, showcasing a minimum reduction of 8.58% compared to previous research. Moreover, the improvement in supply chain performance through this model will lead to reduced delays and increased efficiency in CBEC, demonstrating the effectiveness of the proposed model.
Chapter
Virtual reality (VR) games have gained significant popularity in recent years, offering immersive and interactive experiences. The success of VR games relies heavily on innovative interaction modes that enhance player engagement and immersion. This paper explores the concept of innovative interaction modes in VR games and their impact on player experiences. We examine various forms of interaction modes, including gesture-based controls, motion tracking, and haptic feedback, and analyze their effectiveness in enhancing gameplay. Furthermore, we discuss the challenges and opportunities in designing and implementing innovative interaction modes. Through a comprehensive review of existing research and case studies, this paper provides insights into the potential of innovative interaction modes to revolutionize the gaming industry. The findings underscore the importance of continuous innovation and experimentation to create compelling and immersive VR game experiences. By understanding and leveraging these innovative interaction modes, game developers can deliver more engaging and memorable gameplay, transforming the way players interact with virtual worlds. In this study, we conducted a prototype system to evaluate the impact of innovative interaction modes on player engagement and gameplay experiences in VR games. We developed two different interaction modes: gesture-based controls and in-game control tool.