ArticlePDF Available

Dynamic Pricing Model of E-Commerce Platforms Based on Deep Reinforcement Learning

Tech Science Press
Computer Modeling in Engineering & Sciences
Authors:

Abstract and Figures

With the continuous development of artificial intelligence technology, its application field has gradually expanded. To further apply the deep reinforcement learning technology to the field of dynamic pricing, we build an intelligent dynamic pricing system, introduce the reinforcement learning technology related to dynamic pricing, and introduce existing research on the number of suppliers (single supplier and multiple suppliers), environmental models, and selection algorithms. A two-period dynamic pricing game model is designed to assess the optimal pricing strategy for e-commerce platforms under two market conditions and two consumer participation conditions. The first step is to analyze the pricing strategies of e-commerce platforms in mature markets, analyze the optimal pricing and profits of various enterprises under different strategy combinations, compare different market equilibriums and solve the Nash equilibrium. Then, assuming that all consumers are naive in the market, the pricing strategy of the duopoly e-commerce platform in emerging markets is analyzed. By comparing and analyzing the optimal pricing and total profit of each enterprise under different strategy combinations, the subgame refined Nash equilibrium is solved. Finally, assuming that the market includes all experienced consumers, the pricing strategy of the duopoly e-commerce platform in emerging markets is analyzed.
This content is subject to copyright. Terms and conditions apply.
ech
T
PressScience
Com puter Modeling in
Engineering & Sciences
DOI: 10.32604/cmes.2021.014347
ARTICLE
Dynamic Pricing Model of E-Commerce Platforms Based on
Deep Reinforcement Learning
Chunli Yin1,* and Jinglong Han2
1College of Economics and Administration, Tonghua Normal University, Jilin, 130000, China
2Department of Administration Section, Tonghua Normal University, Jilin, 130000, China
*Corresponding Author: Chunli Yin. Email: TH Yin@thnu.edu.cn
Received: 19 September 2020 Accepted: 12 November 2020
ABSTRACT
With the continuous development of articial intelligence technology, its application eld has gradually expanded.
To further apply the deep reinforcement learning technology to the eld of dynamic pricing, we build an intelligent
dynamic pricing system, introduce the reinforcement learning technology related to dynamic pricing, and intro-
duce existing research on the number of suppliers (single supplier and multiple suppliers), environmental models,
and selection algorithms. A two-period dynamic pricing game model is designed to assess the optimal pricing
strategy for e-commerce platforms under two market conditions and two consumer participation conditions. The
rst step is to analyze the pricing strategies of e-commerce platforms in mature markets, analyze the optimal pricing
and prots of various enterprises under dierent strategy combinations, compare dierent market equilibriums and
solve the Nash equilibrium. Then, assuming that all consumers are naive in the market, the pricing strategy of the
duopoly e-commerce platform in emerging markets is analyzed. By comparing and analyzing the optimal pricing
and total prot of each enterprise under dierent strategy combinations, the subgame rened Nash equilibrium is
solved. Finally, assuming that the market includes all experienced consumers, the pricing strategy of the duopoly
e-commerce platform in emerging markets is analyzed.
KEYWORDS
Deep reinforcement learning; e-commerce platform; dynamic evaluation; game model; pricing strategy
1 Introduction
With the development of the Internet and the popularization of e-commerce, it has become
easier for people to obtain more comprehensive information on goods and services. Changes
in the price of goods or services will also have an impact on consumers’ shopping behavior
in the shortest time, which directly affects corporate prots. To maximize efciency, companies
often adjust the prices of goods or services regularly or irregularly based on certain factors,
which is also consistent with the goal of deep reinforcement learning in the eld of articial
intelligence. The goal of deep reinforcement learning is to maximize long-term benets. Therefore,
the technical means of deep reinforcement learning can achieve the intelligent pricing of goods
This work is licensed under a Creative Commons Attribution 4.0 International License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the
original work is properly cited.
292 CMES, 2021, vol.127, no.1
or services. The e-commerce customer’s purchase behavior prediction makes a real-time prediction
of an online customer’s purchase tendency behavior based on the behavioral laws contained in the
consumer’s historical access click operations, server logs, browsing records and product feedback
information. Therefore, customers can recommend products, formulate marketing strategies, and
determine the purchase and shipment of platform products.
Dynamic pricing is a strategy for enterprises to dynamically adjust commodity prices based
on customer demand, their own supply capacity and other information to maximize revenues [1],
and some scholars also call it personalized pricing [2]. With the continuous development of
articial intelligence technology, increasingly more scholars have sought to use intelligent methods
to solve dynamic pricing problems. Deep reinforcement learning is one of the most widely used
technologies. It is inspired by the ability of people and animals in nature to adapt to the
environment effectively. Learning from the environment through continuous trial and error is an
important branch of machine learning. It has a very wide range of applications in the elds of
articial intelligence problem solving, multiagent control, robot control and motion planning, and
decision-making control [3,4], Learning from the environment is one of the core technologies of
intelligent system design and decision-making, and it is also a key issue in dynamic pricing in
strategy research. The development is of the Internet, increasingly erce market competition, and
the need for customer management have transformed the pricing model of commercial enterprises
from xed prices to dynamic pricing. This transformation relies heavily on the development
of the Internet, market competition, and customer management needs. Dynamic pricing in an
e-commerce environment is based on the customer’s value of a subproduct or service [5,6]anda
dynamic price adjustment strategy for different customers or commodities. Sellers can achieve the
goal of dynamic pricing by integrating customer databases that meet specic standards of target
customers [7,8]. When the quantity demanded is random and price sensitive, dynamic pricing
becomes an effective method to maximize prots [9,10]. Varying, dynamic prices are an important
feature of e-commerce pricing. Effectively formulating dynamic pricing strategies is an important
factor for enterprises to succeed in the eld of e-commerce [11,12]. E-commerce companies need
to adopt four methods of dynamic pricing decision-making strategies, namely, a time-based pricing
strategy, a market segmentation and limited rationing strategy, a dynamic marketing strategy, and
comprehensive application based on dynamic pricing [13,14]. The time-based pricing strategy is
implemented according to the price difference that consumers can bear at different times. The key
is to grasp the psychological difference of customers’ price tolerance at different times [15,16].
The basic principles of the market segmentation and limited rationing strategy are as follows:
using different channels, different times, and different energy expenditures, customers have different
price tolerance psychologies; companies have developed special product and service portfolios;
and companies differentiate pricing based on different product congurations, channels, customer
types, and times [17,18]. The dynamic marketing strategy takes advantage of the powerful advan-
tages of the Internet to quickly and frequently implement price adjustments based on changes
in supply and inventory levels to provide customers with different products, various promotional
offers, multiple delivery methods, and differentiated products. In addition, in the actual application
process, the enterprise may consider implementing a certain strategy individually or combining
strategies. When formulating pricing strategies, the best approach is to experiment with specic
customer groups, select the best pricing model [19,20], and then adjust the model accordingly.
In dynamic pricing, companies can use some modeling methods, such as inventory models, data-
driven models, game models, machine learning models, and simulation models, to assist analysis
and decision-making [21,22]. Data-driven models use statistical or simulation techniques to effec-
tively use customer data to calculate appropriate dynamic prices. Currently, dynamic pricing is
CMES, 2021, vol.127, no.1 293
also one of the important research areas of customer relationship management and data mining
technology [23,24]. Negotiation is a dynamic interactive process for the parties involved in the
transaction to reach a transaction agreement. During the negotiation process, all parties to the
negotiation exchanged proposals for negotiation reecting their beliefs and intentions. In each
round of negotiation, the agent proposes negotiation proposals based on its own negotiation
strategy and evaluates the received proposals to determine whether to accept the other party’s
proposal [25,26]. The negotiation process is usually a dynamic process of learning and updating
your beliefs.
Therefore, in-depth study of the application of deep reinforcement learning methods in the
eld of dynamic pricing is of great signicance to the development of articial intelligence, deep
reinforcement learning methods and their applications in dynamic pricing and other elds. We will
review two aspects of deep reinforcement learning technology and its specic application in the
eld of dynamic pricing. First, based on the existing dynamic pricing, the relevant key technolo-
gies of deep reinforcement learning are introduced. Then, the application of deep reinforcement
learning in dynamic pricing is reviewed from different perspectives, and the advantages and
disadvantages are analyzed. Next, we systematically review platform pricing theory and differential
pricing theory, use game theory as the main research method to establish a competitive platform
enterprise pricing game model, and analyze network externalities and consumer switching costs
in mature and emerging markets as well as the impact of enterprise pricing strategies on market
equilibrium to systematically analyze the dynamic pricing behavior of platform companies. The
rst section of this paper is the introduction, the second part introduces the construction of
the e-commerce dynamic pricing model based on data mining, the third section studies the deep
reinforcement learning transaction recognition model, and the fourth section studies the research
on the e-commerce dynamic pricing model. The results and discussion are given in the fth
section, and the sixth section is a summary.
2 Construction of the E-Commerce Dynamic Pricing Model Based on Data Mining
At present, data mining should focus on customer relationship management in the application
research of e-commerce tools. Although some scholars have also proposed the theory of applying
data mining technology to e-commerce dynamic pricing tools, many of theories are scattered and
general. Theoretical analysis, without comprehensive and systematic application analysis, lacks
the overall grasp of the application of data mining in the dynamic pricing of e-commerce, and
the effectiveness of data mining cannot be fully utilized. To this end, this article establishes a
dynamic pricing model for e-commerce based on data mining and proposes applying data mining
technology to dynamic pricing decisions, which will be of great help to e-commerce companies
in pricing decisions. The model is composed of three layers, namely, the data layer, the analysis
layer and the decision layer, from top to bottom [27]. These three levels are closely connected,
and each level contains the application of related theories and technologies of data mining and
dynamic pricing, which together achieve the goal of e-commerce dynamic pricing decisions. The
model is shown in Fig. 1.
2.1 Data Layer
The task of the data layer is to collect data related to pricing decisions and preprocess these
data to form a data warehouse to prepare for the next stage of data mining.
After the data source is selected, the data must be collected in a timely and high-quality
manner and imported into a series of data les, usually in the form of database storage. This step
294 CMES, 2021, vol.127, no.1
can be used to generate and obtain data in the form of network-free action, but it also requires
enterprises to build a basic database in vain and update it in time according to inventory, market
and sales reports. The data collected through various channels may have considerable redundancy,
or there may be inaccurate, incomplete, and inconsistent data. This requires preprocessing the
data if the data are extracted, veried, and cleaned. Conversion, integration and other processes
to improve data quality, form a data collection suitable for data mining, and load it into the
data warehouse.
Perception layer
Online transaction
data
Customer
transaction
information
Dynamic pricing strategy
database
Preliminary dynamic
pricing strategy
Database
Dynamic
pricing model
Deep
reinforcement
learning
E-commerce
platform
Figure 1: E-commerce dynamic pricing model based on deep reinforcement learning
2.2 Analysis Layer
The main tasks of the analysis layer are to use data mining models and related algorithms to
analyze and process the data obtained, to mine knowledge useful for dynamic pricing decisions,
and to form the initial knowledge base. The realization of this stage is the core of the whole model
construction. In dynamic pricing-assisted decision-making tools, methods such as association rules,
classication, clustering, and sequence pattern Analysis can be used.
Correlation analysis aims to mine the data relationships or rules hidden in the data (ware-
house) database, that is, to discover the laws or knowledge of dependence or association between
an event and other events. In e-commerce dynamic pricing tools, association analysis can be
used to nd customer’s views on various product visits and purchases on a website, to determine
various associations of customer buying behavior and to acquire information on customer buying
behaviors and product prices and other product information The relationship between these types
CMES, 2021, vol.127, no.1 295
of information can be used to further discover the relationship between demand and price, which
is an important point for dynamic pricing decisions. The collected basic customer data and
transaction data can use the Apriori algorithm to discover the details of the customers’ purchase
associations [28,29].
2.3 Decision-Making Layer
The decision layer is a key part of the realization of the entire model. The main task of this
layer is to make dynamic pricing decisions based on the knowledge base that established by the
analysis layer and combined with the business strategy of the enterprise.
Through the application of analysis layer data mining technology, one can obtain the char-
acteristics of the access patterns, purchase patterns, habits and preferences of different customer
groups; the correlation characteristics between price and demand and the sales of goods, as well
as the number of people related to the goods and the amount of sales; the predicted value of time
series data of inventory data; etc. Using this basic knowledge, the seller can make preliminary
dynamic pricing decisions. In the time-based strategy, rst determine the appropriate initial is
determined, and factors such as historical sales data, cost information are comprehensively con-
sidered; then, given the initial maximum or minimum price, a double price change basis can be
used to adjust the price by setting a time threshold on the quantity of goods or demand, and then
controlling the time and range of the price changes [30]. When using market segmentation strate-
gies to differentiate pricing based on customer information, the strategies must be understood by
customers, and strategic consumers must adopt appropriate and targeted dynamic pricing strategies
based on their purchase records and price sensitivity [31], thereby achieving customer satisfaction.
The ultimate goal of dynamic pricing for e-commerce companies is to maximize customer
satisfaction or maximize corporate prots; moreover, companies have different goals in differ-
ent periods of their operations and different requirements for pricing strategies. Therefore, the
enterprise pricing decision is a multiobjective decision-making process. To this end, we must rst
establish a multiobjective function. Using various mined related information and forecast data, an
appropriate demand function can also be established, and the price can be adjusted according to
customer demand or corporate sales/inventory. When applying this traditional enterprise dynamic
pricing strategy, there are many mature pricing models that can be referenced. For example, the
pricing model based on inventory control uses dynamic programming to achieve dynamic pricing
and the application of other mathematical models.
3 Deep Reinforcement Learning Transaction Recognition Model
The intelligent behavior between a group of autonomous and intelligent agents, and how they
coordinate with each other to take action to achieve a certain goal forms Multi-Agent System
(MAS) behavior. In MAS, the mutual coordination among agents includes the coordination of
knowledge, goals, skills and planning directions. The goal they achieve may be a solution goal or
a set of several solution goals. According to the denition, the multiagent collaborative solution
model is shown in Fig. 2.
The input layer of the network has no calculation nodes, and is only used to obtain external
input signals. The neurons of the hidden layer and the output layer are the calculation nodes. The
basis function is a linear function and the activation function is a hard limit function. Suppose
the MLP has only one hidden layer, and its input is t1,t2,...,tn. In addition, the hidden layer has
m1neurons, and their outputs are h1,h2,...,hn. Finally, the network output is represented by δp.
296 CMES, 2021, vol.127, no.1
Then, the output of the j-th neuron in the hidden layer is:
hj=fn
i=1
ωijtiδi,j=1, 2 ...n(1)
Strategy solution
Surroundings
Learning model
Knowledge base
Collaborative solution set
Coordination Coordination Coordination
Knowledge base
Strategy selection
Coordinator
Figure 2: Multiagent collaborative solution model
When the multilayer perceptron is used to solve practical problems, it must rst solve the
problem of training the connection weight between the input and the hidden layer; however,
because it is difcult to determine the expected output value of the hidden layer output, the
network weight training cannot be achieved. Therefore, people seek other neural network solutions
to solve the linear inseparable problem, and the BP network is such a network.
An e-commerce platform, the platform often needs to analyze and predict the customers’
online shopping behavior. Based on the customer information database, the e-commerce plat-
form completes real-time and targeted predictions of customers’ online shopping behaviors, thus
embodying intelligent predictions of customer behaviors. Therefore, as a complete predictive model
system, we rst need to use methods such as data mining, machine learning, and statistics to
discover knowledge and extract features from the data. Based on this, we build a knowledge base
of customer online shopping behavior as knowledge guidance, storage and representation and
then establish a system from data input to prediction behavior. The main research contents are
as follows:
CMES, 2021, vol.127, no.1 297
(1) Consumer behavior data processing and feature construction
First, the interactive logs are extracted from the E-commerce interactive system to pre-
pare data related to consumer behavior analysis and prediction. Then, data preprocessing,
including data cleaning, lling missing values and removing outliers, is performed to ensure
the uniqueness of the data to achieve consumer behavior prediction and provide a good
basic guarantee.
(2) Construction of consumer behavior characteristics
Based on the original data, the user purchase behavior features are extracted. According
to different classication methods, the features can be divided into original and extended
or static and dynamic, or two or more categories of features can be combined into a new
feature. To obtain a good prediction effect, the data and characteristics largely determine
the upper limit of the model prediction. Therefore, how to construct suitable characteristics
is the key factor to provide a good guarantee for the analysis of user behavior.
(3) Consumer behavior prediction model
The accuracy of the prediction model is the key to ensuring the prediction and analysis
of consumer behavior. Although there are many prediction models at present, they are
far from meeting the accuracy requirements under real conditions. How to use consumer
static or dynamic data analysis to accurately predict consumer behavior is an extremely
critical technology.
(4) Consumer shopping behavior analysis
In the representational learning of data, the goal is to seek better representation methods
and create better models to learn these representation methods from large-scale unlabeled
data. The workow of consumer shopping behavior analysis based on deep learning is
mainly divided into the following four steps.
Step 1: Prepare and process the data set. This step includes collecting user interaction
information, data cleaning, etc.
Step 2: Feature construction is divided into three stages: feature selection, forming the sam-
ple training set and test set, and feature processing. Feature selection is the key to building a
prediction model. It selects feature sets that are extremely important for classication from a
large number of data sets, thereby improving the model’s prediction accuracy and shortening
the running time. The inconsistency of feature dimensions and units which selected for different
dimensions will affect the weight of the assessment features, which in turn affects the model’s
estimated effect. Therefore, feature management is required to perform normalization.
Step 3: Design and train the prediction model. Select the basic model framework such as
the convolutional neural network (CNN)+recurrent neural network (RNN). Then, using the
framework, randomly sample negative samples of the data, adjust the number of network layers,
determine the loss function, and design the learning rate and other hyperparameters. The BP
algorithm back-propagates using stochastic gradient descent (SGD) or the Adam algorithm to
optimize model parameters.
Step 4: Model verication. Untrained data are used to verify the generalization ability of the
model. If the prediction result is not ideal, you need to redesign the model and conduct a new
round of training. There are several mature deep learning models to date, including deep neural
networks (DNNs), convolutional neural networks (CNNs), deep condence networks (DBNs),
and recurrent neural networks (RNNs). These methods have been used in machine vision,
298 CMES, 2021, vol.127, no.1
natural language processing, bioinformatics, speech recognition and other elds and have achieved
remarkable results.
4 Research on E-Commerce Dynamic Pricing Model
4.1 Deep Reinforcement Learning
The working principle of deep reinforcement learning is similar to that of human learning.
If an action of the agent obtains a positive reward from the environment, then the agent’s future
actions will be enhanced; conversely, if a negative reward is received, then the future actions will
be weakened. The goal of deep reinforcement learning is to learn an action strategy, so that
the system can obtain the largest cumulative reward. In deep reinforcement learning, the agent
selects and executes an action a in the environment, the environment changes to s after accepting
the action, and feeds back a reward signal r to the agent, and the agent selects the subsequent
action according to the reward signal. In research related to dynamic pricing, the goal of deep
reinforcement learning systems is to enable manufacturers to maximize their overall returns while
ignoring the short-term benets of a single transaction. A deep reinforcement learning architecture
generally includes four elements: Strategy, reward and punishment feedback, the value function,
and the environmental model. The environment-related factors of dynamic pricing are numerous
and complex. Previous studies of dynamic pricing in deep reinforcement learning were mainly
based on the following environmental frameworks.
Deep reinforcement learning can be divided into value-based deep reinforcement learning and
policy-based deep reinforcement learning. In deep reinforcement learning based on value functions,
commonly used learning algorithms include the Q-learning algorithm, SARSA algorithm and
Monte Carlo algorithm. In dynamic pricing research based on deep reinforcement learning, these
three algorithms are also frequently used algorithms. (1) Q-learning algorithm. The Q-learning
algorithm is a model-free algorithm, and its iteration equation is expressed as:
δi=mi+1+λmax Q(si+1,α)Qt(stαt)(2)
where Q(st+1,α)is the state action value at time t, m is the reward value, λis the discount factor,
a is the learning rate, stis the time difference error, and ais the action that state st+1can perform.
4.2 SARSA Algorithm
SARSA is a strategy algorithm that can nd the optimal strategy through iteration of the
state action value function when the reward function and state transition probability are unknown.
When the state action pair is accessed innitely, the algorithm will converge to the optimal strategy
and state action value function with a probability of 1. The SARSA algorithm adopts relatively
safe actions in learning, so the convergence speed of the algorithm is slow. The iteration equation
is expressed as:
Q(s,α)=Q(s,α)+αλ+λQst,αy (3)
4.3 Monte Carlo Algorithm
The Monte Carlo algorithm does not require complete knowledge of the environment, and
only requires experience to solve the optimal strategy. These experiences can be obtained online
or according to some simulation mechanism. The Monte Carlo method keeps a count of the
frequency of state actions and future rewards and establishes their values based on estimates. The
Monte Carlo technique estimates the return of the average sample based on the sample. For each
state, keep all the states obtained from state, and the value of one state is their average value.
CMES, 2021, vol.127, no.1 299
Especially for periodic tasks, Monte Carlo technology is very useful, especially for periodic tasks.
Since sampling depends on the current strategy, the strategy only evaluates the reward of the
proposed action. The value function update rule is expressed as:
V(st)=V(st+1)+α(λiV(st)) (4)
where λiis the reward value at time t, and a is the step parameter.
4.4 E-Commerce Dynamic Pricing Model
Dynamic pricing in e-commerce is one of the fastest growing areas in Internet applications.
By applying an online auction-style dynamic pricing model, companies can products based on the
true market value of commodities. In most real markets, only the buyer himself knows exactly
how many items he will be willing to buy at a specic price level. The seller does not have perfect
knowledge of the market demand and cannot accurately understand the buyer’s valuation. The
seller only has statistical information about the market demand. This chapter mainly starts from
the “individual valuation” model and discusses the online auction” where a single seller provides
auction items, multiple buyers bid on the auction items, and an auction-type dynamic pricing
model exists.
4.4.1 Auction Dynamic Pricing Model and Analysis Based on Uncertain Demand
Suppose that the system is a market environment where a certain auctioneer on the Internet
auctions many items, there are many demanders, and the quantity of demand is uncertain. Let the
set of n demand-side agents sets be N, and let F be the set of all possible allocation combinations
among them. Each distribution combination αF,Agent,jNis assigned a monetary amount
vj(α),andvj(α)is private information, that is, an “independent individual valuation”. Indepen-
dence means that each buyer’s personal information is independent of other bidders’ personal
information. Personal valuation means that once a buyer uses his own information to evaluate the
value of the auction target, this valuation will not be subject to his follow-up knowledge of the
impact of any other purchaser’s personal information.
V(N)=Max
αF
vj(α)
V(N\j)=Max
αF\j
vj(α)(5)
If the auction process is closed, the auction process is as follows: Agents submit their
monetary amount function, and we temporarily assume that they are faithfully submitting their
monetary function. Later, it will be explained that false reporting cannot improve the income of
any agents. The auctioneer chooses the best distribution plan for all calculations of V(N) and V
(N/j). In this way, the agent’s payment is:
V(N\j)=
i=j
viα(6)
The net income is:
vjα
V(N\j)
i=j
viα
(7)
300 CMES, 2021, vol.127, no.1
Suppose that the seller agent S has 5 indivisible commodities and that 5 bidders A1,A2...A5
participate in the bidding. The possible demand and bid of each bidder A1,A2...A5are shown
in Fig. 3, and the revenue of the seller’s agent S is shown in Fig. 4.
Figure 3: Possible demand and bids of bidders
Figure 4: Income of the seller’s agent S
4.4.2 Exchange-Based Dynamic Pricing Model Based on Uncertain Demand
In the online auction MDA market environment, it is assumed that there are m buyers and
n sellers. The number of buyers and the number of sellers are arbitrary, and it is not assumed
that there are more buyers than sellers or more sellers than buyers. Each buyer i =1, 2 ...m wants
to buy Xiunits of homogenous goods. Each seller j =1, 2 ...nhasYjunits of homogenous goods
for sale. To simplify the analysis, it is assumed that Xiand Yjare public information for all
CMES, 2021, vol.127, no.1 301
participants, that is, the m buyers and n sellers know each other’s quantity demanded for the
commodity or quantity supplied of the commodity. However, the reserved price biof buyer i
and the reserved price of seller j are sjis private information, that is, each is an “independent
individual valuation.” The agent in the model assumes that their reserve price is static and remains
unchanged during the auction.
When the auction is over (that is, market liquidation), assume that buyer i purchases tij units
of goods from seller j, and mij is the transaction price of the transaction. In this way, the utility
obtained by buyer i after the auction can be dened as:
ubi=
N
j=1bitijmij (8)
The utility obtained by seller j can be dened as:
usj=
N
i=1tij sjmij (9)
If all information is public, the maximized total market value, that is, the aggregate util-
ity of all agents participating in the auction, can be obtained through the following linear
programming problem:
Max
m
i=1
n
j=1
mij bisj(10)
5 Results and Discussion
Since the third-party brushing platform uses exchange information for brushing customers
and merchants as a prot method, to obtain false transaction information, the author entered
the third-party brushing platform by pretending to be a brushing identity and released the billing
information through the third-party platform. Then, the author collected comments and transac-
tion records of fake trading products. In addition, to collect data on normal trading commodities,
the author chose ofcial agship stores (such as Hailan House, ONLY, VERO, MODA, Uniqlo
and other ofcial Tmall agship stores with a high reputation in reality) and combined these
product reviews and transaction records are used as training sets for regular trading products.
Based on this, the author collected the data of nearly 130,000 reviews data and the transaction
record data of the most recent month of the product as the input data set of the recognition
model. After normalizing the data, an independent sample t-test was performed, and the results
are shown in Tab. 1.
It is not difcult to see the convergence of the algorithm in Fig. 5. In the case that the speci-
ed number of iterations is 80, the scale of the problems involved in our discussion can converge
well to the optimal value. As the scale of the problem continues to increase, the maximum number
of iterations can be adjusted according to the specic situation.
Taking the dynamic bidding market as an example, in K transaction cycles, there are N trans-
action agents bidding on M brand cars, and the matching agent calculates and matches the
bids based on the matching transaction model and algorithm. Trading agents are risk-neutral,
and all participate in bidding in a random optimal way. According to the microstructure and
302 CMES, 2021, vol.127, no.1
dynamic trading mechanism, the market equilibrium easily forms for the same type of commodity
bidding; however, when multiple types of goods are matched at the same time, the market status
will become very complicated. Therefore, we designed market price dynamic uctuations and
equilibrium experiments for single commodities and multiple types of commodities.
Table 1: Summary statistics of the characteristics
Is it false N Mean Standard
deviation
Standard error
of the mean
Store registration time 0
Store registration time 1
125
85
0.479
0.865
0.587
0.927
0.0534
0.1023
Refund dispute rate 0
Refund dispute rate 1
116
78
0.4789
0.8675
0.0167
1.234
0.00145
0.1356
Product review rate 0
Product review rate 1
118
78
0.2689
0.4456
0.0428
1.267
0.00256
0.1543
Single product review
Ratio 0
Single product review
Ratio 1
119
77
0.578
1.036
0.0278
1.0387
0.00267
0.1156
Collection rate 0
Collection rate 1
116
75
0.234
0.367
0.3768
1.467
0.03345
0.1678
Repeat review rate 0
Repeat comment rate 1
115
82
0.3567
0.6754
0.265
1.675
0.0243
0.1864
Average comment
Length 0
Average comment
Length 1
122
82
0.4876
0.7894
0.421
1.234
0.0375
0.1365
Figure 5: Variation curve of individual target value for 80 iterations under 80 periods
In experiment 1, set N =26, K =2, and M =1; and a total of 120 bids were made.
In experiment 2, set N =26, K =36, and M =5; and a total of 140 bids were made; Matching
CMES, 2021, vol.127, no.1 303
and matching will be performed according to the bid price, and the matching transaction price will
be calculated. In addition, based on the actual transaction data of 400 groups of an automobile
trading market, the standard deviation of the matching transaction price is calculated.
Let EquTe represent the degree of equilibrium of market prices. Then, according to the
trading entropy and Walrasian equilibrium, EquTe can be dened as the probability of the
occurrence of an equilibrium trading price.
EquTe =1
K
K
k=1
p
kEk(Vk)
Ek(Vk)(11)
Here, Ek(Vk)is the expectation of the nal value of the commodity in trading cycle k, p
kis
the equilibrium price of market liquidation, K is the number of rounds of the trading cycle,
Time represents the trading cycle, and Price Diff represents the current market transaction price
correction. The experimental results are shown in Figs. 6 and 7.
Figure 6: Price uctuation and equilibrium of a single commodity in 2 rounds of bid-
ding transactions
Fig. 6 shows that in the trading cycle, when the price correction value of the commod-
ity market bidding is at a medium level, the probability that the transaction price reaches an
equilibrium is greater. In fact, the market price dispersion measure is close to the ideal value.
After multiple trading cycles (L =2), the peak sequence of price uctuations forms a Walrasian
equilibrium curve that matches the trading market. Fig. 7 shows that when there are multiple
types of commodities (M =80) participating in multiple rounds of bidding transactions (L =80),
the equilibrium point sequence of various types of commodity transactions forms multiple peaks,
which better reects the market competition and is balanced. The above experimental results show
that in the multiagent matching trading model, the price correction 8 is insensitive to individual
trading agents, and the entire market has good sensitivity to 8. Through multiagent bidding that
continuously adjusts the transaction price, the market can achieve an equilibrium with better
efciency. The experimental results show that the matching transaction model is cost effective
and has good market efciency. According to the market prediction model, the transaction price
uctuation trend of a certain brand car in the auto trading market is predicted. There are
304 CMES, 2021, vol.127, no.1
27 risk-neutral agents participating on the bidding of a certain type of car (using the buyer’s
market as an example). Then the actual transaction price, average bid price and transaction
forecast price uctuation trend of the 7 matching transaction cycles are shown in Fig. 8.
Figure 7: Price uctuation and equilibrium of commodities in 80 rounds of bidding transactions
Figure 8: Fluctuation curve of actual transaction price, average bid price and transaction predicted
price
Fig. 8 shows that the predicted price uctuations is basically between the actual price level
and the average bidding level, which better reects the price trend of this type of car brand in the
market matching transaction. Through market price prediction, the trading agent further adjusts
the bidding strategy to form their initial trading price and belief.
CMES, 2021, vol.127, no.1 305
6 Conclusion
The development of Internet technology and the popularization of the networks have
expanded the application range of data mining, and the application of data mining in e-commerce
tools has become increasingly extensive. This article uses data mining theory and methods and
dynamic pricing-related strategies to establish an e-commerce dynamic pricing model based on
data mining. Based on the mechanism of the model, the auction mechanism is analyzed and
discussed and suggestions for improving pricing strategies are proposed. The comprehensive data
mining of the model system in the application of e-commerce dynamic pricing tools has a rela-
tively general applicability to e-commerce enterprises, which can help enterprises improve customer
satisfaction and economic efciency. The E-commerce platform integrates the production and
sales of the enterprise, and the production and sales are mutually restricted. In the study of the
specic substitution effect of the multiproduct dynamic pricing research, we simply considered the
production constraints, but did not closely integrate production planning and sales and combine
them together. How to adjust commodity prices according to changes in production plans is a
question that requires further study.
Funding Statement: His work is supported by Scientic research planning project of Jilin Provincial
Department of education in 2020: Analysis of the impact of industrial upgrading on employment
of college students in Jilin Province (No. JJKH20200505JY).
Conicts of Interest: The authors declare that they have no conicts of interest to report regarding
the present study.
References
1. Hsu, L. F. (2016). E-commerce model based on the internet of things. Advanced Science Letters, 22(10),
3089–3091. DOI 10.1166/asl.2016.7992.
2. Huang, P. (2016). Research on the construction mode of e-commerce business platform in higher vocational
colleges based on the government purchase of public service theory. Electronic Test, 16(8X ), 170–171. DOI
10.16520/j.cnki.1000-8519.2016.16.093.
3. Zhang, H., Tian, Y., Zhang, G. (2016). Dynamic option pricing model based on the realized-GARCH-NIG
approach. Open Journal of Social Sciences, 4(3), 66–71. DOI 10.4236/jss.2016.43011.
4. Kraines, S., Koyama, M., Weber, C. (2017). A collaborative platform for sustainable building design based
on model integration over the internet. International Journal of Environmental Technology & Management,
5(2), 135–161. DOI 10.1504/IJETM.2005.006847.
5. Kamalapurkar, R., Klotz, J. R., Walters, P., Dixon, W. E. (2018). Model-based reinforcement learning
in differential graphical games. IEEE Transactions on Control of Network Systems, 5(1), 423–433. DOI
10.1109/TCNS.2016.2617622.
6. Hu, C. (2016). Application of e-learning assessment based on AHP-BP algorithm in the cloud com-
puting teaching platform. International Journal of Emerging Technologies in Learning, 11(8), 27. DOI
10.3991/ijet.v11i08.6039.
7. Oliveira, S. M. D., Häkkinen, A., Lloyd-Price, J., Tran, H. Kandavalli, V. et al. (2016). Temperature-
dependent model of multi-step transcription initiation in Escherichia coli based on live single-cell measure-
ments. PLoS Computational Biology, 12(10), e1005174. DOI 10.1371/journal.pcbi.1005174.
8. Yun, Q. J., Fei, Z., Yue, Z. (2016). Change and prediction of the land use/cover in Ebinur Lake Wet-
land Nature Reserve based on CA-Markov model. Journal of Applied Ecology, 27(11), 3649–3658. DOI
10.13287/j.1001-9332.201611.027.
306 CMES, 2021, vol.127, no.1
9. Nuan, W., Zheng, H. L., Ling, P. Z. (2017). Deep reinforcement learning and its application on
autonomous shape optimization for morphing aircrafts. Journal of Astronautics, 38(11), 1153–1159. DOI
10.3873/j.issn.1000-1328.2017.11.003.
10. Wan, C., Li, T., Guan, Z. H. (2017). Spreading dynamics of an e-commerce preferential information
model on scale-free networks. Physica A Statistical Mechanics & Its Applications, 467, 192–200. DOI
10.1016/j.physa.2016.09.035.
11. Li, C., Cao, L., Chen, X. (2018). Cloud reasoning model-based exploration for deep reinforcement learn-
ing. Dianzi Yu Xinxi Xuebao/Journal of Electronics & Information Technology, 40(1), 244–248. DOI
10.11999/JEIT170347.
12. Zeigheimat, F., Ebadi, A., Rahmati-Najarkolaei, F., Ghadamgahi, F. (2016). An investigation into the effect
of health belief model-based education on healthcare behaviors of nursing staff in controlling nosocomial
infections. Journal of Education & Health Promotion, 5(1), 23–35. DOI 10.4103/2277-9531.184549.
13. Li, H. H., Cang, Y. C. (2016). GM (0,N) model-based analysis of the inuence factors of network english
learning platform. Journal of Grey System, 19(1), 31–40. DOI 10.30016/JGS.
14. Alladio, E., Giacomelli, L., Biosa, G., Corcia, D. D. Gerace, E. et al. (2018). Development and validation
of a partial least squares-discriminant analysis (PLS-DA) model based on the determination of ethyl
glucuronide (EtG) and fatty acid ethyl esters (FAEEs) in hair for the diagnosis of chronic alcohol abuse.
Forensic Science International, 282, 221–234. DOI 10.1016/j.forsciint.2017.11.010.
15. Zare, M., Ghodsbin, F., Jahanbin,I. (2016).The effect of health belief model-based education on knowledge
and prostate cancer screening behaviors: A randomized controlled trial. International Journal of Community
Based Nursing & Midwifery, 4(1), 57–68. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4709816/.
16. Qin, R., Zeng, S., Li, J. J. (2017). Parallel enterprises resource planning based on deep reinforcement learning.
Zidonghua Xuebao/Acta Automatica Sinica, 43(9), 1588–1596. DOI 10.16383/j.aas.2017.c160664.
17. Liang, M., Wang, B., Yan, T. (2017). Dynamic optimization of robot arm based on exible multi-body
model. Journal of Mechanical Science and Technology, 31(8), 3747–3754. DOI 10.1007/s12206-017-0717-9.
18. Li, L., Han, Y., Chen, W., Lv, C., Sun, D. et al. (2016). An improved wavelet packet-chaos model
for life prediction of space relays based on Volterra series. PLoS One, 11(6), e0158435. DOI
10.1371/journal.pone.0158435.
19. Ivan, C. F., Jones, E. S., Thiago, V. C. (2016). Development of a predictive control based on Takagi-Sugeno
model applied in a non-linear system of industrial refrigeration. Chemical Engineering Communications,
204(1), 39–54. DOI 10.1080/00986445.2016.1230850.
20. Wei, W. Z. (2018). Research on social responsibility of e-commerce platform. IOP Conference Series:
Materials Science and Engineering, 439(3), 32063. DOI 10.1088/1757-899X/439/3/032063.
21. Ge, F., Ding, X. (2016). Uncertain type of multiple-attribute electronic commerce investment decision
model based on the close degree of the scheme and its applications. iBusiness, 8(2), 31–35. DOI
10.4236/ib.2016.82004.
22. Hartadiyati, E., Rizqiyah, K., Wiyanto, Rusilowati, A., Prasetia, A. P. B. (2017). The integrated model of
sustainability perspective in spermatophyta learning based on local wisdom. Journal of Physics Conference,
895(1), 12051. DOI 10.1088/1742-6596/895/1/012051.
23. Shen, S., Zhu, D. H. (2017). Chinese place name recognition based on deep learning. Bei-
jing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology, 37(11), 1150–1155. DOI
10.15918/j.tbit1001-0645.2017.11.08.
24. Gang, J. T., Huang, L., Zhao, Z. W. (2015). Dynamic simulation of a SEIQR-V epidemic,
model based on cellular automata. Numerical Algebra Control & Optimization, 5(4), 327–337. DOI
10.3934/naco.2015.5.327.
25. Nikolaos, A., Christodoulou, N. E., Tousert, E. C. (2016). A modular repository-based infrastructure for
simulation model storage and execution support in the context of in silico oncology and in silico medicine.
Cancer Informatics, 2016(15), 219–235. DOI 10.4137/CIN.S40189.
26. Larson, D. B., Chen, M. C., Lungren, M. P., Halabi, S. S. Stence, N. V. et al. (2018). Performance of a
deep-learning neural network model in assessing skeletal maturity on pediatric handradiographs. Radiology,
287(1), 313–322. DOI 10.1148/radiol.2017170236.
CMES, 2021, vol.127, no.1 307
27. Ding, P., Li, Y. (2016). An electromechanical transient model of VSC and DC grid based on multi-rate
simulation method and simplied discrete newton method. Proceedings of the Chinese Society of Electrical
Engineering, 36(24), 6809–6819. DOI 10.13334/j.0258-8013.pcsee.160398.
28. Qu, S., Xi, Y., Ding, S. (2018). Image caption description of trafc scene based on deep learning. Journal of
Northwestern Polytechnical University, 36(3), 522–527. DOI 10.1051/jnwpu/20183630522.
29. Luo, N., Wang, X., Van, F. (2015). Integrated simulation platform of chemical processes based
on virtual reality and dynamic model. Computer Aided Chemical Engineering, 37, 581–586. DOI
10.1016/B978-0-444-63578-5.50092-X.
30. Salah, E. B., Jamila, E. A., Youssef, L. (2015). Learners’attitudes towardsextended-blended learning experi-
ence based on the S2P learning model. International Journal of Advanced Computer Science & Applications,
6(70), 78. DOI 10.14569/IJACSA.2015.061010.
31. Hsu, P. S. (2015). The design concept of self-determination based on-linelearning platform. Information
Japan, 16(9), 6531–6538. https://www.researchgate.net/publication/289210371_The_design_concept_of_
self-determination_based_on-line_learning_platform.
... One significant advantage of RL in dynamic pricing is its ability to balance exploration and exploitation. By experimenting with different pricing strategies, RL algorithms identify optimal price points that maximize long-term revenue while minimizing risks associated with customer dissatisfaction or loss of market share [1]. Additionally, RL integrates seamlessly with big data analytics, leveraging historical and real-time data to refine strategies continuously. ...
Article
Full-text available
Dynamic pricing has revolutionized the e-commerce industry by enabling businesses to adapt prices in real time to maximize revenue and customer satisfaction. This paper explores the application of reinforcement learning (RL) in dynamic pricing models, highlighting how RL can optimize pricing strategies by learning from historical and real-time data. The discussion includes an overview of traditional dynamic pricing methods, the advantages of RL in this context, implementation challenges, and real-world applications. The findings suggest that RL offers significant potential for improving pricing efficiency, enhancing customer experience, and driving competitive advantages in e-commerce.
... From the consumer side, it will have a negative impact on consumer psychology and their behaviour, as it will strengthen people's holdout mentality, which may cause market development to stagnate and corporate inventories to surge, which is detrimental to the development of the company and the industry, and ultimately makes the product and brand image damaged to a certain extent [17,18]. ...
Article
Full-text available
In the field of e-commerce, price wars have become a common means for firms to compete for market share. This paper takes price wars in e-commerce platforms as a background, and reviews the game theories used in them, as well as the ways in which firms adjust and optimize their pricing strategies in the context of dynamic competition. Based on the definitions of price wars e-commerce, sting models of static and dynamic games, the paper analyses the patterns of price wars and proposes suggested strategies for firms to cope with them, and illustrates the impact on market stability and firms’ long-term revenues. Finally, the paper discusses the impact of price wars on the long-term development of the industry, and puts forward corresponding policy recommendations and future plans, with the aim that through rational strategy optimization, enterprises can achieve sustainable competitive advantages and maximize their market performance in the midst of fierce price competition.
... Аналіз стратегій ціноутворення. До цієї групи варто віднести роботи, присвячені просторовому та гедоністичному моделюванню цін [8]; дослідженню динамічного ціноутворення [9]; оцінці оптимальної цінової стратегії для платформ електронної комерції [10]. Незважаючи на високі результати точності прогнозування, слід наголосити на локальному характері апробації розглянутих моделей глибокого навчання. ...
Article
The article is devoted to the study of the prospects of applying deep learning models in the field of economics and finance. The main types of deep learning architecture that have been applied in the economic sphere are identified. Based on the analysis of publications, the main areas of application of deep learning models in economics are identified, namely in the areas of macroeconomics and microeconomics for analyzing consumer behavior, pricing strategies, and competition. It is noted that most of the work on using machine learning models for market analysis relates to financial markets rather than commodity markets. More deep learning models must be developed for most goods and services markets. It has been established that the financial sector is one of the critical areas for using deep learning models. In the financial sector, deep learning is used for analyzing the situation in the financial sector and forecasting financial market indicators (stock prices, exchange rates, and cryptocurrencies); analysis of financial statements; analysis and management of risks (credit risk analysis, fraud detection, securities portfolio risk analysis, securities portfolio optimization), etc. The following areas of application of deep learning models are considered: financial market forecasting, foreign exchange market forecasting, algorithmic trading, credit risk analysis and assessment, and fraud detection. The author identifies several problems and limitations of using deep learning models in economics and finance: lack of research comprehensiveness; problem of bringing to a single period; problem of availability and quality of source data; need for large amounts of data for model training; complexity of interpretation; risk of overfitting models; limited computing resources. It is concluded that deep learning has proven to be effective in forecasting economic indicators by analyzing large and complex data sets to identify patterns and create accurate forecasts. Keywords: economic research, artificial intelligence, machine learning, deep learning, macroeconomics, microeconomics, finance, market analysis.
... The outcomes pointing to major effects on day-to-day management functions, such as voucher distribution, which affects about 10% of total revenue, and the potential for AI-driven dynamic pricing optimization. In the same subject, like in [26], author proposed an e-commerce dynamic pricing model based on the theory and practices of reinforcement learning technology, With the goal of enhancing pricing strategies, improving consumer happiness, and increasing economic efficiency. In online retail environments, the model aims to improve both economic efficiency and customer satisfaction. ...
Article
Full-text available
In the world of digital commerce, Artificial intelligence (AI) is starting to shift the game and transform how organizations run. AI benefits firms and consumers both in retail and digital commerce. This review provides a comprehensive analysis of the critical role AI plays in fostering value creation in the digital commerce industry, with a particular emphasis on the ways in which task and information complexity affect the application of AI technologies. The review then looks at AI possibilities in digital commerce from a variety of angles, such as supply chain efficiency, cost savings, product recommendation, enhanced customer experience and marketing plans. The aim of this review is to exploring how AI applications create value in e-commerce. The approach that was employed described the search technique for locating pertinent academic sources and was based on a survey of the literature on recent 15 studies about the effect of AI for Value Creation in Digital Commerce. Finally, the findings demonstrate that utilizing AI as an advanced instrument in the digital commerce sector appears to be a positive move since it applying AI may foster creativity, improve decision-making, and enhance overall marketing performance.
Article
Full-text available
In an era defined by data, Big Data and Predictive Analytics have become indispensable tools for driving economic growth, innovation, and resilience. For Indonesia, one of Southeast Asia's most dynamic digital economies, these technologies offer a transformative pathway to industrial modernization and global competitiveness. With over 212 million internet users and a digital economy projected to hit $146 billion by 2025, Indonesia is poised to harness the power of data to revolutionize sectors such as finance, healthcare, e-commerce, and manufacturing (Antara News, 2022). This study delves into the multifaceted landscape of Big Data in Indonesia, offering a comprehensive analysis of its economic potential and implementation challenges. It highlights how predictive analytics is reshaping industries, enabling businesses to optimize supply chains, enhance customer experiences, and mitigate risks with unprecedented precision. At the same time, it addresses pressing concerns such as data privacy, cybersecurity vulnerabilities, and the ethical implications of AI-driven decision-making. To unlock the full potential of Big Data, this study proposes actionable policy recommendations, including investments in data infrastructure, the development of ethical AI frameworks, and the expansion of STEM education and workforce training programs. Indonesia can create a long-term data ecosystem that balances innovation and responsibility by encouraging collaboration among government, industry, and academics. As Indonesia stands at the crossroads of the Fourth Industrial Revolution, the strategic integration of Big Data and Predictive Analytics is no longer optional-it is imperative. This study serves as a roadmap for Indonesia to harness the power of data, ensuring that these technologies drive not only economic growth but also inclusive development and digital resilience in an increasingly data-driven world.
Conference Paper
Modern dynamic pricing has emerged as a computer-automated strategy that has revolutionized pricing methodologies and consumer interactions across various industries. This paper delves into the intricate domain of dynamic pricing, presenting an overview of its significance and the complex factors influencing its implementation. Within the realms of computer science, this subject involves heuristic development, data science, and the design of robust system architectures. The paper extensively examines the application of dynamic pricing in critical sectors such as e-commerce, travel, energy, electric vehicle charging, cloud computing, and ride-sharing services. It highlights the strategies and algorithms employed in each sector, showcasing their effectiveness in adapting to dynamic market conditions. The pivotal role of competitive forces, demand dynamics, inventory management, and price discrimination in shaping dynamic pricing strategies is explored in depth. Furthermore, this paper identifies promising avenues for future research in dynamic pricing, spanning industries such as transportation, energy, online platforms, industrial parks, automated machine learning, emerging technologies, and real-time pricing. It serves as a comprehensive guide to the contemporary state-of-the-art in dynamic pricing, shedding light on its multifaceted nature, broad applications, and exciting prospects for advancement across diverse industries.
Article
Full-text available
The realistic predicament of social responsibility of e-commerce platform is that illegal and bad information is flooding, information security and protection are prominent, and public education and supervision are weak. The purpose of this thesis was to analyzes the social responsibility of e-commerce platform by adopting international standard ISO 26000, and explores the main reasons of the dilemma of social responsibility of e-commerce platform, and then make some suggestions on how to improve the social responsibility of e-commerce platform.
Article
Full-text available
It is a hard issue to describe the complex traffic scene accurately in computer vision. The traffic scene is changeable, which causes image captioning easily interfered by light changes and object occlusion. To solve this problem, we propose an image caption generation model based on attention mechanism. Combining convolutional neural network (CNN) and recurrent neural network (RNN) to generate an end-to-end description for traffic images. To generate a semantic description with distinct degree of discrimination, the attention mechanism is applied to language model. Using Flickr8K,Flickr30K and MS COCO benchmark datasets to validate the effectiveness of our method. The accuracy is promoted maximally by 8.6%,12.4%,19.3% and 21.5% in different evaluation metrics. Experiments show that our algorithm has good robustness in four different complex traffic scenarios, such as light change, abnormal weather environment, road marked target and various kinds of transportation tools. © 2018, Editorial Board of Journal of Northwestern Polytechnical University. All right reserved.
Article
Full-text available
In present condition, culture is diminished, the change of social order toward the generation that has no policy and pro-sustainability; As well as the advancement of science and technology are often treated unwisely so as to excite local wisdom. It is therefore necessary to explore intra-curricular local wisdom in schools. This study aims to produce an integration model of sustainability perspectives based on local wisdom on spermatophyta material that is feasible and effective. This research uses define, design and develop stages to an integration model of sustainability perspectives based on local wisdom on spermatophyta material. The resulting product is an integration model of socio-cultural, economic and environmental sustainability perspective and formulated with preventive, preserve and build action on spermatophyta material consisting of identification and classification, metagenesis and the role of spermatophyta for human life. The integration model of sustainability perspective in learning spermatophyta based on local wisdom is considered proven to be effective in raising sustainability's awareness of high school students.
Article
Reinforcement learning which has self-improving and online learning properties gets the policy of tasks through the interaction with environment. But the mechanism of "trial-and-error" usually leads to a large number of training episodes. Knowledge includes human experience and the cognition of environment. This paper tries to introduce the qualitative rules into the reinforcement learning, and represents these rules through the cloud reasoning model. It is used as the heuristics exploration strategy to guide the action selection. Empirical evaluation is conducted in OpenAI Gym environment called "CartPole-v2" and the result shows that using exploration strategy based on the cloud reasoning model significantly enhances the performance of the learning process.
Article
This paper considers a class of simplified morphing aircraft and autonomous shape optimization for aircraft based on deep reinforcement learning is researched. Firstly, based on the model of an abstract morphing aircraft, the dynamic equation of shape and the optimal shape functions are derived. Then, by combining deep learning and reinforcement learning of deterministic policy gradient, we give the learning procedure of deep deterministic policy gradient(DDPG).After learning and training for the deep network, the aircraft is equipped with higher autonomy and environmental adaptability, which will improve its adaptability, aggressivity and survivability in the battlefield. Simulation results demonstrate that the convergence speed of learning is relatively fast, and the optimized aerodynamic shape can be obtained autonomously during the whole flight by using the trained deep network parameters.
Article
Based on recurrent neural network and the nature of Chinese word and character, the input and output of place name recognition task were redefined and a label model of recurrent network was proposed for Chinese character level based on deep learning method. Compared with recurrent network based on word level, the model proposed based on Chinese character level in this paper, achieves significant improvement on precision, recalling and F value, the F value gets an improvement of 2.88%. When place names contain rare words, the model can improve the F value more to 26.41%. © 2017, Editorial Department of Transaction of Beijing Institute of Technology. All right reserved.
Article
Traditional enterprise resource planning (ERP) usually adopts static business processes design and does not take the key role of "human" into consideration. It rarely involves the systematic process modeling, which makes it impossible to tackle the management complexity of modern enterprises. Considering the big data driven environment of modern enterprises, we utilize the ACP (Artiflcial societies, computational experiments, parallel execution) theory integrated with deep reinforcement learning approaches to establish a parallel management system for modern ERP management. We flrst propose a framework for ERP systems based on multi-agent technology where a sequential game model is included. Then, we seek for the optimal strategy using a deep reinforcement learning based neural network. Our proposed framework and approaches can well deal with uncertainty, diversity and complexity of modern ERP systems.
Article
The chronic intake of an excessive amount of alcohol is currently ascertained by determining the concentration of direct alcohol metabolites in the hair samples of the alleged abusers, including ethyl glucuronide (EtG) and, less frequently, fatty acid ethyl esters (FAEEs). Indirect blood biomarkers of alcohol abuse are still determined to support hair EtG results and diagnose a consequent liver impairment. In the present study, the supporting role of hair FAEEs is compared with indirect blood biomarkers with respect to the contexts in which hair EtG interpretation is uncertain. Receiver Operating Characteristics (ROC) curves and multivariate Principal Component Analysis (PCA) demonstrated much stronger correlation of EtG results with FAEEs than with any single indirect biomarker or their combinations. Partial Least Squares Discriminant Analysis (PLS-DA) models based on hair EtG and FAEEs were developed to maximize the biomarkers information content on a multivariate background. The final PLS-DA model yielded 100% correct classification on a training/evaluation dataset of 155 subjects, including both chronic alcohol abusers and social drinkers. Then, the PLS-DA model was validated on an external dataset of 81 individual providing optimal discrimination ability between chronic alcohol abusers and social drinkers, in terms of specificity and sensitivity. The PLS-DA scores obtained for each subject, with respect to the PLS-DA model threshold that separates the probabilistic distributions for the two classes, furnished a likelihood ratio value, which in turn conveys the strength of the experimental data support to the classification decision, within a Bayesian logic. Typical boundary real cases from daily work are discussed, too.
Article
Purpose To compare the performance of a deep-learning bone age assessment model based on hand radiographs with that of expert radiologists and that of existing automated models. Materials and Methods The institutional review board approved the study. A total of 14 036 clinical hand radiographs and corresponding reports were obtained from two children's hospitals to train and validate the model. For the first test set, composed of 200 examinations, the mean of bone age estimates from the clinical report and three additional human reviewers was used as the reference standard. Overall model performance was assessed by comparing the root mean square (RMS) and mean absolute difference (MAD) between the model estimates and the reference standard bone ages. Ninety-five percent limits of agreement were calculated in a pairwise fashion for all reviewers and the model. The RMS of a second test set composed of 1377 examinations from the publicly available Digital Hand Atlas was compared with published reports of an existing automated model. Results The mean difference between bone age estimates of the model and of the reviewers was 0 years, with a mean RMS and MAD of 0.63 and 0.50 years, respectively. The estimates of the model, the clinical report, and the three reviewers were within the 95% limits of agreement. RMS for the Digital Hand Atlas data set was 0.73 years, compared with 0.61 years of a previously reported model. Conclusion A deep-learning convolutional neural network model can estimate skeletal maturity with accuracy similar to that of an expert radiologist and to that of existing automated models. (©) RSNA, 2017.
Article
This paper examines the effect mechanism of torsional stiffness on flexible joints and the dynamic optimization of a six Degree-of-freedom industrial robot arm. The design optimization of the robot arm is investigated based on the rotor-torsional spring model and finite element method. The flexible multi-body dynamic model of the robot arm are established by considering the flexible characteristics of arms and joints, and the natural frequencies of a robot arm are calculated to obtain the torsional stiffness of the flexible joints. Natural frequency results gradually increased with joint stiffness improvement. Using the established dynamic model, the topology optimization on the robot arm is carried out by regarding lightweight as design goal and total displacement as constraints. The tare-load ratio and dynamic performance of the optimized robot arm are significantly enhanced compared with the original design model. This research can provide the theoretical basis for the dynamic optimization and upgrade of lightweight robot arm.