Digital-Physical Parity for Food Fraud
Sin Kuang Lo1,2(B
1,2, Chen Wang1, Ingo Weber1,2, Paul Rimba1,
Qinghua Lu1,2, and Mark Staples1,2
1Data61, CSIRO, Sydney, Australia
2School of Computer Science and Engineering, UNSW, Sydney, Australia
Abstract. Food fraud has an adverse impact on all stakeholders in the
food production and distribution process. Lack of transparency in food
supply chains is a strong factor contributing to food fraud. With lim-
ited transparency, the insights on food supply chains are fragmented,
and every participant has to rely on trusted third parties to assess food
quality. Blockchain has been introduced to the food industry to enable
transparency and visibility, but it can only protect the integrity of a
digital representation of physical food, not the physical food directly.
Tagging techniques, like barcodes and QR codes that are used to con-
nect the physical food to its digital representation, are vulnerable to
attacks. In this paper, we propose a blockchain-based solution to link
physical items, like food, to their digital representations using physical
attributes of the item. This solution is generic in its support for diﬀerent
methods to perform the physical checks; as a concrete example, we use
machine learning models on visual features of food products, through
regular and thermal photos. Furthermore, we use blockchain to intro-
duce a reward system for supply chain participants, which incentivizes
honesty and supplying data. We evaluate the technical feasibility of com-
ponents of this architecture for food fraud detection using a real-world
scenario, including machine-learning models for distinguishing between
grain-fed and grass-fed beef.
Keywords: Blockchain ·Machine learning ·Food fraud
Food fraud is a US$40 billion per year industry, which has a signiﬁcant negative
impact on all stakeholders in the food production and distribution process. Food
fraudsters leverage consumers’ perceived safety, trust, and value associated with
prominent brands and certiﬁcations to charge premium prices for inferior, or even
dangerous, food products. However, it is diﬃcult, if not impossible, to eliminate
the fraud industry considering the big proﬁt it generates for beneﬁciaries.
Springer Nature Switzerland AG 2019
J. Joshi et al. (Eds.): ICBC 2019, LNCS 11521, pp. 65–79, 2019.
66 S. K. Lo et al.
Lack of transparency and visibility of the food supply chain is a strong factor
contributing to food fraud. Food supply chains are complex multi-party systems
that involve diﬀerent participants, such as farmers, food production processors,
and retailers. However, regulations in major markets, such as the US, the EU, and
China generally only require one-up and one-down traceability regarding supply
chain participants . In a food supply chain with limited transparency, insights
are fragmented and thus each consumer or producer has to rely on trusted third
party agencies to monitor food quality across the whole food supply chain.
Blockchain technology has been applied to food supply chains [3,5] to enable
transparency and visibility because every participant within the blockchain net-
work has access to all the records and historical movements of information in
the entire system. For instance, a blockchain-based traceability system has been
proposed for tracking food products along international supply chains . How-
ever, blockchain only guarantees the integrity of the digital representation of
food but not the integrity of the physical attributes of a food product.
Commonly used tagging technologies, like barcodes, serial numbers, and QR
codes can connect physical food to its digital representation. Such techniques
are vulnerable to attacks, e.g., a counterfeit product with authentic packaging
and tag1. To counter fraud, it is necessary to enable supply chain participants
to verify the connection between the food itself and the digital information.
Physical World (Oﬀ-chain)
(RFID,Barcode, QR code)
Blockchain (On-chain) a
Fig. 1. Conceptual overview of food fraud detection
A conceptual overview of the relation between a food product, its physical
label, ID, attributes and a digital representation is shown in Fig. 1. The integrity
we seek to establish is between the physical food product and its digital represen-
tation. We approach this goal by connecting the physical item via its label and
Digital-Physical Parity for Food Fraud Detection 67
physical attributes to the digital system (links marked (a) in Fig.1). Conceptu-
ally, this architecture takes a similar approach as in multi-factor authentication
and biometric cryptosystems where physiological and behavioral features are
used for identifying a user . The diﬀerence is mainly on that the physical
characteristics of a food product is rich and diverse, which is diﬃcult to capture
using a closed system.
Stable isotope ratio analysis is one technique in this direction: chemical iso-
tope levels on physical food products, e.g., on muscles of cows , can be analyzed
to verify claims of origin. However, isotope analysis is not economically feasible
at large to tackle food fraud issues. Similarly, a national lab for drug substance
testing recently tested honey samples sourced from supermarkets and markets in
Australia; adulteration was detected in 18% of the 38 samples . While indepen-
dent labs may provide test results, the conﬁrmation of their discovery takes time.
Test methods need to be fast and cheap enough to enable widespread use.
We propose using machine learning techniques to assess food products
through exploiting distinct visual features to verify their physical attributes.
Machine learning models provide an eﬀective way to encapsulate knowledge of
diﬀerentiating objects based on features. Data can be collected from cameras,
smartphones, X-ray, and ultrasound scanners. By training models and perform-
ing feature extraction across training data, we can verify the physical attributes
of a food product against its digital information. Recipients in the supply chain
can check food authenticity based on features they collect.
In this paper, we make the following contributions:
– We propose a blockchain-based architecture that enables checking the parity
of physical items and their digital representation. The proposal is generic in
its support for methods to check physical attributes against the digital claims
about these attributes.
– As one concrete method, we propose machine learning on distinct visual fea-
tures of food. In this method, we provide the exposure and knowledge about
food fraud to the consumers by machine learning models.
– We devise incentive mechanisms that distribute coins to food supply chain
participants to reward honest behavior and supplying data. These rewards can
further be shared with machine learning developers, who contribute models
that allow fraud detection.
We evaluate the technical feasibility of components of this architecture for food
fraud detection using a real-world scenario, including machine-learning mod-
els for distinguishing between grain-fed and grass-fed beef. To the best of our
knowledge, there is no related work that use machine learning and blockchain to
link between the physical attributes of physical food products and their digital
The remainder of the paper is organized as follows. The overview of the
generic architecture is listed in Sect. 2, followed by the resulting system design
in Sect. 3. Section 4provides technical details of the system. The evaluation with
a beef case study is described in Sect. 5. Finally, Sect. 6concludes the paper and
outlines the future work.
68 S. K. Lo et al.
2 A Generic Architecture for Food Digtal-Physical Parity
The generic architecture for food digital-physical parity is shown in Fig.2.The
architecture aims to connect the physical items in a supply chain with their
digital representation. The attributes of a physical items (e.g., visual features,
geographical location and chemical composition) can be assessed via various
techniques. There is no perfect solution that ﬁts all diﬀerent kinds of food prod-
ucts: for some, it might be more suitable to check their visual attributes but
for others it might only be possible to check via their chemical composition.
Hence, our generic architecture allows the use of diﬀerent types of assessment
methods to verify the claimed attributes on physical products. For the digital
representation of the physical items, blockchain is chosen due to its immutable
and transparent nature and ability to span across dynamic networks of parties.
Functionally, blockchain can be used as a data storage and a computational
infrastructure in a software system . As a data storage, blockchain stores all
transactions that have ever occurred in the network, which cannot be deleted
or modiﬁed. As a computational infrastructure, blockchain can run programs
called smart contracts, the result of which is stored in the distributed trusted
Fig. 2. A generic architecture for food digital-physical parity
The claimed attribute assessment methods requires computational resources
that can currently not be fulﬁlled by blockchain. Therefore, the computation of
the assessment is done oﬀ-chain. Data required for the execution of assessment
are also stored oﬀ-chain.
Information related to the food supply chain, such as the claimed attributes
of food products, the details about stakeholders on the supply chain are being
recorded on blockchain. The transactions on blockchain enables the recipients
Digital-Physical Parity for Food Fraud Detection 69
to trace the originality and the processing activities on the food product in the
corresponding supply chain. Incentive mechanism would be deployed as a smart
contract on the blockchain to motivate active participants to contribute to the
supply chain. The results from the assessment of the claimed attributes will be
used as the input for the incentive mechanism to determine the reward to the
As blockchain is unable to communicate with the external world, an oracle
is required to enable the interaction between blockchain and the oﬀ-chain com-
ponents. The oracle is connected to a unifying API, which can interact with the
diﬀerent attributes assessment methods.
There is a decision making module in our generic architecture that decides
whether an item is considered authentic or not: it takes the diﬀerent assessments
for the available methods and decides whether the claimed physical attributes
match the observed ones well enough. The decision making module can be either
deployed on-chain or oﬀ-chain, depending on the speciﬁc situation. In the context
where decision is done in a proprietary, conﬁdential way by a single authority,
the decision making module would be kept oﬀ-chain. If the decision making has
been established by consensus from a set of authorities, and its workings need
not be kept conﬁdential, the decision making module can be kept on blockchain
to ensure the transparency of its rules and implementation.
2.2 Food Fraud Detection
Food fraud encompasses the deliberate and intentional substitution, addition, or
misrepresentation of food, food ingredients or food packaging for economic gain.
Introducing blockchain into supply chain with Internet of Things (IoT)  enable
real-time monitoring of physical movement via tags on products and immutable
food provenance recorded on blockchain. In additional, the smart contract on
blockchain can be used to enable compliance checking. The provenance infor-
mation or the authenticate food can be used to detect simulation, where the
illegitimate product is designed to look like the legitimate product with the
same label, or overrun, where the legitimate product is made in excess of pro-
duction agreements. In these two types of food fraud, there are duplicated labels.
Although it is diﬃcult to determine which one is authentic, and which one is
simulation or overrun based on the digital representation only. Diversion means
the sale or distribution of legitimate products outside of intended markets. If
the condition of sale and distribution associated with the products is recorded
in blockchain, smart contract can help to do automatic compliance checking
against legislation. However, other types of food fraud, like adulteration, where
a component of the ﬁnished products is fraudulent, can not be detected by using
provenance information because the coupling between the physical food and
its digital representation is loose when only basic tagging techniques are used.
Adulteration becomes automatically detectable when the binding between the
physical food and its digital representation is tighten by any of the attributes
70 S. K. Lo et al.
3 Food Fraud Detection System Design
We instantiate a food fraud detection system design based on the proposed
generic architecture. The system helps to manage the knowledge of diﬀerenti-
ating fraud from non-fraud. The knowledge about food fraud is accumulated
through analysis of multimodal data (regular camera, thermal camera, and spe-
cially designed sensors) as well as the interaction between food producers, inde-
pendent food experts, and recipients in a food supply chain. A more detailed
machine learning assessment module of the system is shown in Fig.3. Although
we demonstrate the system design using an image processing technique, the
architecture can use other machine learning techniques.
As discussed above, the existing methods of assessing the integrity of the
physical attributes of food products are either expensive (some require com-
plicated lab procedures) or vulnerable to attack (an adversary can easily swap
the RFID tag for the physical food products). This food fraud detection system
focuses on visible, distinguishable features of food product that can be captured
by a regular camera, thermal camera, or other specially designed sensors. Take
wild salmon as an example. Wild salmon get their characteristic colour from
what they eat. The unique colour reﬂects the diet of shrimp and krill. Salmon
from diﬀerent geo-location consumes a diﬀerent proportion of carotenoid-rich
creatures, which aﬀects how pink or red salmon becomes.
Machine Learning on Visual Features
Model Database Food Data
Fig. 3. Machine learning assessment of our food fraud detection system
The main components of this system is a data repository, a knowledge base
that captures the diﬀerences between fraud and non-fraud food products and
a blockchain-based trust management module. The knowledge is represented
as machine learning models that make use of food product data collectible by
food suppliers and recipients to classify frauds. The data and models are often
provided by food suppliers or independent parties. A consumer can make use of
these models to classify whether the food product she purchases is fraud.
There are incentives for a high quality food product supplier to provide data
containing identiﬁable features to diﬀerentiate their high quality products. Sim-
ilarly, food fraudsters also have the motivation to control and manipulate the
data/model in the system to confuse the consumers to gain beneﬁt.
Digital-Physical Parity for Food Fraud Detection 71
The users in the system are:
1. Food product suppliers and experts who are knowledgeable about the diﬀer-
ence between fraudulent and authentic food products.
2. Recipients in a food supply chain who can collect data of a particular food
3. Machine learning developers that provide services for recipients to check food
fraud based on the data provided by recipients.
3.2 Data Provenance
For food products with potential digital features, aggregating various test meth-
ods for transparency and reproducibility is essential. As mentioned in Sect. 2,we
apply the blockchain technique to enable tracking data and model provenance
and linking them to the party for accountability.
All the datasets within the food database and the classiﬁers within the model
database are registered on the blockchain, thus, publicly available to all the
users of the system. Other than registration, all the activities conducted on the
datasets and the models have separate records on the blockchain. The tamper-
proof log of events on blockchain provides the connection between the data
(as the digital representation of the food product) and the historical activities
executed on the data. Machine learning models are viewed as a type of data.
A simpliﬁed data analytics life cycle is depicted in Fig. 4a. Food suppliers and
recipients contribute to this process by uploading batch data or new data item
of a food product. Machine learning developers contribute to this process by
cleaning the raw data and training/re-training models that capture the knowl-
edge from the food experts. The data analysts register all the analytic activities,
for example, who does cleaning on which dataset at what time. Every activity
record on blockchain is signed by the person who conducts the activity using
their digital signature. Such information allows tracing back to data analysis
activities that have been conducted on a particular data. It allows the system
users to track the changes of food fraud detection knowledge represented by
the data and the models and the system to quantify the contribution activities
towards the ﬁnal model.
The incentive mechanisms discussed in Sect. 3.3 uses the data provenance
and food traceability information recorded on blockchain to distribute reward
to the contributors and honest participants in the supply chain.
3.3 Trust Management Through Incentive
We use blockchain to provide an incentive infrastructure for the food fraud detec-
tion system. With cryptocurrency, blockchain can also provide a trading infras-
tructure that enables contingent payment implemented as a smart contract for
trading items registered on the blockchain. In the context of incentive, honesty
72 S. K. Lo et al.
and trust is the “tradable” item. We introduce two coins as the incentive mecha-
nisms. Contribution-coin is the incentive for ML scientists and experts to use the
system and share their knowledge and data. Honesty-coin is used as the reward
to the honest recipients in the supply chain.
Reward to Contribution. There are two mechanisms to reward contribution.
One is a rule-based automated mechanism that rewards the users of the platform
based on their contribution. The contribution to the platform can be roughly
quantiﬁed based on metrics proposed by machine learning experts. The value of
data is determined by its size and diversity. A data has more value if it is diﬀerent
from any of the data in the system, for example, data from a new food product,
or data from a new feature of a food product. Model with higher accuracy has
more value to the system. How to calculate the contribution based on these
or more metrics is our future work. Point-to-point reward model as shown in
Fig. 4a will be implemented. The reward to a certain model is split between the
developer who cleans the data and the developer who trains the model if more
than one developers are involved in this data analytics process. How to split the
reward is decided by machine learning developers.
The second is a human-driven point-to-point reward mechanism, like the
digital reward system on social media where rewards can be sent between users
arbitrarily. There will be constraints for the reward, as the monetary limit. Since
all the reward transactions are recorded on the blockchain, they are transparent
for public auditing. For example, ML scientist A is rewarded xContribution-
coins from supplier B, a link is established between A and B with weight x.If
the supplier is fraudster or colluding with ML scientist, it is supposed to have
unusual patterns .
Reward to Honesty. The food fraud detection is primarily to check compliance
between the authenticity information on blockchain and the food product data
uploaded by recipients. As shown in Fig. 4b, any participants in supply chain can
report a potential fraud case via our system. In the case that the food product
complies with the authenticity information provided by the farm, some honesty-
coins are rewarded equally to all the participants in the supply chain according to
the food traceability information on the blockchain, With this incentive reporting
mechanism, it motivates every participant in the same line of supply chain to
cross check the product that arrived at their position.
However, in the case of negative result, the result of fraud detection does not
indicate which participant in the food supply chain is the fraudster. In the case
that the food product does not comply with the authenticity information, a food
expert is selected and requested to validate the result. If the result is invalid,
a machine learning developer is alerted and requested to re-train the model. If
the food expert conﬁrms the validity of the result, the honesty-coin own by all
the previous participants in the supply chain will be minused oﬀ to pay the
reporter. This mechanism punishes all the participants in the supply chain as an
Digital-Physical Parity for Food Fraud Detection 73
Consumer / Supplier
Add More Data
(a) Simpliﬁed data analytics life cycle and
contribution coin distribution
(b) Simpliﬁed beef supply chain
and honesty-coin distribution
Fig. 4. Coins distribution
incentive for honest participants to ﬁnd out the food fraudster or partner with
other participants with a higher balance of honesty coin.
3.4 Oﬀ-Chain Components
As the blockchain does not have the scalability to execute computation heavy
process on-chain nor it could store big size of raw data on-chain, we have chosen
to execute our machine learning model and store all the dataset for the model oﬀ-
chain. The inherent disadvantage of using oﬀ-chain is that there are no native way
for blockchain to fetch data from oﬀ-chain. Hence, the interoperability between
the oﬀ-chain components and blockchain can be achieved with the use of an
oracle. Oracle is needed to inject data from oﬀ-chain into blockchain. Details are
discussed in Sect. 4. The training dataset and the model are all stored oﬀ chain.
Oﬀ-chain databases and modules are being described below:
–Food database—A collection of datasets that show the diﬀerence between
fraudulent and non-fraudulent of various food products. Food fraud is a huge
market, and a continuous data collection mechanism is vital to help detection
techniques keeping up with fraud techniques with new data collection devices
and knowledge built on top of new datasets;
–Model database —Storing various classiﬁers and their associated metadata. We
assume that the classiﬁers are learned of the system, based on the datasets
within the food database. Every classiﬁer is aimed to distinguish fraudulent
food from non-fraudulent food under certain circumstances. The deﬁnition of
fraud is context speciﬁc. One example is to distinguish grain-fed beef from
grass-fed beef, as discussed in Sect. 5. All the newly uploaded product data
and the corresponding detection results are stored, and a machine learning
developer is selected regularly to double check and conﬁrm the result. Such
dataset with a label will be used to improve the performance of the models.
74 S. K. Lo et al.
–Model matchmaker—When a recipient submits a query with the data she
collects about a speciﬁc food product to the system, the system needs to
search for its model database for suitable models to answer the query. The
“model matchmaker” is responsible for this task.
–Out-of-distribution detector—A user-submitted query is likely to contain pat-
terns that has never been seen by a model. It is important to detect such a
mismatch to avoid arbitrary classiﬁcations. The “Out-of-distribution detec-
tor” is responsible for checking whether the user-submitted data has diﬀerent
distribution with the data on which a model is trained. It may trigger the
re-training of a model based on changing data or training of a new model
when data distribution has a signiﬁcant change.
An overall deployment architecture of the food fraud detection system is shown
in Fig. 5. It has four main components: (1) a blockchain with registries and incen-
tives implemented in smart contracts; (2) classiﬁers and an out-of-distribution
detector which are hosted on AWS; (3) an Amazon S3 bucket for storing images
uploaded from the users and other datasets; and (4) an Oraclize-based oracle as
the main connector between the blockchain and AWS.
Incenve coin contract
AWS Cl oud
Out of distribuon
Fig. 5. Overall deployment diagram
4.1 Registries as Smart Contracts
We implemented the food fraud detection system with Ethereum Rinkeby Test-
net. All the datasets within the food database and the classiﬁers within the
model database are registered to smart contracts called dataset registry and
model registry. The registries store the metadata of the datasets (description,
ownership, the location of the data) and the classiﬁers (description, ownership,
purpose, accuracy). The raw data (the machine learning models and photos of
food products) are stored oﬀ-chain in the Amazon S3 bucket. A registry, food
supply chain registry, is used to establish the relationship of all the participants
Digital-Physical Parity for Food Fraud Detection 75
in the supply chain for a food product. The stakeholders of a supply chain play
an essential role in ensuring a customer gets a genuine product as described on
the its label by the food supplier. Hence, all the details of related participants for
food products will be recorded on the blockchain. Another registry, food product
registry, is used to register the metadata of the food product image submitted
by the recipient. Metadata includes the checksum of the image and a pointer
that links to the raw image. Once a food product image is labeled and used to
train a model, it is removed from the food product registry and is registered in
the dataset registry as part of a dataset.
4.2 Incentive Mechanisms as Smart Contracts
We introduced contribution-coin and honesty-coin to incentivize the honesty of
food supplier and contribution of participants towards the system. Both coins
are implemented in smart contracts that are compliant to the ERC20 token
standard2. All the balance of token owners will be recorded by the token contract.
We have a ratioConﬁg(address tokenOwner, uint ratio) function to allow
authorized admins (e.g., veriﬁed ML scientists) to adjust the ratio pertaining to
the distribution of the tokens. For contribution-coin, the experts that verify the
accuracy of the newer model will be permitted to set the coin distribution ratio
according to the task description entered by the ML scientist. Some of the tasks
related to retraining a model include preprocessing and cleaning dataset.
Contribution-coin is rewarded to the ML scientists for their contributions to
improve and maintain the machine learning model. Experts are also rewarded
for checking the result of the classiﬁer and label given on the training datasets
periodically. The initial distribution ratio of coin is set to 6:4 between the ML
scientist and experts that verify the classiﬁer and label. Honesty-coin is given
to participants of the supply chain for their honesty in supplying genuine food
products to the public. If the classiﬁer determined that a food product is genuine
as stated on its food label, every participant in the supply chain for that food
product will be rewarded Honesty-coin based on the weight predeﬁned in the
food supply chain registry
4.3 Oﬀ-Chain Storage and Model Execution
The images of food products uploaded by recipients are stored in oﬀ-chain stor-
age. In our prototype, we have opted for AWS S3 due to its availability and
resilience of the stored data. A pointer that links to the image stored in AWS
S3 and checksum of the image will be recorded on the blockchain. Storing only
the metadata of images on the blockchain eliminates the high cost of storing
images on and the storage limitation of blockchain while allowing detection of
any tampering of the images. The pointer and the checksum of the images will
be stored on-chain via a transaction.
2https://theethereum.wiki/w/index.php/ERC20 Token Standard.
76 S. K. Lo et al.
The trained models and the training datasets are also stored oﬀ-chain. The
execution environment of the ML model is running on an Ubuntu 16.04 LTS AWS
EC2 instance. The machine learning model is written in the Python language
with TensorFlow. This module provides two main operations: (1) image data
upload; and (2) image data classiﬁcation. The latter returns classiﬁcation results
together with saliency maps of the result as well as the conﬁdence that the input
image is within the knowledge scope of this model used.
4.4 Oracle for On-Chain and Oﬀ-Chain Interoperation
An oracle is a mechanism that fetches data from the external world to the isolated
execution environment of a blockchain. We have selected Oraclize3for our oracle
implementation. It provides various proof mechanisms to ensure the validity of
the information acquired from the data source. In our prototype, Oraclize is used
to obtain the result of the ML classiﬁer for food fraud detection and distribution
of the reward coins to the participants. It is triggered once the result is generated
from the model classiﬁer. The oracle will inject the result of the classiﬁcation
into the blockchain and use it as an input for the incentive coin distribution.
5 Example Case Study
We use a simple case study of a beef supply chain to demonstrate the feasibility
of our food fraud detection system, which is used to distinguish grain-fed beef
(lower quality and price) from grass-fed beef (higher quality and price).
Using the food fraud detection system, a beef recipient can verify the claimed
attributes of the beef product (written on the label of in the digital represen-
tation). To use the system, the recipient needs to take photo of a full top view
of the beef product under good lighting, currently without plastic packaging or
other covers, and upload the image to our system. Once the image has been
uploaded successfully, the checksum of the image will be calculated, and a script
will invoke the Food Product Registry smart contract to add a data item. The
pointer to the image and the checksum of the image will be entered as input
data on the smart contract to record them on the blockchain. The oﬀ-chain ML
system is triggered once the data has been recorded on the blockchain. It will
retrieve the input data from the blockchain (pointer and checksum) to down-
load the picture, verify the checksum and run the image through the classiﬁer
to check whether it is a grain-fed beef or grass-fed beef. The result is checked
against the claim, and a threshold-based decision is made and recorded in the
result database, triggering the incentive mechanisms.
5.1 Machine Learning for Classifying Beef Types
Grain-fed beef and grass-fed beef are visually diﬀerent as grain-fed beef has more
fat. However, the diﬀerence can be subtle and overwhelm consumers. Examples
Digital-Physical Parity for Food Fraud Detection 77
of the two types of beef are shown in Fig. 6. We trained a classiﬁer based on a
small set of grass-fed and grain-fed beef images to help detect the fraud.
(a) grain-fed beef (b) grass-fed beef (c) Anomaly
Fig. 6. Beef images
We built a neural network-based classiﬁer to distinguish grain-fed beef and
grass-fed beef based on the meat texture in input images. The binary classiﬁer
contains two convolutional layers with 24 and 32 ﬁlters at the ﬁrst and the second
layer respectively, and with a kernel size of dimension (5,5), two max pooling lay-
ers, one drop-out layer and two fully connected layers with 64 units and 2 unites
respectively. The texture diﬀerence can be learned using a small amount of train-
ing data obtained from supplier and supermarket websites. The model accuracy
is 92.5% on a dataset randomly picked from the Web. The amount of training
data grows as recipients or producers keep collecting and contributing data to
the platform, which enables the classiﬁer to be tuned to distinguish a greater
variety of beef of the two types. The data diversity enhances the capability of
the classiﬁer, thereby enriching the knowledge about food product diﬀerences
accumulated in the platform.
The learned “knowledge” of a model is useful when a recipient-collected image
preserves the pattern known by the classiﬁer even though the machine learning
model may not be directly trained from the same type of images. When a beef
steak is cooked, the features diﬀerentiating grass-fed from grain-fed are not vis-
ible, as shown in Fig. 6c. However, when a recipient uses a thermal camera to
take a picture of the beef and submits the image, as shown in Fig. 6dtothe
system to check, the pattern of grain-fed beef becomes highly distinguishable as
the fat part has a higher temperature, and thus, a speciﬁc model can classify
it correctly. This demonstrates the knowledge capturing capability of machine
learning models in food fraud detection scenario.
5.2 Incentive Mechanism
In the proposed incentive mechanism, the two types of coins are the Honesty-
coin and Contribution-coin, which are distributed as follows. Honesty-coin will
be distributed to all the participants recorded on the supplier registry via their
accounts if the product bought by the ﬁnal recipient is veriﬁed to be genuine.
Once the result from the classifying process is available, the system triggers the
incentive coin smart contract. If the result shows “genuine”, the smart contract
78 S. K. Lo et al.
will distribute Honesty-coin to participants according to a predeﬁned split. In
the implementation and the case study, we have set equal split for every partic-
ipant. If the result indicates fraud, the involved participants could be penalized;
however, in a decentralized system, that would require the participants to put
up a stake upfront from which penalties could be deducted.
The system tracks out-of-distribution cases. Once the occurrence reaches a
certain threshold, the ML scientists will be notiﬁed to check and tune their
model or create a new model. The new model will be adjusted by checking all
the recipient-submitted image input. When a model is updated, a smart contract
with oracle will be used to inject the model’s performance in term of accuracy
to the blockchain. The actual model will be stored in cloud data registry. If the
updated model achieves higher accuracy than the previous one, contribution-
coin will be distributed to the scientist, data contributors and experts who help
to verify the performance of the updated model. Currently, the coins are split
among all parties in a pre-deﬁned ﬁx ratio.
Our system provides a transparent platform for recipients to verify claims about
a food product against its physical attributes. It ensures the originality and
integrity of uploaded data and allows tracing back to all participants in case of
food fraud. By using incentives, the system encourages participants continuous
Our design takes future needs into consideration. The classiﬁers are loosely
coupled with other components within the system, thus allowing for ML models
to be swapped and modiﬁed according to the needs and data. Furthermore,
the architecture supports other methods for assessing physical attributes of a
product, like isotopic or genomic analysis.
There are limitations arise from using blockchain and machine learning for
food parity. There are chances where the beef classiﬁer itself might be rigged for
maximum economy beneﬁt for all the participants in the supply chain except
the buyer. Although we added the expert role as a trusted intermediary to verify
and ensure the trustworthiness of the dataset and classiﬁer, but inﬂuential sup-
ply chain market player might still be able to ﬁnd way to manipulate the result
or machine learning model outside of blockchain and supply chain. As partipants
on blockchain are transparent, supply chain participants might be able to com-
municate wth each other outside of the supply chain to gain mutual consensus to
work together to commit fraud. Another problem would be malicious recipients
submitting fake beef product. For example, they could have gotten good quality
product, but ended up substituting beef from others as input to the system for
personal gain or to purposely jeopardize the partipants in the supply chain.
6 Conclusion and Future Work
We proposed a generic architecture that connects physical food products in a
supply chain with their digital representation. The architecture uses blockchain
Digital-Physical Parity for Food Fraud Detection 79
for immutability and an incentive system. We instantiate this architecture with a
concrete system using machine learning on visual features of beef, to check if the
digital claims on the beef product match the physical attributes. By evaluating
our system with a real-world scenario on distinguishing the types of beef, we
determine that it is feasible to implement the components in our proposed generic
architecture to achieve parity between a physical food product and its digital
representation. To the best of our knowledge, this is the ﬁrst research work that
combines physical attribute of food products with their digital representation
via blockchain and machine learning. Our future work will focus on rewarding
participants relative to their contribution, and on introducing a reputation coin
that can be used to assess the trustworthiness of participants.
Acknowledgements. This research is supported by the Science and Industry Endow-
ment Fund of Australia.
1. Food fraud vulnerability assessment and mitigation-are you doing enough to pre-
vent food fraud? Technical report, PWC (2016)
2. Fleder, M., Kester, M.S., Pillai, S.: Bitcoin transaction graph analysis. arXiv
preprint arXiv:1502.01657 (2015)
3. Lo, S.K., Xu, X., Chiam, Y.K., Lu, Q.: Evaluating suitability of applying
blockchain. In: The 22nd ICECCS, November 2017
4. Osorio, M.T., Moloney, A.P., Schmidt, O., Monahan, F.J.: Beef authentication
and retrospective dietary veriﬁcation using stable isotope ratio analysis of bovine
muscle and tail hair. J. Agric. Food Chem. 59(7), 3295–3305 (2011)
5. Staples, M.: Risks and opportunities for systems using blockchain and smart con-
tracts. Technical report, Data61 (CSIRO), Sydney (2017)
6. Tian, F.: A supply chain traceability system for food safety based on HACCP,
blockchain & internet of things. In: The 14th International Conference on Service
Systems and Service Management, June 2017
7. Uludag, U., Pankanti, S., Prabhakar, S., Jain, A.K.: Biometric cryptosystems:
issues and challenges. Proc. IEEE 92(6), 948–960 (2004)
8. Xu, X., Lu, Q., Liu, Y., Yao, H., Zhu, L., Vasilakos, T.: Designing blockchain-
based applications: a case study for imported product traceability. Futur. Gener.
Comput. Syst. 92, 399–406 (2019)
9. Xu, X., et al.: The blockchain as a software connector. In: The 13th Working
IEEE/IFIP Conference on Software Architecture, April 2016
10. Zhou, X., Taylor, M.P., Davies, P.J., Prasad, S.: Identifying sources of environmen-
tal contamination in european honey bees (Apis mellifera) using trace elements and
lead isotopic compositions. Environ. Sci. Technol. 52(3), 991–1001 (2018)