ArticlePDF Available

Abstract and Figures

Assessing nutritional content is very relevant for patients suffering from various diseases, professional athletes, and for health reasons is becoming part of everyday life for many. However, it is a very challenging task as it requires complete and reliable sources. We introduce a machine learning pipeline for predicting macronutrient values of foods using learned vector representations from short text descriptions of food products. On a dataset used from health specialists, containing short descriptions of foods and macronutrient values: we generate paragraph embeddings, introduce clustering in food groups, using graph-based vector representations, that include food domain knowledge information, and train regression models for each cluster. The predictions are for four macronutrients: carbohydrates, fat, protein and water. The highest accuracy was obtained for carbohydrate predictions – 86%, compared to the baseline – 27% and 36%. The protein predictions yielded the best results across all clusters, 53%–77% of the values fall in the tolerance-level range. These results were obtained using short descriptions, the embeddings can be improved if they are learned on longer descriptions, which would lead to better prediction results. Since the task of calculating macronutrients requires exact quantities of ingredients, these results obtained only from short description are a huge leap forward.
Content may be subject to copyright.
mathematics
Article
P-NUT: Predicting NUTrient Content from Short
Text Descriptions
Gordana Ispirova 1, 2, * , Tome Eftimov 1and Barbara Korouši´c Seljak 1
1Computer Systems Department, Jožef Stefan Institute, 1000 Ljubljana, Slovenia; tome.eftimov@ijs.si (T.E.);
barbara.korousic@ijs.si (B.K.S.)
2Jožef Stefan International Postgraduate School, 1000 Ljubljana, Slovenia
*Correspondence: gordana.ispirova@ijs.si; Tel.: +386-14773519
Received: 14 September 2020; Accepted: 8 October 2020; Published: 16 October 2020


Abstract: Assessing nutritional content is very relevant for patients suering from various diseases,
professional athletes, and for health reasons is becoming part of everyday life for many. However,
it is a very challenging task as it requires complete and reliable sources. We introduce a machine
learning pipeline for predicting macronutrient values of foods using learned vector representations
from short text descriptions of food products. On a dataset used from health specialists, containing
short descriptions of foods and macronutrient values: we generate paragraph embeddings, introduce
clustering in food groups, using graph-based vector representations, that include food domain
knowledge information, and train regression models for each cluster. The predictions are for four
macronutrients: carbohydrates, fat, protein and water. The highest accuracy was obtained for
carbohydrate predictions – 86%, compared to the baseline – 27% and 36%. The protein predictions
yielded the best results across all clusters, 53%–77% of the values fall in the tolerance-level range.
These results were obtained using short descriptions, the embeddings can be improved if they are
learned on longer descriptions, which would lead to better prediction results. Since the task of
calculating macronutrients requires exact quantities of ingredients, these results obtained only from
short description are a huge leap forward.
Keywords:
macronutrient prediction; representation learning; machine learning; data mining;
word embeddings; paragraph embeddings; single-target regression
1. Introduction
There is no denying that nutrition has become a core factor to today’s society, and an undeniable
solution to the global health-crisis [
1
4
]. The path towards making the average human diet healthier
and environmentally sustainable is a fundamental part of the solution for numerous challenges from
ecological, environmental, societal and economic perspective, and the awareness for this has just
started to grow and be fully appreciated.
We live in a time of a global epidemic of obesity, of diabetes, of inactivity, all connected to bad
dietary habits. Many chronic diseases such as high blood pressure, cardiovascular disease, diabetes,
some cancers [
5
], and bone-health diseases are linked to, again – poor dietary habits [
6
]. Dietary
assessment is essential for patients suering from many diseases (especially diet and nutrition related
ones), it is also very much needed for professional athletes, and because of the accessibility of meal
tracking mobile applications it is becoming part of everyday habits of a vast majority of individuals,
for health, fitness, or weight loss/gain. Obesity is spiking each day in developed western countries and
this contributes to raised public health concern about some subcategories of macronutrients, specifically
about saturated fats, and added or free sugar. Nutritional epidemiologists are also raising concern
about micronutrients like – sodium, whose intake should be monitored for individuals suering from
Mathematics 2020,8, 1811; doi:10.3390/math8101811 www.mdpi.com/journal/mathematics
Mathematics 2020,8, 1811 2 of 21
specific diseases like osteoporosis, stomach cancer, kidney disease, kidney; and fiber, whose intake is
critical for patients suering from irritable bowel syndrome (IBS).
Nutrient content from one food to another can vary a lot, even though they have roughly the same
type of ingredients. This makes nutrient tracking and calculating very challenging, and predicting
nutrient content very complicated. In this paper, we propose an approach, called P-NUT (Predicting
NUTrient content from short text descriptions), for predicting macronutrient values of a food item
considering learned vector representations of text describing the food item. Food items are generally
unbalanced in terms of macronutrient content. When there is a broad variety of foods, they can go from
one extreme to another for one macronutrient content, for example the content of fat can go from ‘fat
free’ foods to ‘fat based’ foods (ex. dierent kinds of nut butters), which can be a good base for grouping
foods. Therefore, a general model for prediction will not be ecient in macronutrient prediction.
For this reason, we decided to apply unsupervised machine learning – clustering as a method to
separate foods in order to obtain clusters (groups) of foods with similar characteristics. Subsequently,
on these separate clusters we predict the macronutrients with applying supervised machine learning.
Predicting macronutrients is not a task that has been approached in such a manner before, usually
nutrient content of food is calculated or estimated from measurements and exact ingredients [
7
9
].
These calculations are pretty demanding, the detailed procedure for calculation of the nutrient content
of a multi-ingredient food has a few major steps: selection or development of an appropriate recipe,
data collection for the nutrient content of the ingredients, correction of the ingredient nutrient levels
for weight of edible portions, adjustment of the content of each ingredient for eects of preparation,
summation of ingredient composition, final weight (or volume) adjustment, and determination of the
yield and final volumes. This is when all the ingredients and measurements are available. When the
data for the ingredients are not available, this procedure gets more complicated [7,8].
With using just, short text descriptions of the food products – either a simple food or complex
recipe dish, the results from this study show that this way of combining representation learning
with unsupervised and supervised machine learning provides results with accuracy as high as 80%,
compared to the baseline (mean and median – calculated from the values of a certain macronutrient of
all the food items in a given cluster) in some cases there are dierences in accuracies of up to 50%.
The structure of the rest of the paper is the following: In Section 2, we begin with the related work
in Section 2.1 where we present the published research need to understand P-NUT, then Section 2.2
provides a structure and description of the data used in the experiments, and in Section 2.3, we explain
the methodology in detail. The experimental results and the methodology evaluation are presented
in the Section 3. In the Section 4, we review the outcome of the methodology, the benefits of such
approach, and its novelty. At, the end, in Section 5, we summarize the importance of the methodology
and give directions for future work.
2. Materials and Methods
To the best of our knowledge, predicting nutritional content of foods/recipes using only short text
description has never been done before. There has been some work involving machine learning done
in this direction, mainly involving image recognition: employing dierent deep learning models for
accurate food identification and classification from food images [
10
], dietary assessment through food
image analysis [
11
], calculating calorie intake from food images [
12
,
13
]. All this work in the direction
of predicting total calories, strongly relies on textual data retrieved from the Web. There are numerous
mobile and web applications, for tracking macronutrient intake [
14
,
15
]. Systems like these are used
for achieving dietary goals, allergy management or simply, maintaining a healthy balanced diet.
The biggest downside is the fact that they require manual imputation of details about the meal/food.
2.1. Related Work
In this subsection we present a review of the concepts relevant to P-NUT, the algorithms that were
used, and recent work done in this area.
Mathematics 2020,8, 1811 3 of 21
2.1.1. Representation Learning
Representation learning is learning representations of input data by transforming it or extracting
features from it, which then makes it easier to perform a task like classification or prediction [
16
].
There are two dierent categories of vector representations: non-distributed or sparse, which are
much older and distributed or dense, which have been in use for the past few years. Our focus is on
distributed vector representations.
Word Embeddings
Word representations were first introduced as an idea in 1986 [
17
]. Since then, word representations
have changes language modelling [
18
]. Following up is work that includes applications to automatic
speech recognition and machine translation [
19
,
20
], and a wide range of Natural Language Processing
(NLP) tasks [
21
27
]. Word embeddings have been used in combination with machine learning,
improving results from biomedical named entity recognition [
28
], capturing word analogies [
29
],
extracting latent knowledge from scientific literature and going towards a generalized approach to the
process of mining scientific literature [
30
], etc. We previously explored the idea of applying text-based
representation methods in the food domain for the task of finding similar recipes based on cosine
similarity between embedding vectors [
31
]. Word embeddings are vector space models (VSM), that in
a low-dimensional semantic space (much smaller than the vocabulary size) represent words in a form
of real-valued vectors. Having distributed representations of words in vector space helps improve the
performance of learning algorithms in for various NLP tasks.
Word2Vec was introduced as word embedding method by Mikolov et al. in 2013 at Google [
32
],
and it is a neural network based word embedding method. There are two dierent Word2Vec
approaches, Continuous Bag of Words and Continuous Skip Gram [33]:
#
Continuous Bag-of-Words Model (CBOW) – This architecture consists of a single hidden
layer and an output layer. The algorithm tries to predict the center word based on the
surrounding words – which are considered as the context of this word. The inputs of this
model are the one–hot encoded context word vectors.
#
Skip-gram Model (SG) – In the SG architecture we have the center word and the algorithms
tries to predict the words before and after it, which make up the context of the word.
The output from the SG model are
C
number of
V
dimensional vectors, where
C
is the
number of context words which we want the model to return and
V
is the vocabulary size.
The SG model is trained to minimize the summed prediction error and gives better vectors
with increments of C[32,33].
If compared, CBOW is a lot simpler and faster to train but SG performs better with rare words.
GloVe [
34
] is another method for generating word embeddings. It is a global log-bilinear regression
model for unsupervised learning of word representations, that has been shown to outperform
other models on word analogy, word similarity, and named entity recognition tasks. It is based on
co-occurrence statistics from a given corpus.
Paragraph Embeddings
In 2014 [
35
] an unsupervised paragraph embedding method, called Doc2Vec, was proposed.
Doc2Vec in contrast to Word2Vec generated vector representations of whole documents, regardless
of their length. The paragraph vector and word vectors are concatenated in a sliding window and
the next word is predicted; the training is done with a gradient decent algorithm. The Doc2Vec
algorithm also takes into account the word order and context. The inspiration, of course, comes from
the Word2Vec algorithm: the first part, called Distributed Memory version of Paragraph Vector
(PV-DM), is an extension of the CBOW model with an additional vector (Paragraph ID) added, with the
Mathematics 2020,8, 1811 4 of 21
dierence of including another feature vector, unique to the document, for the next word prediction.
The word vectors represent the concept of a word, while the document vector represents the concept of
a document.
The second algorithm, called Distributed Bag of Words version of Paragraph Vector (PV-DBOW),
is similar to the Word2Vec SG model. In PV-DM the algorithm considers the concatenation of the
paragraph vector with the word vectors for the prediction of the next word, whereas in the PV-DBOW
the algorithm ignores the context words in the input, and the word are predicted by random sampling
from the paragraph in the output.
The authors recommend using a combination of the two models, even though the PV-DM model
performs better and usually will achieve state of the art results by itself.
Graph-Based Representation Learning
Besides word embeddings, there are methods that are used for embedding data represented as
graphs, consequently named graph embedding. Usually, embedding methods learn vector embeddings
represented in the Euclidean vector space, but as graphs are hierarchical structures, in 2017 the authors
in [
36
] introduced an approach for embedding hierarchical structures into hyperbolic space – Poincar
é
ball. Poincar
é
embeddings are vector representations of symbolic data, the semantic similarity between
two concepts is the distance between them in the vector space, and their hierarchy is waved by
the magnitudes of the vectors. Graph embeddings have improved performance over many of the
existing models on tasks such as text classification, distantly supervised entity extraction, and entity
classification [
37
], they also have been used for unsupervised feature extraction from sequences of
words [
38
]. In [
39
], the authors generate graph embeddings (Poincar
é
) for the FoodEx2 hierarchy [
40
].
FoodEx2 version 2 is a standardized system for food classification and description developed by the
European Food Safety Authority (EFSA), it has domain knowledge embedded in it and it contains
descriptions of a vast set of individual food items combined in food groups and more broad food
categories in a hierarchy that exibits parent-child relationship. The domain knowledge contained in
the FoodEx2 hierarchy is transcended through the graph embeddings, which later the authors use
in order to group the food items from the FoodEx2 system in clusters. The clustering is done using
the Partition Around Medoids algorithm [
41
], and the number of clusters is determined using the
silhouette method [42].
2.2. Data
In our experiments we used a dataset that contains nutritional information about food items
recently collected as food consumption data in Slovenia with the collaboration of subject-matter experts
for the aims of the EFSA EU Menu project [
43
] – designed for more accurate exposure assessments
and ultimately support of risk managers in their decision-making on food safety. The ultimate goal
being – enabling quick assessment of exposure to chronic and acute substances possibly found in the
food chain [
44
]. In this dataset there are 3265 food items, some of which are simple food products and
others are recipes with short descriptions, a few instances are presented in Table 1as an example.
From the dataset for each food item we have available: name in Slovene, name in English, FoodEx2
code, and nutrient values for: carbohydrates, fat, protein and water. We repeated our experiments for
both English and Slovene names of the food products and the recipes.
Mathematics 2020,8, 1811 5 of 21
Table 1. Subset from the dataset used in the experiments (SLO—Slovenian, ENG—English).
SLO Food Name ENG Food Name FoodEx2
Code
Energy
(g)
Water
(g)
Fat
(g)
Carb
(g)
Protein
(g)
Zelenjavna rižota s parboiled
rižem, sezonsko zelenjavo in
repiˇcnim oljem
Vegetable risotto with
parboiled rice, seasonal
vegetables and rapeseed oil
A041G#
F04.A036V 90.07 79.36
1.55
16.77 1.79
Medenjaki iz pirine in ržene
moke ter hojevega medu
Gingerbread biscuit made of
spelt and rye flour and honey
A00CT#
F04.A004H$
F04.A003J$
F04.A033K
423.68 0.00
1.81
91.41 8.96
ˇ
Cokoladna rezina Kit Kat Candies, KIT KAT Wafer Bar A009Z 517.87 1.63
25.99
64.59 6.51
Zeleni ledeni ˇcaj z medom,
arizona
Tea, ready-to-drink,
green iced tea, Arizona A03LD 27.65 93.00
0.00
6.80 0.01
2.3. Methodology
On Figure 1a flowchart of the methodology is presented. Our methodology is consisted of three
separate parts: representation learning and unsupervised machine learning, conducted independently,
and then combined in supervised machine learning.
Figure 1. Flowchart of the methodology.
The idea is: (i) represent text descriptions in vector space using embedding methods, i.e., semantic
embeddings at sentence/paragraph level of short food descriptions, (ii) cluster the foods based on
their FoodEx2 codes [
40
] using graph embeddings [
39
], (iii) perform post-hoc cluster merging in
order to obtain more evenly distributed clusters on a higher level of the FoodEx2 hierarchy, (iv) apply
Mathematics 2020,8, 1811 6 of 21
dierent single-target regression algorithms on each cluster having the embedding vectors as features
for predicting separate macronutrient values (carbohydrates, fat, protein and water), (v) evaluate the
methodology by comparing the predicted with the actual values of the macronutrients.
2.3.1. Representation Learning
The starting point is the textual data, in our case the short text descriptions of the food
products/recipes, alongside with their FoodEx2 codes and macronutrient values. For representing the
textual data as vectors, the embeddings are generated for the whole food product name/description,
using two dierent approaches:
1. Learning word vector representations (word embeddings) with the Word2Vec and GloVe
methods – The vector representations of the whole description are obtained with merging separate
word embeddings generated for each separate word in the sentence (food product name/description).
If Das a food product description consisted of nwords:
D={word1,word2,. . . ,wordn}(1)
And E[word]is the vector representation (embedding) of a separate word:
E[worda]=[xa1,xa2,. . . ,xad](2)
where
a{1, . . . ,n}
,
n
being the number of words in the description, and
d
is the dimension of the word
vectors, which is defined manually for both Word2Vec and GloVe. These vectors are representations of
words, to obtain the vector representations for the food product description we apply two dierent
heuristics for merging the separate word vectors. Our two heuristics of choice are:
Average – The vector representation for the food product description is calculated as an average
from the vectors of the words from which it consists of:
Eaverage [D]=x11 +. . . +xn1
n,x12 +. . . +xn2
n,. . . ,x1d+. . . +xnd
n(3)
Sum – The vector representation for each food product/recipe description is calculated by summing
the vector representations of the words it consists of:
Esum[D]=[x11 +. . . +xn1,x12 +. . . +xn2,. . . ,x1d+. . . +xnd ](4)
where
Eaverage [D]
and
Esum[D]
are the merged embeddings, i.e., embeddings for the whole
description. When generating the Word2Vec and GloVe embeddings, we considered dierent
values for the dimension size and sliding window size. The dimension sizes of choice are
[50,100,200], also for the Word2Vec embeddings we considered the two types of feature extraction
available: CBOW and SG. For these dimensions we assign dierent values to the parameter called
’sliding’ window. This parameter indicates the distance within a sentence between the current
word and the word being predicted. The values of chose are [2,3,5,10] because our food product
descriptions are not very long – the average number of words in a food product description in the
dataset is 11, while the maximum number of words is 30). By combining these parameter values,
24 Word2Vec models were trained, plus considering the heuristics for combining, a total of 48
models, while with GloVe a total of 24 models were trained.
2. Learning paragraph vector representations with Doc2Vec algorithm – The Doc2Vec algorithm
is used to generate vector representations for each description (sentence). If
D
is the description of the
food product/description, then
EDoc2Vec
is the sentence vector representation generated with Doc2Vec
is as follows:
EDoc2Vec[D]=[x1,x2,. . . ,xd](5)
Mathematics 2020,8, 1811 7 of 21
where
d
is the predefined dimension of the vectors. Same as the two chosen word embedding methods,
we considered dierent dimension sizes and sliding window sizes, specifically [2,3,5,10] for the sliding
window and [50,100,200] for the dimension size. We also considered the two types architectures in the
Doc2Vec model - PV-DM and PV-DBOW, and we used the non-concatenative mode (separate models
for the sum option, and separate for the average option) because if we used the concatenation of
context vectors rather than sum/average the result would be a much-larger model. Taking into account
all these parameters there are 48 Doc2Vec models trained in total.
2.3.2. Unsupervised Machine Learning
Foods exhibit large variations in the nutrient content, therefore have very unbalanced
macronutrient content. The dataset in our experiments includes a broad variety of foods, which implies
that the content of a macronutrient can go from one extreme to another. Therefore, it goes without
saying that in order to have better predictions for the content of macronutrients, food items should
be grouped by some similarity. Here, the FoodEx2 codes that are available come into use, since they
already contain domain knowledge, and based on them food items are grouped in food groups and
broader food categories in the FoodEx2 hierarchy [40].
Independently of the representation learning process, we used the method presented in [
39
],
where the FoodEx2 hierarchy is presented as Poincar
é
graph embeddings and then the FoodEx2 codes
based on these embeddings are clustered into 230 clusters. This clustering process is performed on
the bottom end of the hierarchy, i.e., on the leaves of the graph. Given that our dataset is rather small
compared to the total number of FoodEx2 codes in the hierarchy, and the fact that when assigned
a cluster number some of the clusters in our dataset will contain very few or no elements at all,
we decided to do a post-hoc cluster merging. The post-hoc cluster merging is performed following a
bottom up approach, the clusters are merged based on their top-level parents, going level deeper until
we have as evenly distributed clusters as possible.
2.3.3. Supervised Machine Learning
The last part of the methodology is the supervised machine learning part, which on input
receives the outputs from the representation learning part and the unsupervised machine learning
part. This part consists of applying single-target regression algorithms in order to predict the separate
macronutrient values.
Separate prediction models are trained for each macronutrient, because from the conducted
correlation test (Pearson’s correlation coecient) we concluded that there is no correlation between
the target variables. In a real-time scenario, it is somewhat hard to select the right machine learning
algorithm for the purpose. The overall most accepted approach is to select few algorithms, select
ranges for the hyper-parameters for each algorithm, perform hyper-parameter tuning, and evaluate
the estimators’ performances with cross-validation by the same data in each iteration, benchmark the
algorithms and select the best one(s). When working with regression algorithms, the most common
baseline is using mean or median (central tendency measures) of the train part of the dataset for all
the predictions.
2.3.4. Tolerance for Nutrient Values
The main goal is obtaining macronutrient values which are expressed in grams, and by
international legalizations and regulations can have defined tolerances. The European Commission
Health and Consumers Directorate General in 2012 published [
45
], with the aim to provide advised
recommendations for calculation of the acceptable dierences between quantities of nutrients on
the label declarations of food products and the ones established in Regulation EU 1169/2011 [
46
].
These tolerances for the food product labels are important as it is impossible for foods to contain the
exact levels of nutrients that are presented on the labels, as a consequence of the natural variations
of foods, as well as the variations occurring during production and the storage process. However,
Mathematics 2020,8, 1811 8 of 21
the nutrient content of foods should not deviate substantially from labelled values to the extent that
such deviations could lead to consumers being misled. From the tolerance levels stated in [
45
], for our
particular case we used the tolerance levels for the nutrition declaration of foods that do not include
food supplements, out of which we used the needed information presented in Table 2– where the
allowed deviations are presented for each of the four macronutrients, depending on their quantity
in 100 grams of the food in matter. These tolerance levels are included at the very final step in our
methodology in the determination on how accurate the predicted macronutrient values are.
Table 2. Tolerated dierences in nutrition content in foods besides food supplements.
Quantity per 100 g/Macronutrient Tolerances (Allowed Deviations in Quantity)
Carbohydrates Protein Water Fat
<10 g per 100 g ±2 g ±1.5 g
10–40 g per 100 g ±20% ±20%
>40 g per 100 g ±8 g ±8 g
3. Results
The first step towards the evaluation is pre-processing of the data. Our dataset for evaluation
is a subset from the original dataset, obtained by extracting the English food product descriptions,
alongside the columns with the macronutrient values (carbohydrates, fat, protein and water). The text
descriptions are tokenized. The punctuation signs and numbers that represent quantities are removed,
whereas the percentage values (of fat, of sugar, of cocoa
. . .
) which contain valuable information
concerning the nutrient content, and stop words which add meaning to the description, are kept.
The next step is word lemmatization [
47
], separate lemmatizers are used for the English names and the
Slovene names. In Table 3a few examples of the pre-processed data for the English names are presented.
Table 3. Examples of pre-processed English descriptions.
Original Description Pre-processed Description
Potatoes, mashed, dehydrated, prepared from
flakes without milk, whole milk and butter added
[‘potato’, ‘mashed’, ‘dehydrated’, ‘prepared’, ‘from’, ‘flake’,
‘without’, ‘milk’, ‘whole’, ‘milk’, ‘and’, ‘butter’, ‘added’]
Milk chocolate with 30% cocoa, Gorenjka (250 g) [‘milk’,‘chocolate’, ‘with’, ‘30’, ‘cocoa’, ‘gorenjka’]
After obtaining the data in the desired format, the next step is to apply the algorithms for generating
embeddings. For this purpose we used the Gensim [
48
] library in Python, and the corresponding
packages for the Word2Vec and Doc2Vec algorithms. The embedding vectors represent our base for
the next steps.
Independently of this process, the data is clustered, i.e., the instances are divided in clusters
based on their FoodEx2 codes. In the beginning from the clustering in [
39
] there are 230 clusters,
when assigned a cluster number, the instances in our dataset are clustered. From this initial clustering
we can note that not all clusters have elements in them, and some of them have very few elements.
Therefore, the post-hoc cluster merging is performed, where we merge the clusters following a bottom
up approach. For our dataset we went for the parents on the third level in the FoodEx2 hierarchy and
we obtained 9 clusters. In Table 4a few examples from each cluster are given (the English names are
given for convenience purposes).
Mathematics 2020,8, 1811 9 of 21
Table 4. Example instances from each cluster.
Cluster
Number Example Food Products
Cluster 1
Oil, industrial, mid-oleic,
sunflower, principal uses frying
and salad dressings
Homemade minced lard,
Mesarija Kragelj
Margarine (with added vegetable
sterols 0,75g/10g), line Becel
pro-activ, Unilever
Cluster 2 Peanuts, all types, oil-roasted,
with salt
Seeds, pumpkin and squash
seed kernels, dried Avocados, raw, California
Cluster 3 Cheese, processed, 60% fat in
dry matter
Yogurt, fruit (peach, cereals),
low fat 2.6% milkfat
Baby food, cottage cheese, creamed,
fruit (strawberry, banana),
FruchtZwerge, Danone
Cluster 4 Plums, canned, purple, light
syrup pack, solids and liquids
Segedin cabbage with pork meat
Buckwheat porridge sauted with
onion and garlic
Cluster 5 Fried chicken file (canola oil,
without breadcrumbs)
Trout with parsley and
garlic sauce
Beef, rib, whole (ribs 6–12),
separable lean and fat, trimmed to
1/8 of an inch of fat, all grades,
cooked, roasted
Cluster 6 Fruit tea infusion, with sugar
and lemon
Soup made of turnip cabbage,
peas and tomato (olive oil, stock)
Chicken stew with seasonal
vegetables, without roux
Cluster 7
Fish, salmon, pink, canned,
without salt, solids with bone
and liquid
Salty anchovies in vegetable oil Tuna with beans, canned
Cluster 8 Ham, sliced, regular
(approximately 11% fat) Chicken hot dog, pan-fried Turkey ham, sliced, extra lean,
prepackaged or deli-sliced
Cluster 9 Egg, whole, cooked, scrambled Fried egg (olive oil) Egg spread
The next step in our methodology is the machine learning part – applying single-target regressions
according to the following setup:
1.
Select regression algorithms – Linear regression, Ridge regression, Lasso regression, and ElasticNet
regression (using the Scikit-learn library in Python [49]).
2.
Select parameter ranges for each algorithm and perform hyper-parameter tuning – Ranges and
values are a priori given for all the parameters for all the regression algorithms. From all the
combinations the best parameters for the model training are then selected with GridSearchCV
(using the Scikit-learn library in Python [49]). This is done for each cluster separately.
3.
Apply k-fold cross-validation to estimate the prediction error – We train models for each cluster
using each of the selected regression algorithms. The models are trained with the previously
selected best parameters for each cluster and then evaluated with cross-validation. We chose
the matched sample approach for comparison of the regressors, i.e., using the same data in
each iteration.
4.
Apply tolerance levels and calculate accuracy – The accuracy is calculated according to the
tolerance levels in Table 2. If
ai
is the actual value of the
ith
instance from the test set on a certain
iteration of the k-fold cross-validation, and
pi
is the predicted values of the same,
ith
, instance of
the test set, then:
di=aipi, (6)
di
is the absolute dierence between the two set values. We define a binary variable that is
assigned a positive value if the predicted value is in the tolerance level.
allowed =1i f :
ai10 f(di2, f or protein and carbohydrate
di1.5, f or f at
ai>10 fai40 fdi0.2 ×ai
ai>40 fdi8
(7)
Mathematics 2020,8, 1811 10 of 21
At the end we calculate the accuracy as the ratio of predicted values that were in the ‘allowed’
range, i.e., tolerance level:
Accuracy =Pn
i=1allowed
n(8)
where
n
is the number of instances in the test set. The accuracy percentage is calculated for
the baseline mean and baseline median as well – the percentage of baseline values (means and
medians from each cluster) that falls in the tolerance level range, calculated according to Equations
(6)–(8), where
ai
is the actual value of the
ith
instance from the test set on a certain iteration of the
k-fold cross-validation, and instead of piwe have:
b=
Pm
i=1xi
m,the baseline is the mean
X[(m+1)/2]+X[(m+1)/2]
2,the baseline is the median (9)
where
m
is the number of instances in the train set, and
X
is the train set sorted in ascending order.
The accuracy percentages are calculated for each fold in each cluster, and at the end for each
cluster we calculate an average of the percentages from each fold. In Table 5the results obtained from
the experiments with the embeddings generated from the English names are presented, and in Table 6
with the embeddings generated from the Slovene names.
Table 5.
Accuracy percentages after k-fold cross validation on each cluster obtained with the embeddings
for the English names of the food products. Target: C—Carbohydrates, F—Fat, P—Protein, W—Water.
The bolded numbers in the table represent the overall best performance for each macronutrient in the
given cluster.
Cluster Target
Accuracy
Word2Vec GloVe Doc2Vec Mean Median
1
C59.21 47.84 50.11 1.00 17.47
F 44.26 35.95 49.32 5.05 10.21
P 56.37 60.32 56.95 13.16 14.26
W 40.32 52.32 48.21 8.05 9.26
2
C34.84 34.32 33.22 10.95 13.27
F67.22 64.69 64.69 7.93 60.55
P63.87 61.34 59.22 7.58 31.89
W 50.44 52.83 52.41 17.89 19.73
3
C 46.51 46.18 46.98 11.13 15.74
F67.42 63.62 64.00 6.84 59.81
P 69.64 65.47 70.74 8.75 58.55
W 56.68 60.85 58.70 12.18 29.83
4
C 40.92 43.32 40.53 12.95 16.40
F68.28 66.40 66.67 4.79 62.43
P72.50 70.85 71.71 7.23 66.07
W 59.09 61.51 60.99 11.24 33.86
5
C46.38 37.65 46.07 9.58 15.80
F66.12 62.38 62.38 4.57 42.43
P 66.12 63.63 66.83 8.73 52.38
W 49.87 48.95 53.98 12.80 21.53
6
C 29.46 30.55 33.30 7.90 10.24
F 41.66 41.08 43.26 6.68 29.76
P 53.35 54.37 55.81 15.09 20.11
W 38.01 39.69 41.28 11.03 15.45
7
C72.78 72.78 72.78 11.11 41.11
F 42.78 48.33 53.33 5.56 11.11
P73.89 73.89 73.89 31.67 15.00
W 46.67 48.89 57.22 15.56 20.00
Mathematics 2020,8, 1811 11 of 21
Table 5. Cont.
Cluster Target
Accuracy
Word2Vec GloVe Doc2Vec Mean Median
8
C58.31 51.60 55.58 0.95 21.69
F 48.27 39.74 50.17 6.58 15.15
P 60.48 63.25 67.06 7.62 19.74
W 41.60 48.27 49.83 11.34 11.26
9
C86.36 81.82 72.73 27.27 36.36
F50.00 40.91 45.45 4.55 4.55
P77.27 72.73 63.64 36.36 31.82
W 45.45 40.91 50.00 9.09 18.18
Table 6.
Accuracy percentages after k-fold cross validation on each cluster obtained with the embeddings
for the Slovene names of the food products. Target: C—Carbohydrates, F—Fat, P—Protein, W—Water.
The bolded numbers in the table represent the overall best performance for each macronutrient in the
given cluster.
Cluster Target
Accuracy
Word2Vec GloVe Doc2Vec Mean Median
1
C61.37 54.11 52.00 1.00 17.47
F44.26 37.00 41.26 5.05 10.21
P58.26 50.00 53.26 13.16 14.32
W 34.05 37.05 33.89 8.05 9.26
2
C 27.47 24.43 32.15 10.95 14.00
F70.28 67.04 64.69 7.93 60.55
P63.72 60.12 59.22 7.58 31.54
W49.96 43.28 48.38 17.89 19.55
3
C47.28 41.51 45.00 11.13 16.13
F67.42 63.99 63.62 6.84 59.81
P69.27 65.83 69.20 8.75 58.55
W 52.86 43.97 54.34 12.18 29.44
4
C 34.78 28.49 40.33 12.95 16.93
F70.13 67.74 66.40 4.79 62.43
P72.50 69.58 70.38 7.23 66.07
W 54.24 47.66 55.79 11.24 33.86
5
C47.63 41.40 45.95 9.58 15.80
F66.12 62.80 62.38 4.57 42.43
P66.12 64.47 64.47 8.73 52.38
W 48.18 41.48 51.02 12.80 21.12
6
C 31.42 25.61 33.74 7.90 10.75
F 39.34 34.97 44.42 6.68 29.98
P 53.36 50.73 63.13 15.09 20.33
W 41.21 34.67 41.85 11.03 15.45
7
C72.78 67.78 72.78 11.11 41.11
F58.33 37.78 48.33 5.56 11.11
P 63.89 63.89 69.44 31.67 15.00
W46.11 36.67 41.11 15.56 20.00
8
C 56.41 49.91 56.45 0.95 21.69
F47.32 43.51 44.55 6.58 15.15
P64.29 59.70 61.43 7.62 19.74
W 35.11 36.80 35.11 11.34 11.26
9
C86.36 72.73 86.36 27.27 36.36
F50.00 31.82 50.00 4.55 4.55
P 63.64 59.09 68.18 36.36 31.82
W 54.55 36.36 45.45 9.09 18.18
Mathematics 2020,8, 1811 12 of 21
In these tables we give the accuracy percentages from the predictions for each target macronutrient
in each cluster. From these tables we can see that having the Word2Vec and Doc2Vec embeddings as
features for the regressions yielded better results in more cases than having the GloVe embedding
vectors as inputs to the regressions, but this dierence is not big enough to say that these two
embedding algorithms outperformed GloVe. In Figures 25the results for each target macronutrient
are presented graphically.
Figure 2.
Best prediction accuracies for carbohydrates predictions obtained from the embeddings for
the English names and Slovene names for each cluster compared to the baseline mean and median for
the particular cluster.
Mathematics 2020,8, 1811 13 of 21
Figure 3.
Best prediction accuracies for fat predictions obtained from the embeddings for the English
names and Slovene names for each cluster compared to the baseline mean and median for the
particular cluster.
Mathematics 2020,8, 1811 14 of 21
Figure 4.
Best prediction accuracies for protein predictions obtained from the embeddings for the
English names and Slovene names for each cluster compared to the baseline mean and median for the
particular cluster.
Mathematics 2020,8, 1811 15 of 21
Figure 5.
Best prediction accuracies for water predictions obtained from the embeddings for the
English names and Slovene names for each cluster compared to the baseline mean and median for the
particular cluster.
In the graphs, for each target macronutrient, for each cluster, we give the best result obtained
with the embedding vectors from the English and Slovene names and compare them with the baseline
Mathematics 2020,8, 1811 16 of 21
mean and median for the particular cluster. In the graphs the embedding algorithm that yields the best
results alongside with the parameters and heuristic is given as:
E_h_d_w,
hsum,average,is the chosen heuristic
d{50, 100, 200},is the dimension
w{2, 3, 5, 10},is the sliding window
(10)
where,
E
is the embedding algorithm (Word2Vec, GloVe or Doc2Vec). We can see that the embedding
algorithm that yields the best results changes, but the in all cases the embedding algorithm gives better
results than the baseline methods. In Table 7, we present the embedding algorithms (with all the
parameter used) that gave the best results for each target macronutrient in each cluster, alongside with
the regression algorithm used for making the predictions.
Table 7.
Embedding and regression algorithms which yielded highest accuracies for each macronutrient
prediction in each cluster. Target: C—Carbohydrates, F—Fat, P—Protein, W—Water.
Cluster Target
Embedding Algorithm Regression Algorithm
ENG SLO ENG SLO
1
C Word2VecCBOW_avg_100_2 Word2VecCBOW_avg_50_2
ElasticNet
Ridge
F Doc2VecPV-DM_avg_200_2 Word2VecCBOW_avg_50_2 Lasso Ridge
P GloVe_sum_50_10 Word2VecCBOW_sum_200_2 Lasso Ridge
W GloVe_avg_50_10 GloVe_sum_200_2
ElasticNet
Ridge
2
C Word2VecSG_avg_200_2 Doc2VecPV-DBOW_avg_200_2 Ridge Ridge
F Word2VecCBOW_sum_50_5 Word2VecCBOW_avg_100_2 Lasso Ridge
P Word2VecSG_sum_100_5 Word2VecCBOW_avg_100_2 Ridge Ridge
W GloVe_avg_100_2 Word2VecSG_avg_200_10 Ridge Ridge
3
C Doc2VecPV-DM_avg_50_2 Word2VecCBOW_avg_100_2 Ridge
ElasticNet
F Word2VecCBOW_avg_200_2 Word2VecCBOW_avg_200_3 Ridge Ridge
P
Doc2VecPV-DBOW_avg_200_10
Word2VecCBOW_avg_200_5 Ridge Ridge
W GloVe_avg_200_3 Doc2VecPV-DBOW_avg_200_2 Ridge
ElasticNet
4
C GloVe_sum_200_3 Doc2VecPV-DBOW_avg_200_5 Ridge
ElasticNet
F Word2VecCBOW_avg_100_5 Word2VecCBOW_avg_100_2 Lasso Ridge
P Word2VecCBOW_avg_50_3 Word2VecCBOW_avg_50_2 Lasso Ridge
W GloVe_sum_200_3 Doc2VecPV-DBOW_avg_200_5 Ridge
ElasticNet
5
C Word2VecCBOW_avg_200_2 Word2VecCBOW_avg_100_2 Ridge Lasso
F Word2VecCBOW_avg_200_2 Word2VecCBOW_avg_200_3 Ridge Ridge
P Doc2VecPV-DBOW_sum_200_5 Word2VecCBOW_avg_200_5
ElasticNet
Ridge
W
Doc2VecPV-DBOW_avg_200_10
Doc2VecPV-DBOW_sum_50_3 Ridge Lasso
6
C
Doc2VecPV-DBOW_sum_200_10
Doc2VecPV-DBOW_sum_200_2 Ridge Ridge
F
Doc2VecPV-DBOW_avg_200_10
Doc2VecPV-DBOW_avg_200_5 Ridge Ridge
P
Doc2VecPV-DBOW_sum_200_10
Doc2VecPV-DBOW_avg_200_2 Ridge Ridge
W Doc2VecPV-DBOW_avg_200_3 Doc2VecPV-DBOW_avg_200_2 Ridge Ridge
7
C Word2VecCBOW_sum_50_2 Word2VecCBOW_sum_50_2 Linear Linear
F Doc2VecPV-DM_sum_50_5 Word2VecCBOW_avg_100_2
ElasticNet
Linear
P Word2VecSG_avg_200_5 Doc2VecPV-DM_sum_50_10 Linear
ElasticNet
W Doc2VecPV-DM_sum_50_3 Word2VecSG_sum_100_2 Linear Linear
8
C Word2VecCBOW_avg_200_3 Doc2VecPV-DBOW_sum_200_2 Ridge Ridge
F Doc2VecPV-DBOW_avg_100_5 Word2VecCBOW_avg_50_2 Lasso Ridge
P Doc2VecPV-DBOW_sum_50_2 Word2VecCBOW_sum_50_10
ElasticNet
Ridge
W Doc2VecPV-DM_sum_100_2 GloVe_sum_200_2 Ridge Ridge
9
C Word2VecSG_sum_200_3 Word2VecCBOW_avg_50_5 Lasso Linear
F Word2VecCBOW_avg_50_5 Word2VecSG_sum_200_2 Linear Linear
P Word2VecCBOW_avg_100_3 Doc2VecPV-DBOW_sum_200_3 Linear Lasso
W Doc2VecPV-DM_sum_50_10 Word2VecCBOW_avg_200_5 Lasso Linear
Mathematics 2020,8, 1811 17 of 21
4. Discussion
From the obtained results we can observe that the highest percentage of correctly predicted
macronutrient values is obtained in cluster 9, for the prediction of carbohydrates: 86,36%, 81,82%
and 72,73% for the English names and 86,36%, 72,73% and 86,36% for the Slovene names, and for
the Word2Vec, GloVe and Doc2Vec algorithms appropriately, whereas the baseline (both mean and
median) is more than half less. Following these results are the predictions for protein quantity in the
same cluster, and then the predictions for protein and carbohydrates in cluster 7. When inspecting
these two clusters, we concluded that these were the only two clusters that were not merged with
other ones, therefore, the FoodEx2 hierarchy is on a deeper level, and the foods inside these clusters
are more similar to each other compared to food in other clusters. Cluster 9 consists of types of egg
products, and simple egg dishes – each of these foods have almost identical macronutrients because
they only contain one ingredient – eggs. Cluster 7, on the other hand contains fish products, either
frozen or canned. If we do not consider the results from these two clusters, then the best results are
obtained for protein predictions in cluster 4 (70%–72%) and fat predictions (66%–68%), but compared
to the baseline median of that cluster, they are not much better, but if we look at the results from the
protein predictions in cluster 8 (60%–67%) we can see that the obtained accuracies are much higher
than the baseline mean and median for this cluster. Cluster 8 mainly contains types of processed meats,
which can vary notably in fat content, but have similarities in the range of protein content.
For comparison reasons, we also ran the single-target regressions without clustering the dataset.
The results are presented in Figure 6.
Figure 6.
Best prediction accuracies for each macronutrient obtained from the embeddings for the
English and Slovene names compared to the baseline mean and median from the whole dataset.
From this graph we can conclude the same – the embedding algorithms give better results than the
baseline mean and median (in this case of the whole dataset), for each target macronutrient. The best
results, again, are obtained for the prediction of protein content (62%–64%).
In Table 8, we give the parameters for the embedding algorithms and the regressors with which
the best results were obtained without clustering the data.
From these results, it is worth arguing that modeling machine learning techniques on food data
previously clustered based on FoodEx2 codes would yield better results than predicting on the whole
dataset. If we compare the performances of the three embedding algorithms, it is hard to argue if one
outperformed the others, or if one underperformed compared to the other two. This outcome is due to
the fact that we are dealing with fairly short textual descriptions.
Mathematics 2020,8, 1811 18 of 21
Table 8.
Embedding and regression algorithms which yielded highest accuracies for each
macronutrient prediction on the whole dataset (without clustering). Target: C—Carbohydrates,
F—Fat, P—Protein, W—Water.
Target Embedding Algorithm Regression Algorithm
ENG SLO ENG SLO
CDoc2VecPV-DBOW_avg_200_5 Doc2VecPV-DBOW_sum_200_3 Ridge Ridge
FDoc2VecPV-DBOW_avg_200_2
Doc2VecPV-DBOW_sum_200_10
Lasso
ElasticNet
PDoc2VecPV-DBOW_avg_200_5 Doc2VecPV-DBOW_sum_200_3 Ridge Ridge
WGloVe_avg_200_10 Doc2VecPV-DBOW_sum_200_2 Ridge Linear
Given the fact that the results with the clustering are better than the results without, and we
rely so strongly on having the FoodEx2 codes in order to cluster the foods, the availability of the
FoodEx2 codes is of big importance and therefore a limitation of the methodology. For this purpose,
we can rely on a method such as StandFood [
50
], which is a natural language processing methodology
developed for classifying and describing foods according to FoodEx2. When this limitation is surpassed,
the application of our method can be fully automated.
From a theoretical viewpoint this methodology considers the benefits of using representation
learning as the base of a predictive study, and proves that dense real-valued vectors can capture enough
semantics even from a short text description (without including the needed details for the task in
question – in our case, measurements or exact ingredients) in order to be considered in a predictive
study for complicated and value-sensitive task such as predicting macronutrient content. This study
oers a fertile ground for further exploration of representation learning and considering more complex
embedding algorithms – using transformers [51,52] and fine tuning them for this task.
From a managerial viewpoint the application of this methodology opens up many possibilities for
facilitating and easing the process of calculating macronutrient content, which is crucial for dietary
assessment, dietary recommendations, dietary guidelines, macronutrient tracking, and other such tasks
which are key tools for doctors, health professionals, dieticians, nutritional experts, policy makers,
professional sport coaches, athletes, fitness professionals, etc.
5. Conclusions
We live in a modern health crisis. We have a cure for almost everything, and yet the most common
causes of biggest mortality factor – cardiovascular diseases are nutrition and diet related. Knowing
what is in our food, and understanding its nutritional content (macro and micronutrients) is the first
step, that is in our power, towards the prevention of diet-related diseases. There is an overwhelming
amount of nutrition-related data available, and most of it comes in textual form, structured and
unstructured. Data Science can help us utilize this data for our benefit. We presented a methodology
that combines representation learning and machine learning for the task of predicting macronutrient
values from short textual descriptions of food data – a combination of food products and recipes.
Taking learned vector representations of the descriptions as features, and applying dierent regression
algorithms on separate clusters of the data obtained by clustering based on Poincar
é
graph embeddings
from the FoodEx2 codes of the data, and obtaining results with as high as 86% accuracy, this approach
proves to be very eective for this task. For our future work we intend to extend this methodology
with the state-of-the-art embeddings based on transformers – Bert Embeddings [
51
], clustering on an
upper level of the FoodEx2 hierarchy, and including methods for obtaining FoodEx2 codes, when they
are not available [
50
], as well evaluating it on a bigger dataset, with longer, more detailed descriptions.
Author Contributions:
Conceptualization, G.I., T.E. and B.K.S.; methodology, G.I. and T.E.; software, G.I..;
validation, G.I. and T.E.; resources, B.K.S.; data curation, B.K.S.; writing—original draft preparation, G.I.;
writing—review and editing, T.E. and B.K.S.; visualization, G.I.; supervision, T.E. and B.K.S.; project administration,
B.K.S.; funding acquisition, B.K.S. All authors have read and agreed to the published version of the manuscript.
Mathematics 2020,8, 1811 19 of 21
Funding:
This research was supported by the Slovenian Research Agency (research core grant number P2-0098),
and the European Union’s Horizon 2020 research and innovation programme (FNS-Cloud, Food Nutrition
Security) (grant agreement 863059). The information and the views set out in this publication are those of the
authors and do not necessarily reflect the ocial opinion of the European Union. Neither the European Union
institutions and bodies nor any person acting on their behalf may be held responsible for the use that may be
made of the information contained herein.
Conflicts of Interest: The authors declare no conflict of interest.
References
1.
Willett, W.; Rockström, J.; Loken, B.; Springmann, M.; Lang, T.; Vermeulen, S.; Garnett, T.; Tilman, D.;
DeClerck, F.; Wood, A.; et al. Food in the Anthropocene: The EAT–Lancet Commission on healthy diets from
sustainable food systems. The Lancet 2019,393, 447–492. [CrossRef]
2.
Branca, F.; Demaio, A.; Udomkesmalee, E.; Baker, P.; Aguayo, V.M.; Barquera, S.; Dain, K.; Keir, L.; Lartey, A.;
Mugambi, G.; et al. A new nutrition manifesto for a new nutrition reality. The Lancet
2020
,395, 8–10.
[CrossRef]
3.
Keeley, B.; Little, C.; Zuehlke, E. The State of the World’s Children 2019: Children, Food and Nutrition–Growing
Well in a Changing World; UNICEF: New York, NY, USA, 2019.
4.
Mbow, H.-O.P.; Reisinger, A.; Canadell, J.; O’Brien, P. Special Report on Climate Change, Desertification,
Land Degradation, Sustainable Land Management, Food Security, and Greenhouse Gas Fluxes in Terrestrial
Ecosystems (SR2); IPCC: Geneva, Switzerland, 2017.
5.
Ijaz, M.F.; Attique, M.; Son, Y. Data-Driven Cervical Cancer Prediction Model with Outlier Detection and
Over-Sampling Methods. Sensors 2020,20, 2809. [CrossRef] [PubMed]
6.
World Health Organization. Diet, Nutrition, and the Prevention of Chronic Diseases: Report of a Joint WHO/FAO
Expert Consultation; World Health Organization: Geneva, Switzerland, 2003; Volume 916.
7.
Rand, W.M.; Pennington, J.A.; Murphy, S.P.; Klensin, J.C. Compiling Data for Food Composition Data Bases;
United Nations University Press: Tokyo, Japan, 1991.
8.
Greenfield, H.; Southgate, D.A. Food Composition Data: Production, Management, and Use; Food and Agriculture
Organization: Rome, Italy, 2003; ISBN 978-92-5-104949-5.
9.
Schakel, S.F.; Buzzard, I.M.; Gebhardt, S.E. Procedures for estimating nutrient values for food composition
databases. J. Food Compos. Anal. 1997,10, 102–114. [CrossRef]
10.
Yunus, R.; Arif, O.; Afzal, H.; Amjad, M.F.; Abbas, H.; Bokhari, H.N.; Haider, S.T.; Zafar, N.; Nawaz, R.
A framework to estimate the nutritional value of food in real time using deep learning techniques. IEEE
Access 2018,7, 2643–2652. [CrossRef]
11.
Jiang, L.; Qiu, B.; Liu, X.; Huang, C.; Lin, K. DeepFood: Food Image Analysis and Dietary Assessment via
Deep Model. IEEE Access 2020,8, 47477–47489. [CrossRef]
12.
Pouladzadeh, P.; Shirmohammadi, S.; Al-Maghrabi, R. Measuring calorie and nutrition from food image.
IEEE Trans. Instrum. Meas. 2014,63, 1947–1956. [CrossRef]
13.
Ege, T.; Yanai, K. Image-based food calorie estimation using recipe information. IEICE Trans. Inf. Syst.
2018
,
101, 1333–1341. [CrossRef]
14.
Samsung Health (S-Health). Available online: https://health.apps.samsung.com/terms (accessed on 11 May
2020).
15. MyFitnessPal. Available online: https://www.myfitnesspal.com/(accessed on 11 May 2020).
16.
Bengio, Y.; Courville, A.; Vincent, P. Representation learning: A review and new perspectives. IEEE Trans.
Pattern Anal. Mach. Intell. 2013,35, 1798–1828. [CrossRef]
17.
Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Cogn
Modeling 1988,5, 1. [CrossRef]
18.
Bengio, Y.; Ducharme, R.; Vincent, P.; Jauvin, C. A neural probabilistic language model. J. Mach. Learn. Res.
2003,3, 1137–1155.
19.
Mikolov, T. Statistical Language Models Based on Neural Networks; Presentation at Google, Mountain View,
2nd April 2012; Brno University of Technology: Brno, Czech Republic, 2012; Volume 80.
20.
Caracciolo, C.; Stellato, A.; Rajbahndari, S.; Morshed, A.; Johannsen, G.; Jaques, Y.; Keizer, J. Thesaurus
maintenance, alignment and publication as linked data: The AGROVOC use case. Int. J. Metadatasemantics
Ontol. 2012,7, 65–75. [CrossRef]
Mathematics 2020,8, 1811 20 of 21
21.
Weston, J.; Bengio, S.; Usunier, N. Wsabie: Scaling up to large vocabulary image annotation. In Proceedings
of the Twenty-Second International Joint Conference on Artificial Intelligence, Barcelona, Spain, 16–22 July
2011.
22.
Socher, R.; Lin, C.C.; Manning, C.; Ng, A.Y. Parsing natural scenes and natural language with recursive
neural networks. In Proceedings of the 28th International Conference on Machine Learning (ICML-11),
Bellevue, WA, USA, 28 June–2 July 2011; pp. 129–136.
23.
Glorot, X.; Bordes, A.; Bengio, Y. Domain adaptation for large-scale sentiment classification: A deep learning
approach. In Proceedings of the Proceedings of the 28th International Conference on Machine Learning
(ICML-11); Omnipress: Madison, WI, USA, 2011; pp. 513–520.
24.
Turney, P.D. Distributional semantics beyond words: Supervised learning of analogy and paraphrase. Trans.
Assoc. Comput. Linguist. 2013,1, 353–366. [CrossRef]
25.
Turney, P.D.; Pantel, P. From frequency to meaning: Vector space models of semantics. J. Artif. Intell. Res.
2010,37, 141–188. [CrossRef]
26.
Mikolov, T.; Yih, W.; Zweig, G. Linguistic regularities in continuous space word representations.
In Proceedings of the Proceedings of the 2013 Conference of the North American Chapter of the Association for
Computational Linguistics: Human Language Technologies, Atlanta, GA, USA, 9–14 June 2013; pp. 746–751.
27.
Eckart, C.; Young, G. The approximation of one matrix by another of lower rank. Psychometrika
1936
,1,
211–218. [CrossRef]
28.
Habibi, M.; Weber, L.; Neves, M.; Wiegandt, D.L.; Leser, U. Deep learning with word embeddings improves
biomedical named entity recognition. Bioinformatics 2017,33, i37–i48. [CrossRef]
29.
Drozd, A.; Gladkova, A.; Matsuoka, S. Word embeddings, analogies, and machine learning: Beyond
king-man+woman=queen. In Proceedings of the Coling 2016, the 26th International Conference on
Computational Linguistics: Technical papers, Osaka, Japan, 11–17 December 2016; pp. 3519–3530.
30.
Tshitoyan, V.; Dagdelen, J.; Weston, L.; Dunn, A.; Rong, Z.; Kononova, O.; Persson, K.A.; Ceder, G.; Jain, A.
Unsupervised word embeddings capture latent knowledge from materials science literature. Nature
2019
,
571, 95–98. [CrossRef]
31.
Ispirova, G.; Eftimov, T.; Seljak, B.K. Comparing Semantic and Nutrient Value Similarities of Recipes.
In Proceedings of the 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA,
9–12 December 2019; pp. 5131–5139.
32.
Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Ecient estimation of word representations in vector space.
arXiv 2013, arXiv:1301.3781.
33.
Mikolov, T.; Sutskever, I.; Chen, K.; Corrado, G.S.; Dean, J. Distributed representations of words and phrases
and their compositionality. In Proceedings of the Advances in Neural Information Processing Systems,
Lake Tahoe, NV, USA, 5–10 December 2013; pp. 3111–3119.
34.
Pennington, J.; Socher, R.; Manning, C. Glove: Global vectors for word representation. In Proceedings
of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar,
25–29 October 2014; pp. 1532–1543.
35.
Le, Q.; Mikolov,T. Distributed representations of sentences and documents. In Proceedings of the International
Conference on Machine Learning, Beijing, China, 21–26 June 2014; pp. 1188–1196.
36.
Nickel, M.; Kiela, D. Poincar
é
embeddings for learning hierarchical representations. In Proceedings of
the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017;
pp. 6338–6347.
37.
Yang, Z.; Cohen, W.; Salakhudinov, R. Revisiting Semi-Supervised Learning with Graph Embeddings.; Balcan, M.F.,
Weinberger, K.Q., Eds.; PMLR: New York, NY, USA, 2016; Volume 48, pp. 40–48.
38.
Ristoski, P.; Paulheim, H. Rdf2vec: Rdf graph embeddings for data mining. In Proceedings of the
International Semantic Web Conference, Hyogo, Japan, 17–21 October 2016; Springer: Cham, Switzerland,
2016; pp. 498–514.
39.
Eftimov, T.; Popovski, G.; Valenˇciˇc, E.; Seljak, B.K. FoodEx2vec: New foods’ representation for advanced
food data analysis. Food Chem. Toxicol. 2020,138, 111169. [CrossRef]
40.
European Food Safety Authority The food classification and description system FoodEx2 (revision 2). EFSA
Supporting Publ. 2015,12, 804E.
41.
Van der Laan, M.; Pollard, K.; Bryan, J. A new partitioning around medoids algorithm. J. Stat. Comput. Simul.
2003,73, 575–584. [CrossRef]
Mathematics 2020,8, 1811 21 of 21
42.
Rousseeuw, P.J. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput.
Appl. Math. 1987,20, 53–65. [CrossRef]
43.
The European Food Safety Authority. Available online: https://www.efsa.europa.eu/en/data/food-
consumption-data (accessed on 11 May 2020).
44.
Authority, E.F.S. Use of the EFSA comprehensive European food consumption database in exposure
assessment. EFSA J. 2011,9, 2097. [CrossRef]
45.
European commission health and consumers directorate-general GUIDANCE DOCUMENT FOR
COMPETENT AUTHORITIES FOR THE CONTROL OF COMPLIANCE WITH EU LEGISLATION ON:
Regulation (EU) No 1169/2011 of the European Parliament and of the Council of 25 October 2011 on the
provision of food information to consumers, amending Regulations (EC) No 1924/2006 and (EC) No 1925/2006
of the European Parliament and of the Council, and repealing Commission Directive 87/250/EEC, Council
Directive 90/496/EEC, Commission Directive 1999/10/EC, Directive 2000/13/EC of the European Parliament
and of the Council, Commission Directives 2002/67/EC and 2008/5/EC and Commission Regulation (EC) No
608/2004Devlin. Available online: https://ec.europa.eu/food/sites/food/files/safety/docs/labelling_nutrition-
supplements-guidance_tolerances_1212_en.pdf (accessed on 11 May 2020).
46.
European Commission. Regulation (EU) No 1169/2011 of the European Parliament and of the Council of 25
October 2011 on the provision of food information to consumers, amending Regulations (EC) No 1924/2006
and (EC) No 1925/2006 of the European Parliament and of the Council, and repealing Commission Directive
87/250/EEC, Council Directive 90/496/EEC, Commission Directive 1999/10/EC, Directive 2000/13/EC of the
European Parliament and of the Council, Commission Directives 2002/67/EC and 2008/5/EC and Commission
Regulation (EC) No 608/2004. O. J. Eur. Union L 2011,304, 18–63.
47.
Korenius, T.; Laurikkala, J.; Järvelin, K.; Juhola, M. Stemming and lemmatization in the clustering of
finnish text documents. In Proceedings of the thirteenth ACM international conference on Information and
knowledge management, Washington, DC, USA, 8–13 November 2004; pp. 625–633.
48.
Rehurek, R.; Sojka, P. Gensim—Statistical Semantics In Python; NLP Centre, Faculty of Informatics, Masaryk
University: Brno, Czech Republic, 2011.
49.
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.;
Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res.
2011
,12,
2825–2830.
50.
Eftimov, T.; Korošec, P.; Korouši´c Seljak, B. StandFood: Standardization of foods using a semi-automatic
system for classifying and describing foods according to FoodEx2. Nutrients 2017,9, 542. [CrossRef]
51.
Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for
language understanding. arXiv 2018, arXiv:1810.04805.
52.
Sun, Y.; Wang, S.; Li, Y.; Feng, S.; Chen, X.; Zhang, H.; Tian, X.; Zhu, D.; Tian, H.; Wu, H. Ernie: Enhanced
representation through knowledge integration. arXiv 2019, arXiv:1904.09223.
Publisher’s Note:
MDPI stays neutral with regard to jurisdictional claims in published maps and institutional
aliations.
©
2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
... Existing approaches struggle to effectively address these two tasks in everyday situations. For the first task, unsupervised learning models are used to analyze the textual nutritional data [25,26], but they fail to handle unstructured patient descriptions. Thus most research works either assume having prior knowledge about the nutrients of the patient's meals [6,43] or require the patient to adhere to a predefined set of food intake [18,27]. ...
... We compare the performance of DIETS with the following state-of-the-art models. P-Nut [26] is a machine learning pipeline for predicting macronutrient values of foods using unsupervised learning for clustering, ...
... We use the record of home-cooked dishes in the BOOHEE dataset to evaluate the capability of tailored LLMs for the dietary analysis module. We compare the performance of several popular LLMs, including GPT-4 [1], GPT-4o [39], Mistral-22B [28], Llama3 [45], Gemini [44] and one existing work P-Nut [26]. ChatGPT cannot analyze the dietary in the experiment and thus is excluded from the experiment. ...
Preprint
Full-text available
People with diabetes need insulin delivery to effectively manage their blood glucose levels, especially after meals, because their bodies either do not produce enough insulin or cannot fully utilize it. Accurate insulin delivery starts with estimating the nutrients in meals and is followed by developing a detailed, personalized insulin injection strategy. These tasks are particularly challenging in daily life, especially without professional guidance. Existing solutions usually assume the prior knowledge of nutrients in meals and primarily rely on feedback from professional clinicians or simulators to develop Reinforcement Learning-based models for insulin management, leading to extensive consumption of medical resources and difficulties in adapting the models to new patients due to individual differences. In this paper, we propose DIETS, a novel diabetic insulin management framework built on the transformer architecture, to help people with diabetes effectively manage insulin delivery in everyday life. Specifically, DIETS tailors a Large Language Model (LLM) to estimate the nutrients in meals and employs a titration model to generate recommended insulin injection strategies, which are further validated by a glucose prediction model to prevent potential risks of hyperglycemia or hypoglycemia. DIETS has been extensively evaluated on three public datasets, and the results show it achieves superior performance in providing effective insulin delivery recommendation to control blood glucose levels.
... Recipe1M is the only publicly available recipe dataset that has structured data -separated list of ingredients, quantities and measurements, as well as nutrient values per recipe and per ingredient. In [1] we present a ML pipeline (called P-NUT), for predicting nutrient values of a food item considering learned vector representations of text describing the food item. Based on this ML pipeline, in [2] we propose a domain heuristic for merging text embeddings. ...
... Values using Text Descriptions: P-NUT [1] is a novel ML pipeline for learning predictive models that incorporates the domain knowledge encoded in external semantic resources. The ML pipeline (P-NUT) consists of three parts: a. Representation learning: Introduced by Mikolov et al. in 2013 [33] and Pennington et al. in 2014 [34], word embeddings have become indispensable for natural language processing (NLP) tasks in the past couple of years, and they have enabled various ML models that rely on vector representation as an input to benefit from these high-quality representations of text input. ...
... Using the concept of our proposed ML pipeline presented in [1] and [35] we constructed a representation learning pipeline (presented in [2]) in order to explore how the prediction results change when, instead of using the vector representations of the recipe description, we use the embeddings of the list of ingredients. The nutrient content of one food depends on its ingredients; therefore, the text of the ingredients contains more relevant information. ...
Preprint
Although recipe data are very easy to come by nowadays, it is really hard to find a complete recipe dataset - with a list of ingredients, nutrient values per ingredient, and per recipe, allergens, etc. Recipe datasets are usually collected from social media websites where users post and publish recipes. Usually written with little to no structure, using both standardized and non-standardized units of measurement. We collect six different recipe datasets, publicly available, in different formats, and some including data in different languages. Bringing all of these datasets to the needed format for applying a machine learning (ML) pipeline for nutrient prediction [1], [2], includes data normalization using dictionary-based named entity recognition (NER), rule-based NER, as well as conversions using external domain-specific resources. From the list of ingredients, domain-specific embeddings are created using the same embedding space for all recipes - one ingredient dataset is generated. The result from this normalization process is two corpora - one with predefined ingredient embeddings and one with predefined recipe embeddings. On all six recipe datasets, the ML pipeline is evaluated. The results from this use case also confirm that the embeddings merged using the domain heuristic yield better results than the baselines.
... A novel ML/DM pipeline for learning predictive models that incorporates the domain knowledge encoded in external semantic resources. This work has been presented in a peer-reviewed journal article [39]. ...
... SC4. Evaluation of the proposed ML/DM pipeline through benchmarking it against models obtained without the incorporation of the domain knowledge in a predictive task. This work has been presented in a peer-reviewed journal article [39], and conference paper [40]. ...
... Nutrient content can vary a lot from one complex food to another, even though they have roughly the same type of ingredients, which complicates the nutrient tracking and calculating, and makes the possibility of predicting nutrient content extremely unlikely. In Chapter 3, we proposed an approach -a ML/DM pipeline, published in [39] (called P-NUT), for predicting nutrient values of a food item considering learned vector representations of text describing the food item. The ML/DM pipeline consists of three parts: RL part -learning vector representations from short text descriptions of recipes; unsupervised ML -introducing domain knowledge for obtaining separate clusters of data; and supervised MLobtaining predictions for the nutrient values of the recipes. ...
Thesis
Full-text available
Human knowledge about food and nutrition has evolved drastically with time. With food and nutrition-related data being mass produced and easily accessible, the next step is to use Artificial Intelligence (AI) to translate data into knowledge. The majority of AI research is model-driven, and classical Machine Learning (ML) pipelines concentrate on the model-centric approach, prioritizing training the best model for a specific task, with the main focus on improving model parameters, overlooking the importance of data. We propose a novel ML pipeline that fused data and domain-driven knowledge for a predictive task from the Food and Nutrition domain – fast prediction of nutrient values from unstructured recipe text. Our proposed pipeline consists of three parts: representation learning (RL), unsupervised ML, and supervised ML. In the RL part, word and paragraph embeddings are learned for text short descriptions of foods (recipe titles), in the unsupervised ML part the recipes are separated in clusters based on a domain-specific coding (FoodEx2 classification) from external domain resource, and in the supervised ML part, the two parts are combined – separate predictive models are trained for each cluster for separate nutrients using the learned embeddings as input features. The pipeline is evaluated with a criteria defined using domain knowledge (nutrient tolerance levels) and compared to baselines also calculated using the same criteria. As the evaluation results showed that including the domain knowledge in the unsupervised ML part improved the results compared to the baseline, we propose an alteration of the ML pipeline. We include two different external sources of domain knowledge for clustering in the unsupervised ML part, to explore the domain bias for the same prediction task. To further improve the ML pipeline, we include domain knowledge in the RL part of the pipeline. Instead of obtaining recipe title embeddings, we introduce a domain heuristic for merging embeddings of the ingredients of the recipe. This proved to be a successful way to train excellent performing predictive models for predicting nutrient values, as the accuracies obtained were significantly higher than the baseline. As the domain-specific embeddings showed to be high performant, through the process of data normalization using dictionary and rule-based Named Entity Recognition and data mapping to a Food Composition Database from six heterogeneous multilingual recipe datasets, we composed two predefined corpora of embeddings – ingredient and recipe embeddings. Training embeddings tailored for a specific task is a very time-consuming process, therefore these corpora of predefined embeddings can be used for research purposes as well as transferred to other tasks for application purposes. To explore the major impact data has on model-performance, we focused on generalization of predictive models, by defining a generalizability index that indicates the trust of transferring a predictive model learned on one dataset to another. Going a step further to show the importance of data in predictive modeling, we show different ways of selecting a representative training dataset, and the results show how different selections of the training dataset produce different outcomes. The training data should be representative of the data expected in deployment, covering all variations that deployment data will present.
... Đối với nhóm bài toán xác định hàm lượng dinh dưỡng, nhiều mô hình học máy khác nhau cũng đã được nghiên cứu và phát triển để giải quyết các đặc tính khác nhau. Mô hình P_NUT sử dụng phương pháp xử lý ngôn ngữ tự nhiên (NLP) để dự đoán hàm lượng các chất đa lượng, bao gồm chất béo, chất đạm và tinh bột của thực phẩm dựa trên văn bản mô tả về thực phẩm đó [9]. Một phiên bản mới hơn của mô hình này thậm chí còn có thể dự đoán hàm lượng chất béo, chất đạm và tinh bột từ một công thức nấu ăn [10]. ...
Article
Understanding the nutritional content of food after processing is important for the food processing industry. Choosing the appropriate processing method allows users to retain healthy micronutrients. In fact, collecting nutrient information of food before and after processing poses many challenges due to biological changes and interactions of nutritional components. Currently, the approach is to collect data on each nutritional component before and after processing. Conventional machine learning models will then use this data to produce good prediction results but with limited stability. Therefore, we proposed to use a deep learning model to conduct training on a data set with 27 nutritional components that change through two processing processes, boil and fry, extracted from a standard reference data set of the United States. The results show accurate prediction and improve the stability by 8.6%. The study shows the potential of improving deep learning models in predicting post-processing nutritional composition in food processing.
... There are 12,844 food entities in the FoodBase corpus, including 2,105 different foods. Ispirova et al. [10] grouped nutritional data from Slovenian food consumption data. There were 3,265 food data points in the sample. ...
... Previously published approaches to estimate the added sugar content of foods are often time-intense and rely on several, often manual, steps that provide challenges when incorporating them into an automated process [9][10][11]. More recently, fully automated supervised machine learning-based approaches have been successfully employed to estimate nutrient profiles [12][13][14][15]. Most importantly, Davies et al. [15] showed that an approach using the k nearest neighbors (kNN) method can be successfully used on a curated dataset which includes category labels and complete ingredient information. ...
Article
Full-text available
Obesity and diabetes have emerged as an increasing threat to public health, and the consumption of added sugar can contribute to their development. Though nutritional content information can positively influence consumption behavior, added sugar is not currently required to be disclosed in all countries. However, a growing proportion of the world’s population has access to mobile devices, which allow for the development of digital solutions to support health-related decisions and behaviors. To test whether advances in computational science can be leveraged to develop an accurate and scalable model to estimate the added sugar content of foods based on their nutrient profile, we collected comprehensive nutritional information, including information on added sugar content, for 69,769 foods. Eighty percent of this data was used to train a gradient boosted tree model to estimate added sugar content, while 20% of it was held out to assess the predictive accuracy of the model. The performance of the resulting model showed 93.25% explained variance per default portion size (84.32% per 100 kcal). The mean absolute error of the estimate was 0.84 g per default portion size (0.81 g per 100 kcal). This model can therefore be used to deliver accurate estimates of added sugar through digital devices in countries where the information is not disclosed on packaged foods, thus enabling consumers to be aware of the added sugar content of a wide variety of foods.
... Several models have addressed attributes related to nutrient profiles. Natural language processing (NLP) methods were used to predict the macronutrient (proteins, fats and carbohydrates) content of foods from a text description of the food [19]. USDA investigators predicted the content of 3 label nutrients (carbohydrates, protein and sodium) in processed foods from the ingredient list, using the Branded Foods datatype in Food Data Central (FDC) [20]. ...
Preprint
Full-text available
The future of personalized health relies on knowledge of dietary composition. The current analytical methods are impractical to scale up, and the computational methods are inadequate. We propose machine learning models to predict the nutritional profiles of cooked foods given the raw food composition and cooking method, for a variety of plant and animal-based foods. Our models (trained on USDAs SR dataset) were on average 31% better than baselines, based on RMSE metric, and particularly good for leafy green vegetables and various cuts of beef. We also identified and remedied a bias in the data caused by representation of composition per 100grams. The scaling methods are based on a process-invariant nutrient, and the scaled data improves prediction performance. Finally, we advocate for an integrated approach of data analysis and modeling when generating future composition data to make the task more efficient, less costly and apply for development of reliable models.
Article
The information on nutritional profile of cooked foods is important to both food manufacturers and consumers, and a major challenge to obtaining precise information is the inherent variation in composition across biological samples of any given raw ingredient. The ideal solution would address precision and generability, but the current solutions are limited in their capabilities; analytical methods are too costly to scale, retention-factor based methods are scalable but approximate, and kinetic models are bespoke to a food and nutrient. We provide an alternate solution that predicts the micronutrient profile in cooked food from the raw food composition, and for multiple foods. The prediction model is trained on an existing food composition dataset and has a 31% lower error on average (across all foods, processes and nutrients) than predictions obtained using the baseline method of retention-factors. Our results argue that data scaling and transformation prior to training the models is important to mitigate any yield bias. This study demonstrates the potential of machine learning methods over current solutions, and additionally provides guidance for the future generation of food composition data, specifically for sampling approach, data quality checks, and data representation standards.
Article
Full-text available
Globally, cervical cancer remains as the foremost prevailing cancer in females. Hence, it is necessary to distinguish the importance of risk factors of cervical cancer to classify potential patients. The present work proposes a cervical cancer prediction model (CCPM) that offers early prediction of cervical cancer using risk factors as inputs. The CCPM first removes outliers by using outlier detection methods such as density-based spatial clustering of applications with noise (DBSCAN) and isolation forest (iForest) and by increasing the number of cases in the dataset in a balanced way, for example, through synthetic minority over-sampling technique (SMOTE) and SMOTE with Tomek link (SMOTETomek). Finally, it employs random forest (RF) as a classifier. Thus, CCPM lies on four scenarios: (1) DBSCAN + SMOTETomek + RF, (2) DBSCAN + SMOTE+ RF, (3) iForest + SMOTETomek + RF, and (4) iForest + SMOTE + RF. A dataset of 858 potential patients was used to validate the performance of the proposed method. We found that combinations of iForest with SMOTE and iForest with SMOTETomek provided better performances than those of DBSCAN with SMOTE and DBSCAN with SMOTETomek. We also observed that RF performed the best among several popular machine learning classifiers. Furthermore, the proposed CCPM showed better accuracy than previously proposed methods for forecasting cervical cancer. In addition, a mobile application that can collect cervical cancer risk factors data and provides results from CCPM is developed for instant and proper action at the initial stage of cervical cancer.
Article
Full-text available
Food is essential for human life and has been the concern of many healthcare conventions. Nowadays new dietary assessment and nutrition analysis tools enable more opportunities to help people understand their daily eating habits, exploring nutrition patterns and maintain a healthy diet. In this paper, we develop a deep model based food recognition and dietary assessment system to study and analyze food items from daily meal images (e.g., captured by smartphone). Specifically, we propose a three-step algorithm to recognize multi-item (food) images by detecting candidate regions and using deep convolutional neural network (CNN) for object classification. The system first generates multiple region of proposals on input images by applying the Region Proposal Network (RPN) derived from Faster R-CNN model. It then indentifies each region of proposals by mapping them into feature maps, and classifies them into different food categories, as well as locating them in the original images. Finally, the system will analyze the nutritional ingredients based on the recognition results and generate a dietary assessment report by calculating the amount of calories, fat, carbohydrate and protein. In the evaluation, we conduct extensive experiments using two popular food image datasets - UEC-FOOD100 and UEC-FOOD256. We also generate a new type of dataset about food items based on FOOD101 with bounding. The model is evaluated through different evaluation metrics. The experimental results show that our system is able to recognize the food items accurately and generate the dietary assessment report efficiently, which will benefit the users with a clear insight of healthy dietary and guide their daily recipe to improve body health and wellness.
Article
Full-text available
The overwhelming majority of scientific knowledge is published as text, which is difficult to analyse by either traditional statistical analysis or modern machine learning methods. By contrast, the main source of machine-interpretable data for the materials research community has come from structured property databases1,2, which encompass only a small fraction of the knowledge present in the research literature. Beyond property values, publications contain valuable knowledge regarding the connections and relationships between data items as interpreted by the authors. To improve the identification and use of this knowledge, several studies have focused on the retrieval of information from scientific literature using supervised natural language processing3–10, which requires large hand-labelled datasets for training. Here we show that materials science knowledge present in the published literature can be efficiently encoded as information-dense word embeddings11–13 (vector representations of words) without human labelling or supervision. Without any explicit insertion of chemical knowledge, these embeddings capture complex materials science concepts such as the underlying structure of the periodic table and structure–property relationships in materials. Furthermore, we demonstrate that an unsupervised method can recommend materials for functional applications several years before their discovery. This suggests that latent knowledge regarding future discoveries is to a large extent embedded in past publications. Our findings highlight the possibility of extracting knowledge and relationships from the massive body of scientific literature in a collective manner, and point towards a generalized approach to the mining of scientific literature.
Article
Full-text available
There has been a rapid increase in dietary ailments during last few decades, caused by unhealthy food routine. Mobile-based dietary assessment systems that can record real time images of meal and analyze it for nutritional content can be very handy and improve the dietary habits, and therefore, result in healthy life. This paper proposes a novel system to automatically estimate food attributes such as ingredients and nutritional value by classifying the input image of food. Our method employs different deep learning models for accurate food identification. In addition to image analysis, attributes and ingredients are estimated by extracting semantically related words from a huge corpus of text, collected over the Internet. We performed experiments with a dataset comprising 100 classes, averaging 1000 images for each class to acquire top 1 classification rate of up to 85 percent. An extension of a benchmark dataset Food-101 is also created to include sub-continental foods. Results show that our proposed system is equally efficient on basic Food- 101 dataset and its extension for sub-continental foods. The proposed system is implemented as a mobile app that has its application in healthcare sector.
Chapter
Computational modeling plays a central role in cognitive science. This book provides a comprehensive introduction to computational models of human cognition. It covers major approaches and architectures, both neural network and symbolic; major theoretical issues; and specific computational models of a variety of cognitive processes, ranging from low-level (e.g., attention and memory) to higher-level (e.g., language and reasoning). The articles included in the book provide original descriptions of developments in the field. The emphasis is on implemented computational models rather than on mathematical or nonformal approaches, and on modeling empirical data from human subjects. Bradford Books imprint
Article
In food and toxicology science, a huge amount of research and other data has been collected. To enable its full utilization, advanced statistical and computer methods are required. All data is related to food items, but additionally include different kinds of information. Nowadays the consumption of avocado has increased. To understand the full impact of this increased consumption on public health and the environment, different data related to avocado need to be considered. In this paper, we present an approach for representing foods in the form of vectors of continuous numbers (food embeddings) as an alternative solution to manual indexing. The utility of representing food data as a vector of continuous numbers was evaluated and demonstrated in four tasks: i) automated determination of different food groups, ii) automated detection of the food class for each food concept (raw, derivative or composite), iii) identification of most similar food concepts for a given food concept, and iv) qualitative evaluation by a food expert. The experimental results showed that these kind of vector representations outperform the traditional representational methods used for food data analysis, and thus they present a step forward to more advanced food data analysis used for discovering new knowledge.
Article
Recently, mobile applications for recording everyday meals draw much attention for self dietary. However, most of the applications return food calorie values simply associated with the estimated food categories, or need for users to indicate the rough amount of foods manually. In fact, it has not been achieved to estimate food calorie from a food photo with practical accuracy, and it remains an unsolved problem. Then, in this paper, we propose estimating food calorie from a food photo by simultaneous learning of food calories, categories, ingredients and cooking directions using deep learning. Since there exists a strong correlation between food calories and food categories, ingredients and cooking directions information in general, we expect that simultaneous training of them brings performance boosting compared to independent single training. To this end, we use a multi-task CNN. In addition, in this research, we construct two kinds of datasets that is a dataset of calorie-annotated recipe collected from Japanese recipe sites on the Web and a dataset collected from an American recipe site. In the experiments, we trained both multi-task and single-task CNNs, and compared them. As a result, a multi-task CNN achieved the better performance on both food category estimation and food calorie estimation than single-task CNNs. For the Japanese recipe dataset, by introducing a multi-task CNN, 0.039 were improved on the correlation coefficient, while for the American recipe dataset, 0.090 were raised compared to the result by the single-task CNN. In addition, we showed that the proposed multi-task CNN based method outperformed search-based methods proposed before.