Computer Vision-Based Wood Identification: A Review

  • CITAR - Universidade Católica Portuguesa
  • GreenUPorto

Abstract and Figures

Wood identification is an important tool in many areas, from biology to cultural heritage. In the fight against illegal logging, it has a more necessary and impactful application. Identifying a wood sample to genus or species level is difficult, expensive and time-consuming, even when using the most recent methods, resulting in a growing need for a readily accessible and field-applicable method for scientific wood identification. Providing fast results and ease of use, computer vision-based technology is an economically accessible option currently applied to meet the demand for automated wood identification. However, despite the promising characteristics and accurate results of this method, it remains a niche research area in wood sciences and is little known in other fields of application such as cultural heritage. To share the results and applicability of computer vision-based wood identification, this paper reviews the most frequently cited and relevant published research based on computer vision and machine learning techniques, aiming to facilitate and promote the use of this technology in research and encourage its application among end-users who need quick and reliable results.
Forests 2022, 13, 2041.
Computer Vision-Based Wood Identification: A Review
José Luís Silva 1,*, Rui Bordalo 1, José Pissarra 2 and Paloma de Palacios 3
1 Research Center for the Science and Technology of the Arts, School of Arts, Universidade Católica
Portuguesa, 4169-005 Porto, Portugal
2 Green UPorto and Department of Biology, Faculty of Sciences, University of Porto, 4169-005 Porto, Portugal
3 Department of Natural Systems and Resources, School of Forestry and Natural Environment Engineering,
Universidad Politécnica de Madrid, 28040 Madrid, Spain
* Correspondence:
Keywords: computer vision; machine learning; deep learning; convolutional neural networks;
image recognition; wood anatomy; wood identification; illegal logging
1. Introduction
Illegal logging is one of the most pressing environmental issues, particularly in trop-
ical countries with large forest areas and botanical groups that are highly valued in inter-
national markets. Illegal logging is currently the most profitable ecological crime world-
wide, accounting for 10 to 30% of the global timber trade [1,2].
Although Amazonian forests are traditionally seen as the hotspot of illegal logging,
areas such as Southeast Asia, Central Africa and Russia, home to roughly 60% of the
world’s forests, are unfortunately experiencing a surge in this crime [1,3]. The financial
impact of illegal logging is estimated at 52 to 157 billion dollars a year [1], but, more im-
portantly, the environmental damage, in many cases irreversible, can also have a global
ecological impact [4].
Several institutional and international legal measures have been put in place to pre-
vent overexploitation and irreversible loss of species and habitats [57]. Innovative pro-
grammes are emerging [8,9], solid research is under way [1014], and research with sig-
nificant impact and news items are being shared worldwide [15].
The impact of wood identification extends beyond illegal trading and ecological is-
sues. Wood identification is paramount for the timber industry, civil and structural engi-
neering, criminology, archaeology, art history, ethnography, and conservation and resto-
ration, and many other disciplines.
Despite the multiple wood identification methods now available, the varied results,
costs, accessibility, deployment time and limiting factors hinder their applicability to real-
world identification. This paper presents an overview of the changes that have occurred
in wood identification methods and a review of computer vision-based wood identifica-
tion, which is currently one of the fastest developing research areas in artificial intelligence
(AI) with very promising results and high identification accuracy. In this technique, visual
data are processed from any given image to extract the relevant features in order to make
a decision.
Analogic and Digital Systems
Historically, wood identification methods mainly comprised the study of chemical
and physical and anatomical features aspects of wood. Methods such as macroscopy,
which uses the physical characteristics of wood observable to the naked eye or with a 10×
hand lens, and microscopy, which resorts to light compound microscopes to the observa-
tion of multiple cell typologies that constitute the wood, were the first to be used. The
main limitation of these methods is that wood cannot always be identified at the species
level. As a result, there has been an emergence of multiple techniques such as near-infra-
red spectroscopy [1618], DNA barcoding [1921], mass spectrometry [2224], and X-ray
tomography [2527], with optical microscopy still used as a confirmation method for the
results of these techniques.
However, an important contribution was made with the advent of computer-based
technologies, which rapidly became a preferred option for constructing species databases
and hosting identification tools. Several wood identification databases and software based
on digital technology have been made available, including GUESS [28,29], CSIROID [30]
and the DELTA system [31], three of the most significant early programmes in achieving
the goal of wood identification. As a proof of concept, the importance of these systems
was fundamental for the development of what is today defined as computer-assisted
wood identification. The results they obtained made considerable progress compared to
earlier methods, especially with regard to the time required and identification accuracy.
The DELTA-Intkey for commercial timbers is the only one of these three systems still in
2. Online Reference Databases for Wood Identification
This section briefly describes all the digital reference databases available online, to
the best of our knowledge. The common objective of online identification keys is to enable
and facilitate analysis of wood anatomical features and, ultimately, identification of the
wood. Table 1 summarises the computer-assisted wood identification systems mentioned
Forests 2022, 13, 2041 3 of 26
Table 1. Summary of computer-assisted wood identification systems.
Number of
Commercial timbers: descriptions, il-
lustrations, identification, and infor-
mation retrieval
404 hardwoods
Major forest re-
gions of the
Microscopic descrip-
tions and illustrations
Freely availa-
Anatomy of European and North
American Woods
325 hardwoods
101 softwoods
North America
Microscopic descrip-
tions and illustrations
Freely availa-
Includes features adapted to identifi-
cation of carbonised woods from ar-
chaeological contexts
Wood database of the Forestry and
Forest Products Research Institute
Microscopic descrip-
tions and illustrations
Freely availa-
7653 modern hard-
235 modern soft-
2173 fossil hard-
Microscopic descrip-
tions and illustrations
58,146 modern
3807 fossil
1482 modern
Freely availa-
Includes 61,578 searchable images
Wood anatomy of central European
133 hardwoods
and softwoods
Microscopic descrip-
tions and illustrations
Freely availa-
Includes macroscopic and micro-
scopic images and descriptions
44 CITES woods;
31 look-a-like spe-
Specific forest re-
gions of the
Macroscopic descrip-
tions and illustrations
Freely availa-
Includes abundant extra information
Key to a Selection of Arid Australian
Hardwoods and Softwoods
58 hardwoods and
Microscopic descrip-
tions and illustrations
Freely availa-
Detailed information about each spe-
Brazilian Commercial Timbers
275 species
Macroscopic features;
chemical and physical
Freely availa-
110 hardwoods
Microscopic descrip-
tions and illustrations
Forest Species DatabaseMicroscopic
112 hardwoods
and softwoods
Tropical forests
Microscopic descrip-
tions and illustrations
Freely availa-
Includes 2240 searchable microscopic
Forest Species DatabaseMacro-
41 hardwoods and
Macroscopic descrip-
tions and illustrations
Freely availa-
Includes 2942 searchable macro-
scopic images
Forests 2022, 13, 2041 4 of 26
150 hardwoods
and softwoods
Macroscopic descrip-
tions and illustrations
Free of charge
on request
Available in English, German and
53 softwoods
Major forest re-
gions of the
Microscopic descrip-
tions and illustrations
Freely availa-
Forest Species Classifier
112 hardwoods
and softwoods
Macroscopic and mi-
croscopic illustrations
Freely availa-
41 hardwoods and
Freely availa-
44 hardwoods
Microscopic features
Available for
research only
507 hardwoods
and softwoods
French Guiana
Microscopic descrip-
tions and illustrations
Freely availa-
Highly detailed SEM images
Softwood Retrieval System for Conif-
erous Wood
180 softwoods
Under development
UTForestUTFPR Classificador
44 hardwoods and
Macroscopic descrip-
Freely availa-
Mader app
Microscopic features
Database with 1000 images per spe-
n.a.not available.
Forests 2022, 13, 2041 5 of 26
2.1. Commercial Timbers: Descriptions, Illustrations, Identification and Information Retrieval
Among the several DELTA-INTKEY online identification keys that have been devel-
oped, including CITESwoodID and Softwoods, this interactive identification key devel-
oped in 2000 and updated in 20182019 is an integrated database of microscopic descrip-
tions and illustrations of 409 internationally traded hardwood taxa [32]. It covers major
forest regions of the world and is freely available online.
2.2. Anatomy of European and North American Woods
Created in 2000, this system includes 426 wood taxa [33] and provides an interactive
identification key for the common non-commercial wood species of Europe and North
America. It includes 325 hardwood species and 101 softwood species (native and intro-
duced) with 145 features and 15 sets. It has two extra sets that isolate the features applica-
ble to identification of modified and carbonised wood from palaeobotanical contexts. A
useful feature, which is missing from the commercial timbers key, is a set that isolates the
features extracted from the IAWA standards.
2.3. Wood Database of the Forestry and Forest Products Research Institute
This online database created in 2003 focuses on the identification of 781 Japanese tree
species, substantiating the descriptions on the IAWA list of hardwood features [53]. It
includes a multiple-entry key and an image database and is freely available online [34].
2.4. InsideWood
Developed in 2004, InsideWood is by far the largest and best-known online identifi-
cation key [35]. It is a multiple access key based on the IAWA hardwood list [53] and is
freely available online. It includes keys for hardwoods, softwoods and fossil hardwoods,
and more than 10,030 microscopic anatomic descriptions covering all regions of the world,
with more than 63,435 searchable images. It is a centralised database that integrates all the
anatomical data available for modern wood.
2.5. Wood Anatomy of Central European Species
This is a completely revised and updated version of Schweingruber’s work [54], cre-
ated in 2004 [36] and last updated in 2007. It is a web-based identification key with 133
species, accompanied by macroscopic and microscopic descriptions. It also provides in-
formation such as sample preparation, staining and other procedures, and is freely avail-
able online.
2.6. CITESwoodID
CITES (Convention on International Trade in Endangered Species of Wild Fauna and
Flora) [55] is an agreement that was drawn up in 1963 after a meeting of members of IUCN
(The World Conservation Union). One of many initiatives intended to contribute to the
goals of the agreement, CITESwoodID was developed in 2005 [37] and last updated in
2017. As the name indicates, the platform focuses on CITES species and is an interactive
identification key of macroscopic descriptions with an integrated database. It includes il-
lustrations of 44 CITES protected woods and 31 look-a-like trade species. It provides com-
prehensive, detailed information about each species, with advice on how to avoid misin-
terpretations, and numerous explanatory notes of the relevant features and procedures
for description and identification. It is freely available online.
2.7. Key to a Selection of Arid Australian Hardwoods and Softwoods
Stemming from doctoral research [38], this interactive key focuses on Australian
woods. It is hosted on the Lucid website and includes 58 wood-producing species of arid
Australia, particularly non-commercial species. It is mostly based on specimens from
Forests 2022, 13, 2041 6 of 26
northeast South Australia, southwest Queensland and far western New South Wales and
is freely available online.
2.8. Brazilian Commercial TimbersInteractive Wood Identification Key
As the name suggests, this is an interactive identification system focusing on Brazil-
ian species. Made available in 2010 [39], it was developed in collaboration with the Forest
Products Laboratory (LPF) and the Brazilian Forest Service (SFB). It is hosted on the Lucid
website and includes 275 species, among them Brazilian CITES-listed timber species. All
the nomenclature was revised in 2020 according to the Brazilian Flora Species List. The
key works by analysing macroscopic features and chemical and physical tests on the
woods. It is freely available online.
2.9. Pl@ntwood
Pl@ntwood [40] was developed in 2011 and is described by the authors as an interac-
tive graphical identification tool based on the IDAO system, specifically designed to be
user friendly. It comprises 110 Amazonian tree species belonging to 34 angiosperm fami-
lies and includes microscopic morphological features.
2.10. The Forest Species DatabaseMicroscopy (FSDM)
Created in 2013, this online database [41,42] comprises 2240 microscopic images of
112 species, 85 genera and 30 families of both hardwoods and softwoods. It is freely avail-
able online.
2.11. The Forest Species DatabaseMacroscopy (FSDM)
This online database for forest species identification [43,44] was made available in
2014. It includes 2942 macroscopic images of 41 Brazilian forest species and is freely avail-
able online.
2.12. MacroHOLZdata
MacroHOLZdata [45], created in 2002 and made available for the first time in 2016,
is another interactive identification key with an integrated database for macroscopic wood
descriptions. Completely redesigned in 2022, it is available in German, English and Span-
ish, and includes 150 common hardwood and softwood commercial timbers. The database
is free of charge.
2.13. Softwoods
Softwoods [46] is a database that was made available from 2016 onwards. This inter-
active identification key with an integrated database focuses only on gymnosperms, as
the title suggests. It provides microscopic descriptions and illustrations of 53 taxa in the
world’s main forest regions. It is free on request.
2.14. Forest Species Classifier
Made available in 2018, Forest Species Classifier is the result of a master’s degree [47].
It is a user-friendly online database focusing on Brazilian forest species. It uses macro-
scopic [43] and microscopic [41] databases and includes microscopic images of 112 species
and macroscopic images of 41 species, with a total of 5182 images. It is freely available
2.15. UTForestUTFPR Classificador
This new version of the Forest Species Classifier platform [51] has been available
since 2021. It allows macroscopic identification of 44 native species of Brazil and includes
1318 images.
Forests 2022, 13, 2041 7 of 26
2.16. Charcoal
Developed in 2018, this database comprises charcoal samples of 44 Brazilian hard-
wood forest species, using 528 images [48,56]. It is available for research purposes only.
2.17. CharKey
This 2019 electronic identification key is described by the authors [49] as the first
computer-aided identification key designed for charcoals from French Guiana. It uses
SEM photographs to illustrate the anatomical features of 507 species belonging to 274 gen-
era and 71 families. Most of the descriptions were taken from Détienne et al. [57], and
follow the IAWA list of microscopic features for hardwood identification [53]. The key
contains 289 “items”, and its main aim is to identify specimens to the genus level. It is
freely available online.
2.18. Softwood Retrieval System (SRS) for Coniferous Wood
The Softwood Retrieval System (SRS) for Coniferous Wood [50] is an online identifi-
cation key with descriptions and micrographs of 180 Chinese coniferous wood species
(155 species with descriptions and microphotographs and 25 species with only micropho-
tographs) from nine families and more than 1000 images showing anatomical details. The
system is searchable by an interactive multiple-entry key. The microphotographs were
collected from slices of 115 coniferous species provided by the Wood Collection of the
Chinese Academy of Forestry (Beijing, CAFw) and 40 coniferous species from the Herbar-
ium of Southwest Forestry University (Kunming, SWFUw). The descriptions use features
from the IAWA List of Microscopic Features for Softwood Identification [58]. The system
supports three retrieval methods for coniferous wood retrieval: species name, anatomical
characteristics, and microscopic anatomical images (in test).
2.19. Mader App
This mobile app is under development. Its goal is to contribute to the global wood
identification effort and the fight against illegal logging using AI [52]. The project com-
prises 26 species, with a vast image database of 1000 images per species, aiming to obtain
maximum intraspecific variability for each species. The images were taken using a porta-
ble microscope and the app aims to obtain real-time recognition of samples. Preliminary
data from the authors indicate that accuracy is 95%. The authors intend to make the app
available soon on the Play Store, and the database used will be freely available for neural
network training [52].
3. Computer Vision-Based Wood Identification
The digital systems described above are the foundation of the systems which, despite
their limitations, are currently used to identify wood, mostly based on computer vision
technology. They are applicable to several fields of research and industry, including neu-
robiology, autonomous vehicles, and facial recognition. Computer vision systems process
visual data from any given image or video to extract the required and relevant features to
make a decision [59].
This image recognition ability, also known as image classification, is one of the most
important research areas in AI and is most frequently based on supervised learning. In
this case, the network is required to create a model that learns from labelled images to
determine classification rules, then it classifies the input data based on these same rules
(generally used for image classification). In the case of unsupervised learning, it is the
model that obtains unknown information through unlabelled data (generally used for im-
age clustering) [60].
Machine learning can also decide what to do without human assistance from the data
recognised by computer vision (input data), using predesigned algorithms [61,62]. This
Forests 2022, 13, 2041 8 of 26
removes the need to teach the model the necessary features or procedures for wood iden-
tification [63].
Computer vision technology is very appealing to many researchers because of its
verifiable potential for field application [64] and proven ability to recognise and quantify
wood structure variations that are not easily discernible using strictly “human” analysis.
It is also an affordable resource [65] and, therefore, scalable. However, for the software to
correctly interpret the specific architecture structure of the samples analysed to such a
high level of precision, reference material must be constantly entered into the image da-
tabase so that it can recognise natural variations in wood structure [12].
Computer vision-based wood identification is the real-world application of combin-
ing two types of software with different approaches within AI [66,67].
Figure 1 shows a pipeline of this method.
Figure 1. General scheme of machine learning method for image classification (based on [63,68]).
3.1. Machine Learning
Machine learning operates primarily as software that recognises patterns from input
images that are processed to define a descriptive structure to which the unknown image
will be referenced [69]. This involves various stages, as follows.
3.2. Image Acquisition
The most frequently used types of image are macroscopic images (obtained without
magnification using a normal digital camera) [7073], stereograms (stereoscopic images
obtained with hand lens magnification, ca. 10×) [66,67,7476], micrographs (optical micro-
scopic images) [7779], SEM images (up to 10,000×) [80], and X-ray computed tomography
(CT) images [26,81].
Light control and uniformity are significant issues in image processing [67,82,83].
They include techniques that are used to filter and normalise image brightness [8486].
3.3. Image Datasets
Image dataset construction or availability is one of the most significant factors among
the multiple issues that can affect the performance of computer vision-based wood iden-
tification systems.
The more extensive the dataset is, the more naturally occurring biological variations
within a species will be accessed and learned by the model. However, because construct-
ing a dataset of wood samples is such a difficult and time-consuming task, most studies
use wood collections for references [41,70,75,82,8790].
This limitation is countered to some extent by initiatives such as ImageNet [91]. Aim-
ing to advance computer vision and deep learning research, the ImageNet dataset was
made freely available to researchers worldwide. It contains 14.2 million images across
more than 20,000 classes. A similar process is under way with herbaria digitalisation [92
94]. However, despite the efforts made [42,79,88,95,96], the lack of free access to world-
wide wood image datasets continues to be the main constraint for computer vision-based
wood identification [63].
Table 2 shows the main currently available datasets that have useful data for com-
puter vision-based wood identification research.
Forests 2022, 13, 2041 9 of 26
Table 2. Wood image datasets available for computer vision-based wood identification research,
adapted from [63].
Number of
Number of Images
Commercial hardwood species of Malaysia
Commercial hardwood species of Indonesia
Wood species at Zhejiang A&F University
Commercial wood species of Central Africa
Major Fagaceae species of Japan
Lauraceae species of East Asia
Wood species of Greece
Wood species of Brazil
3.4. Image Processing
Machine learning comprises two independent procedures: feature processing, also
known as extraction (extraction of relevant features from input images), and classification
(learning extracted features and querying image classification). There is, however, a pre-
vious step to image processing.
Pre-processing aims to convert the image into data that a specific algorithm can use
to extract the required features, thus reducing computational complexity and facilitating
subsequent processing [100]. The techniques used for this include greyscale conversion
and image cropping [72,75,79,89,101,102], filtering [84,86], image sharpening [75,103] and
denoising [80,104,105].
Another important pre-processing procedure is data splitting, where the dataset is
split into subsets, most commonly training, validation, and test sets. Data splitting is ulti-
mately used to create a training set, a validation set and a test set in order to later evaluate
the model performance. To understand the reasons for these sets, one should think that
machine learning systems mimic the human learning process based on examples. From
this training set, the system will learn to generalise in order to correctly classify the im-
ages. The validation set is used to avoid the system learning the images from memory
during this generalization process. Finally, the test set is used to check the reliability of
the learning process. The use of these distinct data samples is one of the earliest pre-pro-
cessing steps needed to evaluate any model’s performance.
More specifically, the system will interpret the images extracted from the designated
training set as nothing more than a combination of pixels. Each pixel will have a specific
intensity represented by a number, and in this way a matrix of numbers is formed. Image
processing is based on extracting elements such as points, blobs, angles, corners and
edges, and the patterns they form. Variability in the anatomy of each wood species is rep-
resented as patterns of distinct pixel intensities, arrangement, distribution, and aggrega-
tion. The variations detected by computer vision will be learned by machine learning. This
process is the fundamental operating system of computer vision for all applications, in-
cluding wood identification [63].
After the extracted features have been learned, a classification model is established
by a classifier and a test set is formed to evaluate the system’s learning. The images are
then input, allowing the classification model to complete the identification through feed-
back of the predicted classes of each image [63].
Computer vision detects and “sees” the input image using multiple feature extraction
algorithms, while machine learning selects the types of features to be extracted, in most
cases texture and local features.
Texture features work with the combination and arrangement of image elements
(pixel intensities and resulting patterns) [66,75,97,103,106]. The most frequently used tech-
niques are grey level co-occurrence matrix (GLCM), grey level aura matrix (GLAM), local
Forests 2022, 13, 2041 10 of 26
binary pattern (LBP), higher local order autocorrelation (HLAC), and Gabor filter-based
features (GFBF). Despite the individual capabilities of each technique, texture fusion of
different types of texture features has shown superior classification accuracy
Local features differ from texture features by not describing an image as a unit, but
by describing significant and important specific features (keypoints) such as edges, cor-
ners or points. The most frequently used algorithms are scale-invariant feature transform
(SIFT), speeded up robust features (SURF), oriented features-from-accelerated-segment-
test (FAST) and rotated binary-robust-independent-elementary-feature (BRIEF) (ORB),
and Accelerated-KAZE (AKAZE).
Beyond features typology, factors such as dimensionality reduction and feature se-
lection are also important, as a large number of features extracted from an image can sub-
stantially reduce the computational efficiency of classification models. To achieve this bal-
ance, methods such as R AutoEncoder [109], principal component analysis (PCA), linear
discriminant analysis (LDA), and genetic algorithms (GA) are used for dimension reduc-
tion of data sets [110].
Another important element is the classification models created to learn the extracted
features and establish classification rules. The most frequently used classifiers are k-near-
est neighbours (k-NN), support vector machines (SVM), and artificial neural networks
(ANN) [111,112]. These classification procedures can be executed either in on-site hard-
ware [113] or on a cloud-based interface [90].
Machines can be easily misled by factors such as the source of images, which can be
acquired in the field using mobile phones [90] or in a laboratory-controlled environment
[71,114], and variables including different thicknesses, orientation, staining, digital arte-
facts and other variations, which is why many thin sections from historical wood collec-
tions are useful only to the trained human eye [10].
4. Deep Learning
Deep learning is among the most notable and promising of the many branches of
machine learning research.
As a neural network that attempts to simulate the function, structure and behaviour
of the human brain (Figure 2), it has the capacity to process and “learn” large amounts of
data [115,116].
Figure 2. General pipeline of deep learning models for image classification (based on [63,68]).
Its multiple different architectures include ANN [117], deep neural networks (DNN)
[118], recurrent neural networks (RNN) [119], deep reinforcement learning (DRL) [120],
and convolutional neural networks (CNN) [121]. The fields to which these have been ap-
plied are so vast that they are very difficult to summarise, but they include computer vi-
sion [122], forensic research [123], climate science [124], machine translation [125], classic
literature [126] and bioinformatics [127], to name just a few.
Among these multiple architectures, it is mostly ANNs and CNNs that are applied
to wood characterisation and identification.
Table 3 summarises this research and the applications of deep learning technologies.
Forests 2022, 13, 2041 11 of 26
Table 3. Research and applications of deep learning technologies.
Species Geo-
graphic Origin
Image Type
Section Type
Number of
Similar Species
Number of Im-
cessing Im-
Features De-
Classification Accu-
Samples from natural
Canary Islands
Biometric data
Feedforward mul-
tilayer perceptron
No specific source
Multiple classifica-
tion methods
Hardwoods vs. soft-
(LOOCV) 89%,
(EVT) 93%.
7 species
(LOOCV) 81%,
(EVT) 80%.
Samples from natural
with linear kernel
Texture fusion
Two-level divide-
Samples from natural
Iberian Penin-
Biometric data
Resilient back-
Feedforward mul-
tilayer perceptron
Samples from the tim-
ber industry
Democratic Re-
public of the
CellB (ver-
sion 3.2,
88% species level
89% genus level
90% family level
Forests 2022, 13, 2041 12 of 26
class of
Central and
South America,
trained image clas-
Scale dataset 100%
Macroscopic dataset
Microscopic dataset
Samples from the tim-
ber industry
Radial & between
the two planes
Central and
South America
CM, Inc.;
Central America
and Central Af-
Slivers for metabo-
lome profiling
Samples from natural
Amazonia At-
lantic region
CM, Inc.;
Central America
and Central Af-
Species level
Genus level
North America
2012 ImageNet
Forests 2022, 13, 2041 13 of 26
1.2 million
Samples from Lumber
North America
SGD optimizer
Adam optimizer
Radial & in be-
tween the two
from trunks of leafing
Train set
tional en-
coder net-
Wood patch classifi-
Wood core classifi-
Democratic Re-
public of the
n.a. not available. Institutions: CM, Inc.Carlton McLendon, Inc.; CVLO-CELOSCentrum voor Landbouwkundig Onderzoek/Centre for Agricultural Research
in Suriname; DSB-FWRCDepartment of Sustainable Bioproducts/Forest and Wildlife Research Center; FFPRIForestry and Forest Products Research Institute
(Japan); GACDGarman Art Conservation Department, SUNYBuffalo State; HNMIBHerbario Nacional de xico; Instituto de Biología; LWA-UFP
Laboratory of Wood Anatomy at the Federal University of Parana; MADwUSDA Forest Products Laboratory Wood Collection of Madison, Wisconsin; OSU
Oregon State University; PVPrivate vendor; RBwBotanic Garden of Rio de Janeiro, Brazil; SJRwSamuel J. Record Collection; TXWD-RMCATervuren
Xylarium Wood DatabaseRoyal Museum for Central Africa. Models & Techniques: DART-TOFMSDirect Analysis in Real Time, Time of Flight Mass Spec-
trometry; EVTExternal validation test; GLCMGrey level co-occurrence matrix; GSCGrey scale conversion; kNNk-nearest neighbours; LDAlinear dis-
criminant analysis; LOOCVLeave-one-out-cross-validation; MVRFMulti-view random forest; PCAPrincipal component analysis; RFRandom Forests;
RiLPQRotation Invariant Local Phase Quantisation; SGD optimizerStochastic gradient descent; SVMSupport vector machine; TLTransfer learning; XRF
X-ray fluorescence spectrometry.
Forests 2022, 13, 2041 14 of 26
4.1. Artificial Neural Networks (ANN)
Artificial neural networks are not only one of the main investigation methods, but
also constitute the foundation of deep learning [63]. These mathematical structures in-
spired by biological neural networks are a form of supervised or unsupervised learning
that show high ability to learn from examples given to them and extrapolate the infor-
mation when applied to future non-identified samples. This ability to reproduce, model
and “learn” nonlinear processes has given ANNs widespread applications in multiple
disciplines [79,117].
In the field of wood differentiation and identification, examples of research applying
ANNs include:
- Esteban et al. [128] used a feedforward multilayer perceptron (MLP) network, which
uses a similar structure to ANN to distinguish between Juniperus cedrus and J. phoe-
nicea var. canariensis, obtaining a 92% probability of correctly differentiating the spe-
- Mallik et al. [80] applied SEM to wood cross sections with 1500× magnification to
obtain species-level identification through the shape, number, area and distribution
of earlywood tracheids, processed by image segmentation, object recognition and
statistical methods. Their results showed that when distinguishing between hard-
woods and softwoods, a 0.89 accuracy was obtained using leave-one-out cross-vali-
dation and 0.93 using an external validation test (EVT), and when differentiating
seven wood species, they obtained a 0.81 accuracy using one-leave-out cross-valida-
tion and 0.80 using an EVT;
- The same microscopic features analysis was applied by Martins et al. [78], who used
microscopic transverse sections applying local phase quantisation (LPQ), local binary
patterns (LBP) and grey-level co-occurrence matrix (GLOM) to identify Brazilian spe-
cies. The process was applied to 112 species, 85 genera and 30 families, obtaining a
recognition rate of 98.6% for differentiation of hardwoods and softwoods and 86%
for discrimination of the 112 species;
- Turhan [129] used the SVM as a machine learning algorithm to differentiate Salix alba,
S. caprea and S. eleagnos, obtaining a 95.2% success rate;
- Filho et al. [72] used a two-level divide-and-conquer classification strategy to differ-
entiate 41 species of Brazilian flora, obtaining the highest accuracy level, of 97.77%;
- Esteban et al. [131] used a multilayer perceptron (MP) to differentiate Pinus sylvestris
L. and P. nigra Arn subsp. salzmannii (Dunal) Franco, obtaining 81.2% accuracy in the
testing set;
- Silva et al. [79] used microscopic images of cross sections of 77 commercial wood
species from the Democratic Republic of the Congo for surface texture analysis, re-
porting 88% successful identifications at species level, 89% at genus level and 90% at
family level.
- He et al. [136] applied machine learning classifiers SVM, Naive Bayes (NB), Decision
Tree C5.0 and ANN) to discriminate between Swietenia macrophylla King, S. mahagoni
(L.) Jacq and S. humilis Zucc. The best results were obtained with SVM, with an over-
all accuracy of 91.4%;
- Deklerck et al. [137] used machine learning not for image-based data processing, but
for metabolome profile obtained through direct analysis in real-time (DART™) ioni-
sation combined with time-of-flight mass spectrometry (TOFMS) to study the heart-
wood of 175 samples of 10 species of the Meliaceae family. Combining these tech-
niques resulted in accuracy levels of 82.2%;
- de Andrade et al. [74] generated 2000 macroscopic images of 21 species using a
smartphone and samples manually polished with a knife to replicate field conditions.
A grey level co-occurrence matrix for the development of classifiers based on SVM
was used, resulting in accuracies of 97.7%;
Forests 2022, 13, 2041 15 of 26
- Silva et al. [141] used 77 Congolese wood species as a reference base for applying a
multi-view random forest (MVRF) model for species-level identification. To ensure
information was not missed, the authors used images of the three anatomical planes.
The results showed that the concatenation of features from the transverse and tan-
gential planes clearly outperforms transverse-only analysis, while adding the radial
plane minimally improves the results obtained. The use of the MVRF model outper-
formed concatenation of LPQ features. The results showed that the supplementary
information added using three planes analysis and the model type considerably im-
prove the final results. Moreover, when evaluating the performance of the systems
developed, using the k-fold cross-validation scheme could have led to overestima-
tion of the results, so the authors applied a leave-k-tree-out approach during cross-
validation. The results showed that implementing this approach dramatically de-
creased accuracy compared with traditional cross-validation schemes.
4.2. Convolutional Neural Networks (CNN)
Convolutional neural networks are one of the most significant applications of ANNs.
In the AI context, a CNN is a class of feedforward ANN that has been successfully applied
to digital image processing analysis.
A CNN processes images more effectively by applying filtering techniques to ANNs
[116]. This is a powerful and accurate way of solving classification problems, and CNNs
are mainly credited for their role in image analysis, recognition, and classification. The
architecture of a CNN typically has multiple layers between input and output: three con-
volutional layers, a pooling layer and a fully connected layer. These layers process differ-
ent tasks during the images course. As the images progress through the distinct layers,
features such as edges, colours and shapes are extracted and interpreted. These features
are then learned and classified by the deep neural network, resulting ultimately in the
network’s ability to identify a specific object [63,116,142]. Other advantages are the capac-
ity of automatically recognise important features without human supervision.
CNNs have difficulty dealing with variance in the data presented, as tilted or rotated
images. This results in a limitation to encode an object’s orientation and position or pro-
cess spatially invariant data.
Research examples applied to wood identification include:
- Hafemann et al. [130] applied the CNN model 3-ConvNeta to identify macro images
of 41 species and micro images of 112 species. The results obtained 95.77% accuracy
for macroscopic images and 97.32% accuracy for microscopic images;
- Kwon et al. [132] applied six LeNet and MiniVGGNet CNN models to identify five
Korean softwood species (Cryptomeria japonica, Chamaecyparis obtuse, Pinus koraiensis,
P. densiflora, Larix kaempferi), using an iPhone 7 camera to obtain macroscopic images
of rough sawn surfaces from cross sections. Of all the CNN models tested, LeNet3
achieved the highest results and stability, with two extra layers added to the original
LeNet architecture. The identification accuracy obtained was 99.3%. The authors re-
ported that the software weight of the CNN created is small enough for installation
on a mobile device such as a smartphone;
- Maintaining the objective of ensuring field applicability, Kwon et al. [134] acknowl-
edged the real-world limitations of not including longitudinal wood surfaces. Using
mobile device cameras to obtain macroscopic images, they applied a combination of
models, obtaining the best results with LeNet2, LeNet3 and MiniVGGNet4. Their re-
sults showed an overall accuracy of 98% and an improvement on their earlier study,
particularly in the case of P. koraiensis and P. densiflora;
- Figueroa-Mata et al. [87] applied deep convolutional networks for identification of
41 Brazilian forest species from xylotheque samples at species level, achieving an ac-
curacy of 98.3%;
Forests 2022, 13, 2041 16 of 26
- Ravindran et al. [113] used CNNs to identify 10 neotropical species in the Meliaceae
family (Cabralea canjerana, Carapa guianensis, Guarea glabra, G. grandifolia, Khaya ivoren-
sis, K. senegalensis, and the CITES-listed Swietenia macrophylla, S. mahagoni, Cedrela fis-
silis, and C. odorata), using only the transverse surface. The results showed an accu-
racy of 87.4 to 97.5%;
- To develop an automatic classification system for charcoal, Maruyama et al. [48] ap-
plied two LBP configurations of as texture descriptors. As state-of-the-art machine
learning classifiers, SVM and random forests (RF) have shown the best results. Incep-
tion_v3 CNN was applied for representation learning evaluation. The database com-
prised 44 charcoal samples from Brazilian native species from natural forests. The
authors reported that both handcrafted features and RL achieved results of around
95% recognition rate;
- Oliveira et al. [133] used databases developed by Filho et al. [72] and Martins et al.
[78] to access cross sections of 2942 wood macroscopic images of 41 species and 2240
microscopic images of 112 species, applying CNNs to create three models. Based on
the results, the authors reported 100% recognition accuracy for the scale model,
98.73% for the macroscopic model, and 99.11% for the microscopic model;
- Kanayama et al. [135] applied a deep CNN approach to near-infrared hyperspectral
imaging (NIR-HSI) using a principal component (PC) algorithm to identify 120 sam-
ples of 38 hardwood species. The results obtained showed 90.5% accuracy;
- A CNN was also used by Ravindran and Wiedenhoeft [67] to compare the macro-
scopic field identification programme XyloTron, using an ImageNet pre-trained Res-
Net34 CNN, with mass spectrometry to differentiate 10 Meliaceae species used by
Deklerck et al. [137]. The results showed identification accuracy of 81.9% at the spe-
cies level and 96.1% at the genus level compared to 74.9% and 91.4%, respectively, in
the work by Deklerck et al. [137];
- Lopes et al. [82] applied the InceptionV4_ResNetV2 CNN to analyse macroscopic im-
ages of the end-grain of 10 xylarium North American hardwood species, producing
1869 images using a smartphone fitted with a 14× macro lens. Their results showed
an accuracy of 92.6%;
- de Geus et al. [138] applied the DenseNet CNN to recognise 281 species, using the
largest dataset of microscopic transverse, radial and tangential images available at
the time. Rotation invariant LPQ (RiLPQ) showed the best results of the feature de-
scriptors used. The authors reported an identification accuracy of 98.8%;
- Olschofsky and Köhl [70] applied Inception-v3, an image classification model using
a CNN for feature recognition and classification, pre-trained with 1.2 million images.
The CITES-protected species Cedrella odorata was chosen and compared with 13 other
tropical tree species for recognition. The results with the pre-trained CNNs had 98%
accuracy, but when other tree species not used for training were added, the classifi-
cation accuracy fell to 87%;
- The ResNet101 CNN, associated with an SVM as classifier, was applied by Lens et al.
[77] to species-level identification of 112 mainly neotropical tree species, using only
transverse sections but focusing on microscopic rather than macroscopic analysis.
The results showed successful identification in 95.6% of cases;
- Wu et al. [139] applied deep convolutional neural networks (CNNs) for the identifi-
cation of 11 rough saw hardwood North American species based on tangential plane
images only. CNNs ResNet-50, DenseNet-121, as well as MobileNet-V2 were tested,
resulting in an overall accuracy of 98.2%.
- Shugar et al. [140] combined X-ray fluorescence spectrometry (XRF) and a CNN to
identify 48 wood specimens of both hardwoods and softwoods, mostly from heart-
wood and using either tangential or radial sections. They reported 99% identification
accuracy from the 66 datasets;
- In the study by Fabijańska et al. [73], a CNN with residual connections was tested to
identify 312 wood core scanned images of 14 European softwood and hardwood tree
Forests 2022, 13, 2041 17 of 26
species, developing a wood patch classification and a wood core classification. The
results showed that the proposed model correctly recognised patch images in 93% of
cases and wood core images in 98.7%. Comparison of the results also showed that
this model outperformed the state-of-the-art convolutional neural network-based
4.3. Generative Adversarial Networks (GANs)
Within deep learning, GANs [143] are described as neural networks that can learn to
generate realistic samples from the data on which they were trained.
They use a neural network as a generator that takes a random distribution of data as
input and learns to map that information to output the desired distribution of data. A
second neural network, known as a discriminator (a binary classifier), will use the input
and output images to determine the probability of the image originating as a training im-
age (real) or on the generator (fake), thus assessing the most likely class to which the out-
put image belongs [144].
Generative adversarial networks can produce highly realistic images using CNNs in
an unsupervised manner [145]. Their application extends to multiple fields of scientific
research, but they remain poorly explored in wood sciences [146148].
- Addressing the possibility of eliminating economic and processing burdens in ac-
quiring images of worldwide wood species for machine-learning training purposes,
Lopes et al. [145] accessed 119 hardwood species references on the publicly available
Xylarium Digital Database [88]. Applying a style-based GAN, they successfully gen-
erated highly realistic and anatomically meaningful synthetic microscopic cross-sec-
tional images of hardwood species which they reported as virtually indistinguishable
from real cross-sectional images.
- To evaluate the resemblance, quality and pattern evaluation between the synthetic
and real cross sections, a structural similarity index measure (SSIM) and Fréchet in-
ception distance (FID) were applied and a visual Turing test (VTT) was performed
by wood anatomists to confirm the usefulness and realism of the GAN-generated
images. The results showed that the artificially generated images were indistinguish-
able from real microscopic cross-sectional images.
- The authors [145] reported that it is even feasible to generate synthetic hybrids based
on microscopic cross-sectional images from two parental species. This would have
considerable implications on wood science and technology, especially for estimating
the wood permeability, strength, density, or hydraulic potential, for example, of a
species that has not even been planted.
5. Field Applicable Wood Identification Systems
One of the most interesting features of computer vision-based wood identification
systems is their field application capability. Despite the consensus that it will be a long
time before this technology becomes readily available not only to researchers and law en-
forcement bodies, but also the general public, it is evident that this goal is reachable. Pro-
grammes already developed or under development to respond to field application needs
5.1. MyWood-ID
Described as an automated wood identification mobile app [90,149], MyWood-ID
uses a smartphone with a retrofitted macro lens and machine vision for macroscopic wood
identification. The system uses a database of 20 species of timber native to Malaysia and
provides a simple and effective way to acquire macroscopic wood digital images. The im-
ages are then uploaded to a cloud server via an internet connection for immediate identi-
fication results. It is intended to be cost-effective, easily accessible and intuitive, and to
provide fast results.
Forests 2022, 13, 2041 18 of 26
These characteristics are evident when compared with other field deployable systems
[65,75]. As differentiating features of their wood identification system, the authors cite its
portability, lower initial cost, faster field deployment time, intuitive use, and continual
online database update. However, it requires a constant internet connection for results
and is operating-system-dependent (running only on iPhone 6 and 7). The main limitation
of this system is the lack of consistent light control for wood image acquisition, although
the authors indicate that this can be mitigated using the learning capability of a deep
learning algorithm. The results achieved by this system have an accuracy of 96% to 98%.
It is a paid app.
5.2. MyWood-Premium
This is an update of the previous app, developed by FRIM (Wood Anatomy Lab of
Forest Research Institute Malaysia) and UTAR (Universiti Tunku Abdul Rahman) [150].
The updated version comprises a database of 100 wood species native to Malaysia. It is
available only on iPhone, iPod touch and Mac, and requires iOS 8.0 or later. It recom-
mends the Ollo-clip™ Macro Lens with 21× magnification for optimum performance. It is
a free app.
5.3. Xylorix
Another recent approach to rapid field wood identification is Xylorix [151,152], a
platform that combines a suite of apps, tools and services. Xylorix Inspector is a wood
identification mobile app that uses macroscopic features for automated identification. It
is based on trained AI models to automatically identify the wood genus or species. How-
ever, a Xylorix WIDK-24X01 illuminated macro lens must be attached to the mobile phone
camera for correct performance. It is available on either Apple iOS or Android operating
systems and is supported by most mobile phones. Of the 24 species in the system database,
11 are free and the other 13 are paid.
5.4. XyloTron
XyloTron is a paid, open-source, image-based macroscopic field identification pro-
gramme designed for wood and charcoal identification [65,153]. It features adjustable and
controlled visible light, UV illumination capacity, and all the necessary software to control
the device, capture images, and deploy the trained classification models.
It works by capturing high-quality images of wood or charcoal samples with visible
or UV light. The identification accuracy for wood is described as 97.7%, increasing with
the use of UV light to 99.1% (e.g., identification confusion between Albizia sp., fluorescent,
and Inga sp., not fluorescent) and 98.7% for charcoal.
One limitation is that it is not a simple or easily deployable on-site system to use,
because it requires a permanent connection to a laptop computer.
Ravindran and Wiedenhoeft [67] compared the performance of XyloTron and MS for
species- and genus-level identification of 10 species of Meliaceae. The results showed a
similar species-level accuracy of the XyloTron and MS models, but higher genus-level ac-
curacy with XyloTron [67].
5.5. XyloPhone
To overcome visual aberrations (field distortion and spherical aberration), uncon-
trolled light sources, high prices, and a lack of real field applicability, Wiedenhoeft [83]
proposed the XyloPhone. Described as an open-source, 3D-printed imaging attachment
adaptable to virtually any smartphone for macroscopic image capture, it is a small, closed
plastic box that provides a fixed focal distance, exclusion of ambient light, and a choice of
visible or UV illumination. It is powered by a rechargeable external battery and a com-
mercially available lens, making it affordable and, according to the author, providing
comparable image quality to XyloTron.
Forests 2022, 13, 2041 19 of 26
To document features such as evenness of illumination, distortion, maximum reso-
lution, and spherical aberration, the author compared the Xylophone + iPhone (XPi), the
XyloPhone + Samsung (XPs), the Ollo Clip 14× + iPhone (OCi), and the Xylorix + iPhone,
with two distinct configurations. He reported that XyloPhone’s optical performance, es-
pecially when used with more recent smartphones, is clearly superior to the lenses/light-
ing arrays of other systems [83].
5.6. WIDER
WIDER is a battery-charged portable system that uses spectroscopy measurement
and machine-learning-based identification software. It comprises a database of 15 species
and the authors [154] reported accuracy results of 95%. It is part of a larger project that
was completed in 2021 and brought into use by USAID PEER Cycle 8 (Development of
Wood Identification System and Timber Tracking Database to Support Legal Trade). The
same project developed the ECVT 4D Dynamic [154] technology for monitoring tree phys-
iological processes.
5.7. IMAIapp
IMAIapp [155] is a wood identification mobile app that uses a lens attached to a
smartphone. Its purpose is to use convolutional neural networks (CNN) capable of carry-
ing out timber identification through machine learning of macroscopic elements observed
in photo enlargements. The difficulty of the problem lies in the number of classes that the
method is required to recognise automatically, from a total of 400 wood species and a high
number of macroscopic images. EfficientNet architecture is enhanced by a novel approach
for pre-processing that combines computer vision and data augmentation techniques ap-
plied to the original dataset. The use of classification models based on deep learning is the
leading technique with the best performance at present, and the innovative approach to
increase the quality of the training data makes the model integrated in IMAIapp robust to
rotation, illumination and zoom invariants. This means the app can be used in the field.
Using TensorFlow Lite libraries for Apple and Google platforms, the application works
standalone, is 100% executable from the mobile device and does not require a connection
to the Internet. IMAIapp is therefore a design using an edge-computing method that is
intended to avoid computing constraints on the mobile device on which it is installed. The
project is under development and the app will be free for Android and iOS.
6. Discussion
In the last 100 years, what we now call the traditional wood identification method
based on anatomical descriptions has followed well defined, standardised features to suc-
cessfully distinguish and identify the multiple families, genera and species of angiosperm
and gymnosperm trees. However, despite the many positive aspects of this method, it is
now evident that it is reaching its limit.
The main limitations are the identification uncertainty at species level, the time-con-
suming methodology, the lack of anatomists with the necessary training for the task, and
the associated costs of these professionals, mainly when on-site identifications are re-
quired. These limitations have a notable impact on the type of monitoring that can be
carried out, e.g., in the fight against illegal logging. Faster, more accurate and economi-
cally scalable methods are urgently needed.
As a valuable response to this need, the results obtained so far by computer vision-
based identification (CVBI) of wood and, in particular, deep learning approaches, clearly
demonstrate that this method has enormous potential for wood identification and quan-
titative wood anatomy.
Despite the many obstacles remaining, this method is steadily adapting to overcome
limitations such as inter- and intra-anatomical variability, high anatomical resemblance,
Forests 2022, 13, 2041 20 of 26
non-homogeneous illumination, staining or deformed samples, and limited image data-
bases, among many other issues.
Of all the resources discussed, deep learning appears to be the most significant and
promising solution in AI developments, and CNN models applied to wood sciences are
one of the leading and most rapidly evolving systems. CNNs have exhibited a notably
more efficient capacity and accuracy for quantitative wood anatomy and feature recogni-
tion, alongside computer cost reduction.
Field-deployable identification systems appear to be the most important and impact-
ful option in computer vision-based wood identification. This resource, based on ad-
vances in communication technologies, will enable more prolific, increasingly accurate
and faster screening by authorities without human prejudice, particularly with regard to
illegal timber and charcoal trading.
Computer vision-based identification technology could become one of the most ef-
fective and unavoidable weapons in the fight against the illegal timber and charcoal trade,
as it enables individuals who are untrained in traditional identification to obtain highly
accurate and legally binding identifications on the spot.
The multiple future contributions of the technologies underpinning CVBI for wood
sciences are difficult to fully envision at present. However, it is of utmost importance to
overcome or at least mitigate the limitations that are severely hampering the development
and implementation of these systems.
Two of the most pressing issues are the limited number of digital databases, which
are specific to geographically restricted areas/species or inaccessible to the global research
community, and the lack of extensive field testing and verification hindering the accuracy
quantification of the systems.
The most urgent actions required are the construction of a freely accessible global
digital wood image database, the availability of this tool in a cloud-based system for ac-
cess everywhere, by everyone, and priority inclusion of CITES-listed species and their
