Access to this full-text is provided by MDPI.
Content available from Applied Sciences
This content is subject to copyright.
Academic Editor: Pedro Couto
Received: 17 March 2025
Revised: 16 April 2025
Accepted: 18 April 2025
Published: 21 April 2025
Citation: Shen, Y.; Kong, M.; Yu, H.;
Liu, L. A Texture-Based Simulation
Framework for Pose Estimation. Appl.
Sci. 2025,15, 4574. https://doi.org/
10.3390/app15084574
Copyright: © 2025 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license
(https://creativecommons.org/
licenses/by/4.0/).
Article
A Texture-Based Simulation Framework for Pose Estimation
Yaoyang Shen , Ming Kong * , Hang Yu and Lu Liu *
School of Measurement Technology and Instrumentation, China Jiliang University, Hangzhou 310020, China;
dylan_shenyy@163.com (Y.S.); yuhang@cjlu.edu.cn (H.Y.)
*Correspondence: mkong@cjlu.edu.cn (M.K.); lu_liu@cjlu.edu.cn (L.L.); Tel.: +86-18222286965 (L.L.)
Abstract: An accurate 3D pose estimation of spherical objects remains challenging in
industrial inspections and robotics due to their geometric symmetries and limited fea-
ture discriminability. This study proposes a texture-optimized simulation framework
to enhance pose prediction accuracy through optimizing the surface texture features of
the design samples. A hierarchical texture design strategy was developed, incorporating
complexity gradients (low to high) and color contrast principles, and implemented via VTK-
based 3D modeling with automated Euler angle annotations. The framework generated
2297 synthetic
images across six texture variants, which were used to train a MobileNet
model. The validation tests demonstrated that the high-complexity color textures achieved
superior performance, reducing the mean absolute pose error by 64.8% compared to the
low-complexity designs. While color improved the validation accuracy universally, the
test set analyses revealed its dual role: complex textures leveraged chromatic contrast for
robustness, whereas simple textures suffered color-induced noise (a 35.5% error increase).
These findings establish texture complexity and color complementarity as critical design
criteria for synthetic datasets, offering a scalable solution for vision-based pose estimation.
Physical experiments confirmed the practical feasibility, yielding 2.7–3.3
◦
mean errors.
This work bridges the simulation-to-reality gaps in symmetric object localization, with
implications for robotic manipulation and industrial metrology, while highlighting the
need for material-aware texture adaptations in future research.
Keywords: texture design; dataset construction; pose estimation; deep learning; spherical
particles
1. Introduction
A three-dimensional pose of spherical objects is pivotal for industrial inspection [
1
],
remote sensing applications [
2
], particle tracking [
3
], and Robotic Machining Systems [
4
].
While the traditional methods of pose estimation rely on geometric feature matching [
5
]
or point cloud registration [
6
], they struggle under dynamic lighting and occlusion [
7
].
Recent research advances have combined pose estimation with deep learning to perform
end-to-end pose estimation by training convolutional neural networks with datasets [
8
–
11
].
Although this method is very convenient, there are still two key challenges in practical
applications: (1) the cost of obtaining the real-world gesture data with intensive annota-
tion [
12
] is too high, and human errors easily occur in the annotation process; (2) the lack of
discriminability of features leads to a limited generalization ability of the model. When
applied to unknown targets or when there are drastic lighting changes, the performance
of the model decreases significantly [
13
,
14
]. Moreover, if the observed object is regularly
symmetric, such as a sphere, the problem of rotation ambiguity inherent in symmetry
cannot be solved [14,15].
Appl. Sci. 2025,15, 4574 https://doi.org/10.3390/app15084574
Appl. Sci. 2025,15, 4574 2 of 15
In recent years, the method of printing characteristic texture patterns on the surface
of a sphere has been used to solve the problem of obtaining particle attitude information
in particle rotation dynamics. Zimmermann et al. [
16
] matched experimental particle
textures with synthetic templates using stereo vision, while Mathai et al. [
17
] optimized
binary surface patterns via cost function minimization. Though effective in controlled
settings, these methods exhibit limited generalization due to texture sensitivity. Will et al.’s
stereolithography-based rendering [
18
] further highlights the trade-off between pattern
complexity and computational feasibility.
Recent advances in texture-driven pose estimation include the work of Zhang K
et al., who discussed in detail the importance of color and texture characteristics in using
neural networks to estimate coal ash content and showed through experimental results and
visualizations that the importance of color in a CNN was as high as 64.77%, while the texture
characteristics contributed 35.23% [
19
]. Wang Zhen et al. constructed a texture optimization
network that combined contextual aggregate information and used the network for texture
restoration to enhance low-light images [
20
]. These studies show that texture features play
a crucial role in deep learning and image processing.
To address these limitations, this work introduces a simulation-driven framework
combining VTK-based synthetic data generation with Tamura texture theory. The specific
research objectives and contents are as follows:
•
Objective: bridge the simulation–reality gap in attitude estimation for symmetrical
objects using synthetic data and texture theory.
•
Texture design: direction-sensitive surface textures governed by Tamura texture princi-
ples (coarseness, contrast, and directionality) with six variants (three complexity levels
×grayscale/color).
•
Data generation: VTK-based synthetic data with automated pose space sampling
(3◦intervals) and implicit Euler angle encoding.
•
Validation: experimental evaluation using 3D-printed textured spheres and real-world
attitude measurements.
This study establishes and experimentally validates a set of texture design criteria for
spherical objects. Using 3D-printed textured spheres and actual attitude measurements, we
bridge the simulation–reality gap in the positioning of symmetrical objects. This provides a
robust attitude estimation for robotics and industrial metrology while offering a scalable
synthetic data paradigm.
2. Materials and Methods
This study proposes a simulation dataset construction scheme based on representa-
tional texture. Firstly, based on the texture-related theory, a series of textures are designed
with different complexities. Secondly, the texture attachment and pose simulation of
spherical particles are established through the VTK simulation library, and automatic
pose change and pose information annotation are designed. Finally, a batch of image
label datasets with pose information is obtained, which can support the end-to-end deep
learning model training.
2.1. Texture Design
Texture, as a visual attribute of the surface of an object, reflects the statistical character-
istics of the microstructure of the surface of an object and contains a wealth of structural
information, which plays a crucial role in image recognition and object attitude estimation.
In the field of computer vision, texture is usually defined as the spatial distribution pattern
of gray pixel values within an image area.
Appl. Sci. 2025,15, 4574 3 of 15
Early research shows that the key to texture perception is to extract the basic features
of texture. Tamura et al. proposed six basic texture features, including coarseness, contrast,
directionality, line-likeness, regularity, and roughness [
21
]. These features can effectively
describe the statistical characteristics of texture and provide a theoretical basis for texture
analysis and recognition. Recent studies have further quantified the relationship between
texture features and research accuracy. Dzier ˙
zak et al. demonstrated that an optimized
feature selection from 290 texture descriptors (including gray-level statistics and wavelet
transforms) significantly improves osteoporosis detection in CT scans, with the k-nearest
neighbors algorithm achieving 96.75% accuracy using 50 prioritized features [
22
]. Trevisani
et al. proposed a method to quantify surface roughness through multi-scale texture analysis
and constructed a scalable roughness index. These indexes can reveal terrain texture
characteristics at different spatial scales, provide a new dimension for terrain analysis, and
improve the accuracy of geomorphic analysis [
23
]. He et al. further studied the impact of
texture distribution on visual perception and proposed rules for how texture distribution
affects visual perception [
24
]. Their research showed that specific texture distribution
patterns can enhance the visual system’s perception, thereby improving the accuracy of
object recognition.
The stripe spacing annular ratio usually refers to the ratio of the spacing (d) between
the adjacent stripes in a ring or periodic texture pattern to the characteristic size of the ring
structure, such as the circumference L or radius r. Its mathematical expression is as follows:
η=d/2πr×100% (1)
This ratio reflects the distribution density of the fringes in the ring structure and is a
key parameter of texture design. In optical measurement or machine vision, if the fringe
spacing is too small (the ring ratio is too low), the imaging system may have aliasing effects
due to insufficient sampling, resulting in the fringe not being able to be accurately resolved.
According to Nyquist’s sampling theorem, the sampling frequency needs to be at least
twice the resolution of the system [
25
]. In display technology, if the spatial frequency
corresponding to the pixel spacing is
fp
, the frequency of the moiré fringe needs to meet
the following:
fMoire′≤fp
2→d
2πr≥5% (2)
In the fringe projection system, the ring ratio should be
≥
5% to avoid the phenomenon
of a moiré fringe on the display screen, resulting in phase unwrapping errors [26].
We considered this in the process of designing the texture patterns, so the fringe
spacing ratio is controlled above 5%. Based on the above theoretical basis, three textures
with similar proportion distribution and fringe spacing but different complexities are
designed. Texture1 is a low-complexity texture composed only of a horizontal stripe texture.
Texture2 is a medium-complexity texture, adding a columnar stripe texture, and the texture
differentiation is higher. Texture3 adds more columnar stripes and short horizontal stripes
for a more complex texture area.
CIE-Lab color difference (
∆E
) is an internationally used quantitative index of color
difference. The minimum color difference perceptible to the human eye is
∆E≈
1, but
∆E≥5
is needed to be reliably distinguished. When
∆E
is higher than a certain condition,
the color difference is significant (the human eye can clearly distinguish it), and it is suitable
for robust recognition in machine vision systems. Studies have shown that
∆E
> 30 can
resist light changes, noise interference, and sensor errors to ensure the stability of color
features in complex environments [27]. In this case, we improve on the original black and
white texture. Because in the Lab color domain the corresponding brightness of black and
white is 0 and 100, which are two extreme colors, while the corresponding Lab values of
Appl. Sci. 2025,15, 4574 4 of 15
blue are about 30, 68, and
−
112 and green is about 46,
−
52, and 49, the color difference
between the two
∆E≈
200 and the color difference of the four colors is far greater than 30.
Therefore, a color texture pattern with the same texture distribution is designed, composed
of black, white, green, and blue.
A total of six texture patterns, with three different texture complexities and
two versions
(black and white and color), are designed, as shown in Figure 1below.
Figure 1a–f correspond to six different textures. To further differentiate, these textures
are named in order: Texture_1_bw, Texture_1_color, Texture_2_bw, Texture_2_color, Tex-
ture_3_bw, and Texture_3_color. Texture1-3 corresponds to the three textures of different
complexities mentioned above, while bw represents the texture in black and white type
and color represents the color type.
Appl. Sci. 2025, 15, x FOR PEER REVIEW 4 of 15
features in complex environments [27]. In this case, we improve on the original black and
white texture. Because in the Lab color domain the corresponding brightness of black and
white is 0 and 100, which are two extreme colors, while the corresponding Lab values of
blue are about 30, 68, and −112 and green is about 46, −52, and 49, the color difference
between the two ∆𝐸 ≈ 200 and the color difference of the four colors is far greater than
30. Therefore, a color texture paern with the same texture distribution is designed, com-
posed of black, white, green, and blue.
A total of six texture paerns, with three different texture complexities and two ver-
sions (black and white and color), are designed, as shown in Figure 1 below. Figure 1a–f
correspond to six different textures. To further differentiate, these textures are named in
order: Texture_1_bw, Texture_1_color, Texture_2_bw, Texture_2_color, Texture_3_bw,
and Texture_3_color. Texture1-3 corresponds to the three textures of different complexi-
ties mentioned above, while bw represents the texture in black and white type and color
represents the color type.
(a) (b)
(c) (d)
(e) (f)
Figure 1. Figures (a,b) are the black and white type and color type with a low texture complexity;
Figures (c,d) are the black and white type and color type with a medium texture complexity; and
Figures (e,f) are the black and white type and color type with a high texture complexity.
2.2. Dataset Construction
An automated simulation dataset is developed, building on methodology based on
the Visualization Toolkit (VTK) [28]. As an open-source 3D visualization framework,
VTK’s object-oriented design and cross-platform characteristics support complex scene
Figure 1. Figures (a,b) are the black and white type and color type with a low texture complexity;
Figures (c,d) are the black and white type and color type with a medium texture complexity; and
Figures (e,f) are the black and white type and color type with a high texture complexity.
2.2. Dataset Construction
An automated simulation dataset is developed, building on methodology based on
the Visualization Toolkit (VTK) [
28
]. As an open-source 3D visualization framework, VTK’s
object-oriented design and cross-platform characteristics support complex scene modeling
and high-precision rendering. The core process includes three modules: 3D modeling,
attitude control, and automatic annotation. The core process is implemented in a Pycharm
environment using Python 3.8 and its integrated VTK library.
Appl. Sci. 2025,15, 4574 5 of 15
A.
3D modeling and texture mapping
The VTK geometric modeling class is used to build a sphere model, and its radius and
spatial position are set by parameterization. In order to realize the discernability of the
surface features, the designed texture is mapped to the surface of the model, and the texture
coordinate transformation mechanism is used to ensure its continuity and consistency on
the surface. The virtual imaging environment is configured with a simulation camera, a
multi-light source system, and a physical rendering engine, where the camera parameters
(focal length, field angle, and sensor size) strictly mimic real industrial inspection equipment
to ensure the physical consistency of the generated images.
B. Attitude control and data generation
The Euler angle rotation sequence of the sphere around the x-/y-/z-axes is defined. In
order to avoid the angular coupling effect, the independent axis incremental step method
is adopted and a sampling interval of 3
◦
is set to generate a discrete attitude set. Each
pose corresponds to a unique coordinate transformation matrix. After the VTK rendering
and pipeline real-time calculation, the corresponding two-dimensional projection image is
outputted. A single projected image corresponds to unique pose information. In addition,
to improve the efficiency of the data generation, a batch script is designed to automate the
process of iterating, rendering, and storing the attitude parameters.
C. Pose coding and dataset construction
An implicit encoding method for the pose parameters, based on the file name, is
proposed. The Euler angle (pitch angle, yaw angle, and roll angle) is embedded into the
image file name in the format of “X_Y_Z.png” to avoid data management redundancy
caused by independent annotation files. Through the design of the parsing script, the
angle value in the file name is converted into a three-dimensional vector, which is used
as the truth value label for the network training. The final dataset contains the image
pose information that supports the end-to-end deep learning model training. Some of the
datasets generated under different textures are shown in Figure 2below.
Appl. Sci. 2025, 15, x FOR PEER REVIEW 5 of 15
modeling and high-precision rendering. The core process includes three modules: 3D
modeling, aitude control, and automatic annotation. The core process is implemented in
a Pycharm environment using Python 3.8 and its integrated VTK library.
A. 3D modeling and texture mapping
The VTK geometric modeling class is used to build a sphere model, and its radius
and spatial position are set by parameterization. In order to realize the discernability of
the surface features, the designed texture is mapped to the surface of the model, and the
texture coordinate transformation mechanism is used to ensure its continuity and con-
sistency on the surface. The virtual imaging environment is configured with a simulation
camera, a multi-light source system, and a physical rendering engine, where the camera
parameters (focal length, field angle, and sensor size) strictly mimic real industrial inspec-
tion equipment to ensure the physical consistency of the generated images.
B. Aitude control and data generation
The Euler angle rotation sequence of the sphere around the x-/y-/z-axes is defined. In
order to avoid the angular coupling effect, the independent axis incremental step method
is adopted and a sampling interval of 3° is set to generate a discrete aitude set. Each pose
corresponds to a unique coordinate transformation matrix. After the VTK rendering and
pipeline real-time calculation, the corresponding two-dimensional projection image is
outpued. A single projected image corresponds to unique pose information. In addition,
to improve the efficiency of the data generation, a batch script is designed to automate the
process of iterating, rendering, and storing the aitude parameters.
C. Pose coding and dataset construction
An implicit encoding method for the pose parameters, based on the file name, is pro-
posed. The Euler angle (pitch angle, yaw angle, and roll angle) is embedded into the image
file name in the format of “X_Y_Z.png” to avoid data management redundancy caused
by independent annotation files. Through the design of the parsing script, the angle value
in the file name is converted into a three-dimensional vector, which is used as the truth
value label for the network training. The final dataset contains the image pose information
that supports the end-to-end deep learning model training. Some of the datasets gener-
ated under different textures are shown in Figure 2 below.
(a)
(b)
(c)
Figure 2. Cont.
Appl. Sci. 2025,15, 4574 6 of 15
Appl. Sci. 2025, 15, x FOR PEER REVIEW 6 of 15
(d)
(e)
(f)
Figure 2. Partial datasets generated by different textures. Figures (a–f) correspond to Texture_1_bw
-Texture_3_color.
3. Simulations and Design Criterion
In this section, a series of simulations are conducted to verify the method’s effective-
ness. Firstly, the simulation image training set generated by different textured particles in
the previous chapter is utilized. By training a CNN model, the validation error on the
validation set is verified. Secondly, the test error on the test set is further verified by visual
and quantitative analyses. By comparing the performance of different texture datasets, a
set of design rules for spherical grain texture are defined, and the final texture paern is
determined.
MobileNet is chosen as the model for this experiment [29], and a composite loss func-
tion, consisting of the mean absolute error (MAE) and the Pseudo-Huber loss function, is
designed. For all the training in this section, 1600 training sets, 597 verification sets, and
100 test sets were used. AdamW is chosen as the optimizer, using a learning rate of 0.0005
and scheduling with a cosine annealing strategy. In addition, each model is trained with
20 epochs. The expression of the loss function is as follows:
𝐿𝑜𝑠𝑠(𝑦,𝑦)=⎩
⎪
⎨
⎪
⎧
1
𝑁 𝑎
, 𝑖𝑓 𝑎 ≤𝛿
𝛿(
1+(𝑎
𝛿)− 1), 𝑖𝑓 𝑎> 𝛿 (3)
where 𝑛 is the number of samples; 𝑎 = |𝑦 − 𝑦|; and 𝛿=1° in this example.
3.1. Verification Set Performance Comparison
In this section, the performance of the CNN on the validation sets under different
texture datasets is visually and quantitatively analyzed. The specific quantitative analysis
results are shown in Table 1, and the visual analysis results are shown in Figure 3. In the
table, (Err_X, Err_Y, Err_Z) represents the average angular error of the aitude angle cor-
responding to each coordinate axis; (Std_X, Std_Y, Std_Z) stands for standard deviation
(Std); MAE stands for total mean absolute error; and RMSE stands for total root mean
square error, which accounts for error magnitudes and is less prone to cancellation effects
compared to the mean error metrics.
As shown in Table 1, the model’s pose estimation accuracy improves progressively
with an increasing texture complexity. Texture1 yield the highest errors (MAE = 1.178° for
Figure 2. Partial datasets generated by different textures. Figures (a–f) correspond to Texture_1_bw
-Texture_3_color.
3. Simulations and Design Criterion
In this section, a series of simulations are conducted to verify the method’s effective-
ness. Firstly, the simulation image training set generated by different textured particles
in the previous chapter is utilized. By training a CNN model, the validation error on the
validation set is verified. Secondly, the test error on the test set is further verified by visual
and quantitative analyses. By comparing the performance of different texture datasets,
a set of design rules for spherical grain texture are defined, and the final texture pattern
is determined.
MobileNet is chosen as the model for this experiment [
29
], and a composite loss
function, consisting of the mean absolute error (MAE) and the Pseudo-Huber loss function,
is designed. For all the training in this section, 1600 training sets, 597 verification sets, and
100 test sets were used. AdamW is chosen as the optimizer, using a learning rate of 0.0005
and scheduling with a cosine annealing strategy. In addition, each model is trained with
20 epochs. The expression of the loss function is as follows:
Loss(y,ˆ
y) =
1
N
n
∑
i−1
a2,i f a ≤δ
δ2 r1+a
δ2−1!,i f a >δ
(3)
where nis the number of samples; a=|y−ˆ
y|; and δ=1◦in this example.
3.1. Verification Set Performance Comparison
In this section, the performance of the CNN on the validation sets under different
texture datasets is visually and quantitatively analyzed. The specific quantitative analysis
results are shown in Table 1, and the visual analysis results are shown in Figure 3. In
the table, (Err_X, Err_Y, Err_Z) represents the average angular error of the attitude angle
corresponding to each coordinate axis; (Std_X, Std_Y, Std_Z) stands for standard deviation
(Std); MAE stands for total mean absolute error; and RMSE stands for total root mean
square error, which accounts for error magnitudes and is less prone to cancellation effects
compared to the mean error metrics.
Appl. Sci. 2025,15, 4574 7 of 15
Table 1. Validation set performance index under different textures.
Texture Err_X Err_Y Err_Z Std_X Std_Y Std_Z MAE RMSE
Texture_1_bw 1.7485 0.663 1.121 0.706 0.529 0.873 1.178 1.469
Texture_1_color
0.769 0.654 1.499 0.493 0.454 1.534 0.974 1.189
Texture_2_bw 0.764 0.663 1.097 0.539 0.540 0.873 0.842 1.078
Texture_2_color
0.710 0.658 1.004 0.568 0.874 0.657 0.791 1.037
Texture_3_bw 0.305 0.654 0.517 0.246 0.449 0.426 0.492 0.661
Texture_3_color
0.284 0.367 0.596 0.211 0.242 0.466 0.416 0.543
All indicators are measured in degrees (◦).
As shown in Table 1, the model’s pose estimation accuracy improves progressively
with an increasing texture complexity. Texture1 yield the highest errors (MAE = 1.178
◦
for
black and white and 0.974
◦
for color; RMSE = 1.469
◦
and 1.189
◦
), while medium-complexity
textures (Texture2) reduce the MAE to 0.842
◦
and 0.791
◦
(RMSE = 1.078
◦
and 1.037
◦
),
respectively. Texture3, which is composed of complex textures, has the best performance,
with the MAE = 0.492
◦
in black and white type and 0.416
◦
in color type and the RMSE
reaching 0.661
◦
and 0.543
◦
, which is 64.8% higher than texture 1. Notably, the RMSE values
follow a similar decreasing trend as the MAE but emphasize a greater error magnitude
reduction in the high-complexity textures. This trend shows that the complexity of the
texture features may be highly correlated with the model’s pose estimation ability.
Appl. Sci. 2025, 15, x FOR PEER REVIEW 7 of 15
black and white and 0.974° for color; RMSE = 1.469° and 1.189°), while medium-complex-
ity textures (Texture2) reduce the MAE to 0.842° and 0.791° (RMSE = 1.078° and 1.037°),
respectively. Texture3, which is composed of complex textures, has the best performance,
with the MAE = 0.492° in black and white type and 0.416° in color type and the RMSE
reaching 0.661° and 0.543°, which is 64.8% higher than texture 1. Notably, the RMSE val-
ues follow a similar decreasing trend as the MAE but emphasize a greater error magnitude
reduction in the high-complexity textures. This trend shows that the complexity of the
texture features may be highly correlated with the model’s pose estimation ability.
Color textures consistently outperform their black and white counterparts across all
the complexity levels, with a lower MAE, RMSE, and Std, suggesting color information
strengthens feature discriminability. However, in low-complexity scenarios (Tex-
ture_1_color), the z-axis errors increase significantly, implying potential noise interference
from color in simple paerns. Figure 3 further reveals that a higher texture complexity
concentrates error distributions in low-value regions, particularly for color textures, with
improved consistency across all axes.
Table 1. Validation set performance index under different textures.
Texture Err_X Err_Y Err_Z Std_X Std_Y Std_Z MAE RMSE
Texture_1_bw 1.7485 0.663 1.121 0.706 0.529 0.873 1.178 1.469
Texture_1_color 0.769 0.654 1.499 0.493 0.454 1.534 0.974 1.189
Texture_2_bw 0.764 0.663 1.097 0.539 0.540 0.873 0.842 1.078
Texture_2_color 0.710 0.658 1.004 0.568 0.874 0.657 0.791 1.037
Texture_3_bw 0.305 0.654 0.517 0.246 0.449 0.426 0.492 0.661
Texture_3_color 0.284 0.367 0.596 0.211 0.242 0.466 0.416 0.543
All indicators are measured in degrees (°).
Figure 3. Cont.
Appl. Sci. 2025,15, 4574 8 of 15
Appl. Sci. 2025, 15, x FOR PEER REVIEW 8 of 15
Figure 3. Figures (a–f) plot the distribution of the verification error and Std under different com-
plexities of black and white/color textures.
3.2. Test Set Performance Comparison
In this section, the performance of 100 test set images generated for each texture is
analyzed and compared. The trained model is used to estimate the aitude of the test set,
and the error distribution is visualized. The following visual analysis diagram is drawn,
including a true forecast distribution scaer plot, an error line plot, and an error frequency
domain histogram. In the scaer plot, the more concentrated the data distributed on the
line where 𝑥 = 𝑦, the more accurate the aitude prediction is; otherwise, the larger the
deviation is. The line chart can clearly and intuitively picture the specific error distribution
trend; the histogram makes a statistical analysis of the individual axis error distribution.
The detailed analysis is shown in Figure 4 below.
In the test set, texture 1’s color version underperforms its black and white counter-
part, as seen in Figure 4a,b. Despite a lower MAE on the validation set, the color version
exhibits a significant z-axis error deviation (around 2°) in the test set, highlighting the
potential negative impact of color information on simple textures. Figure 4c,d demon-
strate improved overall performance with more complex textures, with more accurate
predictions on the test set. In this case, the color information enhances the prediction per-
formance. While Figure 4c shows a larger z-axis error, Figure 4d presents a z-axis error
distribution similar to the other axes. Figure 4e,f reveal the optimal model performance
on the test set with complex textures. The model predictions closely match the actual val-
ues, with error distributions within nearly 1° on each axis. Notably, Figure 6 shows a sig-
nificant reduction in the z-axis error deviation with color information, resulting in a more
uniform error distribution across all the axes.
These results indicate that color information has a more positive effect on feature
extraction and overall model performance, particularly with complex textures.
Figure 3. Figures (a–f) plot the distribution of the verification error and Std under different complexi-
ties of black and white/color textures.
Color textures consistently outperform their black and white counterparts across all the
complexity levels, with a lower MAE, RMSE, and Std, suggesting color information strength-
ens feature discriminability. However, in low-complexity scenarios (Texture_1_color), the
z-axis errors increase significantly, implying potential noise interference from color in
simple patterns. Figure 3further reveals that a higher texture complexity concentrates
error distributions in low-value regions, particularly for color textures, with improved
consistency across all axes.
3.2. Test Set Performance Comparison
In this section, the performance of 100 test set images generated for each texture is
analyzed and compared. The trained model is used to estimate the attitude of the test set,
and the error distribution is visualized. The following visual analysis diagram is drawn,
including a true forecast distribution scatter plot, an error line plot, and an error frequency
domain histogram. In the scatter plot, the more concentrated the data distributed on the
line where
x=y
, the more accurate the attitude prediction is; otherwise, the larger the
deviation is. The line chart can clearly and intuitively picture the specific error distribution
trend; the histogram makes a statistical analysis of the individual axis error distribution.
The detailed analysis is shown in Figure 4below.
Appl. Sci. 2025, 15, x FOR PEER REVIEW 8 of 15
Figure 3. Figures (a–f) plot the distribution of the verification error and Std under different com-
plexities of black and white/color textures.
3.2. Test Set Performance Comparison
In this section, the performance of 100 test set images generated for each texture is
analyzed and compared. The trained model is used to estimate the aitude of the test set,
and the error distribution is visualized. The following visual analysis diagram is drawn,
including a true forecast distribution scaer plot, an error line plot, and an error frequency
domain histogram. In the scaer plot, the more concentrated the data distributed on the
line where 𝑥 = 𝑦, the more accurate the aitude prediction is; otherwise, the larger the
deviation is. The line chart can clearly and intuitively picture the specific error distribution
trend; the histogram makes a statistical analysis of the individual axis error distribution.
The detailed analysis is shown in Figure 4 below.
In the test set, texture 1’s color version underperforms its black and white counter-
part, as seen in Figure 4a,b. Despite a lower MAE on the validation set, the color version
exhibits a significant z-axis error deviation (around 2°) in the test set, highlighting the
potential negative impact of color information on simple textures. Figure 4c,d demon-
strate improved overall performance with more complex textures, with more accurate
predictions on the test set. In this case, the color information enhances the prediction per-
formance. While Figure 4c shows a larger z-axis error, Figure 4d presents a z-axis error
distribution similar to the other axes. Figure 4e,f reveal the optimal model performance
on the test set with complex textures. The model predictions closely match the actual val-
ues, with error distributions within nearly 1° on each axis. Notably, Figure 6 shows a sig-
nificant reduction in the z-axis error deviation with color information, resulting in a more
uniform error distribution across all the axes.
These results indicate that color information has a more positive effect on feature
extraction and overall model performance, particularly with complex textures.
Figure 4. Cont.
Appl. Sci. 2025,15, 4574 9 of 15
Appl. Sci. 2025, 15, x FOR PEER REVIEW 9 of 15
Figure 4. Figures (a–f) plot the true forecast distribution scaer plot, error line plot, and error fre-
quency domain histogram under different complexities of black and white/color textures.
Figure 4. Figures (a–f) plot the true forecast distribution scatter plot, error line plot, and error
frequency domain histogram under different complexities of black and white/color textures.
Appl. Sci. 2025,15, 4574 10 of 15
In the test set, texture 1’s color version underperforms its black and white counterpart,
as seen in Figure 4a,b. Despite a lower MAE on the validation set, the color version exhibits
a significant z-axis error deviation (around 2
◦
) in the test set, highlighting the potential
negative impact of color information on simple textures. Figure 4c,d demonstrate improved
overall performance with more complex textures, with more accurate predictions on the test
set. In this case, the color information enhances the prediction performance. While Figure 4c
shows a larger z-axis error, Figure 4d presents a z-axis error distribution similar to the
other axes. Figure 4e,f reveal the optimal model performance on the test set with complex
textures. The model predictions closely match the actual values, with error distributions
within nearly 1
◦
on each axis. Notably, Figure 6 shows a significant reduction in the z-axis
error deviation with color information, resulting in a more uniform error distribution across
all the axes.
These results indicate that color information has a more positive effect on feature
extraction and overall model performance, particularly with complex textures.
3.3. Results, Discussion, and Design Criterion
In the performance test of the test set, the MAE, RMSE, and Std of each texture
corresponding to the test set are recorded, and the texture characteristics and performance
on the validation set are summarized, as shown in Table 2below.
Table 2. Summary of test data for different textures.
Texture Description Val_
Mae
Val_
RMSE Test_Mae Test_
RMSE Test_Std
Texture_1_bw Black and white; low complexity 1.178 1.469 1.052 1.411 0.997, 0.625, 0.916
Texture_1_color Color; low complexity 0.974 1.189 1.32 1.543 0.649, 0.48, 0.632
Texture_2_bw Black and white; medium complexity 0.842 1.078 1.039 1.335 0.6, 0.639, 1.059
Texture_2_color Color; medium complexity 0.791 1.037 1.008 1.237 0.659, 0.737, 0.762
Texture_3_bw Black and white; high complexity 0.492 0.661 0.758 0.911 0.327, 0.665, 0.483
Texture_3_color Color; high complexity 0.416 0.543 0.731 0.876 0.421, 0.422, 0.549
All indicators are measured in degrees (◦).
The experimental results demonstrate a strong positive correlation between texture
complexity and pose estimation accuracy. The validation set performance progressively im-
proves with increasing complexity, and the high-complexity color textures (Texture_3_color)
achieve the optimal results (MAE: 0.416
◦
, RMSE: 0.543
◦
, and per-axis Std
≤
0.549
◦
). No-
tably, the grayscale and color variants exhibit parallel trends, with MAE reductions of 58.2%
and 57.3%, respectively, from Texture_1 to Texture_3, underscoring complexity’s universal
benefit across the color modalities.
However, the test set analyses reveal critical nuances in the model generalization.
While color enhances the validation accuracy universally, its real-world impact proves
complexity-dependent: low-complexity color textures (Texture_1_color) suffer a 35.5% test
error increase, accompanied by a 9.4% RMSE degradation (1.411
◦
to 1.543
◦
), over their
grayscale counterparts, suggesting that chromatic noise dominates when structural features
are sparse. Conversely, high-complexity color textures (Texture_3_color) maintain superior
test performance (MAE: 0.731
◦
, RMSE: 0.876
◦
), with the color-to-grayscale RMSE advantage
persisting (0.876
◦
and 0.911
◦
), despite the domain shift. This duality establishes texture
complexity as a prerequisite for effective color utilization in pose estimation systems.
Based on the analysis results of the above experimental data and the previous theories,
a design criterion for the surface texture of spherical particles for attitude estimation is
finally established:
(1) Orientation uniqueness: it should be ensured that each view corresponds to a
unique orientation, so that the model can distinguish between different poses;
Appl. Sci. 2025,15, 4574 11 of 15
(2) Proper proportion distribution: the proportions of the texture areas and blank areas
should be appropriate, and the pixel ratio should be close to 1:1;
(3) Stripe spacing control: the stripe spacing should be moderate, and the annular
ratio of the stripe spacing should be greater than or equal to 5%;
(4) Complex texture design: the texture design should include complex texture parts
with obvious features to strengthen the features;
(5) Color complementarity: a CIE-Lab color difference
∆E
> 30 color texture combina-
tion of black, white, green, and blue should be used to enhance the color information.
Finally, the color texture with a high texture complexity is chosen as the surface texture
of the spherical particles.
4. Experiments
To verify the accuracy of the texture design, texture is attached to the simulated
particle model for modeling, and the physical spherical particles are 3D-printed. At the
same time, a real machine vision system is built using an industrial CMOS camera, a triaxial
angular displacement table, and a personal computer. Figure 5shows a photo of the entire
system setup.
Appl. Sci. 2025, 15, x FOR PEER REVIEW 11 of 15
4. Experiments
To verify the accuracy of the texture design, texture is aached to the simulated par-
ticle model for modeling, and the physical spherical particles are 3D-printed. At the same
time, a real machine vision system is built using an industrial CMOS camera, a triaxial
angular displacement table, and a personal computer. Figure 5 shows a photo of the entire
system setup.
In the experiments, the pose of the spherical particle is changed using an angular
displacement table. A camera is used to collect the 2D projection image corresponding to
the 3D pose, which is then transmied to a computer. The obtained images are processed
to be applicable to neural network algorithms. In this experiment, the neural network used
is still MobileNet trained to work with texture.
The above system is used to collect 40 real images with different aitude angles, and,
after processing the images, the MobileNet network is used to estimate the actual aitude.
The error between the estimated result and the actual aitude is analyzed statistically. The
detailed analysis is shown in Table 3 below, and a more specific visualization is shown in
Figure 6 below.
Figure 5. The machine vision system.
Figure 6. Box plot of test image error.
Table 3. Error analysis of test data.
Parameter Mean Error Std RMSE Maximum
X-axis 2.717 2.34 3.585 11.843
Y-axis 3.275 3.718 4.955 15.273
Figure 5. The machine vision system.
In the experiments, the pose of the spherical particle is changed using an angular
displacement table. A camera is used to collect the 2D projection image corresponding to
the 3D pose, which is then transmitted to a computer. The obtained images are processed
to be applicable to neural network algorithms. In this experiment, the neural network used
is still MobileNet trained to work with texture.
The above system is used to collect 40 real images with different attitude angles, and,
after processing the images, the MobileNet network is used to estimate the actual attitude.
The error between the estimated result and the actual attitude is analyzed statistically. The
detailed analysis is shown in Table 3below, and a more specific visualization is shown in
Figure 6below.
Table 3. Error analysis of test data.
Parameter Mean Error Std RMSE Maximum
X-axis 2.717 2.34 3.585 11.843
Y-axis 3.275 3.718 4.955 15.273
Z-axis 3.223 2.031 3.810 8.511
Appl. Sci. 2025,15, 4574 12 of 15
Appl. Sci. 2025, 15, x FOR PEER REVIEW 11 of 15
4. Experiments
To verify the accuracy of the texture design, texture is aached to the simulated par-
ticle model for modeling, and the physical spherical particles are 3D-printed. At the same
time, a real machine vision system is built using an industrial CMOS camera, a triaxial
angular displacement table, and a personal computer. Figure 5 shows a photo of the entire
system setup.
In the experiments, the pose of the spherical particle is changed using an angular
displacement table. A camera is used to collect the 2D projection image corresponding to
the 3D pose, which is then transmied to a computer. The obtained images are processed
to be applicable to neural network algorithms. In this experiment, the neural network used
is still MobileNet trained to work with texture.
The above system is used to collect 40 real images with different aitude angles, and,
after processing the images, the MobileNet network is used to estimate the actual aitude.
The error between the estimated result and the actual aitude is analyzed statistically. The
detailed analysis is shown in Table 3 below, and a more specific visualization is shown in
Figure 6 below.
Figure 5. The machine vision system.
Figure 6. Box plot of test image error.
Table 3. Error analysis of test data.
Parameter Mean Error Std RMSE Maximum
X-axis 2.717 2.34 3.585 11.843
Y-axis 3.275 3.718 4.955 15.273
Figure 6. Box plot of test image error.
It can be seen from Table 3that the model trained with the virtual dataset still has a
low MAE (2.7~3.3
◦
) in the practical application, which means good practical application
prospects. In the field of attitude estimation for symmetric objects such as spheres, Zim-
mermann obtained a matching error of about 2
◦
and a weighted error of 3
◦
by matching
synthetic textures with stereo vision. Mathai’s method, based on optimized surface patterns,
achieved an MAE of about 4
◦
in a controlled environment with SNR = 2. Song extended the
matching algorithm to 6-DoF pose detection of complex parts by constructing a multi-view
template library offline based on CAD models, achieving position errors < 2 mm and
pose errors
≈
3
◦
[
30
]. In contrast, the texture optimization framework proposed in this
paper achieves a lower average error (2.7–3.3
◦
) in real scenes, indicating its effectiveness.
However, the deviation along the y-axis is somewhat large, with a Std of 3.718
◦
, an RMSE
of 4.955
◦
, and a maximum prediction error of 15
◦
, indicating that the model’s predictions
in this direction are less stable. The test error box diagram in Figure 6further reflects the
error distribution of each coordinate axis. As can be seen from the figure, the median error
of each axis is relatively close, distributed around 2–3
◦
, and the upper edge of the box is
almost within 4
◦
, indicating that 75% of the data prediction is more accurate, indicating
that this method is feasible. However, there are several outliers on the x-axis and y-axis,
indicating that the prediction performance of the current model is not very stable and there
will be deviations. Further research is needed to understand and mitigate the anomalies for
more robust real-world performance.
5. Discussion and Conclusions
The development of robust 3D pose estimation systems for symmetric objects, such
as spheres, presents a crucial advancement in industrial automation, robotic manipula-
tion, precision metrology [
31
,
32
], and the biomedical field [
33
]. Existing methods have
explored the use of printed texture patterns on spheres for attitude determination, yet these
techniques often struggle with limited generalization due to texture sensitivity and the
trade-off between pattern complexity and computational demands. This work builds upon
these existing efforts by addressing the limitations of traditional approaches through a
texture-optimized design tailored for a synthetic dataset. Our results demonstrate that a
high-complexity texture design, incorporating both multi-scale directional patterns and
high chromatic contrast, leads to significant performance improvements. This enables
substantial reductions in the MAE compared to low-complexity textures, thus directly
tackling the dual bottlenecks of data scarcity and rotational ambiguity inherent in spherical
object localization. This enhancement aligns with Tamura’s texture theory, where multi-
Appl. Sci. 2025,15, 4574 13 of 15
scale directional features and chromatic contrast improve feature discriminability, enabling
robust pose prediction under varying viewpoints.
Theoretically, this work establishes a novel paradigm for texture-driven synthetic data
generation. The observed correlation between texture complexity and model accuracy
highlights the importance of the feature information provided by the texture feature design.
The dual role of color—enhancing the validation accuracy while introducing noise in low-
complexity scenarios—provides new insights into color–texture interactions. Here are the
three key findings from this study:
•
Texture complexity dominance: high-complexity color textures (Texture_3_color) achieved
the optimal accuracy, reducing errors by 64.8% compared to low-complexity designs.
•
Color–texture synergy: color enhanced performance in complex textures (with the test
MAE achieving 0.731
◦
and RMSE achieving 0.876
◦
) but degraded the low-complexity
results, emphasizing complexity as a prerequisite for effective color utilization.
•
Real-world generalization: the physical tests confirmed the feasibility, with the average
attitude error measured by the real system reaching around 3
◦
and 75% of the test
data errors being less than 4
◦
, which ensures the feasibility of training the network
with 2D data for 3D attitude estimation.
These results provide a foundation for texture-driven synthetic data systems with
applications in industrial detection and the related applications of target attitude estimation
and motion analysis.
This study is subject to two key limitations. First, the simulation framework assumes
ideal material–light interactions, which may not fully capture real-world scenarios with
reflective or translucent surfaces. Second, the rotational symmetries in spherical objects
introduce inherent ambiguity in pose estimation. Due to the rotational symmetry of a
sphere, an insufficient texture design can result in the object’s appearance after rotation
being indistinguishable from its original state, negatively impacting the measurement accu-
racy. This is particularly pronounced under extreme lighting variations, where cameras
struggle to capture subtle differences in surface textures (such as color gradients or tiny
marks). This difficulty further weakens the system’s ability to differentiate between various
rotation angles. Future work should integrate dynamic lighting models, like ray tracing
and material-aware texture mapping, for more realistic simulations. To overcome rotational
ambiguity, research should focus on designing more distinctive textures with invariant fea-
tures. Furthermore, exploring sensor fusion with IMUs and incorporating prior knowledge
of object motion could enhance pose estimation robustness. Moreover, future work should
prioritize the further optimization of the feature extraction capabilities and generalization
performance of datasets and models to achieve better pose estimation accuracy.
Author Contributions: Conceptualization, Y.S. and M.K.; Methodology, Y.S. and M.K.; Software,
Y.S.; Validation, Y.S.; Formal analysis, M.K., H.Y. and L.L.; Investigation, Y.S.; Resources, M.K.; Data
curation, L.L.; Writing—original draft, Y.S.; Writing—review & editing, H.Y. and L.L.; Visualization,
H.Y. and L.L.; Supervision, H.Y. and L.L.; Project administration, M.K. and L.L.; Funding acquisition,
M.K. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The original contributions presented in the study are included in the
article, further inquiries can be directed to the corresponding authors.
Conflicts of Interest: The authors declare no conflict of interest.
Appl. Sci. 2025,15, 4574 14 of 15
Abbreviations
The following abbreviations are used in this manuscript:
MAE Mean absolute error
Std Standard deviation
RMSE Root mean square error
References
1.
Zhou, S.; Cao, W.; Wang, Q.; Zhou, M.; Zheng, X.; Lou, J.; Chen, Y. KMFDSST Algorithm-Based Rotor Attitude Estimation for a
Spherical Motor. IEEE Trans. Ind. Inform. 2023,20, 4463–4472. [CrossRef]
2.
Hansen, J.G.; de Figueiredo, R.P. Active Object Detection and Tracking Using Gimbal Mechanisms for Autonomous Drone
Applications. Drones 2024,8, 55. [CrossRef]
3.
Zhou, Z.; Zeng, C.; Tian, X.; Zeng, Q.; Yao, R. A Discrete Quaternion Particle Filter Based on Deterministic Sampling for IMU
Attitude Estimation. IEEE Sens. J. 2021,21, 23266–23277. [CrossRef]
4.
Hao, D.; Zhang, G.; Zhao, H.; Ding, H. A Combined Calibration Method for Workpiece Positioning in Robotic Machining Systems
and a Hybrid Optimization Algorithm for Improving Tool Center Point Calibration Accuracy. Appl. Sci. 2025,15, 1033. [CrossRef]
5.
Jiang, J.; Xia, N.; Yu, X. A feature matching and compensation method based on importance weighting for occluded human pose
estimation. J. King Saud Univ. Comput. Inf. Sci. 2024,36, 102061. [CrossRef]
6.
Nadeem, U.; Bennamoun, M.; Togneri, R.; Sohel, F.; Rekavandi, A.M.; Boussaid, F. Cross domain 2D-3D descriptor matching for
unconstrained 6-DOF pose estimation. Pattern Recognit. 2023,142, 109655. [CrossRef]
7.
Yu, X.; Zhuang, Z.; Koniusz, P.; Li, H. 6dof object pose estimation via differentiable proxy voting loss. arXiv 2020, arXiv:2002.03923.
[CrossRef]
8.
Hou, H.; Xu, Q.; Lan, C.; Lu, W.; Zhang, Y.; Cui, Z.; Qin, J. UAV Pose Estimation in GNSS-Denied Environment Assisted by
Satellite Imagery Deep Learning Features. IEEE Access 2020,9, 6358–6367. [CrossRef]
9.
Bogaart, M.V.D.; Jacobs, N.; Hallemans, A.; Meyns, P. Validity of Deep Learning-Based Motion Capture Using DeepLabCut to
Assess Proprioception in Children. Appl. Sci. 2025,15, 3428. [CrossRef]
10.
Park, S.; Jeong, W.-J.; Manawadu, M.; Park, S.-Y. 6-DoF Pose Estimation from Single RGB Image and CAD Model Retrieval Using
Feature Similarity Measurement. Appl. Sci. 2025,15, 1501. [CrossRef]
11.
Kubicki, B.; Janowski, A.; Inglot, A. Multimodal Augmented Reality System for Real-Time Roof Type Recognition and Visualiza-
tion on Mobile Devices. Appl. Sci. 2025,15, 1330. [CrossRef]
12.
Hoda ˇn, T.; Sundermeyer, M.; Drost, B.; Labbé, Y.; Brachmann, E.; Michel, F.; Rother, C.; Matas, J. BOP challenge 2020 on 6D object
localization. In Proceedings of the Computer Vision–ECCV 2020 Workshops, Glasgow, UK, 23–28 August 2020; Proceedings, Part
II 16. Springer International Publishing: Cham, Switzerland, 2020; pp. 577–594. [CrossRef]
13.
Peng, S.; Liu, Y.; Huang, Q.; Zhou, X.; Bao, H. Pvnet: Pixel-wise voting network for 6dof pose estimation. In Proceedings
of the IEEE/CVF conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019;
pp. 4561–4570, CVPR 2019 Open Access Repository.
14. Hartley, R.; Zisserman, A. Multiple View Geometry in Computer Vision; Cambridge University Press: Cambridge, UK, 2003.
15.
Asher, J.M.; Hibbard, P.B.; Webb, A.L. Perceived intrinsic 3D shape of faces is robust to changes in lighting direction, image
rotation and polarity inversion. Vis. Res. 2024,227, 108535. [CrossRef] [PubMed]
16.
Zimmermann, R.; Gasteuil, Y.; Bourgoin, M.; Volk, R.; Pumir, A.; Pinton, J.-F. International Collaboration for Turbulence Tracking
the dynamics of translation and absolute orientation of a sphere in a turbulent flow. Rev. Sci. Instruments 2011,82, 033906.
[CrossRef] [PubMed]
17.
Mathai, V.; Neut, M.W.M.; van der Poel, E.P.; Sun, C. Translational and rotational dynamics of a large buoyant sphere in turbulence.
Exp. Fluids 2016,57, 51. [CrossRef]
18. Will, J.B.; Krug, D. Dynamics of freely rising spheres: The effect of moment of inertia. J. Fluid Mech. 2021,927, A7. [CrossRef]
19.
Zhang, K.; Wang, W.; Cui, Y.; Lv, Z.; Fan, Y.; Zhao, X. Deep learning-based estimation of ash content in coal: Unveiling the
contributions of color and texture features. Measurement 2024,233, 114632. [CrossRef]
20.
Wang, Z.; Zhang, X. Contextual recovery network for low-light image enhancement with texture recovery. J. Vis. Commun. Image
Represent. 2024,99, 104050. [CrossRef]
21.
Tamura, H.; Mori, S.; Yamawaki, T. Textural Features Corresponding to Visual Perception. IEEE Trans. Syst. Man, Cybern. 1978,8,
460–473. [CrossRef]
22.
Dzier˙
zak, R. Impact of Texture Feature Count on the Accuracy of Osteoporotic Change Detection in Computed Tomography
Images of Trabecular Bone Tissue. Appl. Sci. 2025,15, 1528. [CrossRef]
Appl. Sci. 2025,15, 4574 15 of 15
23.
Trevisani, S.; Guth, P.L. Terrain Analysis According to Multiscale Surface Roughness in the Taklimakan Desert. Land 2024,13,
1843. [CrossRef]
24.
He, T.; Zhong, Y.; Isenberg, P.; Isenberg, T. Design Characterization for Black-and-White Textures in Visualization. IEEE Trans. Vis.
Comput. Graph. 2023,30, 1019–1029. [CrossRef] [PubMed]
25. Goodman, J.W. Introduction to Fourier Optics; Roberts and Company Publishers: Colorado, CO, USA, 2005.
26.
Zhang, S. High-speed 3D shape measurement with structured light methods: A review. Opt. Lasers Eng. 2018,106, 119–131.
[CrossRef]
27.
Luo, M.R.; Cui, G.; Rigg, B. The development of the CIE 2000 colour-difference formula: CIEDE2000. Color Res. Appl. 2001,26,
340–350. [CrossRef]
28.
Schroeder, W.; Martin, K.M.; Lorensen, W.E. The Visualization Toolkit an Object-Oriented Approach to 3D Graphics; Prentice-Hall, Inc.:
Englewood Cliffs, NJ, USA, 1998; pp. 10–52.
29.
Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient
Convolutional Neural Networks for Mobile Vision Applications. arXiv 2017, arXiv:1704.04861. [CrossRef]
30.
Song, W.; Guo, C.; Shen, L.; Zhang, Y. 3D pose measurement for industrial parts with complex shape by monocular vision. In
Proceedings of the SPIE 10827, Sixth International Conference on Optical and Photonic Engineering (icOPEN 2018), Shanghai,
China, 8–11 May 2018; p. 1082712. [CrossRef]
31.
Balntas, V.; Doumanoglou, A.; Sahin, C.; Sock, J.; Kouskouridas, R.; Kim, T.K. Pose Guided RGBD Feature Learning for 3D Object
Pose Estimation. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29
October 2017. [CrossRef]
32.
Cui, Y.; Hildenbrand, D. Pose estimation based on Geometric Algebra. GraVisMa 2009,73, 7. Available online: https://www.
researchgate.net/publication/286141050_Pose_estimation_based_on_Geometric_Algebra (accessed on 2 January 2025).
33.
Ci, J.; Wang, X.; Rapado-Rincón, D.; Burusa, A.K.; Kootstra, G. 3D pose estimation of tomato peduncle nodes using deep keypoint
detection and point cloud. Biosyst. Eng. 2024,243, 57–69. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.