Content uploaded by Marco Peterson
Author content
All content in this area was uploaded by Marco Peterson on Feb 27, 2022
Content may be subject to copyright.
Comprehensive Assessment of Neural Network Synthetic Training Methods
using Domain Randomization for Orbital and Space-based Applications
Marco Peterson∗, Minzhen Du∗, Nadhir Cherfaoui∗, Alby Koolipurackal∗, Daniel D. Doyle†, Jonathan T. Black∗
∗Kevin T. Crofton Department of Aerospace and Ocean Engineering,
†Hume Center for National Security and Technology,
Virginia Polytechnic Institute and State University, Blacksburg, VA, USA
Abstract— The continued advancement of neural networks
and other deep learning architectures have fundamentally
changed the definition of “State of the art” (SOTA) in a
wide and ever-growing range of disciplines. Arguably the most
impacted field of study is that of computer vision, providing
flexible and general function-approximation frameworks capa-
ble of accurately and reliably performing object identification
and classification on a wide range of image datasets. However,
the impressive gains achieved by deep learning methods come at
a cost. The incredibly large number of images required to train
a deep network makes them prohibitive for certain applications
where large image datasets are limited or simply do not exist.
Collecting the required image data is often too expensive, too
dangerous, or too cumbersome to gather for many problem
sets. Space-based applications are a perfect example of an
imagery limited domain due to its complex and extreme envi-
ronment. Conversely, breakthroughs in space-based computer
vision applications would enable a wide range of fundamental
capabilities required for the eventual automation of this critical
domain, including robotics-based construction and assembly,
repair, and surveying tasks of orbital platforms or celestial
bodies. To bridge this gap in capability researchers have started
to rely upon 3D rendered synthetic image datasets generated
from advanced 3D rasterization software.
Generating synthetic data is only step one. Space-Based
computer vision is more complex than traditional terrestrial
tasks due to the extreme variances encountered that can confuse
or degrade optical sensors and CV and ML algorithms. These
include orientation and translation tumble in all axes, inability
for a model to orient on a horizon, and extreme light saturation
on lit sides of celestial bodies and occlusion on dark sides
and in shadows. To produce a more reliable and robust result
for computer vision architectures, a new method of synthetic
image generation called domain randomization has started to
be applied to more traditional computer vision problem sets.
This method involves creating an environment of randomized
patterns, colors, and lighting while maintaining rigid structures
for objects of interest. This may prove a promising solution
to the variance space-based problem. This paper explores
computer vision, domain randomization, and the necessary
computational hardware required to apply them to space-based
applications.
I. INTRODUCTION
Optical-based navigation deployed onboard the Mariner 6
and Mariner 7 missions [1] to Mars in 1969 is arguably
*This work was supported by the Hume Center for National
Security and Technology, and the Institute for Critical Technology
and Applied Science (ICTAS) https://hume.vt.edu/ and
https://ictas.vt.edu/
Fig. 1: Synthetic Training (top image) used for real-world
inferencing (bottom image) on space-based applications.
the first use of computer vision for a space application.
This capability would later prove to be mission essential
technology for the Voyager missions a decade later. Today,
the most recent use of this technology can be found onboard
the recent history-making Mars Helicopter Ingenuity [2],
which employed a downward-facing monochrome camera
to capture relative motion velocity for real-time navigation
using velocimetry. This method of local navigation was se-
lected in lieu of absolute navigation because sufficient image
data required to perform future matching on the surface of
Mars did not exist. So how do we extend the computer vision
capability past simple feature extraction/mapping on known
image datasets to generalizing automation tasks in a wide
range of environments and the image datasets required?
Pioneering Convolutional Neural Network (CNNs) Archi-
tectures such as Alexnet [4], GoogleNet [5], and ResNet
[6] and most recently, YOLO [7] have provided researchers
and engineers with an automation tool to perform object
identification and classification. However, the success of
these neural network architectures has long been a product
of training with human labeled, real-world imagery.
A promising alternative to overcome the dependence on
these often time-consuming real world datasets is the adapta-
tion of physics and environmental engines such as the Unreal
Engine [8], Unity [9], and Blender [10] to generate synthetic
datasets, that would allow a faster and more efficient data
collection process. Although it accelerates the impact of
machine learning on industries such as robotics, synthetic
data generation comes with numerous limitations due to
discrepancies between simulation and the real world. Even
by altering the parameters of the simulation to match those
of the physical world, the process remains error-prone and
ultimately unreliable due to physical behaviors like fluid
dynamics that cannot be incorporated in the simulator, a
problem otherwise known as the reality gap. In this paper,
we explore the concept of Domain Randomization [18] [19]
[20] [21] [23] [24], its effectiveness in closing the reality
gap, the most recent advances in CNNs and how to deploy
them in orbit (Fig. 1).
A. Today’s State of the art CNNs
CNNs are networks of computational layers that employ
convolutions over grid-like topology input data (e.g. images
or time-series data) to extract feature maps. These architec-
tures are most commonly used to derive features from images
and videos because they are capable of deducing relation-
ships and dependencies between pixel structures. Due to the
rapid development of machine learning, new architectures
are developed monthly (if not weekly) that advance the state
of the art, but almost all of them use the same fundamental
building blocks. Today’s CNNs have two general functions:
feature extraction and classification.
Feature extraction is accomplished via the convolutional
block, which is further divided into three subparts. The first
is the convolution operation itself, which is governed by the
size and weights of the convolution kernel. Downstream,
an activation function (generally zero-centered) defines the
output of a node and adds non-linearity to the network.
Lastly, a pooling operation is often included which serves
to refine a large matrix space into smaller feature maps by
prioritizing the most-pronounced components of the input
and discarding the rest. After pooling is completed, a feature
map matrix containing the relevant information encoded
within the input image is returned. Multiple convolutional
blocks can be (and often are) chained together to further
refine the feature map.
Classification is the last operation employed by (most)
CNNs. During this step, the feature map is processed via a
series of fully connected layers. This results in a probability
score that represents the CNNs confidence that the input
image contains/matches a defined object. This classification
is then compared against the ground truth of the (manually)
labeled input, and an error metric is computed. The error is
then backpropagated throughout the neural network to each
weight via (stochastic) gradient descent which minimizes the
error between the final probability output and the ground
truth.
Given the extremely rapid pace of machine learning al-
gorithms, techniques, and methodologies, it is nearly im-
possible to exhaustively describe all of them. Some of the
best algorithms at performing these tasks today are R-CNN
[11], Fast R-CNN [12], and YOLO [7], however the YOLO
architecture, which stands for “You Only Look Once” and its
accompanying Darknet framework are consistently regarded
as being leaders in the field of image classification and is
the only one capable of acceptable frame rates on lower
power edge hardware. This architecture has experienced
four iterations since its conception to keep up with new
advancements. YOLO derives it’s detection speed in part to
the introduction of Anchor boxes, defined as a predefined
collection of boxes with widths and heights chosen to match
object sizes in a particular data set, or some other criteria
based on the problem set. Instead of using the sliding
window method of an arbitrarily sized box requiring multiple
passes over an image in hopes of finding an object feature
that matches predefined window dimensions near perfectly,
thousands of candidate anchor boxes are dispersed through
out a given input image. Each anchor box is used to calculate
the percentage of Intersection over Union (IOU) by assigning
an object score which is used to estimate the probability
that an object exists within a given anchor box regardless
of the predicated label, thereby only requiring a single pass
over an image (Fig. 2). Object scores combined with object
classification probability determines where an object is and
what an object is for a given input image. Both of these
parameters are incrementally updated via back propagation,
improving the size and shape of the anchor boxes as the
model is trained.
B. How CNNs are Evaluated
Mean average precision is one of the industry standards
for evaluation of a machine learning architecture, labeling
methodology, and input imagery in regards to their combined
ability to classify and localize an object or set of objects. This
metric is a function of precision and recall curves defined
below in Equations 1and 2, respectively where TP =True
Positive, T N =True Negative, F P =False Positive, and
F N =False Negative. Precision and recall serve as accuracy
metrics that calculate classification accuracy rates during
training.
Fig. 2: Intersection over Union (IoU) bounding box threshold
examples.
P recision =T P
T P +F P (1)
Recall =T P
T P +F N (2)
Average Precision is a measure of the area under the
Precision-Recall Curve (PR Curve) and is often defined as in
Equation 3. The Mean Average Precision (mAP) is simply
defined as in Equation 4.
AP =Z1
0
p(r)dr (3)
mAP = 1
N
N
X
i=1
APi(4)
A brief background on computer vision used in space
applications, the state of the art of CNNs, and one of the
industry standards for evaluating classifier performance has
been covered in this section. In Section II, an approach to
synthetic imagery is covered along with design questions that
help to address training data development for CNN models.
After initial training data development, improving model
performance through domain randomization is covered in
Section III. The ability to move from one domain to another
by domain adaptation is covered in Section IV. Next, Section
V covers computer vision in the space industry today. Section
VII addresses space hardware required for deployment of a
space-based computer vision system and a look at the tempo
of related research. Lastly, conclusions and future work is
provided in Section VII.
II. SYNTHETIC IMAGERY
The precision of neural networks has often been attributed
to architectural decisions such as number of layers, number
of neurons populating those layers, learning rates, as well
as the nonlinear and pooling functions adopted. Over the
past decade several best practices have been developed and
combined in variety of ways to produce more capable neural
networks. However the quality or lack of quality of input
data will also have a profound impact on performance.
Completely trained models using computer generated worlds
was introduced in 2017 [25] as means of generating massive
amounts of training data to train models that could solve
the complex “self driving automobile” problem. Furthermore,
by specifically leveraging the photo-realistic environment of
grand-theft auto five (GTA 5) [21], this method demonstrated
the ability to augment datasets with certain controllable
parameters such as geographic regions, time of day, and
weather to increase model robustness. With a generated
dataset of over 200,000 synthetic images pushed through the
Faster R-CNN architecture, this method achieved a mean
average precision of just under 70 percent utilizing an IOU
threshold of 0.7 when validating against the KITTI dataset
[17]. Fortunately for this particular problem set, a high
fidelity world was already created with liberal copy right use
for research applications and able to generate data from the
game engine buffers. The next step is expanding this capabil-
ity to a wider range of use cases. Given recent advancement
and proliferation of state of the art 3D rasterization engines
capable of rendering hyper-realistic environments at ever
increasing frame rates, it is now possible to substitute this
data with photo-realistic or cell shaded versions generated
within a digital world. Tools such as Nvidia’s “Ray Tracing”
[14] and “deep learning super-sampling” (DLSS) [15], as
well as The Unreal Engine’s “Meta Human” [16] Creator
have removed significant technical barriers to entry. Synthetic
datasets are now becoming more and more practical to a
growing number of research teams. Moreover, effectively
controlling desired parameters such as size, shape, distance,
and angle of every object and asset in a scene is a capability
difficult to replicate in the real world without the use of
a studio workspace. For these reasons, synthetic data has
become increasingly popular to both reduce the cost of the
data collection process and introduce greater flexibility and
variety. Providing a thorough understanding and control over
the simulated environment and can therefore be used to
create large datasets.
•How accurately can the real world with all its complex
physical attributes be simulated?
•Which characteristics are relevant for modeling and
which ones are unnecessary?
•Which augmentations matter for a specific problem set
and which do not?
III. DOMAIN RANDOMIZATION
Domain randomization is simply the process of introduc-
ing enough variability in a simulation, real world objects will
eventually be generalized as just another randomized per-
mutation, allowing models to generalize to a greater extent
than non domain randomized datasets. In the past the use of
synthetic data has led to a problem known as the reality gap,
or the unavoidable deviation of any simulation from reality.
Domain Randomization is a method for creating simulated
data by constantly changing the environment of any given
dataset, reducing the need for hyper-realistic simulations.
The process of Domain Randomization to produce simulated
data has in some use cases increased a neural networks ability
to detect and classify real world objects. Not only inside
the intended domain, but across a wide range of domains.
Given the wide range of parameters that can be changed
dynamically within a synthetic environment, it is natural to
ask which parameters impact mean average precision the
most. The University of Toronto conducted an ablation study
[18] evaluating parameters such as lighting, asset textures,
orientation, Gaussian filters, and the use of flying distrac-
tors when validating their model against the KITTI dataset
[17]. Each parameter individually effected overall (mAP50)
anywhere from 7%-2% with lighting variation having the
largest impact. This approach was also applied to the task of
self driving cars. For space domain specific computer vision
application, the approach can be posed as a question of how
other variations can be used to increase model performance,
such as:
•Variation of the size and distance of the objects of
interest
•Variation in the object of interest itself (if applicable)
•Angle of the camera with respect to the object of interest
•Light source intensity and camera aperture size
•Inferencing with supplemental lighting only
Applying these new gains in mean average precision to
real world manipulation - specifically for robotic arm control
- several researchers including a team from UC Berkeley
[23] have implemented domain randomization. When applied
to robotic arms, object classification, while still important,
comes second to the ability to perform accurate object lo-
calization. Domain randomization generally increases model
localization allowing robotic arms to be accurate up to
1.5 cm while retaining the ability to identify and track
objects even if they are partially occluded. However, domain
randomization has another use case related to robotics. In
the years to come, robotic manipulators will be tasked with
grasping/maneuvering small parts or fasteners to assemble
larger structures such as bolts, screws, and other fastening
hardware. Synthetically generated domain randomized im-
agery data can and should be used to train robotic platforms
to identify and localize the often small parts necessary for
construction or repair tasks. A sample is shown in Fig. 3.
Domain Randomization has another secondary benefit: Its
ability to supplement existing datasets by training a single
end-to-end model on a wide range of imagery that can
generalize and classify objects of interest across a wide
range of domain environments. In relation to the self driving
problem set, this translates into the ability to classify a
vehicle in any environment or weather condition across the
planet, or the ability to classify new classes of cars outside
the norm that might be found on road ways such as golf
carts. This method has been shown to achieve state of
the art performance by using episodic training to gradually
Fig. 3: Synthetic Data Domain Randomization (Unreal En-
gine 4)
introduce new domains to the network using only real world
imagery [22]. Incorporating domain randomization into a
large domain generalized data set will not only be more cost
effective when collecting large amounts of data, but may also
provide grounds for significant increases in performance.
IV. DOMAIN ADAPTATION
While domain generalization attempts to train one model
for several domain use cases, domain adaptation is the
ability to effectively train a new model (target model) for
a specific domain (source) based on what has already been
learned by another model from a different domain. This
methodology [22] [27] is related to what is known as transfer
learning. Domain shift dramatically effects a deep network’s
performance because features extracted become more and
more specific with each training epoch. If the network
becomes too specific, overfitting occurs, causing the network
to fail to generalize objects within the domain from which
it was trained, and therefore, ineffective for other domains.
A number of approaches have been proposed, including re-
training the model in the target domain adapting the weights
of the model based on the statistics of the source and target
domains, learning invariant features between domains, and
learning a mapping from the target domain to the source
domain. Researchers in the reinforcement learning commu-
nity have also studied the problem of domain adaptation
by learning invariant feature representations, adapting pre-
trained networks, and other methods.
A group of researchers from Google Brain [26] have used
domain adaptation to reduce the number of target domain
training required. Using a generative adversarial network
(GAN) algorithm to train a robotic arm to grasp objects with
a data set of nine million real images and more than eight
million synthetic domain randomized images. By varying the
percentage of the real data used the neural networks were
tested on their ability to pick up previously unseen physical
objects. They were trained with only synthetic data, only real
data, and then synthetic plus a percentage of real data. The
neural network was able to match the full real data’s accuracy
with only 2% of the real data when mixed with the synthetic
data data. In addition when combining the synthetic with the
real data there was a significant gain in accuracy compared
to only synthetic or only real.
The current neural networks that have been trained will
likely not work as well in space as they have been trained
in gravity and with solid foundations. When moving objects
in space, the reactionary forces caused by the robot’s own
movement and that of the object it is moving would come
into effect. Disturbances and vibrations become more im-
portant in space as these can propagate through the arm or
even move the spacecraft. However, space is simply another
domain that can be adapted to from a model trained on Earth.
In addition, training these networks on synthetic data and a
limited amount of real data from space, the neural networks
can gain accuracy at a significantly higher rate and lower
cost compared to training solely on real data.
V. SPACE INDUSTRY COMPUTER VISION
Perhaps the two most proliferated topics in the field of
space-based automation are On-Orbit Servicing which is
the process of rendezvous and docking with a damaged or
inoperable satellite and effecting necessary repairs, and In-
Space Assembly which is the manufacture and/or assembly
of necessary materials to effect the assembly of a structure in
micro-gravity. The space shuttle program fulfilled servicing
missions on several occasions, including five repair missions
to the Hubble telescope. Of course, these missions were
completed with astronauts performing extra vehicle activities
(EVAs), increasing overall mission risk. These EVAs helped
lay the ground work for the International Space Station
(ISS). However as our capabilities grow, a desire to shift
these responsibilities from astronauts to robotic systems has
percolated throughout the industry. Furthermore, automating
these capabilities will become mission critical technologies
as we look to exploit Cislunar, and our spacecraft travel
further into the solar system. The aerospace industry has
dedicated significant time and resources to solving these
problem sets, as detailed in Fig. 4over the last several years.
Several thousands of publications from leading conferences
such as Institute of Electrical and Electronics Engineers
(IEEE), Association for Computing Machinery (ACM), and
Society of Photo-Optical Instrumentation Engineers (SPIE)
illustrating solutions such as optimal orbits, best materials for
additive manufacturing, inverse kinematics in space, and the
economics of such a venture have been introduced over the
last 5 years. However, the publications employing machine
learning and computer vision capabilities needed to provide
a true human out of the loop solution are far less numerous
as detailed in Fig. 5.
To date, there are even fewer publications employing
domain randomization for computer vision use cases as
illustrated by Figure 6, and to our knowledge, zero literature
deploying domain randomization as a means to overcome the
variance problem within the space domain.
VI. SPACE HARDWARE REQUIRED FOR
DEPLOYMENT
Once a computer vision model has been sufficiently
trained for a desired task, deploying that model on-board a
Fig. 4: Number of In-Space Assembly and On-Orbit Assem-
bly related publications from major Conferences and Journals
2016 - 2020
Fig. 5: Number of Computer Vision based In-Space Assem-
bly and On-Orbit Assembly related publications from major
Conferences and Journals 2016 - 2020
satellite, rover, or space station with the available hardware
remains quite challenging.
Conventional avionics control architectures are controlled
by one or more central on-board computers (OBC), act-
ing as a central hub for all other subsystems including
sensor data; and can generally be implemented with 8-bit
micro-controllers. However, Performing object inferencing
in real time is computationally demanding, requiring a high-
performance embedded computer capable of high band with
communication protocols to the OBC.
Providing space-grade radiation hardening for such a sub-
system, for now, is inherently expensive. The computational
architecture will more than likely be modeled after a CUDA
[29] capable Graphics Processing Unit (GPU) with several
Gigabytes of onboard virtual memory (VRAM) that will need
to be protected against radiation, extreme vibrations, hard
vacuum, and high-temperature variations.
Traditional Commercial-off-the-shelf (COTS) hardware is
ill-suited for the extreme riggers of space. There are several
Fig. 6: Number of Computer Vision based In-Space Assem-
bly and On-Orbit Assembly related publications from major
Conferences and Journals 2016 - 2020
Edge-computing COTS devices available today such as the
Jetson TX2 and Jetson Xavier [28] capable of running on-
board computer vision architectures at acceptable frame
rates, but almost all compute devices sold for terrestrial
applications have a temperature rating between -30°C to
+70°C, while exterior temperatures of the international space
station (ISS) range between -157°C to +121°C. Lastly, and
perhaps the most problematic issue for space electronics is
the possibility of bit flipping and data corruption caused
by high amounts of ionizing radiation found within the
Van Allen belts encapsulating our planet, solar particles,
and cosmic rays from outside our solar system. Memory
modules containing any trained computer vision model itself
will require hardened electronics. Corrupting the weights of
the neural network or the neural network itself will degrade
its capability to perform object detection and localization.
However, a correlation study between corrupt model data
and its effect on mean average precision to our knowledge
has not been conducted.
As of today, the latest radiation tolerant 32-bit micropro-
cessors such as LEON4 [30] or RAD750 [31] are capable
of just over 400 DMIPS (Dhrystone Million Instructions per
Second) which is an architecture independent performance
calculation. Compared to the 32 TeraOPS (trillion Operations
Per Second) required to run the most modern computer
vision architectures such as YOLOv4 above 30 frames per
second (FPS), space grade CPUs are orders of magnitude
behind in computational complexity necessary. However,
dedicated graphics processing units use 32 bit floating point
arithmetic for rendering rasterization polygons, operating at
a level of mathematical precision not necessary required for
the multiply and accumulate operations used for machine
learning. Using dedicated analog or digital machine learning
integrated circuits capable of high volumes of low precision
calculations such as 8-bit integer computational architectures,
will allow for significantly more training calculations at a
lower energy cost.
Fig. 7: Radiation Hardened vs. Traditional Commercial-off-
the-shelf computational hardware
Given the constraints of satellite design, particularly the
cost to deliver mass to orbit and limited power distribution
provided by onboard solar panels, designing high perfor-
mance avionics and computation architectures required to
effectively deploy not only computer vision algorithms but
general machine learning technologies to the space domain
is still a significant challenge that will need to be addressed
before true automation can be achieved.
VII. CONCLUSIONS AN D FUTURE WORK
Given the promising applications of CNNs based computer
vision architectures, Synthetic Data Generation, and Domain
Randomization, applying these technology’s to space-based
problem sets may prove to be a mission essential end-to-
end solution needed to achieve truly autonomous on orbit
robotics capability.
In future work, we intend to start exploring this capability
by bridging both the reality gap as it pertains to the space
environment and the gap in literature on several of the unan-
swered questions outlined in this paper by demonstrating:
•The performance of a neural network solely trained
on domain randomized synthetic detests when validated
against real world imagery within the space domain.
•How effective ts the combination of real world and
synthetic training (domain generalization) on improving
accuracy
•The impact of domain randomization’s characteristics
such as the introduction of flying distractors as well as
various textures and lighting on mean average precision
results.
•The limitations of simulation and overall performance
of neural networks trained on a synthetic data.
REFERENCES
[1] “Mariner 6 amp; 7,” NASA, 07-Sep-2019. [Online]. Available:
https://mars.nasa.gov/mars-exploration/missions/mariner-6-7/
[2] J. Balaram, M. M. Aung, and M. P. Golombek, “The Ingenuity
Helicopter on the Perseverance Rover,” Space Sci. Rev., vol. 217, no.
4, pp. 1–11, 2021
[3] “ImageNet large Scale visual Recognition CHALLENGE 2017
(ILSVRC2017),”.
[4] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. ”Imagenet
classification with deep convolutional neural networks.” Advances in
neural information processing systems 25 (2012): 1097-1105.
[5] C. Szegedy et al., “Going deeper with convolutions,” Proc. IEEE
Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 07-12-June-
2015, pp. 1–9, 2015, doi: 10.1109/CVPR.2015.7298594.
[6] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image
recognition,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern
Recognit., vol. 2016-December, pp. 770–778, 2016.
[7] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, “YOLOv4: Optimal
Speed and Accuracy of Object Detection,”
[8] Unreal Engine. [Online]. Available: https://www.unrealengine.com/en-
US/unreal
[9] U. Technologies, “Solutions,” Unity. [Online]. Available:
https://unity.com/solutions.
[10] Blender Foundation, “blender.org - Home of the Blender project - Free
and Open 3D Creation Software,” Blender.org. [Online]. Available:
https://www.blender.org.
[11] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Tech report (v5)
R-CNN: Regions with CNN features,” Proc. ieee Conf. Comput. Vis.
pattern Recognit., 2014
[12] E. Hanna and M. Cardillo, “Faster R-CNN2015,” Biol. Conserv., vol.
158, pp. 196–204, 2013.
[13] A. Geiger, P. Lenz, and R. Urtasun, “VSLAM Datasets,” Proc. IEEE
Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 3354–3361,
2012.
[14] B. Bitterli, C. Wyman, M. Pharr, P. Shirley, A. Lefohn, and W. Jarosz,
“Spatiotemporal reservoir resampling for real-time ray tracing with
dynamic direct lighting,” ACM Trans. Graph., vol. 39, no. 4, 2020,
doi: 10.1145/3386569.3392481.
[15] “NVIDIA DLSS 2.0: A Big Leap In AI Rendering,” Nvidia.com. [On-
line]. Available: https://www.nvidia.com/en-us/geforce/news/nvidia-
dlss-2-0-a-big-leap-in-ai-rendering/.
[16] Unreal Engine, “Early Access to MetaHuman Creator is now
available!,” Unreal Engine, 14-Apr-2021. [Online]. Available:
https://www.unrealengine.com/en-US/blog/early-access-to-
metahuman-creator-is-now-available.
[17] A. Geiger, P. Lenz, and R. Urtasun, “VSLAM Datasets,” Proc. IEEE
Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 3354–3361,
2012.
[18] J. Tremblay et al., “Training deep networks with synthetic data:
Bridging the reality gap by domain randomization,” arXiv [cs.CV],
2018
[19] J. Tremblay, A. Prakash, D. Acuna, M. Brophy, V. Jampani, C. Anil,
T. To, E. Cameracci, S. Boochoon, and S. Birchfield, “Training Deep
Networks with Synthetic Data: Bridging the Reality Gap by Domain
Randomization,” 2018 IEEE/CVF Conference on Computer Vision and
Pattern Recognition Workshops (CVPRW), 2018.
[20] J. Huang, D. Guan, A. Xiao, and S. Lu, “FSDR: Frequency Space
Domain Randomization for Domain Generalization,” pp. 6891–6902,
2021, [Online]. Available: http://arxiv.org/abs/2103.02370.
[21] X. Yue, Y. Zhang, S. Zhao, A. Sangiovanni-Vincentelli, K. Keutzer,
and B. Gong, “Domain randomization and pyramid consistency:
Simulation-to-real generalization without accessing target domain
data,” Proc. IEEE Int. Conf. Comput. Vis., vol. 2019-Octob, pp.
2100–2110, 2019, doi: 10.1109/ICCV.2019.00219.
[22] B. Huang, S. Chen, F. Zhou, C. Zhang, and F. Zhang, “Episodic Train-
ing for Domain Generalization Using Latent Domains,” Commun.
Comput. Inf. Sci., vol. 1397 CCIS, pp. 85–93, 2021, doi: 10.1007/978-
981-16-2336-37.
[23] J. Tobin, R. Fong, A. Ray, J.Schneider, W.Zaremba, P. Abbeel,
”Domain Randomization for Transferring Deep Neural Networks
from Simulation to the Real World,” 2017 IEEE/RSJ International
Conference on Intelligent Robots and Systems (IROS)
[24] E. Ameperosa, P. Bhounsule, ”Domain Randomization Using Deep
Neural Networks for Estimating Positions of Bolts,” Journal of Com-
puting and Information Science in Engineering, 2018,
[25] M. Johnson-Roberson, C. Barto, R. Mehta, S. N. Sridhar, K. Rosaen,
and R. Vasudevan. Driving in the matrix: Can virtual worlds replace
human-generated annotations for real world tasks? In ICRA, 2017
[26] K.Bousmalis, A. Irpan, P. Wohlhar, Y. Bai, M. Kelcey, M. Kalakr-
ishnan, L. Downs, J. Ibarz, P. Pastor, K. Konolige, S. Levine, V.
Vanhoucke, ”Using Simulation and Domain Adaptation to Improve
Efficiency of Deep Robotic Grasping,” 018 IEEE International Con-
ference on Robotics and Automation (ICRA)
[27] M. Wang and W. Deng, “2018 Survey,” Neurocomputing, vol. 312,
pp. 135–153, 2018.
[28] “Jetson AGX Xavier Developer Kit,” Nvidia.com, 09-Jul-2018. [On-
line]. Available: https://developer.nvidia.com/embedded/jetson-agx-
xavier-developer-kit
[29] CUDA Toolkit,” Nvidia.com, 02-Jul-2013. [Online]. Available:
https://developer.nvidia.com/cuda-toolkit.
[30] LEON4,” Gaisler.com. [Online]. Available:
https://www.gaisler.com/index.php/products/processors/leon4.
[31] Baesystems.com. [Online]. Available:
https://www.baesystems.com/en-us/product/radiation-hardened-
electronics