Available via license: CC0
Content may be subject to copyright.
Vastextures: Vast repository of textures and PBR materials
extracted from real-world images using unsupervised methods
Sagi Eppel1
Abstract1
Vastextures is a vast repository of 500,000 textures and PBR materials extracted from real-world
images using an unsupervised process. The extracted materials and textures are extremely
diverse and cover a vast range of real-world patterns, but at the same time less refined compared
to existing repositories. The repository is composed of 2D textures cropped from natural images
and SVBRDF/PBR materials generated from these textures. Textures and PBR materials are
essential for CGI. Existing materials repositories focus on games, animation, and arts, that
demand a limited amount of high-quality assets. However, virtual worlds and synthetic data are
becoming increasingly important for training A.I systems for computer vision. This application
demands a huge amount of diverse assets but at the same time less affected by noisy and
unrefined assets. Vastexture aims to address this need by creating a free, huge, and diverse assets
repository that covers as many real-world materials as possible. The materials are automatically
extracted from natural images in two steps: 1) Automatically scanning a giant amount of images
to identify and crop regions with uniform textures. This is done by splitting the image into a grid
of cells and identifying regions in which all of the cells share a similar statistical distribution.
2) Extracting the properties of the PBR material from the cropped texture. This is done by
randomly guessing every correlation between the properties of the texture image and the
properties of the PBR material. The resulting PBR materials exhibit a vast amount of real-world
patterns as well as unexpected emergent properties. Neutral nets trained on this repository
outperformed nets trained using handcrafted assets. 500,000 textures and PBR materials have
been made available at 1,2,3,4.
1 Vastexture was published as part of “Learning Zero-Shot Material States Segmentation, by Implanting Natural Image Patterns in
Synthetic Data”[1] please refer to this work in citations. This document gives a more focused and detailed description of this repository.
Figure 1) Samples PBR materials (top) and 2D textures (bottom) from the Vastextures repository.
500,000 textures and SVBRDF/PBR materials have been made free and publicly available.
1.Introduction
Textures and physics-based rendering (PBR) materials are crucial for representing the visual
appearance of materials in computer-generated images (CGI) for animation, games, and other
arts2-14. Textures and PBR materials also become increasingly important for generating virtual
worlds and synthetic data for training artificial intelligence systems15-24. Materials are usually
represented either as 2D textures for 2D scenes or using physics-based rendering (PBR)
materials (also called SVBRDF materials) which describe the distribution of material properties
as a set of maps. Each map describes the distribution of one property on the material surface25,26.
Existing textures and PBR materials repositories2-14 are mostly created using a handcrafted
approach in which artists identify interesting textures in images and use intuition to “guess” the
physical properties of the material from the RGB image27-30. An alternative approach uses A.I
(trained on handcrafted PBRs) to generate new PBR materials from input text or images31-41.
Both handcrafted and AI-based approaches are focused on generating assets for CGI artists that
demand highly refined visually appealing assets. However, handcraft methods are limited by the
quantity of assets that can be created27-30. While A.I is limited by the diversity of the data on
which it was trained, which is mostly composed of handcrafted assets31-41. As such they have
limited diversity and are not well suited for generating the diverse and huge amount of data
needed for training A.I. Vastexture aims to fill this gap by creating a giant and highly diverse
repository that is extracted from natural images with minimal restrictions. The core idea is to
cast a wide net, capturing as many textures and real-world patterns even at the cost of capturing
unrelated patterns and more noisy assets in the process. The repository is composed of two
parts: 1) 2D textures extracted from real-world images (Figure 1 bottom), and 2) PBR materials
generated from these textures (Figure 1 top).
2D texture repositories and datasets: have been available mostly for artists and academic
research. These are usually limited to a set of a few hundred to a few thousand images of textures
divided into a few classes2-14,32,42-52. In contrast, the Vastexture contains 200,000 textures without
being limited to given categories or domains.
PBR materials repositories: Existing PBR material repositories are composed of a few hundred
to a few thousand high-quality manually or made materials3,4.9-14. The Vastexture repository does
not aim to compete with these repositories but rather to supplement them in their main limitation:
the limited number of assets and their diversity. The Vastextures contain 300,000 PBR materials
(not including mixes), each extracted from a unique texture image.
1. General approach
Extracting textures and PBR materials from images is done in two steps:
1. Identifying and cropping image regions with uniform textures, which often imply
uniform material.
2. Using the cropped texture to guess the physical properties of the SVBRDF/PBR material
it represents.
3. Identifying and extracting uniform textures from images
Extracting textures from images is mostly done either manually or using A.I. Both approaches
can generate high-quality assets for CGI art. However, manual extraction is limited in quantities.
While A.I-based approaches are limited by the diversity of the assets they were trained on31-41.
Figure 2) Material extraction methods in two examples: Pick an image and divide it into a grid of
square cells. For every grid cell, extract the distribution of features (color, gradients). Identify
image regions for which all cells have similar distributions as a uniform texture. d,e) Pick random
channels from the extracted texture image, augment them, and use the resulting maps as property
maps (roughness, metallic, height, etc.) for the PBR material.
3.1. Extracting textures from images using statistical similarity
An image texture can be defined as an area with some uniform distribution of patterns. Hence,
we can find textures by measuring the distribution of patterns across the image and search for
square regions in which this distribution is uniform. More accurately we can split the image into
a grid and identify regions in which all the cells share a similar distribution of color and
gradients as having uniform textures (Figure 2). Methods for extracting textures by statistical
similarity53-58 have been heavily explored since the 1970s, but were mostly discarded in favor of
A.I and machine learning tools. This is largely because the extracted textures are noisy and
include not only materials but any region with repeating patterns like crowds of people, waves at
sea, or uniform blocks of buildings (Figures 1,4). In addition, these methods are sensitive to
uneven illumination and will not extract regions with uniform materials but uneven light and
shadows. As such statistical methods are far less useful for the high-quality assets needed for
CGI artists. In contrast, neural nets like GPT59 and CLIP60,61 prove that A.I can learn from very
noisy data and achieve high levels of world understanding. On the other hand, nets trained on
clean but narrow domains of data performed poorly when applied to general real-world tasks60,62.
Hence, the highly general but noisy nature of patterns extracted by statistical methods is very
well suited for training neural nets. While the method might miss significant amounts of textures
due to non-uniform illuminations, we can expect that for a large enough set of images, any
important texture will appear in many settings and at least in some cases under uniform light
where it will be easily identified and extracted. Hence, by scanning a large enough set of images
we are likely to extract any major material or texture. Samples of extracted textures are shown in
Figures 1,4. It can be seen that this approach captures a wide range of materials but at the same
time captures many unrelated regular patterns like waves at sea, dense vegetation, crowds of
people or tiled architectures.
3.3. Detailed implementation
The approach for texture extraction (Figure 2) uses the following steps :
1. Resize and split the image into a grid of square cells of around 40x40 pixels each.
2. Each cell collects the distribution/histogram of feature values in the cell (features are the
R,G,B colors and their gradients).
3. Identify regions in which all cells share a similar distribution of features as having
uniform textures. In other words, regions of 6x6 or 7x7 cells or more in which every pair
of cells share a similar distribution of color and gradients are considered to belong to the
same texture and are extracted.
4. The similarity between two cells is measured by the Jensen-Shannon distance of their
colors and gradients distribution. Colors taken as R,G,B image channels and gradients are
the vertical and horizontal differences between neighboring pixels (sobel filter).
5. Filter regions with too uniform a distribution of color, as these just represent smooth
featureless regions.
A lot of work has been done in this field53-62, and the goal here is not to take the most
sophisticated approach but rather the simplest. Images were taken from the segment anything
repository63 and the open images repositoriy64,65. Parameters like image resizing, cell size,
minimal texture size, and threshold, were all manually tuned for each repository by visual
inspection.
4. Extracting SVBRDF/PBR materials properties from texture images
In order to simulate materials in 3D environments it's not enough to have a picture of the
texture. To predict how the material will appear from different angles and illuminations it is
necessary to know various physical properties like albedo, roughness, transparency, and normals,
and how they vary across the material surface25-30. Hence, we need to take the RGB texture and
extract a map for each material property from it (Figure 2). There is no deterministic way to
extract material properties from a simple RGB image. All options involve guessing in one way or
another. Guessing can be based on experience and intuition for handcrafted assets27-30, or
statistical learning for the A.I methods31-41. Were again the A.I learn by seeing a large number of
examples handcrafted by humans32. Hence, A.I is limited by the diversity of training examples,
while humans are limited by the number of assets they can create.
4.1. Casting a wide net: Guessing every correlation
A.I and manual approaches try to guess the most likely properties of the material in the image.
In contrast,our approach is to guess every possible correlation between simple image properties
(R,G,B,H,S,V…) and the PBR material property maps (Rougnes, height, metallic…). More
accurately to assume that the material properties are correlated with the texture image properties
in a simple way, and therefore can be derived from them. For example, more reflective regions
are brighter or cracks are darker. However, the exact way in which the image properties are
correlated with the material properties is unknown. The solution is not to try to find the most
statistically likely correlation but instead to guess every correlation between image property and
material property (Figure 2). Hence, for every PBR material property (roughness, metallic,
height, transmission...) we randomly pick one of the image property maps (R,G,B, H,S,V…). For
example, we may pick the brightness channel of the image randomly, scale and augment it and
then use it as the roughness map of the PBR material (Figure 2). Or pick the red channel of the
image and use it with some augmentation as the height map. For the normal map, we take the
gradient of the height map. For all maps except normals we also randomly use uniform random
values, and soft and hard thresholding in some cases.
4.2. Explanation and justification
The reasoning beyond the above approach is that the resulting PBR materials will end up using
all patterns in the texture image, and will use them for each and every material property. Hence,
we get maximum variability and maximum utilization of the patterns in the image, with
minimum assumptions. As a result all the information we can extract from the image will be
embedded in all physical properties a PBR can simulate (for a large enough set). Since the
textures themselves are extracted from a vast range of images in an unsupervised way and
therefore contain almost unlimited amounts of real-world patterns. This means that the resulting
PBRs will likely represent any important or common texture and pattern in the world, in any of
its physical properties (Again the assumption here is that important materials patterns will appear
in a wide range of images and will be extracted at least in some of them.)
This approach is very different from A.I and manual approaches that try to guess the single most
likely correlation based on intuition or past examples, which lead to more accurate but also
narrow distribution materials. The Vastexture PBRs are far less likely to accurately represent the
properties of the material in the image but will cover a far broader range of domains, and are far
more likely to embed the full possibilities and complexity of patterns in the world.
4.3. Detailed implementation
The generating of materials follows the following steps:
1. For each property of the PBR material (albedo, roughness, metallic, height,
transmission..) randomly choose one or more properties mapped in the image. The
property maps include the Red, Green, and Blue channels and the Hue, Values, and
Saturation channels.
2. Randomly Augment this map by using rotating color space, Scaling by a random value
(multiply), adding random value, taking the negative of the map (1 - map), or soft or hard
thresholding, and color ramp. Apply this augmentation randomly.
3. Use the augmented map (1,2) to represent the property map of the PBR.
4. For all properties other than normal and color in a fraction of cases give the property a
random uniform value.
5. For normal maps use the gradient of the height map, with a random scaling.
6. For base color/albedo use the RGB of the image. In a small fraction of cases use
augmentation like rotating color space.
4.4. Resulting PBRs materials
Samples of the resulting PBRs are shown in Figures 1,3 It can be seen that these materials
contain a vast amount of real-world material textures, as well as things like waves, branches,
feathers, and many others. Some of the generated PBR seems to be a good representation of the
materials in the image from which it was generated. However, many of the generated materials
show emerging, unique properties not appearing in images, but nonetheless representing
possible materials. For example, patterns of tree branches can be mapped into lower regions'
height maps, creating crack-like structures. When this branching structure is mapped to a bump
map it gives vein-like patterns. These emergent properties are similar procedural generation
techniques66-68. However, procedural textures are limited to a set of predefined building blocks,
while for Vastexture the building blocks are by themself unlimited natural patterns.
5. Using Vastexture materials for training A.I systems.
To examine the validity of this repository for training neural nets. We use the MatSeg dataset for
zero-shot segmentation of materials1. We re-created this dataset twice. Once using handcrafted
assets that encompass almost all the free publicly available PBRs, and second using the
Vastexture PBRs. The results show that the net trained on data constructed using the Vastexture
achieved 92% accuracy compared to 89% for the net trained on handcrafted assets.
Figure 3) More samples of Vastexture PBR materials.
Figure 4) More samples of extracted textures (textures extracted from the open images dataset).
6. References
[1] Eppel S, Li J, Drehwald M, Aspuru-Guzik A. Learning Zero-Shot Material States Segmentation, by
Implanting Natural Image Patterns in Synthetic Data. arXiv preprint arXiv:2403.03309. 2024 Mar 5.
[2] Poliigon. High-quality photoreal textures, models, brushes, and HDRIs. Available:
https://www.poliigon.com/
[3] ambientCG. Free PBR materials and textures under the CC0 license. Available:
https://ambientcg.com/
[4] Poly Haven. High-quality 3D models and textures under a CC0 license. Available:
https://polyhaven.com/
[5] 3DXO. Free texture images for diverse creative projects. Available: https://www.3dxo.com
[6] Pixar One Twenty Eight. Textures for rendering in Pixar's Renderman. Available:
https://renderman.pixar.com/pixar-one-twenty-eight
[7] Texturer. High-resolution free textures for 3D artists and designers. Available: http://texturer.com/ .
[8] Textures.com. Available: https://www.textures.com
[9] FreePBR, free PBR textures repositories. https://freepbr.com/.
[10] Giuseppe Vecchio and Valentin Deschaintre. Matsynth: A modern pbr materials dataset. arXiv
preprint arXiv:2401.06056, 2024.
[11] HumanMaterial Project. HumanMaterial: Physics and Neural-driven Full-body Human
[12] Material Estimation. https://humanmaterial.github.io/.
[13] MaterialX Library. MaterialX Library. https://matlib.gpuopen.com/main/materials/all.
[14] Physically Based. The PBR values database. https://physicallybased.info.
[15] Stephan R Richter, Vibhav Vineet, Stefan Roth, and Vladlen Koltun. Playing for data: Ground truth
from computer games. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The
Netherlands, October 11-14, 2016, Proceedings, Part II 14, pages 102–118. Springer, 2016.
[16] Manuel Lagunas, Sandra Malpica, Ana Serrano, Elena Garces, Diego Gutierrez, and Belen
Masia. A similarity measure for material appearance. arXiv preprint arXiv:1905.01562, 2019.
[17] Eppel S, Xu H, Wang YR, Aspuru-Guzik A. Predicting 3D shapes, masks, and properties of
materials inside transparent containers, using the TransProteus CGI dataset. Digital Discovery.
2022;1(1):45-60.
18] Sergey I Nikolenko. Synthetic data for deep learning, volume 174. Springer, 2021.
19] Perroni-Scharf M, Sunkavalli K, Eisenmann J, Hold-Geoffroy Y. Material swapping for 3d scenes
using a learnt material similarity measure. InProceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition 2022 (pp. 2034-2043).
[20] Jonathan Dyssel Stets, Rasmus Ahrenkiel Lyngby, Jeppe Revall Frisvad, and Anders Bjorholm
Dahl. Material-based segmentation of objects. In Image Analysis: 21st Scandinavian Conference, SCIA
2019, Norrköping, Sweden, June 11–13, 2019, Proceedings 21, pages 152–163. Springer,
2019.
[21] Drehwald MS, Eppel S, Li J, Hao H, Aspuru-Guzik A. One-shot recognition of any material
anywhere using contrastive learning with physics-based rendering. InProceedings of the IEEE/CVF
International Conference on Computer Vision 2023 (pp. 23524-23533).
[22] Jian Song, Hongruixuan Chen, and Naoto Yokoya. Syntheworld: A large-scale synthetic dataset for
land cover mapping and building change detection. In Proceedings of the IEEE/CVF Winter Conference
on Applications of Computer Vision, pages 8287–8296, 2024.
[23] Li Z, Gan R, Luo C, Wang Y, Liu J, Zhang ZZ, Li Q, Yin X, Zhang Z, Peng J. MaterialSeg3D:
Segmenting Dense Materials from 2D Priors for 3D Assets. arXiv preprint arXiv:2404.13923. 2024 Apr
22.
[24] Sharma P, Philip J, Gharbi M, Freeman B, Durand F, Deschaintre V. Materialistic: Selecting similar
materials in images. ACM Transactions on Graphics (TOG). 2023 Aug 1;42(4):1-4.
[25] Clara Asmail. Bidirectional scattering distribution function (bsdf): a systematized bibliography.
Journal of research of the National Institute of Standards and Technology, 96(2):215, 1991.
[26] Matt Pharr, Wenzel Jakob, and Greg Humphreys. Physically based rendering: From theory to
implementation. Morgan Kaufmann, 2016.
[27] "FREE TOOL For Creating PBR Material Maps from Photos - Materialize!",YouTube.
https://www.youtube.com/watch?v=GItohAbyzCM
[28] "Creating a PBR Texture from Start to Finish - Substance Painter & Materialize" , YouTube.
https://www.youtube.com/watch?v=gF2bQd9nn6U&list=PLjwvP__WVRTRs052yCvrnQYMAmsgx0Pz
H
[29] "Creating Seamless Textures for Games - Substance Designer Tutorial" , YouTube.
https://www.youtube.com/watch?v=YJqWHsllczY&list=PLjwvP__WVRTRs052yCvrnQYMAmsgx0PzH
&index=3
[30] "Making Tileable Textures in Blender - A Quick Guide" , YouTube:
https://www.youtube.com/watch?v=rMKHv4wvB1g&list=PLjwvP__WVRTRs052yCvrnQYMAmsgx0Pz
H&index=7
[31] Munkberg, J., Hasselgren, J., Shen, T., Gao, J., Chen, W., Evans, A., Müller, T., & Fidler, S. (2022).
Extracting Triangular 3D Models, Materials, and Lighting From Images. In Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition (CVPR), 8280-8290.
[32] Lopes I, Pizzati F, de Charette R. Material palette: Extraction of materials from a single image. In
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024 (pp.
4379-4388).
[33] Toggle3D. Generate PBR Material with AI, 2023.
[34] Adobe Inc. Image To Material | Substance 3D Sampler, 2023.
[35] Zhengqin Li, Mohammad Shafiei, Ravi Ramamoorthi, Kalyan Sunkavalli, and Manmohan
Chandraker. Inverse rendering for complex indoor scenes: Shape, spatially-varying lighting and
svbrdf from a single image. In Proceedings of the IEEE/CVF Conference on Computer Vision
and Pattern Recognition, pages 2475–2484, 2020.
[36] Valentin Deschaintre, Miika Aittala, Fredo Durand, George Drettakis, and Adrien Bousseau.
Single-image svbrdf capture with a rendering-aware deep network. ACM Transactions on
Graphics (ToG), 37(4):1–15, 2018.
[37] Ta-Ying Cheng, Prafull Sharma, Andrew Markham, Niki Trigoni, and Varun Jampani. Zest:
Zero-shot material transfer from a single image. arXiv preprint arXiv:2404.06425, 2024.
[38] Ye Fang, Zeyi Sun, Tong Wu, Jiaqi Wang, Ziwei Liu, Gordon Wetzstein, and Dahua Lin.
Make-it-real: Unleashing large multimodal model’s ability for painting 3d objects with realistic
materials. arXiv preprint arXiv:2404.16829, 2024.
[39] Yiwei Hu, Paul Guerrero, Milos Hasan, Holly Rushmeier, and Valentin Deschaintre. Generating
procedural materials from text or image prompts. In ACM SIGGRAPH 2023 Conference Proceedings,
pages 1–11, 2023.
[40] Pandey V, Kalra T, Gubba M, Faisal M. Texture Extraction Methods Based Ensembling Framework
for Improved Classification. arXiv preprint arXiv:2206.04158. 2022 Jun 8.
[41] Lucas de Assis Soares, Klaus Fabian Côco, Patrick Marques Ciarelli, and Evandro Ottoni Teatini
Salles. A class-independent texture-separation method based on a pixel-wise binary classification.
Sensors, 20(18):5432, 2020.
[42] C. Cimpoi, M. Mircea, and A. Vedaldi. Describing textures in the wild. Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition (CVPR), 2014. Available:
https://www.robots.ox.ac.uk/~vgg/data/dtd/
[43] B. Jiang, L. Zheng, L. Luo, Y. Tan, and S. Wang. Geo-Textured Outdoor Scenes: A New Benchmark
for Fine-Grained Recognition of Ground Terrain. Proceedings of the IEEE International Conference on
Computer Vision (ICCV), 2019. Available: https://github.com/BeichenJiang/gtos-mobile
[44] B. Caputo, E. Hayman, M. Fritz, and J.-O. Eklundh. Classifying materials in the real world. Image
and Vision Computing, vol. 28, no. 1, pp. 150-163, 2010. Available:
https://www.csc.kth.se/cvap/databases/kth-tips/
[45] A. Bell, P. Upchurch, N. Snavely, and K. Bala. Material recognition in the wild with the materials in
context database. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), 2015. Available: http://opensurfaces.cs.cornell.edu/publications/minc/
[46] V. Sharan, R. Rosenholtz, E. H. Adelson, and D. L. Kulkarni. Material perception: What can you see
in a brief glimpse? Journal of Vision, vol. 14, no. 9, pp. 1-20, 2014. Available:
https://people.csail.mit.edu/celiu/CVPR2010/FMD/
[47] S. Lazebnik, C. Schmid, and J. Ponce. A sparse texture representation using local affine regions.
IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 8, pp. 1265-1278, 2005.
Available: http://www-cvr.ai.uiuc.edu/ponce_grp/data/
[48] K. Dana, B. Van Ginneken, S. Nayar, and J. Koenderink. Reflectance and texture of real-world
surfaces. ACM Transactions on Graphics, vol. 18, no. 1, pp. 1-34, 1999. Available:
http://www.cs.columbia.edu/CAVE/software/curet/
[49] G. Kon. Drexel texture database. Drexel University. Available:
http://www.cs.drexel.edu/~kon/texture/
[50] A. Bell, P. Upchurch, W. S. Agarwal, N. Snavely, and K. Bala. OpenSurfaces: A richly annotated
catalog of surface appearance. ACM Transactions on Graphics, vol. 32, no. 4, 2013. Available:
http://opensurfaces.cs.cornell.edu/
[51] P. Brodatz. Textures: A Photographic Album for Artists and Designers. New York: Dover
Publications, 1966. Available: http://multibandtexture.recherche.usherbrooke.ca/original_brodatz.html
[52] P. Brodatz. Textures: A Photographic Album for Artists and Designers. New York: Dover
Publications, 1966. Available: http://multibandtexture.recherche.usherbrooke.ca/original_brodatz.html
[53] Hideyuki Tamura, Shunji Mori, and Takashi Yamawaki. Textural features corresponding to
visual perception. IEEE Transactions on Systems, man, and cybernetics, 8(6):460–473, 1978.
[54] Patrick C Chen and Theodosios Pavlidis. Segmentation by texture using correlation. IEEE
Transactions on Pattern Analysis and Machine Intelligence, (1):64–69, 1983.
[55] Dalal, N., & Triggs, B. (2005). Histograms of Oriented Gradients for Human Detection. Proceedings
of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 1,
886-893.
[56] Jaehyun Park and Ludwik Kurz. Unsupervised segmentation of textured images. Information
sciences, 92(1-4):255–276, 1996.
[57] Anne Humeau-Heurtier. Texture feature extraction methods: A survey. IEEE access, 7:8975–9000,
2019.
[58] Lowe, D. G. (2004). Distinctive Image Features from Scale-Invariant Keypoints. International
Journal of Computer Vision, 60(2), 91-110.
[59] Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla
Dhariwal,Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are
few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
[60] Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini
Agarwal,Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual
models from natural language supervision. In International Conference on Machine Learning, pages
8748–8763. PMLR, 2021.
[61] Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman,
Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Wortsman, et al. Laion-
5b: An open large-scale dataset for training next generation image-text models. arXiv preprint
arXiv:2210.08402, 2022.
[62] Dan Hendrycks, Steven Basart, Norman Mu, Saurav Kadavath, Frank Wang, Evan Dorundo, Rahul
Desai, Tyler Zhu, Samyak Parajuli, Mike Guo, et al. The many faces of robustness: A critical analysis of
out-of-distribution generalization. In Proceedings of the IEEE/CVF international conference on computer
vision, pages 8340–8349, 2021.
[63] Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson,
Tete Xiao, Spencer Whitehead, Alexander C Berg, Wan-Yen Lo, et al. Segment anything. In
Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4015–4026,
2023.
[64] Kuznetsova A, Rom H, Alldrin N, Uijlings J, Krasin I, Pont-Tuset J, Kamali S, Popov S, Malloci M,
Kolesnikov A, Duerig T. The open images dataset v4: Unified image classification, object detection, and
visual relationship detection at scale. International journal of computer vision. 2020 Jul;128(7):1956-81.
[65] Open Images Dataset V7. Available at https://storage.googleapis.com/openimages/web/index.html.
[66] Dong J, Liu J, Yao K, Chantler M, Qi L, Yu H, Jian M. Survey of procedural methods for
two-dimensional texture generation. Sensors. 2020 Feb 19;20(4):1135.
[67] Franke K, Müller H. Procedural generation of 3D karst caves with speleothems. Computers &
Graphics. 2022 Feb 1;102:533-45.
[68] Alexander Raistrick, Lahav Lipson, Zeyu Ma, Lingjie Mei, Mingzhe Wang, Yiming Zuo,
Karhan Kayan, Hongyu Wen, Beining Han, Yihan Wang, et al. Infinite photorealistic worlds
using procedural generation. In Proceedings of the IEEE/CVF Conference on Computer Vision
and Pattern Recognition, pages 12630–12641, 2023.
Appendix
A7. Mixing PBR materials,
Another way to increase the complexity of PBR materials is by mixing two or more materials.
Similar to past work this is done by mixing the property maps of the two textures using a random
mixing ratio. For each property in the PBR material, the weighted average of the corresponding
maps in the two materials. The mixing ratio was determined randomly for each property
separately (different value for each property), or single mixing value for all the properties.
A8. Seamless textures and tilability
Textures in the real world are rarely tileable (periodic) as a result the extracted textures in
Vastextures are in most cases not tileable. This limits their usefulness in CGI areas which
demand seamless textures. There are plenty of automatic and A.I methods to turn texture into
seamless, many of them built in free and commercial graphic tools. At the current point, we
choose not to modify the textures to be seamless, mainly because this will mean changing them
using a formalistic way. Changing textures to tileable is relatively straightforward and could
easily be done by commonly available graphic tools.
A9. Image sources and textures size
Another limitation of the Vastexture is the textures sizes which are mostly between 240-1000
pixels wide. This is due to the need for large images in order to extract large textures. For the
image repositories we tested, finding textures that are larger than ¼ of the image size is rare and
more than ½ is very rare. The segment of anything (SAM) repository contains images of sizes
around 2000 pixels and enables extraction of textures between 500 -1000 pixels but with a more
restrictive license. The open images repository has a very permissive license (Apache) but small
images. Extracting textures of sizes >1000 pixels will demand a large repository of very large
images >4000 pixels per dimension. We are not aware of any such large scale repository with
free license.
A10. Dataset source and license
The Vastexture repository available for downloads at this URLs: 1,2, 3,4
The code used to generate the repository available with CC0 license at 1,2, 3,4.
The dataset license is CC0 but the license for each texture is derived from the source image
license (open images, segment anything) .