Abstract

Assessing biodiversity status and trends in plant communities is critical for understanding, quantifying and predicting the effects of global change on ecosystems. Vegetation plots record the occurrence or abundance of all plant species co‐occurring within delimited local areas. This allows species absences to be inferred, information seldom provided by existing global plant datasets. Although many vegetation plots have been recorded, most are not available to the global research community. A recent initiative, called ‘sPlot’, compiled the first global vegetation plot database, and continues to grow and curate it. The sPlot database, however, is extremely unbalanced spatially and environmentally, and is not open‐access. Here, we address both these issues by (a) resampling the vegetation plots using several environmental variables as sampling strata and (b) securing permission from data holders of 105 local‐to‐regional datasets to openly release data. We thus present sPlotOpen, the largest open‐access dataset of vegetation plots ever released. sPlotOpen can be used to explore global diversity at the plant community level, as ground truth data in remote sensing applications, or as a baseline for biodiversity monitoring. Vegetation plots (n = 95,104) recording cover or abundance of naturally co‐occurring vascular plant species within delimited areas. sPlotOpen contains three partially overlapping resampled datasets (c. 50,000 plots each), to be used as replicates in global analyses. Besides geographical location, date, plot size, biome, elevation, slope, aspect, vegetation type, naturalness, coverage of various vegetation layers, and source dataset, plot‐level data also include community‐weighted means and variances of 18 plant functional traits from the TRY Plant Trait Database. Global, 0.01–40,000 m². 1888–2015, recording dates. 42,677 vascular plant taxa, plot‐level records. Three main matrices (.csv), relationally linked.
1740  
|
Global Ecol Biogeogr. 2021;30:1740–1764.wileyonlinelibrary.com/journal/geb
Received: 22 December 2020 
|
  Revised: 12 May 2021 
|
  Accepted: 18 May 2021
DOI : 10.1111/geb .133 46
DATA ART ICLE
sPlotOpen – An environmentally balanced, open- access, global
dataset of vegetation plots
Francesco Maria Sabatini1,2 | Jonathan Lenoir3| Tarek Hattab4|
Elise Aimee Arnst5| Milan Chytrý6| Jürgen Dengler1,7,8 | Patrice De Ruffray9|
Stephan M. Hennekens10 | Ute Jandt1,2 | Florian Jansen11 |
Borja Jiménez- Alfaro12 | Jens Kattge13 | Aurora Levesley14| Valério D. Pillar15 |
Oliver Purschke16 | Brody Sandel17 | Fahmida Sultana18| Tsipe Aavik19 |
Svetlana Aćić20 | Alicia T. R. Acosta21 | Emiliano Agrillo22 | Miguel Alvarez23 |
Iva Apostolova24| Mohammed A. S. Arfin Khan25 | Luzmila Arroyo26| Fabio Attorre27 |
Isabelle Aubin28 | Arindam Banerjee29| Marijn Bauters30,31 |
Yves Bergeron32 | Erwin Bergmeier33 | Idoia Biurrun34 | Anne D. Bjorkman35,36 |
Gianmaria Bonari37 | Viktoria Bondareva38 | Jörg Brunet39 | Andraž Čarni40,41 |
Laura Casella42 | Luis Cayuela43 | Tomáš Černý44| Victor Chepinoga45 |
János Csiky46 | Renata Ćušterevska47| Els De Bie48 | André Luis de Gasper49 |
Michele De Sanctis27 | Panayotis Dimopoulos50| Jiri Dolezal51 | Tetiana Dziuba52|
Mohamed Abd El- Rouf Mousa El- Sheikh53,54 | Brian Enquist55 | Jörg Ewald56 |
Farideh Fazayeli57, 5 8 | Richard Field59 | Manfred Finckh60 | Sophie Gachet61 |
Antonio Galán- de- Mera62,63,64 | Emmanuel Garbolino65| Hamid Gholizadeh66 |
Melisa Giorgis67 | Valentin Golub68 | Inger Greve Alsos69 | John- Arvid Grytnes70|
Gregory Richard Guerin71 | Alvaro G. Gutiérrez72 | Sylvia Haider1,2 |
Mohamed Z. Hatim73,74 | Bruno Hérault75,76,77 | Guillermo Hinojos Mendoza78|
Norbert Hölzel79 | Jürgen Homeier80 | Wannes Hubau81,82| Adrian Indreica83|
John A. M. Janssen84| Birgit Jedrzejek79| Anke Jentsch85 | Norbert Jürgens60 |
Zygmunt Kącki86| Jutta Kapfer87| Dirk Nikolaus Karger88 | Ali Kavgacı89 |
Elizabeth Kearsley90 | Michael Kessler91 | Larisa Khanina92 | Timothy Killeen93|
Andrey Korolyuk94| Holger Kreft95 | Hjalmar S. Kühl1,96 | Anna Kuzemko97 |
Flavia Landucci6| Attila Lengyel98 | Frederic Lens99,10 0 |
Débora Vanessa Lingner101 | Hongyan Liu102 | Tatiana Lysenko103,104,105 |
Miguel D. Mahecha1,106 | Corrado Marcenò6,34 | Vasiliy Martynenko107 |
Jesper Erenskjold Moeslund108 | Abel Monteagudo Mendoza109,110 |
This is an open access article under the terms of the Creat ive Commo ns Attri bution License, which permits use, distribution and reproduction in any medium,
provided the original work is properly cited.
© 2021 The Authors. Global Ecology and Biogeography published by John Wiley & Sons Ltd.
Frances co Maria Sabati ni and Jonathan Lenoir c ontributed eq ually to this work.
  
|
 1741
SABATINI eT A l.
Ladislav Mucina111,112 | Jonas V. Müller113 | Jérôme Munzinger114 |
Alireza Naqinezhad115| Jalil Noroozi116 | Arkadiusz Nowak117,118 |
Viktor Onyshchenko119| Gerhard E. Overbeck120 | Meelis Pärtel19 |
Aníbal Pauchard121,122 | Robert K. Peet123 | Josep Peñuelas124,125 |
Aaron Pérez- Haase126,127 | Tomáš Peterka6| Petr Petřík128 | Gwendolyn Peyre129 |
Oliver L. Phillips14 | Vadim Prokhorov130| Valerijus Rašomavičius131 |
Rasmus Revermann132,133 | Gonzalo Rivas- Torres134 | John S. Rodwell135|
Eszter Ruprecht136| Solvita Rūsiņa137 | Cyrus Samimi138| Marco Schmidt139 |
Franziska Schrodt59 | Hanhuai Shan140| Pavel Shirokikh107 | Jozef Šik141 |
Urban Šilc142 | Petr Sklenář143| Željko Škvorc144 | Ben Sparrow145|
Marta Gaia Sperandii21,146 | Zvjezdana Stančić147| Jens- Christian Svenning148 |
Zhiyao Tang102| Cindy Q. Tang149| Ioannis Tsiripidis150| Kim André Vanselow151 |
Rodolfo Vásquez Martínez109| Kiril Vassilev24| Eduardo Vélez- Martin152 |
Roberto Venanzoni153 | Alexander Christian Vibrans101| Cyrille Violle154 |
Risto Virtanen1,155,156 | Henrik von Wehrden157| Viktoria Wagner158 |
Donald A. Walker159 | Donald M. Waller160 | Hua- Feng Wang161|
Karsten Wesche1,162,163| Timothy J. S. Whitfeld164 | Wolfgang Willner116 |
Susan K. Wiser5| Thomas Wohlgemuth165 | Sergey Yamalov166|
Martin Zobel19 | Helge Bruelheide1,2
1German Centre for Integrative Biodiversity Research (iDiv) Halle- Jena- Leipzig, Leipzig, Germany
2Institute of Biology, Martin- Luther University Halle- Wittenberg, Halle, Germany
3UMR CNRS 7058 “Ecologie et Dynamique des Systèmes Anthropisés” (EDYSAN), Université de Picardie Jules Verne, Amiens, France
4MARBEC, Univ Montpellier, CNRS, IFREMER and IRD, Sète, France
5Manaaki Whenua - Landcare Research, Lincoln, New Zealand
6Department of Botany and Zoology, Masar yk Universit y, Brno, Czech Republic
7Vegetation Ecology Group, Institute of Natural Resource Sciences (IUNR), Zurich Universit y of Applied Sciences (ZHAW), Wädenswil, Switzerland
8Bayreuth Center of Ecology and Environmental Research (BayCEER), University of Bayreuth, Bayreuth, Germany
9Institut de biologie moléculaire des plantes- CNRS, Université de Strasbourg, Strasburg, France
10Wageningen Environmental Research, Wageningen, the Netherlands
11Faculty of Agricultural and Environmental Sciences, University of Rostock, Rostock, Germany
12Research Unit of Biodiversity (CSIC/UO/PA), University of Oviedo, Mieres, Spain
13Max Planck Institute for Biogeochemistry, Jena, Germany
14School of Geography, University of Leeds, Leeds, UK
15Department of Ecology, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
16Institute for Medical Epidemiology, Biometrics and Informatics (IMEBI), Interdisciplinar y Center for Health Sciences, Medical School of the Martin- Luther
University Halle- Wittenberg, Halle/Saale, Germany
17Depar tment of Biology, Santa Clara University, Santa Clara, California, USA
18Shahjalal University of Science & Technology, Sylhet, Bangladesh
19Institute of Ecology and Earth Sciences, Universit y of Tartu, Tartu, Estonia
20Faculty of Agriculture, Department of Botany, University of Belgrade, Belgrade- Zemun, Serbia
21Department of Sciences, Roma Tre University, Rome, Italy
22Biodiversity Conser vation Department, ISPRA - Italian National Institute for Environmental Protec tion and Research, Rome, Italy
23University of Bonn, INRES, Bonn, Germany
24Department of Plant and Fungal Diversity and Resources, Institute of Biodiversity and Ecosystem Research, Bulgarian Academy of Sciences, Sofia, Bulgaria
25Shahjalal University of Science & Technology, Sylhet, Bangladesh
1742 
|
   SABATINI eT Al.
26Dirección de la Carrera de Biología, Universidad Autónoma Gabriel René Moreno, Santa Cruz de la Sierra, Bolivia
27Depar tment of Environmental Biolog y, Sapienza University of Rome, Rome, Italy
28Natural Resources Canada, Great Lakes Forestry Centre, C anadian Forest Service, Sault Ste Marie, Ontario, Canada
29Department of Computer Science, University of Illinois Urbana Champaign, Urbana, Illinois, USA
30Department Green Chemistry and Technology, Isotope Bioscience Laborator y (UGent- ISOFYS), Ghent University, Ghent, Belgium
31Department Environment, Computational and Applied Vegetation Ecology (UGent- CAVELab), Ghent University, Ghent, Belgium
32Forest Research Institute, Université du Québec en Abitibi- Témiscamingue, Rouyn- Noranda, Quebec, Canada
33Vegetation Ecology and Phytodiversity, Universit y of Göttingen, Göttingen, Germany
34Plant Biology and Ecology, University of the Basque Countr y UPV/EHU, Bilbao, Spain
35Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
36Gothenburg Global Biodiversity Centre, Gothenburg, Sweden
37Free University of Bozen- Bolzano, Bolzano, Italy
38Laboratory of Phytodiversity Problem and of Phytocoenology, Institute of Ecology of the Volga River Basin, Toljatty, Russian Federation
39Southern Swedish Forest Research Centre, Swedish University of Agricultural Sciences, Alnarp, Sweden
40Institute of Biology, Research Center of the Slovenian Academy of Sciences and Ar ts, Ljubljana, Slovenia
41School for Viticulture and Enology, University of Nova Gorica, Nova Gorica, Slovenia
42Biodiversity Conservation Department, ISPRA - It alian National Institute for Environmental Protection and Research, Rome, Italy
43Department of Biology and Geology, Physics and Inorganic Chemistr y, Universidad Rey Juan Carlos, Móstoles, Spain
44Department of Forest Ecology, Faculty of Forestry and Wood Sciences, Czech University of Life Sciences Prague, Praha 6 - Suchdol, Czech Republic
45Central Siberian Botanical Garden SB RAS, Novosibirsk, Russian Federation
46Department of Ecology, Universit y of Pécs, Pécs, Hungary
47Institute of Biology, Faculty of Natural Sciences and Mathematics, Ss. Cyril and Methodius University, Skopje, Republic of Nor th Macedonia
48Biotope Diversity, Research Institute for Nature and Forest (INBO), Brussels, Belgium
49Universidade Regional de Blumenau, Blumenau, Brazil
50Laboratory of Botany, Division of Plant Biology, Department of Biology, University of Patras, Patras, Greece
51Department of Functional Ecology, Institute of Botany, Czech Academy of Sciences, Trebon, Czech Republic
52M.G. Kholodny Institute of Botany, National Academy of Sciences of Ukraine, Geobotany and Ecology, Kyiv, Ukraine
53Botany and Microbiolog y Department, College of Science, King Saud University, Riyadh, Saudi Arabia
54Botany Department, Faculty of Science, Damanhour Universit y, Damanhour, Egypt
55Ecology and Evolutionar y Biology, University of Arizona, Tucson, Arizona, USA
56Hochschule Weihenstephan- Triesdorf, University of Applied Sciences, Freising, Germany
57Google LLC, Mountain View, California, USA
58University of Minnesota - Twin Cities, Minneapolis, Minnesota, USA
59School of Geography, University of Nottingham, Nottingham, UK
60Biodiversity, Ecology and Evolution of Plants, Institute for Plant Science & Microbiology, University of Hamburg, Hamburg, Germany
61CNRS, IRD, IMBE, Aix Marseille Univ, Avignon Université, Marseille, France
62Laboratorio de Botánica, Universidad CEU San Pablo, Madrid, Spain
63Laboratorio de Botánica, Universidad Privada Antonio Guillermo Urrelo, Cajamarca, Peru
64Herbario AQP, Estudios Fitogeográficos del Perú, Paucarpata, Arequipa, Peru
65Nova Sophia - Regus Nova, Climpact Data Science (CDS), CS, Sophia Antipolis Cedex, France
66Department of Plant Biology, University of Mazandaran, Babolsar, Iran
67Ecología Vegetal y Fitogeografía, Instituto Multidisciplinario de Biología Vegetal (IMBIV- CONICET), Córdoba, Argentina
68Laboratory of Phytocoenology, Samara Federal Research Center of the Russian Academy of Sciences, Institute of Ecology of the Volga river basin of the
Russian Academy of Science, Toljatty, Russian Federation
69The Arctic University Museum of Norway, UiT - The Arctic University of Norway, Tromsø, Norway
70Department of Biological Sciences, University of Bergen, Bergen, Norway
71School of Biological Sciences, University of Adelaide, Adelaide, Australia
72Departamento de Ciencias Ambientales y Recursos Naturales Renovables, Facultad de Ciencias Agronomicas, Universidad de Chile, Santiago, Chile
73Plant Ecology and Nature Conser vation Group - Environmental Sciences Department, Wageningen University, Wageningen, the Netherlands
74Botany and Microbiology Department - Faculty of Science, Tanta University, Tanta, Egypt
75CIRAD, UPR Forêts et Sociétés, Yamoussoukro, Ivory Coast
76University of Montpellier, CIR AD, Montpellier, France
77INP- HB, Institut National Poly technique Félix Houphouët- Boigny, Yamoussoukro, Ivory Coast
  
|
 1743
SABATINI eT A l.
78ASES Ecological and Sustainable Services, Aubenas, France
79Institute of Landscape Ecolog y, University of Münster, Münster, Germany
80Plant Ecology and Ecosystems Research, University of Göttingen, Göttingen, Germany
81Department Environment, Laboratory of Wood Biology (UGent- WoodLab), Ghent University, Ghent, Belgium
82Service of Wood Biology, Royal Museum for Central Africa, Tervuren, Belgium
83Department of Silviculture, Transilvania University of Brasov, Brasov, Romania
84Wageningen University and Research, Wageningen Environment al Research (Alterra), Wageningen, the Netherlands
85Disturbance Ecology, Bayreuth Center of Ecology and Environmental Research, University of Bayreuth, Bayreuth, Germany
86Botanical Garden, University of Wrocław, Wrocław, Poland
87Norwegian Institute of Bioeconomy Research, Tromsø, Norway
88Biodiversity and Conservation Biology, Swiss Federal Institute for Forest, Snow and L andscape Research WSL, Birmensdorf, Switzerland
89Faculty of Forest y, Kilavuzlar Köyü Öte Karsi Üniversite Kampüsü Merkez, Karabuk University, Karabuk, Turkey
90Depar tment Environment, Computational and Applied Vegetation Ecology (UGent- CAVELab), Ghent University, Ghent, Belgium
91Department of Systematic and Evolutionary Botany, University of Zurich, Zurich, Switzerland
92Branch of the M.V. Keldysh Institute of Applied Mathematics of Russian Academy of Sciences, Institute of Mathematical Problems of Biology of R AS,
Pushchino, Russian Federation
93Museo de Historia Natural Noel Kempff Mercado, Universidad Autonoma Gabriel Rene Moreno, Santa Cruz de la Sierra, Bolivia
94Geosystem Laboratory, Central Siberian Botanical Garden, Siberian Branch, Russian Academy of Sciences, Novosibirsk, Russian Federation
95Depar tment of Biodiversity, Macroecology and Biogeography, University of Göttingen, Göttingen, Germany
96Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
97Depar tment of Geobotany and Ecology, M.G. Kholodny Institute of Botany of the National Academy of Sciences of Ukraine, Kyiv, Ukraine
98Centre for Ecological Research, Institute of Ecology and Botany, Vácrátót, Hungary
99Research Group Functional Traits, Naturalis Biodiversity Center, Leiden, the Netherlands
100Institute of Biology Leiden, Leiden University, Leiden, the Netherlands
101Depar tamento de Engenharia Florestal, Universidade Regional de Blumenau, Blumenau, Brazil
102College of Urban and Environment al Sciences, Peking University, Beijing, China
103Laboratory of Vegetation Science, Komarov Botanical Institute RAS, Saint- Petersburg, Russian Federation
104Laboratory of Phy todiversity Problems, Institute of Ecology of the Volga River Basin RAS - Branch of the Samara Scientific Center RAS, Togliatti, Russian
Federation
105Group of Ecology of Living Organisms, Tobolsk complex scientific station of Ural Branch R AS, Tobolsk, Russian Federation
106Remote Sensing Centre for Earth System Research, University of Leipzig, Leipzig, Germany
107Ufa Institute of Biolog y, Ufa Federal Scientific Center of the Russian Academy of Sciences, Ufa, Russian Federation
108Department of Bioscience, Aarhus University, Roende, Denmark
109Jardín Botánico de Missouri Oxapampa, Oxapampa, Pasco, Peru
110Universidad Nacional de San Antonio Abad del Cusco, Cusco, Peru
111Murdoch University, Murdoch, Perth, Western Australia, Australia
112Department of Geography & Environmental Studies, Stellenbosch University, Stellenbosch, South Africa
113Conservation Science, Royal Botanic Gardens, Kew, Ardingly, UK
114CIRAD, CNRS, INRAE, AMAP, Université de Montpellier, Montpellier, France
115Department of Plant Biology, University of Mazandaran, Mazandaran, Iran
116Department of Botany and Biodiversity Research, University of Vienna, Vienna, Austria
117Botanical Garden - Center for Biodiversity Conservation, Polish Academy of Sciences, Warsaw, Poland
118Institute of Biology, University of Opole, Opole, Poland
119National Academy of Sciences of Ukraine, M.G. Kholodny Institute of Botany, Kyiv, Ukraine
120Depar tment of Botany, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
121Laboratorio de Invasiones Biológicas (LIB), Facultad de Ciencias Forestales, Universidad de Concepción, Concepción, Chile
122Institute of Ecology and Biodiversity (IEB), Santiago, Chile
123Depar tment of Biology, University of Nor th Carolina, Chapel Hill, North Carolina, USA
124Global Ecology Unit CSIC- CREAF- UAB, CSIC, Bellaterra, Catalonia, Spain
125CREAF, Cerdanyola del Valles, Spain
126Department of Biosciences, University of Vic- Central Universit y of Catalonia, Barcelona, Spain
127Depar tment of Evolutionary Biology, Ecology and Environmental Sciences, University of Barcelona, Barcelona, Spain
128Depar tment of Vegetation Ecology, Institute of Botany, Czech Academy of Sciences, Průhonice, Czech Republic
1744 
|
   SABATINI eT Al.
129Department of Civil and Environmental Engineering, University of the Andes, Bogota, Colombia
130Institute of Environmental Sciences, Kazan Federal University, Kazan, Russian Federation
131Institute of Botany, Nature Research Centre, Vilnius, Lithuania
132Biodiversity, Ecology and Evolution of Plants/Institute for Plant Science & Microbiology, University of Hamburg, Hamburg, Germany
133Facult y of Natural Resources and Spatial Sciences, Namibia Universit y of Science and Technology, Windhoek, Namibia
134Universidad San Francisco de Quito, COCIBA, Quito, Ecuador
135Independent Researcher, Lancaster, UK
136Hungarian Department of Biology and Ecology, Faculty of Biology and Geology, Babeș- Bolyai Universit y, Cluj- Napoca, Romania
137Faculty of Geography and Earth Sciences, Universit y of Latvia, Riga, Latvia
138Climatology, Bayreuth Center of Ecology and Environmental Research (BayCEER), University of Bayreuth, Bayreuth, Germany
139Palmengarten, Stadt Frank furt am Main - Der Magistrat, Frankfurt am Main, Germany
140Microsoft, Redmond, Washington, USA
141Institute of Bot any, Plant Science and Biodiversity Centre Slovak Academy of Sciences, Bratislava, Slovakia
142Institute of Biology, Research Centre of Slovenian Academy of Sciences and Arts (ZRC SAZU), Ljubljana, Slovenia
143Department of Botany, Charles University, Prague, Czech Republic
144Faculty of Forestr y and Wood Technology, University of Zagreb, Zagreb, Croatia
145TERN, University of Adelaide, Adelaide, South Australia, Australia
146CSIC- UV- GV, Centro de Investigaciones sobre Desertificación, Moncada, Spain
147Faculty of Geotechnical Engineering, University of Zagreb, Varaždin, Croatia
148Department of Biology, Aarhus Universit y, Aarhus C, Denmark
149School of Ecology and Environmental Science, Yunnan University, Chenggong New District, Kunming, China
150School of Biology, Aristotle Universit y of Thessaloniki, Thessaloniki, Greece
151Department of Geography, University of Erlangen- Nuremberg, Erlangen, Germany
152ILEX Consultoria Científica, Por to Alegre, Brazil
153Depar tment of Chemistry, Biology and Biotechnology, University of Perugia, Perugia, Italy
154Univ Montpellier, CNRS, EPHE, IRD, Univ Paul Valéry Montpellier 3, CEFE, Montpellier, France
155Ecolog y and Genetics Research Unit, Biodiversity Unit, University of Oulu, Oulu, Finland
156Department of Physiological Diversity, Helmholtz Center for Environmental Research - UFZ, Leipzig, Germany
157Institute of Ecology, Leuphana University of Lüneburg, Lüneburg, Germany
158Depar tment of Biological Sciences, University of Alberta, Edmonton, Alber ta, Canada
159Institute of Arctic Biology, University of Alaska, Fairbanks, Alaska, USA
160Botany, University of Wisconsin- Madison, Madison, Wisconsin, USA
161Hainan Key Laborator y for Sustainable Utilization of Tropical Bioresources, College of Tropical Crops, Hainan University, Haikou, China
162Botany Department, Senckenberg Museum of Natural Histor y Görlitz, Görlitz, Germany
163International Institute Zittau, Technische Universität Dresden, Zittau, Germany
164Bell Museum, University of Minnesota, St. Paul, Minnesota, USA
165Forest Dynamics, Swiss Federal Institute for Forest , Snow and Landscape Research WSL, Birmensdorf, Switzerland
166Laborator y of Wild- Growing Flora, South- Ural Botanical Garden- Institute, Ufa Scientific Centre, Russian Academy of Sciences, Ufa, Russian Federation
Correspondence
Francesco Maria Sabatini, German Centre for Integrative Biodiversity Research (iDiv) Halle- Jena- Leipzig, Puschstraße 4, 04103, Leipzig, Germany.
Email: francesco.sabatini@botanik.uni-halle.de
  
|
 1745
SABATINI eT A l.
1 | BACKGROUND & SUMMARY
Biodiversity is facing a global crisis. As many as 1 million species
are currently threatened with extinction, the vast majority due to
anthropogenic impacts such as land- use and climate change (IPBES,
2019; WWF, 2020). In addition, the rates of biodiversity homogeni-
zation and redistribution are accelerating (Fricke & Svenning, 2020;
Lenoir et al., 2020; Staude et al., 2020). Biological assemblages are
becoming progressively more similar to each other globally, as local
and endemic species go extinct and are replaced by more wide-
spread and competitive native or alien species (IPBES, 2019; Staude
et al., 2020). Many terrestrial and marine species are also shifting
their geographical distribution as a response to climate change
(Lenoir et al., 2020). This has profound potential impacts on ecosys-
tems and human health (Bonebrake et al., 2018; Pecl et al., 2017).
Plant communities are no exception to this biodiversity crisis
(Cardinale et al., 2011; Lenoir et al., 2008; Staude et al., 2020). This
is particularly worrying since terrestrial vegetation accounts for 80%
(450 Gt C) of the living biomass on Earth (Bar- On et al., 2018). Given
the central role of vegetation in ecosystem productivity, structure,
stability and functioning (Cardinale et al., 2011), assessing biodiver-
sity status and trends in plant communities is paramount for other
kingdoms of life and human societies alike.
Monitoring trends in plant biodiversity requires adequate
data across a range of spatio- temporal scales (Kühl et al., 2020;
Pimm, 2021). Large independent collections of plant occurrence
Funding information
Agence Nationale de la Recherche, Grant/
Award Number: ANR- 07- BDIV- 0006, ANR-
07- BDIV- 0008 and ANR- 07- BDIV- 0010;
H2020 European Research Council,
Grant/Award Number: ERC Advanced
Grant 291585 “T- FORCES” and ERC- SyG-
2013- 610028 IMBALANCE- P; Villum
Fonden, Grant/Award Number: 16549;
Deutsche Forschungsgemeinschaft, Grant/
Award Number: DFG FZT 118, 202548816,
DFG Ho3296- 2, DFG Ho3296- 4, DFG
VA 749/1- 1, DFG VA 749/4- 1 and DFG
WE 2601/3- 1 3- 2 4- 1 4- 2; Narodowe
Centrum Nauki, Grant/Award Number:
2017/25/B/NZ8/00572; Latvia grant,
Grant/Award Number: AAP2016/B0 41//
Zd2016/AZ03; NSF, Grant/Award Number:
DEB- 0 415383; Horizon 2020 Framework
Programme, Grant/Award Number:
640176; U.S. National Science Foundation,
Grant/Award Number: DBI- 0213794
and DBI- 9905838; Grantová A gentura
České Republiky, Grant/Award Number:
19- 28491X; German Centre for Integrative
Biodiversity Research, Grant/Award
Number: 50170649_#7; Fundación BBVA,
Grant/Award Number: BIOCON08_04 4;
Akademie Věd České Republiky, Grant/
Award Number: RVO 67985939; Spanish
Research Agency, Grant/Award Number:
AEI/10.13039/501100 011033; National
Research, Development and Innovation
Office, Hungar, Grant/Award Number:
PD- 12399; Schweizerischer Nationalfonds
zur Förderung der Wissenschaftlichen
Forschung, Grant/Award Number:
20BD21_184131; Basque Government,
Grant/Award Number: 640176; Russian
Foundation for Basic Research, Grant/Award
Number: 16- 0 4- 00747a; Brazil’s National
Council of Scientific and Technological
Development, Grant/Award Number:
307689/2014- 0 and 310022/2015- 0;
Volkswagen Foundation, Grant/Award
Number: AZ I/81 976
Handling Editor: Amanda Bates
Abstract
Motivation: Assessing biodiversity status and trends in plant communities is criti-
cal for understanding, quantifying and predicting the effects of global change on
ecosystems. Vegetation plots record the occurrence or abundance of all plant spe-
cies co- occurring within delimited local areas. This allows species absences to be in-
ferred, information seldom provided by existing global plant datasets. Although many
vegetation plots have been recorded, most are not available to the global research
community. A recent initiative, called ‘sPlot’, compiled the first global vegetation plot
database, and continues to grow and curate it. The sPlot database, however, is ex-
tremely unbalanced spatially and environmentally, and is not open- access. Here, we
address both these issues by (a) resampling the vegetation plots using several envi-
ronmental variables as sampling strata and (b) securing permission from data holders
of 105 local- to- regional datasets to openly release data. We thus present sPlotOpen,
the largest open- access dataset of vegetation plots ever released. sPlotOpen can be
used to explore global diversity at the plant community level, as ground truth data in
remote sensing applications, or as a baseline for biodiversity monitoring.
Main types of variable contained: Vegetation plots (n = 95,104) recording cover or
abundance of naturally co- occurring vascular plant species within delimited areas.
sPlotOpen contains three partially overlapping resampled datasets (c. 50,000 plots
each), to be used as replicates in global analyses. Besides geographical location, date,
plot size, biome, elevation, slope, aspect, vegetation type, naturalness, coverage of
various vegetation layers, and source dataset, plot- level data also include community-
weighted means and variances of 18 plant functional traits from the TRY Plant Trait
Database.
Spatial location and grain: Global, 0.01– 40,000 m².
Time period and grain: 1888– 2015, recording dates.
Major taxa and level of measurement: 42,677 vascular plant taxa, plot- level records.
Software format: Three main matrices (.csv), relationally linked.
KEYWORDS
big data, biodiversity, biogeography, database, functional traits, macroecology, vascular
plants, vegetation plots
1746 
|
   SABATINI eT Al.
data do exist at the global or continental extent via the Botanical
Information and Ecology Network (BIEN; Enquist et al., 2016), the
Global Inventory of Floras and Traits (GIFT; Weigelt et al., 2020) or
the Global Biodiversity Information Facility (GB IF; ht tps://www.gbif.
org/). However, these databases suffer from one or several of the
following limitations: (a) imbalance towards tree species only; (b) lack
of data on how individual plant species co- occur and interact locally
to form plant communities; and (c) coarse spatial resolutions (e.g.,
one- degree grid cells), which preclude intersection with high resolu-
tion remote sensing data and the assessment of biodiversity trends
at the plant community level (Boakes et al., 2010).
There is a long tradition among botanists and phytosociologists
to record the cover or abundance of each plant species that occurs
in a vegetation plot (here used as a synonym of ‘relevé’ or ‘quad-
rat’) of a given size (i.e., surface area) at a given time (e.g., Stebler &
Schröter, 1892). Compared to presence- only data, vegetation- plot
data present many advantages. As all visible plant species are re-
corded, plots contain information on which plant species do, and do
not co- occur in the same locality at a given moment in time (Chytrý
et al., 2016). This is important for testing hypotheses related to bi-
otic interactions among plant species. Vegetation- plot data also pro-
vide crucial information on where and when a species was absent,
therefore, improving predictions from current species distribution
models (Phillips et al., 2009). Being spatially explicit, vegetation
plots can be resurveyed through time to assess potential changes in
plant species composition relative to a baseline (Perring et al., 2018;
Staude et al., 2020; Steinbauer et al., 2018). As they normally con-
tain information on the relative cover or abundance of each species,
vegetation plots are also more appropriate for detecting biodiversity
changes than data representing only the occurrence of individual
species (Beck et al., 2018; Jandt et al., 2011).
Globally, however, vegetation- plot data are very fragmented,
as they typically stem from a myriad of local research and sur-
vey projects (Bruelheide et al., 2019). These are fine- grained data
(e.g., 1– 10,000 m2) normally covering small spatial extents (e.g., 1–
1,000 km2). With their disparate sampling protocols, standards and
taxonomic resolutions, aggregating and harmonizing vegetation plot
data proves extremely challenging (Bruelheide et al., 2018). It is not
surprising, therefore, that these data are rarely used in global- scale
research on the biodiversity of plant communities (Aubin et al., 2020;
Franklin et al., 2017; Wiser, 2016).
The sPlot initiative tries to close this data gap. It consolidates
numerous local to regional vegetation- plot datasets to create a
harmonized and comprehensive global database of georeferenced
terrestrial plant species assemblages (Bruelheide et al., 2019).
Established in 2013, sPlot v3.0 currently contains more than 1.9 mil-
lion vegetation plots, and is fully integrated with the TRY database
(Kattge et al., 2020 ), fro m which it derives information on pla nt func-
tional traits. The sPlot database is increasingly being used to study
continental- to- global scale vegetation patterns (Cai et al., 2021;
Testolin, Attorre, et al., 2021; Testolin, Carmona, et al., 2021),
such as the relative contribution of regional versus local factors
to the global patterns of fern richness (Weigand et al., 2020), the
mechanisms underlying the spread and abundance of native versus
invasive tree species (van der Sande et al., 2020), and worldwide
trait– environment relationships in plant communities (Bruelheide
et al., 2018).
Yet, most of these data are not open- access. Here, we secured
permission from data holders in the sPlot database to openly release
a dataset composed of 95,104 vegetation plots. We selected the
plots to be released using a replicated environmental stratification,
in order to represent the entire environmental space covered by the
sPlot database. This maximizes the benefits of releasing these data
for a wide range of potential uses. The selected vegetation plots
stem from 105 databases and span 114 countries (Figure 1). This
resampled dataset (sPlotOpen hereafter) is composed of: (a) plot-
level information, including metadata and basic vegetation structure
descriptors; (b) the vascular plant species composition of each vege-
tation plot, including species cover or abundance information when
available; and (c) community- level functional information obtained
by intersection with the TRY database (Kattge et al., 2020).
sPlotOpen is specifically designed for global macroecological
studies, for example, the exploration of functional diversity patterns
of communities with continental- to- global extent. We expect, how-
ever, that sPlotOpen might likewise prove useful to answer a range
of different questions, related for instance to species co- occurrence
patterns, the definition of species pools, the link between regional
versus local determinants of species diversity, or the niche overlap
between co- occurring species. Yet, data in sPlotOpen should not be
considered as representative of the distribution of plant communi-
ties worldwide, especially when working at local spatial extents. This
should be kept in mind for applications such as species distribution
models (SDMs) or joint SDMs, whose results might be affected by
the uneven geographical distribution of sPlotOpen's data. We refer
the reader to the section ‘Usage notes’ for additional guidance on
critical issues related, for instance, to incompletely sampled vegeta-
tion plots, varying plot size, and nested vegetation plots.
2 | METHODS
2.1 | Vegetation plot data sources
We started from the sPlot database v2.1 (created in October 2016),
which contains 1,121,244 unique vegetation plots and 23,586,216
species records. Most of the data in sPlot refer to natural and semi-
natural vegetation, while vegetation shaped by intensive and re-
peated human interference, such as cropland or ruderal communities,
is hardly represented. Data originate from 110 different vegetation-
plot datasets of regional, national or continental extent, some of
which stem from regional or continental initiatives (see Bruelheide
et al., 2019, for more information). For instance: 48 vegetation-
plot datasets derive from the European Vegetation Archive (EVA;
Chytrý et al., 2016); three major African datasets derive from the
Tropical African Vegetation Archive (TAVA); and multiple vegetation
datasets in the USA and Australia derive from the VegBank (Peet,
  
|
 1747
SABATINI eT A l.
Lee, Boyle, et al., 2012; Peet, Lee, Jennings, et al., 2012) and TERN’s
AEKOS (Chabbi & Loescher, 2017) archives, respectively. Data from
other continents (South America, Asia) or countries were contrib-
uted as separate standalone datasets. The metadata of each indi-
vidual vegetation- plot dataset stored in sPlot are managed through
the Global Index of Vegetation- Plot Databases (GIVD; Dengler
et al., 2011), using the GIVD code as the unique dataset identifier.
2.2 | Resampling method
Data in the sPlot database are unevenly distributed across veg-
etation types and geographical regions (see Bruelheide et al., 2018).
Mid- latitude regions in developed countries (mostly Europe, the
USA and Australia) are overrepresented in sPlot, while regions in the
tropics and subtropics are underrepresented, which is a typical geo-
graphical bias in biodiversity data (see Lenoir et al., 2020; Lenoir &
Svenning, 2015 for similar geographical bias in species redistribution).
Such a geographical bias usually translates into an environmental bias
with temperate climate usually more represented than tropical or
Mediterranean climates. Unbalanced sampling effort in the environ-
mental space is of particular concern for comparative macroecological
studies (Bruelheide et al., 2018; Lenoir et al., 2010). To reduce this
imbalance as much as possible, we performed a stratified resampling
approach within the environmental space using several environmental
variables available at global extent as sampling strata.
FIGURE 1 Top: global distribution of all vegetation plots contained in sPlotOpen (n = 95,104). Each colour represents a different
source dataset (n = 105 – different datasets might have the same colour). Bottom: spatial distribution of vegetation plot density for the
environmentally balanced dataset selected by the first resampling iteration (n = 49,787). Densities are calculated in hexagonal cells with a
spatial resolution of approximately 70,000 km². Map projection is Eckert IV [Colour figure can be viewed at wileyonlinelibrary.com]
1748 
|
   SABATINI eT Al.
First, we removed vegetation plots without geographical coor-
dinates or with a location uncertainty higher than 3 km. We also
removed vegetation plots identified by the respective data con-
tributors as having been recorded in wetlands or in anthropogenic
vegetation types, since these data were available only for a few
geographical regions, mostly in Europe. This resulted in a total of
799,400 out of the initial set of 1,121,244 vegetation plots.
We then ran a global principal component analysis (PCA) on a
matrix of all terrestrial grid cells at a spatial resolution of 2.5 arcmin
(n = 8,384,404), based on 30 climatic and soil variables. For climate,
we used the 19 bioclimatic variables from CHELSA (Climatologies
at high resolution for the earth's land surface areas) v1.2 (Karger
et al., 2017), as well as two other bioclimatic variables reflecting the
growing- season length (growing degree days above 1 GDD1
and 5 GDD5), which were derived from CHELSA’s monthly
te mpera ture s as in Syne s and Osb orn e (2011 ). In ad ditio n, we co nsid -
ered an index of aridity and a layer for potential evapotranspiration
from the Consortium of Spatial Information (CGIAR- CSI, Trabucco
& Zomer, 2010). For soil, we extracted seven variables from the
SoilGrids database (Hengl et al., 2017), namely: (a) soil organic carbon
content in the fine earth fraction; (b) cation exchange capacity; (c)
pH; as well as the fractions of (d) coarse fragments; (e) sand; (f) silt;
and (g) clay. The results of this PCA represent the full environmen-
tal space of all terrestrial habitats on Earth, irrespective of whether
a grid cell hosted vegetation plots or not (Supporting Information
Figure S1). We then subdivided the PCA ordination space, repre-
sented by the first two principal components (PC1– PC2), which
accounted for 47 and 23%, respectively, of the total environmental
variation in terrestrial grid cells, into a regular 100 × 100 grid. This
PC1– PC2 two- dimensional space was subsequently used to balance
our sampling effort across all PC1– PC2 grid cells for which vegeta-
tion plots were available. After excluding 42,878 vegetation plots for
which no PC1 or PC2 values were available, due to missing data in
the bioclimatic or soil variables, we projected the remaining 756,522
vegetation plots onto this PC1– PC2 grid. We finally calculated how
many vegetation plots occurred in each PC1– PC2 grid cell (Figure 2).
In total, vegetation plots were available for 1,720 out of the
4,125 PC1– PC2 grid cells covered by the 8,384,404 terrestrial grid
cells of the geographical space. We then resampled those PC1– PC2
grid cells (n = 858) with more than 50 vegetation plots, which is
FIGURE 2 Distribution of vegetation plots from sPlotOpen in the global environmental space based on a principal component analysis
(PCA) using 30 climate and soil variables. Top: spatial distribution of PCA values across all terrestrial grid cells (n = 8,384,404, spatial grain
= 2.5 arcmin). Bottom left: distribution of plots compared to the distribution of all terrestrial 2.5 arc- minute cells (grey background) in the
PCA space. Only the plots in the environmentally balanced dataset selected in the first resampling iteration are shown (n = 49,787). The PCA
space was divided into a 100 × 100 regular grid. The first and second PCA axes explained 47 and 23% of the total variance, respectively.
Bottom right: geographical distribution of the vegetation plots contained in four randomly selected PCA grid cells [Colour figure can be
viewed at wileyonlinelibrary.com]
  
|
 1749
SABATINI eT A l.
the median number of plots occurring across occupied grid cells in
sPlot. This threshold of 50 vegetation plots represents a compro-
mise between selecting a high number of plots, and keeping the
resampled dataset as balanced as possible across the PC1– PC2 en-
vironmental space. To select these 50 vegetation plots we used the
heterogeneity- constrained random resampling algorithm (Lengyel
et al., 2011). This algorithm quantifies the variability in plant species
composition among a set of vegetation plots by computing the mean
and the variance of the Jaccard’s dissimilarity index (Jaccard, 1912)
between all possible pairs of vegetation plots. More precisely, for a
given PC1– PC2 grid cell containing more than 50 vegetation plots,
we generated 1,000 random selections of 50 vegetation plots and
ranked each selection according to the mean (ascending order) and
variance (descending order) value of the Jaccard’s dissimilarity index.
Ranks from both sortings were summed for each random selection,
and the selection with the lowest summed rank was considered to
provide the most balanced/even representation of vegetation types
within the focal grid cell. Where a grid cell contained fewer than 50
plots, we retained all of them. In this way, we reduced the imbalance
towards over- sampled climate types while ensuring that the resam-
pled dataset represents the entire environmental gradient covered
by the original sPlot database. This approach optimizes the selec-
tion of a subset of vegetation plots that encompasses the highest
variability in species composition while avoiding peculiar and rare
communities, which may represent outliers. As such, our approach
maximizes variability over representativeness within each grid cell.
We repeated the whole resampling procedure three times to get
three different environmentally balanced, resampled subsets of our
vegetation plots. These three resampling iterations can therefore be
used as separate replicates, albeit these are not completely indepen-
dent, as the same plots might have been drawn in two or even three
of the three resampling iterations. In addition, those plots located
in PC1– PC2 grid cells with fewer than 50 vegetation plots are com-
pletely shared by all three iterations.
2.3 | Permission to release the data as open access
The resampling procedure resulted in 56,486, 56,501 and 56,494
vegetation plots selected during resampling iterations #1, #2 and #3,
respectively, for a total of 107,238 unique vegetation plots. Since
the sPlot database is a consortium of independent datasets whose
copyright belongs to the data contributors, we used this preliminary
potential selection to ask each dataset’s custodian (i.e., either the
owner of a dataset or its authorized representative in the case of
a collective dataset) for permission to release the data of selected
vegetation plots as open access. For 12,134 unique vegetation plots,
permission could not be granted because, for instance, the data
are unpublished, confidential or sensitive. The number of vegeta-
tion plots for which the open- access permission was not granted
in resampling iterations #1, #2 and #3 was 6,699, 6,690 and 6,705,
respectively.
To mitigate the imbalance due to the exclusion of these confi-
dential plots, we created a ‘consensus’ dataset. We started from
resampling iter ation #1, and repl aced th e 6,699 plots not gr anted as
open access with plots selected in the second and third iterations,
for which such permission could be granted (‘reserve’ plots, here-
after). We imposed the constraint that each candidate vegetation
plot in the reserve pool should belong to the same environmen-
tal stratum, that is, the same PC1PC2 grid cell, as the confidential
vegetation plot, even though we acknowledge that this procedure
does not maximize the variability in plant species composition of
the replacement plots. Even after drawing from reserves, there
were 3,150 plots that could not be replaced. These were distrib-
uted across 279 PC1– PC2 grid cells (16.2% of occupied cells), each
cell having on average 11 irreplaceable plots (min. = 1, median = 5,
max. = 50).
2.4 | Trait information
For each vegetation plot for which open access could be granted,
we computed the community- weighted mean and variance for 18
plant functional traits derived from the TRY database v3.0 (Kattge
et al., 2020). These traits were selected among those that de-
scribe the leaf, wood, and seed economics spectra (Reich, 2014;
Westoby, 1998), and are known to either affect different key eco-
system processes or respond to macroclimatic drivers, or both
(Bruelheide et al., 2018). The 18 plant functional traits (all concen-
trations based on dry weight) were: (a) leaf area (mm2); (b) stem spe-
cific density (g/cm3); (c) specific leaf area (m2/kg); (d) leaf carbon
concentration (mg/g); (e) leaf nitrogen concentration (mg/g); (f) leaf
phosphorus concentration (mg/g); (g) plant height (m); (h) seed mass
(mg); (i) seed length (mm); (j) leaf dry matter content (g/g); (k) leaf
nitrogen per area (g/m2); (l) leaf N:P ratio (g/g); (m) leaf δ15N (per mil-
lion); (n) seed number per reproductive unit; (o) leaf fresh mass (g); (p)
stem conduit density (per mm2); (q) dispersal unit length (mm); and (r)
conduit element length (μm).
Because missing values were particularly widespread in the
species- trait matrix, we calculated community- weighted means
using the gap- filled version of these traits we received from TRY
(Kattge et al., 2020). Gap- filling was performed at the level of indi-
vidual observations and relies on hierarchical Bayesian modelling (R
package ‘BHPMF’ – Fazayeli et al., 2014; Schrodt et al., 2015) in R
(R Core Team, 2020). This is a Bayesian machine learning approach,
with no a priori assumptions, except for the data being missing com-
pletely at random. The algorithm ‘learns’ from the data, that is, if
there was a phylogenetic signal in the data, this was used to fill the
gaps but where no such signal was apparent, none was introduced.
After gap- filling, we transformed to the natural logarithm all gap-
filled trait values and averaged each trait by taxon (i.e., at species or
genus level). The gap- filling approach was run only for species having
at least one trait obser vation (n = 21,854). Additional information on
the gap- filling procedure is available in Bruelheide et al. (2019).
1750 
|
   SABATINI eT Al.
Community- weighted means (CWM) and variances (CWV) were
calculated for every plant functional trait j and every vegetation plot
k as follows (Enquist et al., 2015):
where nk is the number of species with trait information in vegetation
plot k, pi,k is the relative abundance of species i in vegetation plot k cal-
culated as the species’ fraction in cover or abundance of total cover or
abundance, and ti,j is the mean value of species i for trait j.
3 | DATA RECORDS
sPlotOpen contains 95,104 unique vegetation plots from 105 con-
stitutive datasets (Table 1) and from 114 countries covering all con-
tinents except Antarctica (Figure 1). This is the result of pooling
together the three environmentally balanced datasets from resam-
pling iterations #1, #2 and #3 containing 49,787, 49,811 and 49,789
plots, respectively, after excluding the set of plots for which open
access could not be granted by data contributors. The number of
plots shared across all three resampling iterations is 19,672, while
14,939 plots are shared between two iterations. Replacing confi-
dential plots in resampling iteration #1 with reserves from the other
two iterations in the same PC1– PC2 grid cell resulted in a consensus
version containing 53,262 plots. sPlotOpen only contains the spe-
cies composition of vascular plants; information on the composition
of bryophytes and lichens was discarded since it was only available
for a minority of plots (n = 11,001 and n = 6,801, respectively).
Information on the size (surface area) of the vegetation survey is
available for 67,022 plots, and ranges between 0.03 and 40,000 m2
(mean = 377 m2; median = 10 0 m2). Specifically, sPlotOpen contains
12,894 plots with size smaller than 10 m2, 25,742 with size 10–
100 m2, 24,750 plots with size 100– 1,000 m2 and 3,075 plots with
size greater or equal to 1,000 m2. Similarly, only for a minority of
plots (n = 24,167) is information on the exact group of plants sam-
pled in the field available (e.g., complete vegetation, only trees, only
trees > 1 m height, and so on). However, as most data were collected
using the phytosociological method, we deem it safe to assume that,
unless otherwise specified, plots contain information on all vascular
plants. We retained plots with incomplete vegetation, because they
were mostly located in the tropics, that is, in areas where vegetation
plots are particularly scarce otherwise. The average number of vas-
cular plant species per vegetation plot ranges between 1 (i.e., mono-
specific stands) and 271 species (mean = 20; median = 16).
By capping the number of vegetation plots in overrepresented
environmental conditions, the resampling procedure described
above strongly reduced the bias in the distribution of vegetation
plots within the PC1PC2 environmental space. Yet, due to the lack
or scarcity of data from some geographical regions, like the tropics,
there is some remaining imbalance in the spatial distribution of veg-
etation plots across geographical regions (Figure 1). This is evident
when comparing the number of plots across continents. When con-
sidering the first resampling iteration only (n = 49,787), Europe is
by far the best represented continent, with 15,920 vegetation plots.
The least represented continents are Africa and South America, with
3,709 and 5,498 vegetation plots, respectively. Some residual im-
balance remains also when considering biomes (Figure 3). With the
exception of the ‘Temperate mid- latitudes’ biome, which includes
14,100 vegetation plots, all other biomes have a number of plots
comprised between 1,558 (‘Polar and subpolar zone’) and 6,245
(‘Subtropics with year- round rain’) vegetation plots (Figure 3, left).
Despite this residual imbalance, all the Whittaker biomes are cov-
ered by sPlotOpen (Figure 3, right), and our resampling algorithm has
resulted in a much more balanced dataset than many other global
datasets that are available, such as GBIF.
Almost one third of the 95,104 vegetation plots in sPlotOpen
belong to forests (n = 38,282), one half to non- forest vegetation
(n = 45,735), with 11.6% of plots remaining unassigned (n = 11,087 ).
When not directly done by data providers, the assignment of plots
to forests and non- forests was based on multiple lines of evidence,
including the plot- level information on the cover of the tree layer, as
well as traits of species composing a plot, such as growth form and
height. In short, a plot record was considered as forest if the cover
of the tree layer, or alternatively, the sum of the (relative) cover of
all tree taxa (scaled by the sum of all cover values, as a percentage),
was greater than 25%. It was considered a non- forest record if the
sum of relative cover of low- stature, non- tree and non- shrub taxa
was greater than 90%. For an extensive explanation of this classifi-
cation scheme, we refer the reader to Bruelheide et al. (2019). Even
though the proportion of forest versus non- forest vegetation plots is
relatively well balanced, the geographical distribution of vegetation
plots belonging to different vegetation types is likely not balanced
in the geographical space, as it depends on the idiosyncrasies of the
constitutive datasets composing the sPlot database. For instance,
the data from New Zealand only include plots collected in non- forest
ecosystems, while data from Chile only refer to forests. We urge po-
tential users to carefully read the section ‘Usage notes’ be low and the
description of each individual dataset in GIVD (Dengler et al., 2011),
and to co nt act the custodians of each dat aset for further information.
4 | DATABASE ORGANIZATION
The environmentally balanced and open- access dataset sPlotOpen
is organized into three main matrices, relationally linked through the
key column ‘PlotObservationID’.
The ‘header’ matrix contains plot- level information for the
95,104 vegetation plots, including: metadata (e.g., plot ID, data
source, sampling date, geographical location, positional accuracy);
sampling design information (e.g., the total surface area used during
the vegetation survey); and a plot- level description of vegetation
(1)
CWM
j,k=
n
k
i
pi,kti,
j
(2)
CWV
j,k=
n
k
i
pi,k(ti,jCWMj,k)
2
  
|
 1751
SABATINI eT A l.
TABLE 1 List of datasets contributing to sPlotOpen, the environmentally balanced and open- access database of vegetation plots
GIVD ID Dataset name Custodian Deputy custodian
No. open-
access plots Reference
00- 00- 001 ForestPlots.net Oliver L. Phillips Aurora Levesley 169 Lopez- Gonzalez et al. (2011)
00- 00- 003 SALVIAS Brian Enquist Brad Boyle 3,403
00- 00- 004 Vegetation Database of Eurasian Tundra Risto Virtanen 519
00- 00- 005 Tundra Vegetation Plots (TundraPlot) Anne D.
Bjorkman
Sarah Elmendorf 309 Elmendorf et al. (2012)
0 0 - R U - 0 0 1 Vegetation Database Forest of Southern
Ural
Vasiliy
Martynenko
Pavel Shirokikh 68
0 0 - R U - 0 0 2 Database of Masar yk University’s
Vegetation Research in Siberia
Milan Chytrý 158 Chytr ý (2012)
0 0 - R U - 0 0 3 Database Meadows and Steppes of
Southern Ural
Sergey Yamalov Mariya Lebedeva 238
0 0 - T R - 0 0 1 Forest Vegetation Database of Turkey
FVDT
Ali Kavga 45
AF- 00- 001 West African Vegetation Database Marco Schmidt Georg Zizka 258 Schmidt et al. (2012)
AF- 00- 003 BIOTA Southern Africa Biodiversity
Observatories Vegetation Database
Norbert Jürgens Ute Schmiedel 1,015 Muche et al. (2012)
AF- 00- 006 SWEA- Dataveg Miguel Alvarez Michael Curran 1,675 Alvarez et al. (2021)
AF- 00- 008 PANAF Vegetation Database Hjalmar S. Kühl TeneKwetche Sop 884
AF- 00- 009 Vegetation Database of the Okavango
Basin
Rasmus
Revermann
Manfred Finckh 378 Revermann et al. (2016)
A F - B F - 0 0 1 Sahel Vegetation Database Jonas V. Müller Marco Schmidt 556 Müller (2003)
A F - C D - 0 0 1 Forest Database of Central Congo Basin Kim Sarah
Jacobsen
Hans Verbeeck 140 Kearsley et al. (2013)
A F - E T - 0 0 1 Vegetation Database of Ethiopia Desalegn Wana Anke Jentsch 67 Wana & Beierkuhnlein (2011)
A F - M A - 0 0 1 Vegetation Database of Southern
Morocco
Manfred Finckh 621 Finckh (2012)
A F - Z W - 0 0 1 Vegetation Database of Zimbabwe Cyrus Samimi 31 Samimi (2003)
AS- 00- 001 Korean Forest Database Tomáš Černý Jiri Dolezal 1,039 Černý et al. (2015)
AS- 00- 003 Vegetation of Middle Asia Arkadiusz
Nowak
Marcin Nobis 314 Nowak et al. (2017)
AS- 00- 004 Rice Field Vegetation Database Arkadiusz
Nowak
32
A S - B D - 0 0 1 Tropical Forest Dataset of Bangladesh Mohammed A.
S. Arfin Khan
Fahmida Sultana 87
A S - C N - 0 0 1 China Forest- Steppe Ecotone Database Hongyan Liu Fengjun Zhao 117 Liu et al. (2000)
A S - C N - 0 0 2 Tibet- PaDeMoS Grazing Transect Karsten Wesche Yun Jäschke 58 Wang et al. (2017)
A S - C N - 0 0 3 Vegetation Database of the BEF China
Project
Helge
Bruelheide
24 Bruelheide et al. (2011)
A S - C N - 0 0 4 Vegetation Database of the Northern
Mountains in China
Zhiyao Tang 124
A S - E G - 0 0 1 Vegetation Database of Sinai in Egypt Mohamed Z.
Hatim
143 Hatim (2012)
A S - I D - 0 0 1 Sulawesi Vegetation Database Michael Kessler 24
A S - I R - 0 0 1 Vegetation Database of Iran Jalil Noroozi Parastoo Mahdavi 277
A S - K Z - 0 0 1 Database of Meadow Vegetation in the
NW Tien Shan Mountains
Viktoria Wagner 13 Wagner (2009)
A S - M N - 0 0 1 Southern Gobi Protected Areas
Database
Henrik von
Wehrden
Karsten Wesche 1,032 von Wehrden et al. (2009)
A S - R U - 0 0 1 Wetland Vegetation Database of Baikal
Siberia (WETBS)
Victor
Chepinoga
9Chepinoga (2012)
(Continues)
1752 
|
   SABATINI eT Al.
GIVD ID Dataset name Custodian Deputy custodian
No. open-
access plots Reference
A S - R U - 0 0 2 Database of Siberian Vegetation (DSV) Andrey
Korolyuk
Andrei Zverev 3,634 Korolyuk & Zverev (2012)
A S - R U - 0 0 4 Database of the University of Münster
Biodiversity and Ecosystem Research
Group's Vegetation Research in
Western Siberia and Kazakhstan
Norbert Hölzel Wanja Mathar 207
A S - S A - 0 0 1 Vegetation Database of Saudi Arabia Mohamed Abd
El- Rouf Mousa
El- Sheikh
711 El- Sheikh et al. (2017)
A S - T J - 0 0 1 Eastern Pamirs Kim And
Vanselow
221 Vanselow (2016)
A S - T W - 0 0 1 National Vegetation Database of Taiwan Ching- Feng Li Chang- Fu Hsieh 912
A S - Y E - 0 0 1 Socotra Vegetation Database Michele De
Sanctis
Fabio Attorre 236 De Sanctis & Attorre (2012)
A U - A U - 0 0 2 AEKOS Ben Sparrow 10,976 Chabbi & Loescher (2017)
A U - N C - 0 0 1 New Caledonian Plant Inventory and
Permanent Plot Network (NC- PIPPN)
Jérôme
Munzinger
Philippe Birnbaum 98 Ibanez et al. (2014)
A U - N Z - 0 0 1 New Zealand National Vegetation
Databank
Susan K. Wiser 1,127 Wiser et al. (20 01)
A U - P G - 0 0 1 Forest Plots from Papua New Guinea Timothy J. S.
Whitfeld
George D. Weiblen 60 Whitfeld et al. (2014)
EU- 00- 002 Nordic- Baltic Grassland Vegetation
Database (NBGVD)
Jürgen Dengler Łukasz Kozub 54 Dengler & Rūsiņa (2012)
EU- 00- 011 Vegetation- Plot Database of the
University of the Basque Country
(BIOVEG)
Idoia Biurrun Itziar
García- Mijangos
2,142 Biurrun et al. (2012)
EU- 00- 013 Balkan Dry Grasslands Database Kiril Vassilev Armin Macanović 269 Vassilev et al. (2012)
EU- 00- 016 Mediterranean Ammophiletea Database Corrado
Marcenò
Borja
Jiménez- Alfaro
783 Marcenò &
Jiménez- Alfaro (2017)
EU- 00- 017 European Coastal Vegetation Database John A. M.
Janssen
356
EU- 00- 018 The Nordic Vegetation Database Jonathan Lenoir Jens- Christian
Svenning
1,735 Lenoir et al. (2013)
EU- 00- 019 Balkan Vegetation Database Kiril Vassilev Hristo Pedashenko 484 Vassilev et al. (2016)
EU- 00- 020 WetVegEurope Flavia Landucci 127 Landucci et al. (2015)
EU- 00- 022 European Mire Vegetation Database Tomáš Peterka Martin Jiroušek 2,560 Peterka et al. (2015)
E U - A L - 0 0 1 Vegetation Database of Albania Michele De
Sanctis
Giuliano Fanelli 31 De Sanctis et al. (2017)
E U - A T - 0 0 1 Austrian Vegetation Database Wolfgang
Willner
Christian Berg 2,310 Willner et al. (2012)
E U - B E - 0 0 2 INBOVEG Els De Bie 119
E U - B G - 0 0 1 Bulgarian Vegetation Database Iva Apostolova Desislava
Sopotlieva
160 Apostolova et al. (2012)
E U - C H - 0 0 5 Swiss Forest Vegetation Database Thomas
Wohlgemuth
2,134 Wohlgemuth (2012)
E U - C Z - 0 0 1 Czech National Phytosociological
Database
Milan Chytrý Ilona Knollová 1,287 Chytrý & Rafajová (20 03)
E U - D E - 0 0 1 VegMV Florian Jansen Christian Berg 15 Jansen et al. (2012)
E U - D E - 0 1 3 VegetWeb Germany Florian Jansen rg Ewald 587 Ewald et al. (2012)
E U - D E - 0 1 4 German Vegetation Reference Database
(GVRD)
Ute Jandt Helge Bruelheide 762 Jandt & Bruelheide (2012)
TABLE 1 (Continued)
(Continues)
  
|
 1753
SABATINI eT A l.
GIVD ID Dataset name Custodian Deputy custodian
No. open-
access plots Reference
E U - D K - 0 0 2 National Vegetation Database of
Denmark
Jesper
Erenskjold
Moeslund
Rasmus Ejrnæs 332
E U - E S - 0 0 1 Iberian and Macaronesian Vegetation
Information System (SIVIM) – Wetlands
Aaron
Pérez- Haase
Xavier Font 580
E U - F R - 0 0 3 SOPHY Emmanuel
Garbolino
Patrice De Ruffray 7,986 Garbolino et al. (2012)
E U - G B - 0 0 1 UK National Vegetation Classification
Database
John S. Rodwell 3,182
E U - G R - 0 0 1 KRITI Erwin Bergmeier 22
E U - G R - 0 0 5 Hellenic Natura 20 00 Vegetation
Database (HelNatVeg)
Panayotis
Dimopoulos
Ioannis Tsiripidis 620 Dimopoulos &
Tsiripidis (2012)
E U - G R - 0 0 6 Hellenic Woodland Database Ioannis Tsiripidis Georgios Fotiadis 17 Fotiadis et al. (2012)
E U - H R - 0 0 1 Phytosociological Database of Non-
Forest Vegetation in Croatia
Zvjezdana
Stančić
193 Stančić (2012)
E U - H R - 0 0 2 Croatian Vegetation Database Željko Škvorc Daniel Krstonošić 585
E U - H U - 0 0 3 CoenoDat Hungarian Phytosociological
Database
János Csiky Zoltán
Botta- Dukát
46 Lájer et al. (2008)
E U - I T - 0 0 1 VegItaly Roberto
Venanzoni
Flavia Landucci 754 Landucci et al. (2012)
E U - I T - 0 1 0 Vegetation database of Habitats in the
Italian Alps – HabItAlp
Laura Casella Pierangela Angelini 247 Casella et al. (2012)
E U - I T - 0 1 1 Vegetation- Plot Database Sapienza
University of Rome (VPD- Sapienza)
Emiliano Agrillo Fabio Attorre 967 Agrillo et al. (2017)
E U - L T − 0 0 1 Lithuanian Vegetation Database Valerijus
Rašomavičius
Domas Uogintas 81
E U - L V - 0 0 1 Semi- natural Grassland Vegetation
Database of Latvia
Solvita Rūsiņa 369 Rūsiņa (2012)
E U - M K - 0 0 1 Vegetation Database of the Republic of
Macedonia
Renata
Ćušterevska
28
E U - N L - 0 0 1 Dutch National Vegetation Database Stephan M.
Hennekens
Joop H. J.
Schaminée
1,098 Schaminée et al. (2006)
E U - P L - 0 0 1 Polish Vegetation Database Zygmunt Kącki Grzegorz Swacha 692 Kącki & Śliwiński (2012)
E U - R O - 0 0 7 Romanian Forest Database Adrian Indreica Pavel Dan
Turtureanu
166 Indreica et al. (2017)
E U - R O - 0 0 8 Romanian Grassland Database Eszter Ruprecht Kiril Vassilev 82 Vassilev et al. (2018)
E U - R S - 0 0 2 Vegetation Database Grassland
Vegetation of Serbia
Svetlana Aćić Zora Dajić
Stevanović
217 Aćić et al. (2012)
E U - R U - 0 0 2 Lower Volga Valley Phytosociological
Database
Valentin Golub Andrey Chuvashov 383 Golub et al. (2012)
E U - R U - 0 0 3 Vegetation Database of the Volga and
the Ural Rivers Basins
Tatiana Lysenko 174 Lysenko et al. (2012)
E U - R U - 0 1 1 Vegetation Database of Tatarstan Vadim
Prokhorov
Maria
Kozhevnikova
206 Prokhorov et al. (2017)
E U - S I - 0 0 1 Vegetation Database of Slovenia Urban Šilc Filip Küzmič 1,029 Šilc (2012)
E U - S K - 0 0 1 Slovak Vegetation Database Milan Valachovič Jozef Šibík 2,394 Šibík (2012)
E U - U A - 0 0 1 Ukrainian Grasslands Database Anna Kuzemko Yulia Vashenyak 301 Kuzemko (2012)
E U - U A - 0 0 6 Vegetation Database of Ukraine and
Adjacent Parts of Russia
Viktor
Onyshchenko
Vitaliy
Kolomiychuk
96
NA- 00- 002 Tree Biodiversity Network
(BIOTREE- NET)
Luis Cayuela 241 Cayuela et al. (2012)
TABLE 1 (Continued)
(Continues)
1754 
|
   SABATINI eT Al.
structure (e.g., vegetation type, percentage cover of each vege-
tation layer), vegetation type, and naturalness level (i.e., whether
a plot belongs to the same formation that would occupy the site
without human interference). Plots in Europe are also classified ac-
cording to the European Nature Information System (EUNIS) habitat
classification (column ‘ESY), based on the habitat classification ex-
pert system (ESY, Chytrý et al., 2020). For each vegetation plot, we
further provide information on the dataset it originates from, based
on the IDs used in GIVD (Dengler et al., 2011). We also report four
binary fields describing whether a plot belongs to the three resa-
mpling iterations (columns ‘Resample_1’, ‘Resample_2’, ‘Resample_3’),
or to the first resa mpling iteration afte r the inc lusion of rep lacement
plots (column ‘Resample_1_consensus’). A brief summary of all the 47
variables in the header matrix is provided in Table 2.
The ‘DT’ matrix contains data on the species composition of each
plot. It is structured in a long format and contains 1,945,384 records
from 42,680 vascular plant taxa, mostly resolved at the species level.
For each record, we report both the taxon name as originally contrib-
uted by the data custodian (column ‘Original_species’), and the taxon
name after taxonomic standardization (column ‘Species’). For details on
the taxonomic standardization, please see section ‘Technical valida-
tion’ below. For each species we also provided cover/abundance val-
ues. These follow different standards across the datasets constituting
the sPlot database. We, therefore, provide both the cover/abundance
value as reported in the original data (column ‘Original_abundance’),
together with the abundance scale that was originally used (column
‘Abundance_scale). This can take seven values: ‘CoverPerc’ = percent-
age cover; ‘pa’ = presence- absence; ‘x_BA’ = basal area (m2/ha, only
for woody species); ‘x_IC’ = individual count, that is, number of indi-
viduals in plot; ‘x_SC’ = stem count, that is, number of stems in plot;
‘x_IV’ = importance value index; and ‘x_PF’ = presence frequency.
The great majority of entries, however, use the percentage cover scale
(n = 1,709,00 0). Finally, for each entr y, we calculated a ‘Relative_cover’,
that is, the cover/abundance of a given taxon divided by the total
cover/abundance of all taxa in that vegetation plot.
The ‘CWM_CWV’ matrix contains the community- weighted
means and variances calculated for each of the 18 functional traits
mentioned above. It also contains three additional columns. The
GIVD ID Dataset name Custodian Deputy custodian
No. open-
access plots Reference
N A - C A - 0 0 3 Database of Timberline Vegetation in
NW North America
Viktoria Wagner Toby Spribille 63 Wagner et al. (2014)
N A - C A - 0 0 4 Understory of Sugar Maple Dominated
Stands in Quebec and Ontario (Canada)
Isabelle Aubin 13 Aubin et al. (2007)
N A - C A - 0 0 5 Boreal Forest of Canada Philippe
Marchand
Yves Bergeron 57 Harper et al. (2003)
N A - G L - 0 0 1 Vegetation Database of Greenland Birgit Jedrzejek Fred J. A. Daniëls 4 41 Sieg et al. (20 06)
N A - U S - 0 0 2 VegBank Robert K. Peet Michael T. Lee 14,965 Peet, Lee, Jennings,
et al. (2012)
N A - U S - 0 0 6 Carolina Vegetation Sur vey Database Robert K. Peet Michael T. Lee 3,263 Peet, Lee, Boyle, et al. (2012)
N A - U S - 0 1 4 Alaska- Arctic Vegetation Archive Donald A.
Walker
Amy Breen 771 Walker et al. (2016)
SA- 00- 002 VegPáramo Gwendolyn
Peyre
Xavier Font 2,010 Peyre et al. (2015)
S A - A R - 0 0 2 Vegetation Database of Central
Argentina
Melisa Giorgis Alicia T. R. Acosta 86
S A - B O - 0 0 3 Bolivia Forest Plots Michael Kessler Sebastian Herzog 44
S A - B R - 0 0 2 Forest Inventor y, State of Santa
Catarina, Brazil (IFFSC Project)
Alexander
Christian
Vibrans
André Luís de
Gasper
1,561 Vibrans et al. (2020)
S A - B R − 0 0 3 Grasslands of Rio Grande do Sul, Brazil Eduardo
Vélez- Martin
Valério D. Pillar 306
S A - B R − 0 0 4 Grassland Database of Campos Sulinos Gerhard E.
Overbeck
Valério D. Pillar 147
S A - C L − 0 0 2 SSAForests_Plots_db Alvaro G.
Gutiérrez
155
S A - C L - 0 0 3 Chilean Park Transects – Fondecyt
1040528
Aníbal Pauchard Alicia Marticorena 44 Pauchard et al. (2013)
S A - E C - 0 0 1 Ecuador Forest Plot Database Jürgen Homeier 166
Note: Datasets are ordered based on their ID in the Global Index of Vegetation Databases (GVID ID).
TABLE 1 (Continued)
  
|
 1755
SABATINI eT A l.
column ‘Species_richness’ shows the number of species recorded in
each plot. The columns ‘Trait_coverage_cover’ and ‘Trait_coverage_pa’
provid e, respecti vely, th e propor tion of tot al cover and the pro por tion
of species in a plot for which functional trait information was avail-
able. In total, functional trait information was available for 21,854
species. As functional trait information was based on gap- filled data
(see above), each of these 21,854 species had information for all the
18 functional traits. The average proportion of species in each plot for
which functional trait information was available is .85 (median = .95).
For 42,012 plots, the coverage was complete, while we do not have
functional trait information for any of the species occurring in 482
plots. When considering relative cover, the average trait coverage is
.87, with 74,151 plots having functional trait information for species
cumulatively accounting for more than 80% of relative cover. When
considering the number of species, 68,041 plots have functional trait
information for 80% or more of the species occurring in that plot.
sPlotOpen contains two additional objects. The ‘metadata’
matrix contains plot- level metadata, which provide information
on the origin of each individual vegetation plot. This object con-
tains 15 columns, with information on the dataset of origin (col-
umn ‘GIVD_ID’ Dengler et al., 2011), author or surveyor names
(columns ‘Releve_author’ and ‘Releve_coauthor’), bibliographic refer-
ences both at the dataset (column ‘DB_BIBTEXKEY’) and plot level
(‘Plot_Biblioreference’ and ‘BIBTEXKEY’), when available. Similarly,
the column ‘Project_name’ provides information on the project in
which a vegetation plot was originally recorded. When available,
we also provide information on the numbering of the plots in the
publication where they originally appeared (columns ‘Nr_table_in_
publ’, ‘Nr_releve_in_table’), or in the dataset where they were ini-
tially stored (‘Original_nr_in_database’). In the case of nested plots
(n = 1,851), we also provide the original plot and subplot IDs (col-
umns: Original_plotID’, ‘Original_subplotID’). The last two columns
report plot- level ‘Remarks’, and the unique identifier produced
by Turboveg when the vegetation plot was first stored (‘GUID’).
Turboveg is a program specifically designed to store, maintain and
export vegetation plot data (https://www.synbi osys.alter ra.nl/tur-
boveg; Hennekens & Schaminée, 2001).
Finally, the object ‘references’ contains all the bibliographic ref-
erences formatted according to a BibTex standard. Each reference
is tagged with a key corresponding to the fields ‘DB_BIBTEXKEY’
and ‘BIBTEXKEY in the metadata. We further provide an R function
(‘sPlotOpen_citation’) to create reference lists, based on a selection of
plots and/or datasets.
Except for the ‘reference’ file (format.bib), all objects/matrices
are provided in tab- delimited .txt files. All objects, including the ‘sPlo-
tOpen_citation’ function, are also compiled inside a .RData object.
5 | TECHNICAL VALIDATION
The original sPlot database has a nested structure and consists of
several individual datasets, each validated and maintained by its re-
spective dataset custodian. In many cases, individual datasets are also
collections whose vegetation plots were provided by their respective
owners (the person who performed the actual vegetation survey) or by
someone who digitized the original data from the scientific published
or grey literature. We obviously have no direct control over the individ-
ual vegetation plots that we provide here in sPlotOpen. Yet, all these
vegetation plots stem from train ed profession al botanists , or publishe d
scientific work, and are accompanied by detailed information on the
sampling protocols used, thus ensuring data quality and reliability.
Before integration into the sPlot database, each dataset was fur-
ther checked for consistency. If the dataset was in a different format,
we converted it to a Turboveg 2 dataset (Hennekens & Schaminée,
FIGURE 3 Distribution of vegetation plots in the first resampling iteration of sPlotOpen (n = 49,787) in the two- dimensional climatic
space represented by mean annual temperature and mean annual precipitation. Left: plots are colour coded based on sBiomes, that is,
sPlot’s definition of biomes (Bruelheide et al., 2019), which derives from Schultz’s (2005) ecozones, modified to include also the alpine biome
from Körner et al. (2017). Right: the same plots superimposed onto Whittaker’s biomes (Whittaker, 1975), as adapted by Ricklefs (2008) and
plotted using the R package ‘plotbiomes’ [Colour figure can be viewed at wileyonlinelibrary.com]
1756 
|
   SABATINI eT Al.
TABLE 2 Description of the variables contained in the ‘header’ matrix, together with their range (if numeric) or possible levels (if nominal
or binary) and the number of non- empty (i.e., non NA) records
Variable Range/levels
Unit of
measurement
No. of plots with
information Type
GIVD_ID see Table 1 95,104 n
Dataset see Table 1 95,10 4 n
Continent Africa, Asia, Europe, North America, Oceania, South
America
95,104 n
Country 95,104 n
Biome Alpine, Boreal zone, Dry midlatitudes, Dry tropics and
subtropics, Polar and subpolar zone, Subtropics with
year- round rain, Subtropics with winter rain, Temperate
midlatitudes, Tropics with summer rain, Tropics with year-
round rain
95,104 n
Date_of_recording 0 5 - 0 7 - 1 8 8 8 - 0 3 - 0 2 - 2 0 1 5 dd- mm- yyyy 80,085 d
Latitude −54.82303 – 80.149116 ° (WGS84) 95,104 q
Longitude −162.741433 – 176.4221 ° (WGS84) 95,104 q
Location_uncertainty 1 – 2 , 7 5 0 m95,075 q
Releve_area 0.03– 40,000 m267, 0 22 q
Plant_recorded All vascular plants, All trees & dominant understory,
Dominant trees, Only dominant species, Dominant woody
plants >= 2.5 cm dbh, All woody plants, Woody plants >=
1 cm dbh, Woody plants >= 2.5 cm dbh, Woody plants >=
5 cm dbh, Woody plants >= 10 cm dbh, Woody plants >=
20 cm dbh, Woody plants >= 1 m height, Not specified
95,104 n
Elevation −30 – 5,960 m a.s.l. 62,968 q
Aspect 1– 3 6 0 °42,178 q
Slope 0– 90 °51, 24 6 q
is_forest FAL SE = 45,735; TRUE = 38,282 84, 017 b
ESY 39, 632 n
Naturalness 1 = Natural, 2 = Semi- natural 60,192 o
Forest FALSE = 36,282; TRUE = 3 3,170 69,452 b
Shrubland FALSE = 58,245; TRUE = 11,207 69,452 b
Grassland FAL SE = 33,800; TRUE = 35,652 69,452 b
Wetland FALSE = 59,196; TRUE = 10,256 69,452 b
Sparse_vegetation FALSE = 66,177; TRUE = 3,275 69,452 b
Cover_tota l 1– 9 9 0 %19,4 0 7 q
Cover_tree_layer 0 . 5 – 1 5 0 %12,094 q
Cover_shrub_layer 0 . 5 – 1 7 0 %16,804 q
Cover_herb_layer 0. 2– 19 9 %29,668 q
Cover_moss_layer 1– 10 0 %9,681 q
Cover_lichen_layer 1– 90 %708 q
Cover_algae_layer 1– 10 0 %41 q
Cover_litter_layer 1– 107 %3,161 q
Cover_bare_rocks 1– 10 0 %2,747 q
Cover_cryptogams 1– 90 %772 q
Cover_bare_soil 0– 99 %2,746 q
Height_trees_highest 1– 9 9 m8,220 q
Height_trees_lowest 1– 90 m447 q
Height_shrubs_highest 0 .1– 9.9 m3,389 q
(Continues)
  
|
 1757
SABATINI eT A l.
2001). During this conversion, we checked that all datasets contained
the required metadata information, and cross- checked that each plot
was located within the geographical scope of its respective dataset.
All individual Turboveg 2 dat asets were then int egr ated into a Turboveg
3 database, and exported to comma- separated files. Finally, we har-
monized all the taxonomic names from all datasets, based on sPlot’s
taxonomic backbone (Purschke, 2017). This backbone matched all the
taxonomic names (without nomenclatural authors) from all datasets in
sPlot v2.1 and TRY v3.0 (Kattge et al., 2020) to their resolved version
based on the Taxonomic Name Resolution Service web application
(TNRS version 4.0; Boyle et al., 2013). This allowed us to (a) ha rmonize
all datasets to a common nomenclature and (b) link the sPlot data-
base to the TRY database (Kattge et al., 2020). The final backbone
only retained matched taxonomic names at the rank of species or
higher. Additional detail on the taxonomic resolution is reported in
Bruelheide et al. (2019), while a description of the workflow, including
R- code, is available in Purschke (2017).
6 | USAGE NOTES
The sPlotOpen database can be downloaded from https://doi.
org/10.25829/ idiv.3474- 40- 3292. A short vignette introducing
the use of sPlotOpen in R can be found in Supporting Information
Appendix S1. Users are urged to cite the original sources when using
sPlotOpen in addition to the present paper (see Table 1). For two
datasets (AF- 00009, AF- CD- 001), the identification of taxa at spe-
cies level is still in progress. Data on lichens and mosses, where avail-
able (e.g., dataset NA- GL- 001), can be obtained on request from the
respective dataset custodian or sPlot coordinator. As most of the
constitutive datasets remain under continuous development, sPlo-
tOpen users are encouraged to get in touch with the custodian(s)
of the data they are planning to use (the updated list of custodian
names is maintained on the sPlot website).
The use of sPlotOpen comes with a number of warnings. First,
sPlotOpen was resampled in a way that maximizes the compositional
variability of vegetation in different environmental conditions. As
such, sPlotOpen should not be considered as representative of the
spatial distribution of plant communities, especially when the focus
has a local or regional spatial extent. Second, for most regions data
were collected opportunistically, and without a randomized sampling
design. This might lead to some vegetation types being oversampled
in some regions, but undersampled in other regions, which might af-
fect the output of species distribution models, especially at local or
regional spatial extents. Third, not all plots were sampled using the
same plot size, and some plots, mostly located in tropical regions,
only contain data on woody species. This should be accounted for
when exploring biodiversity patterns or comparing biodiversity in-
dices (e.g., species richness, beta diversity) across plots or regions.
Finally, a small fraction of plots are nested subsets of larger plots.
Depending on the application, this might or might not represent a
problem. Nested plots can be identified using the information in the
‘metadata’ matrix. The most appropriate way to deal with these is-
sues depends on the problem being analysed. Users are, therefore,
invited to carefully consider the limitations above when designing
applications relying on sPlotOpen.
The data described here represent the subset of sPlot for which
we were able to secure permission for making these data open.
Additional data from sPlot are available under sPlot’s Governance
and Data Property Rules (https://www.idiv.de/en/splot). Using the
full sPlot dataset is also recommended if a stratification is desired
that is different from the environmental factors used here, for exam-
ple by geographical region or plot size.
ACKNOWLEDGMENTS
The authors are grateful to the thousands of vegetation scientists
who sampled vegetation plots in the field or digitized them into
regional, national or international databases. The authors also
Variable Range/levels
Unit of
measurement
No. of plots with
information Type
Height_shrubs_lowest 0.1– 9 m263 q
Height_herbs_average 0.1– 6 0 0 cm 5,901 q
Height_herbs_lowest 1 – 1 5 0 cm 490 q
Height_herbs_highest 1– 60 0 cm 1,083 q
SoilClim_PC1 −6.233 – 8.172 95,104 q
SoilClim_PC2 −4.824 – 15.466 95,104 q
Resample_1 FALSE = 45,317; TRUE = 49,787 95,104 b
Resample_2 FAL SE = 45,293; TRUE = 49, 811 95,104 b
Resample_3 FALSE = 45,315; TRUE = 49,789 95,104 b
Resample_1_consensus FAL SE = 41,842; TRUE = 53,262 95,104 b
Note: dbh = diameter at breast height. Variable types can be n = nominal (i.e., qualitative variable); o = ordinal; q = quantitative; b = binary (i.e.,
Boolean); or d = date. Additional details on the variables are in Bruelheide et al. (2019). Global Index of Vegetation Databases (GIVD) codes derive
from Dengler et al. (2011). Biomes refer to Schultz (2005), modified to include also the world mountain regions (Körner et al., 2017). The column ESY
refers to the European Nature Information System (EUNIS) Habitat Classification exper t system (ESY, Chytrý et al., 2020).
TABLE 2 (Continued)
1758 
|
   SABATINI eT Al.
appreciate the support of the German Research Foundation for
funding sPlot as one of the iDiv (DFG FZT 118, 202548816) re-
search platforms, as well as for funding the position of Francesco
Maria Sabatini and the organization of three workshops through
the sDiv calls. The authors acknowledge this support with naming
the data base ‘sP lot’, wher e the ‘s’ refers to the sDiv synthesis work-
shops. The authors are also grateful to Anahita Kazem and iDiv's
Data & Code Unit for assistance with curation and archiving of the
dataset.
The study has been supported by the TRY initiative on plant
traits (http://www.try- db.org). The TRY initiative and database is
hosted, developed and maintained by J. Kattge and G. Bönisch
(Max Planck Institute for Biogeochemistry, Jena, Germany). TRY
is currently supported by DIVERSITAS/Future Earth and iDiv
Halle- Jena- Leipzig. Jens Kattge acknowledges support by the
Max Planck Institute for Biogeochemistry (Jena, Germany), Future
Earth, iDiv Halle- Jena- Leipzig and the EU H2020 project BACI,
Grant No. 640176.
Isabelle Aubin was funded through the Natural Sciences and
Engineering Research Council of Canada and Ontario Ministr y
of Natural Resources and Forestry. Yves Bergeron was funded
through the Natural Sciences and Engineering Research Council
of Canada. Idoia Biurrun was funded by the Basque Government
(IT936- 16). Anne Bjorkman thanks the Herschel Island- Qikiqtaruk
Territorial Park management, Catherine Kennedy, Dorothy Cooley,
Jill F. Johnstone, Cameron Eckert and Richard Gordon for estab-
lishing the ecological monitoring programme. Funding was provided
by Herschel Island- Qikiqtaruk Territorial Park. Luis Cayuela was
supported by project BIOCON08_044 funded by Fundación BBVA
(Banco Bilbao Vizcaya Argantiera). Milan Chytrý, Flavia Landucci,
Corrado Marcenò and Tomáš Peterka were supported by the
Czech Science Foundation (project no. 19- 28491X). Brian Enquist
thanks the following individuals and institutions for contribut-
ing data to sPlot via the SALVIAS database: Mauricio Bonifacino,
Saara DeWalt, Timothy Killeen, Susan Letcher, Nigel Pitman, Cam
Webb, The Missouri Botanical Garden, RAINFOR and the Amazon
Forest Inventory Network. Alvaro G. Gutiérrez was funded by
Project FORECOFUN- SSA PIEF- GA- 2010– 274798 and FONDECYT
1200468. Mohamed Z. Hatim thanks Kamal Shaltout and Joop
Schaminée for MSc thesis supervision, and Joop Schaminée for
suppor t and funding from the Prin ce Bernard Culture Fund Pr ize for
Nature Conservation. Jürgen Homeier received funding from BMBF
(Federal Ministry of Education and Science of Germany) and the
German Research Foundation (DFG Ho3296- 2, DFG Ho3296- 4).
Borja Jiménez- Alfaro was funded by the Spanish Research Agency
through grant AEI/10.13039/501100011033. Dirk N. Karger re-
ceived funding from: the Swiss Federal Institute for Forest, Snow and
Landscape Research (WSL) internal grant exCHELSA and ClimEx,
the Joint Biodiversa COFUND project ‘FeedBaCks' and ‘Futureweb',
the Swiss Data Science Projects: SPEEDMIND, and COMECO, and
the Swiss National Science Foundation (20BD21_184131). Hjalmar
Kühl gratefully acknowledges the Pan African team and fund-
ing by the Max Planck Society and Krekeler Foundation. Attila
Lengyel was supported by the National Research, Development
and Innovation Office, Hungary (PD- 123997). Tatiana Lysenko was
funded by the Russian Foundation for Basic Research (Grant No.
16- 04- 00747a). Alireza Naqinezhad is supported by a master grant
from the University of Mazandaran. Jérôme Munzinger was sup-
ported by the French National Research Agency (ANR) with grants
INC (ANR- 07- BDIV- 0008), BIONEOCAL (ANR- 07- BDIV- 0006)
& ULTRABIO (ANR- 07- BDIV- 0010), by the National Geographic
Society (Grant 7579- 04), and with funding and authorizations of
North and South Provinces of New Caledonia. Arkadiusz Nowak re-
ceived support from the National Science Centre, Poland, grant no.
2017/25/B/NZ8/00572. Gerhard E. Overbeck acknowledges sup-
port from Brazil's National Council of Scientific and Technological
Dev elopment ( C N Pq, gr a nt 310 022/2 015 - 0). Meelis P ärtel was sup-
ported by the Estonian Research Council (PRG609) and European
Regional Development Fund (Centre of Excellence EcolChange).
Robert Peet acknowledges the support from the National
Center for Ecological Analysis and Synthesis, the North Carolina
Ecosystem Enhancement Program, the U.S. Forest Service, and the
U.S. National Science Foundation (DBI- 9905838, DBI- 0213794).
Josep Peñuelas acknowledges the financial support from the
European Research Council Synergy grant ERC- SyG- 2013- 610028
IMBALANCE- P. Petr Petřík and Jiri Dolezal acknowledge the sup-
port of the long- term research development project No. RVO
67985939 of the Czech Academy of Sciences. Oliver Phillips was
funded by an ERC Advanced Grant (291585, ‘T- FORCES’) and a
Royal Society- Wolfson Research Merit Award. Valério D. Pillar
was supported by the Brazil's National Council of Scientific and
Technological Development (CNPq, grant 307689/2014- 0). Solvita
Rūsiņa was supported by the University of Latvia grant AAP2016/
B041//Zd2016/AZ03 within the ‘Climate change and sustainable
use of natural resources’ framework. Franziska Schrodt was sup-
ported by the University of Minnesota Institute on the Environment
Discovery Grant, the German Centre for Integrative Biodiversity
Research (iDiv) Halle- Jena- Leipzig grant (50170649_#7) and the
Un ive rsity of Nottingham A nne McLaren Fe llow ship . Joz e f Šibí k wa s
funded by The Slovak Research and Development Agency grant no.
APVV16- 0431. Jens- Christian Svenning considers this work a con-
tribution to his VILLUM Investigator project ‘Biodiversity Dynamics
in a Changing World’ funded by VILLUM FONDEN (grant 16549).
Kim André Vanselow would like to thank W. Bernhard Dickoré for
the help in the identification of plant species and acknowledges the
financial support from the Volkswagen Foundation (AZ I/81 976)
and the German Research Foundation (DFG VA 749/1- 1, DFG VA
749/4- 1). Evan Weiher was funded by NSF DEB- 0415383, UWEC-
ORSP, and UWEC- BCDT. Work by Karsten Wesche was supported
by the German Research Foundation (DFG WE 2601/3- 1,3- 2, 4- 1,4-
2) and by the German Ministry for Science and Education (BMBF,
CAME 03G0808A). Susan Wiser was funded by the New Zealand
(NZ) Ministry for Business, Innovation and Employment's Strategic
Science Investment Fund.
This paper is dedicated to the memory of Dr. Ching- Feng
(Woody) Li.
  
|
 1759
SABATINI eT A l.
Viktoria Bondareva https://orcid.org/0000-0002-6676-5722
Jörg Brunet https://orcid.org/0000-0003-2667-4575
Andraž Čarni https://orcid.org/0000-0002-8909-4298
Laura Casella https://orcid.org/0000-0003-2550-3010
Luis Cayuela https://orcid.org/0000-0003-3562-2662
Victor Chepinoga https://orcid.org/0000-0003-3809-7453
János Csiky https://orcid.org/0000-0002-7920-5070
Els De Bie https://orcid.org/0000-0001-7679-743X
André Luis de Gasper https://orcid.org/0000-0002-1940-9581
Michele De Sanctis https://orcid.org/0000-0002-7280-6199
Jiri Dolezal https://orcid.org/0000-0002-5829-4051
Mohamed Abd El- Rouf Mousa El- Sheikh https://orcid.
org/0000-0002-0720-7448
Brian Enquist https://orcid.org/0000-0002-6124-7096
Jörg Ewald https://orcid.org/0000-0002-2758-9324
Richard Field https://orcid.org/0000-0003-2613-2688
Manfred Finckh https://orcid.org/0000-0003-2186-0854
Sophie Gachet https://orcid.org/0000-0002-3599-5189
Antonio Galán- de- Mera https://orcid.org/0000-0002-1652-5931
Hamid Gholizadeh https://orcid.org/0000-0002-3694-368X
Melisa Giorgis https://orcid.org/0000-0001-6126-6660
Valentin Golub https://orcid.org/0000-0003-3973-6608
Inger Greve Alsos https://orcid.org/0000-0002-8610-1085
Gregory Richard Guerin https://orcid.org/0000-0002-2104-6695
Alvaro G. Gutiérrez https://orcid.org/0000-0001-8928-3198
Sylvia Haider https://orcid.org/0000-0002-2966-0534
Mohamed Z. Hatim https://orcid.org/0000-0002-0872-5108
Bruno Hérault https://orcid.org/0000-0002-6950-7286
Norbert Hölzel https://orcid.org/0000-0002-6367-3400
Jürgen Homeier https://orcid.org/0000-0001-5676-3267
Anke Jentsch https://orcid.org/0000-0002-2345-8300
Norbert Jürgens https://orcid.org/0000-0003-3211-0549
Dirk Nikolaus Karger https://orcid.org/0000-0001-7770-6229
Ali Kavgacı https://orcid.org/0000-0002-4549-3668
Elizabeth Kearsley https://orcid.org/0000-0003-0046-3606
Michael Kessler https://orcid.org/0000-0003-4612-9937
Larisa Khanina https://orcid.org/0000-0002-8937-5938
Holger Kreft https://orcid.org/0000-0003-4471-8236
Hjalmar S. Kühl https://orcid.org/0000-0002-4440-9161
Anna Kuzemko https://orcid.org/0000-0002-9425-2756
Flavia Landucci https://orcid.org/0000-0002-6848-0384
Attila Lengyel https://orcid.org/0000-0002-1712-6748
Frederic Lens https://orcid.org/0000-0002-5001-0149
Débora Vanessa Lingner https://orcid.org/0000-0002-6391-9343
Hongyan Liu https://orcid.org/0000-0002-6721-4439
Tatiana Lysenko https://orcid.org/0000-0001-6688-1590
Miguel D. Mahecha https://orcid.org/0000-0003-3031-613X
Corrado Marcenò https://orcid.org/0000-0003-4361-5200
Vasiliy Martynenko https://orcid.org/0000-0002-9071-3789
Jesper Erenskjold Moeslund https://orcid.org/0000-0001-8591-7149
Abel Monteagudo Mendoza https://orcid.org/0000-0002-1047-845X
Ladislav Mucina https://orcid.org/0000-0003-0317-8886
Jonas V. Müller https://orcid.org/0000-0001-7049-3048
CONFLICT OF INTEREST
The authors declare no competing interests.
AUTHOR CONTRIBUTIONS
FMS wrote the first draft of the manuscript, with considerable input
from JL and HB. JL and TH wrote the resampling algorithm. FMS
set up the GitHub projects, curated the database, and produced the
graphs. He also coordinated the sPlot consortium. SMH wrote the
Turboveg software, which holds the sPlot database. JKa provided the
trait data from TRY and FSc performed the trait data gap filling. HB
secured the funding for sPlot as a strategic project of iDiv. All other
authors contributed data and/or helped set up the database and/or
helped develop the resampling algorithm. All authors contributed to
revising and approved the manuscript.
DATA AVAIL ABILI T Y STAT EME NT
The R code used to produce sPlotOpen from the sPlot v2.1 data-
base is contained in the sPlotOpen_code GitHub repository: https://
github.com/fmsab atini/ sPlot Open_Code. This manuscript was pro-
duced using the Manubot workflow (Himmelstein et al., 2019). The
code for reproducing this manuscript is stored in the sPlotOpen_
manuscript GitHub repository: https://github.com/fmsab atini/ sPlot
Open_Manus cript.
ORCID
Francesco Maria Sabatini https://orcid.org/0000-0002-7202-7697
Jonathan Lenoir https://orcid.org/0000-0003-0638-9582
Tarek Hattab https://orcid.org/0000-0002-1420-5758
Elise Aimee Arnst https://orcid.org/0000-0003-2388-7428
Milan Chytrý https://orcid.org/0000-0002-8122-3075
Jürgen Dengler https://orcid.org/0000-0003-3221-660X
Stephan M. Hennekens https://orcid.org/0000-0003-1221-0323
Ute Jandt https://orcid.org/0000-0002-3177-3669
Florian Jansen https://orcid.org/0000-0002-0331-5185
Borja Jiménez- Alfaro https://orcid.org/0000-0001-6601-9597
Jens Kattge https://orcid.org/0000-0002-1022-8469
Valério D. Pillar https://orcid.org/0000-0001-6408-2891
Oliver Purschke https://orcid.org/0000-0003-0444-0882
Brody Sandel https://orcid.org/0000-0003-2162-6902
Tsipe Aavik https://orcid.org/0000-0001-5232-3950
Svetlana Aćić https://orcid.org/0000-0001-6553-3797
Alicia T. R. Acosta https://orcid.org/0000-0001-6572-3187
Emiliano Agrillo https://orcid.org/0000-0003-2346-8346
Miguel Alvarez https://orcid.org/0000-0003-1500-1834
Mohammed A. S. Arfin Khan https://orcid.org/0000-0001-6275-7023
Fabio Attorre https://orcid.org/0000-0002-7744-2195
Isabelle Aubin https://orcid.org/0000-0002-5953-1012
Marijn Bauters https://orcid.org/0000-0003-0978-6639
Yves Bergeron https://orcid.org/0000-0003-3707-3687
Erwin Bergmeier https://orcid.org/0000-0002-6118-4611
Idoia Biurrun https://orcid.org/0000-0002-1454-0433
Anne D. Bjorkman https://orcid.org/0000-0003-2174-7800
Gianmaria Bonari https://orcid.org/0000-0002-5574-6067
1760 
|
   SABATINI eT Al.
Jérôme Munzinger https://orcid.org/0000-0001-5300-2702
Jalil Noroozi https://orcid.org/0000-0003-4124-2359
Arkadiusz Nowak https://orcid.org/0000-0001-8638-0208
Gerhard E. Overbeck https://orcid.org/0000-0002-8716-5136
Meelis Pärtel https://orcid.org/0000-0002-5874-0138
Aníbal Pauchard https://orcid.org/0000-0003-1284-3163
Robert K. Peet https://orcid.org/0000-0003-2823-6587
Josep Peñuelas https://orcid.org/0000-0002-7215-0150
Aaron Pérez- Haase https://orcid.org/0000-0002-5974-7374
Petr Petřík https://orcid.org/0000-0001-8518-6737
Gwendolyn Peyre https://orcid.org/0000-0002-1977-7181
Oliver L. Phillips https://orcid.org/0000-0002-8993-6168
Valerijus Rašomavičius https://orcid.org/0000-0003-1314-4356
Rasmus Revermann https://orcid.org/0000-0002-7044-768X
Gonzalo Rivas- Torres https://orcid.org/0000-0002-2704-8288
Solvita Rūsiņa https://orcid.org/0000-0002-9580-4110
Marco Schmidt https://orcid.org/0000-0001-6087-6117
Franziska Schrodt https://orcid.org/0000-0001-9053-8872
Pavel Shirokikh https://orcid.org/0000-0003-1864-4878
Jozef Šibík https://orcid.org/0000-0002-5949-862X
Urban Šilc https://orcid.org/0000-0002-3052-699X
Željko Škvorc https://orcid.org/0000-0002-2848-1454
Marta Gaia Sperandii https://orcid.org/0000-0002-2507-5928
Jens- Christian Svenning https://orcid.org/0000-0002-3415-0862
Kim André Vanselow https://orcid.org/0000-0003-3299-6220
Eduardo Vélez- Martin https://orcid.org/0000-0001-8028-8953
Roberto Venanzoni https://orcid.org/0000-0002-7768-0468
Cyrille Violle https://orcid.org/0000-0002-2471-9226
Risto Virtanen https://orcid.org/0000-0002-8295-821