Access to this full-text is provided by Springer Nature.
Content available from Scientific Data
This content is subject to copyright. Terms and conditions apply.
1
SCIENTIFIC DATA | (2022) 9:664 | https://doi.org/10.1038/s41597-022-01786-5
www.nature.com/scientificdata
An all-Africa dataset of energy
model “supply regions” for solar
photovoltaic and wind power
Sebastian Sterl
✉ HussainMiketaLiMerven
en Tichalabbas
Thiery
&
Daniel Russo
with the strongest resources may not necessarily be the best candidates for investment in new power
source and open-access all-Africa dataset of “supply regions” for solar photovoltaic and onshore wind
Globally, the deployment of modern renewable electricity sources has reached unprecedented levels, mainly
driven by a strong growth of solar photovoltaic (PV) and wind power generation1. The typical levelised
cost of electricity (LCOE) of solar PV and wind power projects has dropped substantially and resulted in
cost-competitiveness with fossil fuel and hydropower plants2. Among other drivers, this has paved the way
for high penetrations of solar PV and wind power in various countries’ electricity mixes, such as Denmark,
Germany and Uruguay3. As a consequence of declining costs—a trend that is projected to continue—long-term
capacity expansion planning at a national and regional level, based on cost-optimisation procedures, oen sug-
gests solar PV and wind power as priorities for future capacity buildout4.
Solar PV and wind power have a specic characteristic in which they dier from more traditional meth-
ods of power generation: their electricity yield varies in function of meteorological parameters such as irradia-
tion, temperature and wind speed. Solar PV and wind power, classied as variable renewable electricity (VRE)
resources, exhibit weather-related variability on all timescales from sub-hourly to interannual, and their yield
is site-specic. A power system with a high share of VRE necessitates increased power system exibility to
cope with these variabilities5. Long-term energy planning, and models used therein, therefore need to take
these aspects into consideration6. When planning future power systems with potentially high shares of VRE
using such models, it is therefore particularly important to properly represent site-specic characteristics of
VRE, including site-specic hourly, seasonal and interannual variabilities, and site-specic costs accounting
for additional grid and road infrastructure needs as a function of the distance between each site and existing
infrastructure7.
In order to represent VRE investment options whose characteristics dier across space, capacity expansion
models theoretically require a comprehensive set of potential VRE plant sites as input to allow well-informed
planning, each with their own temporal generation prole, similar to representing site-specic hydropower
1International Renewable Energy Agency (IRENA), Bonn, Germany. 2Faculty of Engineering, BClimate group,
Department HYDR, Vrije Universiteit Brussel, Brussels, Belgium. 3World Resources Institute (WRI), Regional Hub for
Africa, Addis Ababa, Ethiopia. 4Energy Systems Research Group, University of Cape Town, Cape Town, South Africa.
5International Atomic Energy Agency (IAEA), Vienna, Austria. 6Institute for Research in Technology (IIT), ICAI School
of Engineering, Comillas Pontical University, Madrid, Spain. ✉e-mail: sebastian.sterl@vub.be
Content courtesy of Springer Nature, terms of use apply. Rights reserved
2
SCIENTIFIC DATA | (2022) 9:664 | https://doi.org/10.1038/s41597-022-01786-5
www.nature.com/scientificdata
www.nature.com/scientificdata/
options in these models8. While datasets such as the Global Solar Atlas and the Global Wind Atlas9,10 provide
a comprehensive mapping of VRE potential across the world at high resolution, they do not provide the asso-
ciated infrastructure costs of power plant deployment for each site, and neither would it be practical to feed
cost-optimisation models with all sites (pixels) in a given region without any prior cost-based screening, for
reasons related to computational performance. At the same time, using lighter datasets based e.g. on existing
projects as a proxy for an entire region is also of limited use, as model results would not provide any information
on preferences for VRE deployment across dierent potential sites. For instance, such an approach would not
elucidate the eect of site distance from grid infrastructure on costs11. Clearly, a representative subset of attrac-
tive sites for VRE deployment would be the preferred option to feed into capacity expansion models.
Each site in such a subset would have to be attributed to its own resource strength, temporal variability and
associated grid and road network expansion costs. Once fed into a capacity expansion model, this would allow
elucidating the optimal deployment of VRE plants, i.e. a portfolio of solar and wind power plants across the most
appropriate locations. For the African continent, whose burgeoning power systems imply a chance to plan power
grids from the outset to accommodate VRE5, the need for such modelling exercises is especially high.
is need is accentuated by the fast-growing deployment of VRE plants in dierent parts of the African
continent, and the projected continuation of this deployment over the next decade. Currently, the deployment
of solar PV and wind power in Africa is roughly evenly matched, with installed capacities of solar PV at around
8 GW as of 2020–2112, and wind power at 6.5 GW13. For solar power, this number is strongly dominated by
South Africa and Egypt, which cover around 80% of installed capacity on the continent12. For wind power, the
capacities are somewhat more spread out: South Africa, Egypt and Morocco record nearly two-thirds, with the
remaining one-third mostly in Tunisia, Kenya, Ethiopia and Mauritania13. Given the favourable cost projections
for both solar PV and wind power, the International Energy Agency predicts that these sources could record
strongly increased growth rates across Africa in the period up to 2030, and reach 27% of Africa’s aggregate elec-
tricity mix by that same year14.
One attempt to provide such subsets of attractive VRE plant sites, focused on the Eastern and Southern
African regions, was previously published in literature15,16. However, this methodology focused exclusively on
near-grid resources (within 50–100 km of existing grid infrastructure) and did not calculate annual yields of
VRE in a bottom-up manner based on open-source hourly meteorological conditions.
is study presents an attempt to go beyond refs. 15,16 by developing and open-sourcing a full workow of
creating spatiotemporally model-ready VRE investment options for Africa based on publicly available data.
We present a novel representative subset of attractive sites for solar PV and onshore wind power for the entire
African continent. Hereaer, we refer to these sites as “Model Supply Regions” (MSRs). is MSR dataset was
created from an in-depth analysis of various existing datasets on resource potential, grid infrastructure, land
use, topography and others (see Methods section “Additional methodological details”), and achieves hourly tem-
poral resolution and kilometre-scale spatial resolution. is dataset lls an important research need by closing
the gap between comprehensive datasets on African VRE potential (such as the Global Solar Atlas and Global
Wind Atlas) on the one hand, and the input needed to run cost-optimisation models on the other. It also allows
a detailed analysis of the trade-os involved in exploiting excellent, but far-from-grid resources as compared to
mediocre but more accessible resources, which is a crucial component of power systems planning to be elabo-
rated for many African countries. e rest of the paper is organised as follows: we describe the overall approach
in its main steps, with more details provided in the Methods section “Additional methodological details”. is is
followed by a section that presents results obtained, before concluding with a discussion, conclusions and future
work section.
Methods
e principle of MSR creation is based on the combination of various geospatial data-
sets to lead to a representative subset of sites that can, in practical terms, be considered attractive sites for VRE
plant deployment. A owchart of the modelling process described in this section is given in Fig.1. e model is
implemented ve Python-based scripts which execute the dierent stages described below: MSR creation, hourly
prole generation, attribution, screening and clustering.
e process of MSR creation is indicated schematically in Fig.2
for a hypothetical rectangular country (panel (i) in Fig.2), and summarised hereaer. Starting from the map
of the African continent, the following parameters are used to select a geographically referenced subset of sites
within each country (details are given in the Methods section “Additional methodological details”):
Resource strength. Only sites where average VRE resources (irradiation and wind speed) are above a certain
minimum threshold, typical for commercial exploitation, are considered for inclusion.
Population density. Very densely populated areas, e.g. cities, are excluded from consideration.
Elevation. Locations above a certain elevation are excluded.
Slope. Locations with a slope beyond a given threshold are excluded.
Land use. Sites are only considered if they fall within certain categories of land use.
Protected areas. Natural reserves and other protected sites are excluded.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
3
SCIENTIFIC DATA | (2022) 9:664 | https://doi.org/10.1038/s41597-022-01786-5
www.nature.com/scientificdata
www.nature.com/scientificdata/
Distance from roads. Only sites within a certain vicinity of existing road networks are considered.
e above criteria dene the exclusion areas, and only regions that meet all criteria are considered in the
subsequent steps as potential areas for VRE deployment (ii).
ese inclusion areas contain geographically close areas with steep resource gradients. To separate areas of
dierent resource strength from each other, each country’s map of inclusion areas is rst classied into ve bins
(iii) that reect VRE resources of dierent strength (see Methods section “Additional methodological details”).
e binned areas are then polygonised (iv), i.e. marking boundaries around contiguous included sites, to dene
a set of contiguous areas that belong to the same bin.
Fig. 1 e MSR toolset comprises of ve Python scripts that are run sequentially. A high-level description of
each script and process ow is illustrated in this gure.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
4
SCIENTIFIC DATA | (2022) 9:664 | https://doi.org/10.1038/s41597-022-01786-5
www.nature.com/scientificdata
www.nature.com/scientificdata/
e data in each bin is then separately broken down (v) by applying a raster consisting of equally sized square
cells onto each bin’s polygon features (see Methods section “Additional methodological details”). e raster cell
area is based on a pre-dened maximum capacity threshold (in GW) of the technology in question, chosen to
correspond to typical high voltage transmission level power evacuation infrastructure, and is calculated using
the typical spatial footprint (MW/km2, see Methods section “Additional methodological details”) of solar PV or
wind power plants. is breaks the polygon features in each bin down into smaller cells with a clearly dened
maximum size (see Methods section “Additional methodological details”).
Each of these cells is dened as an individual MSR (vi) with its own specic attributed param-
eters such as maximum deployable capacity (in MW) based on cell area, average capacity factor (CF), distance
from grid and road infrastructure, resource strength, hourly VRE generation prole, etc. For a complete list of
attributes, the reader is referred to Supplementary MaterialA. e methodology to obtain hourly VRE generation
proles is described in detail in section “Additional methodological details” below.
is approach yielded a total of 79,608 MSRs for solar PV and 36,352 MSRs for wind, corresponding to
56 TW and 29 TW of generation capacity, respectively. In theory, MSRs can now be directly input as indi-
vidual investment options in capacity expansion models, similar to how hydropower can be represented with
site-specic investment options. However, depending on the scope of the modelling and the computational
power available, the numbers of the MSR may be unpractical for direct use.
Therefore, we developed two further steps aiming at reducing the computational requirements by (1)
pre-selecting the most desirable MSRs for model inclusion (“screening”) and (2) grouping MSRs with similar
characteristics together as a single investment option (“clustering”). ese are described next.
In this study, the screen-
ing step was based on the expected LCOE of each MSR. is LCOE was dened to include not only the costs of
potential power plant construction and operation & maintenance in the MSR, but also additional costs for sub-
station and transmission line and road construction costs for grid connection of the MSR (see Supplementary
MaterialB). e MSRs were then ranked from lowest to highest expected LCOE. is allowed to screen out a
high-ranking sub-selection of the MSRs in each country based on LCOE, such that this subset (the “best” sites in
LCOE terms) can be used in capacity expansion planning.
In this analysis, we screened the dataset according to the criterion that the total area of screened MSRs should
not exceed 5% of an individual country’s surface area. Since screening criteria can be arbitrarily dened, we
note that this 5% is purely meant for demonstration purposes. Other criteria (e.g. “the cheapest 45 MSRs per
country”, “all MSRs, ranked by LCOE, whose annual power generation would be equal to the country’s electric-
ity demand”, etc.) could be equally or more valid, depending on the research question or the policy objective.
Overall, the data set is designed such that a range of criteria for selecting an optimal MSR subset can be easily
implemented.
e MSRs screened for this analysis are provided in Excel les along with their metadata, which includes
full hourly proles and assumed costs, and also as shapeles (see Data Availability). ese are intended to serve
the energy modelling community to provide a comprehensive set of potential VRE plants, representing the
Fig. 2 Process of MSR creation. is schematic shows the various steps of MSR creation (see also Methods),
starting from (i) the boundaries of a hypothetical rectangular country, through (ii) the exclusion of unsuitable
areas, (iii) the classication of the suitable areas into dierent bins representing VRE resources of dierent
strength, (iv) the polygonization of the areas in each bin, and (v) the breakdown of each polygon into smaller
cells, to arrive at (vi) a collection of pre-screened MSRs, each with their own specic characteristics, for the
country.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
5
SCIENTIFIC DATA | (2022) 9:664 | https://doi.org/10.1038/s41597-022-01786-5
www.nature.com/scientificdata
www.nature.com/scientificdata/
lowest-LCOE sites within each country, as inputs to cost-optimisation models. Since all MSRs come with their
own specic costs and own temporal availability proles, they can be distinguished as separate technologies in
such models (and their cost parameters potentially further rened). e analysis presented in the remainder of
this paper is based on this screened dataset. Users wishing to use alternative screening criteria are invited to use
the open-sourced Model Supply Regions code (see Code Availability) and adapt it to their needs.
Since screening criteria can be arbitrarily dened, the number of MSRs provided by the screening may still be
too high for computational purposes (in this case of screening up to 5% of a country’s area, the number of MSRs
is still several thousand). In particular, many neighbouring MSRs may have very similar temporal proles and
very similar costs, potentially leading to long runtimes of optimisation models. To mitigate this issue, one could
simply limit the screening further. However, the latter could lead to a loss of information on the dierent types
of resource proles present within a country.
Therefore, a further clustering method is proposed based on the mathematical technique of k-means
clustering, which allows to numerically group MSRs with extremely similar proles within a country into a
user-dened number of “clusters”, summing up their overall potential and calculating aggregated temporal pro-
les and costs. While this arguably leads to the loss of some spatial granularity, it preserves a wide range of
proles for inclusion in energy models while allowing practically feasible runtimes. e approach is explained in
more detail in the Methods section “Additional methodological details” (cf. also Supplementary MaterialE), and
a corresponding clustering script is included in the open-sourced Model Supply Regions code.
While our screening method is based on costs (LCOE), “strongest” and “cheapest” resources are not synon-
ymous (cf. Figure4c,d below). In addition, we note that “cheapest” and “optimal” are not synonymous either.
Cost-optimisation models would not automatically prefer the very cheapest MSRs when selecting from either
VRE technology. Notably, the temporal characteristics of power generation potential, especially diurnal and
seasonal proles, can dier strongly between MSRs. Since cost-optimisation is done on the basis of the entire
system under study11, it is likely that cases exist where it is preferable to deploy VRE plants in locations where
resources are neither the strongest nor the cheapest—e.g. in locations where the seasonality of the resource has the
best t with other elements of the system (such as complementarity with other resources, with demand, or with
storage needs). For these reasons, we note that a static comparison of MSRs’ LCOE only serves for screening, and
does not replace the need for further analysis of “optimality” using capacity expansion models.
e following section provides deeper methodological details on the above-described steps of MSR creation.
It is followed by a results section which focuses on an analysis of the screened MSRs that can be used as input
to capacity expansion models. It does not include an analysis of the application of the MSR in such models
themselves.
Datasets used in MSR creation. Average resource strength data for
solar PV was obtained from the Global Solar Atlas9 at 1 × 1 km2 resolution; data for wind power was obtained
from the Global Wind Atlas10 at 250 × 250 m2 resolution. e inclusion threshold (lower limit) for solar PV
resources was an annual average Global Horizontal Irradiation (GHI) of 4 kWh/m2/day; the inclusion threshold
for wind resources was an annual average wind speed of 6 m/s at 100 m height.
Population density was obtained from the Oak Ridge National Laboratory’s LandScan 2019 dataset17 at 1 × 1
km2 resolution, with an exclusion threshold (upper limit) of 100 inhabitants/km2.
Elevation was obtained from the Shuttle Radar Topography Mission (STRM) at 30 × 30 m2 resolution18,
with an exclusion threshold (upper limit) of 2000 m above sea level. Slopes were obtained from the same dataset
(Slope = [dierence in elevation between two points]/[Distance between two points] * 100%), with an exclusion
threshold of 20%.
Land use maps were obtained from the European Space Agency’s GlobCover (2009) map19 at 300 × 300 m2.
e included land cover categories are 11, 14, 20, 30, 110, 120, 130, 140, 150, 180, 190, and 200.
Protected areas were obtained from the World Database on Protected Areas20.
Existing transmission grid infrastructure was obtained from the GridFinder (2020) dataset21 and the existing
road network from the Global Roads Inventory Project (GRIP)22. No exclusion threshold was considered for the
distance of a site to existing grid infrastructure. An upper limit of 50 km was used for the distance between a site
and the road network.
e search for nearest transmission and road infrastructure was done considering exclusively the infrastruc-
ture within a country’s borders. Only in cases where a country did not have one of such infrastructure types
within its borders, the search for that infrastructure type was relaxed to its neighbouring countries.
e GRIP dataset provides roads in dierent categories, e.g. “primary”, “secondary”, etc. (to distinguish e.g.
highways from dirt roads). e MSR code provides the exibility to select any subset of these categories. In the
present paper, all GRIP road categories were included in the search.
e dierent resource bins for MSR creation were obtained by dividing the range between the lower limit
used for the resources and the maximum resource value observed across non-excluded areas into ve equally
spaced bins. e resource bins are thus country-specic.
e breakdown of the polygons in each resource bin happens by bounding each polygon with a rectangle,
and subsequently dividing this rectangle vertically and horizontally into equidistant rows and columns. e
number of rows and columns is determined by dividing the vertical and horizontal dimensions of the rectangle
by the side length of a square representing the maximum deployable capacity in each MSR (see below), and
rounding this number down to the nearest integer. Each unit of the polygon enclosed by vertical and horizontal
lines and, in certain cases, the polygon boundary, is dened as an MSR.
e maximum deployable capacity in each MSR (i.e. the typical size of a VRE power plant, which denes the
maximum MSR area) was taken to be 2.7 GW (based on evacuation with a four-circuit 500 kV line, with a single
500 kV line assumed capable of evacuating up to 900 MW and one extra line included for N-1 security15). e
Content courtesy of Springer Nature, terms of use apply. Rights reserved
6
SCIENTIFIC DATA | (2022) 9:664 | https://doi.org/10.1038/s41597-022-01786-5
www.nature.com/scientificdata
www.nature.com/scientificdata/
spatial footprint of solar PV and wind power plants was taken to be 33 MW/km2 for solar PV23,24 and 11.8 MW/
km2 for wind power25. In addition, we assumed that investing parties would not allocate all the area within an
MSR to solar panels or wind turbines (given the need for various types of infrastructure, such as on-site roads),
and used an area discount factor (the percentage of an MSR actually used for the power plant itself) of 10% for
solar PV and 25% for wind25. e maximum area of an MSR was thus 818 km2 for solar PV, and 915 km2 for
wind.
Very small MSRs that would represent a capacity potential (MW) below the user-specied minimum size of
(in this case) 20 MW were discarded. is criterion is set to ensure that all MSRs represent a substantial enough
utility-scale power evacuation opportunity.
A list of cost parameters (installation, operation and maintenance of power plants, transmission lines, substa-
tions and road extension) used in the calculation of LCOEs is provided in Supplementary MaterialB.
We note that the code for all steps of MSR creation has been made publicly available along with this paper
(see Code Availability), and hence, could be readily re-run with other parameter values should future users wish
to do so.
Temporal proles of VRE generation. While average resource strength can be inferred at high spatial resolution
from the Global Solar Atlas and Global Wind Atlas, these databases do not provide hourly data series of poten-
tial power generation or capacity factors. We obtained hourly data series of Global Horizontal Irradiation (GHI),
ambient temperature, and 100-m wind speed from the ERA5 reanalysis dataset26, whose spatial resolution is
coarser at 31 × 31 km2. For each MSR, the hourly proles of these parameters were extracted from the nearest
ERA5 cell through Nearest Neighbour spatial interpolation, based on centre-to-centre-distance between the
MSR in question and the ERA5 cell. We used the meteorological year 2018 to perform the calculations as an
example, noting that any other year or an average across multiple years could be used as well. Aer extracting
these data, the GHI and wind speed datasets (8760 values, representing all hours in a year) were bias-corrected
to the respective annual average values across MSRs as provided by the Global Solar Atlas and Global Wind
Atlas. is process allowed to combine the superior spatial resolution of the Global Solar Atlas and Global Wind
Atlas (km-scale) with the superior temporal resolution of ERA5 (hourly).
For GHI, we applied a simple additive bias-correction. e bias between the annual average Global Solar
Atlas and ERA5 GHI (in kWh/m2/year) was calculated for each MSR. is bias was subsequently divided by the
number of hours in a year with nonzero irradiation, and the resulting value (in kWh/m2/day) was added to the
hourly time series of GHI extracted from ERA5 (excluding hours with zero irradiation). is bias was generally
very small; across all MSRs, the average absolute dierence between annual irradiation levels in MSR centres
from the Global Solar Atlas and ERA5 was 3%.
For wind speeds, we used the rank mapping technique (equivalent to empirical quantile mapping27, using
ranks out of 8760 values as quantiles) to map all hourly wind speed values extracted from ERA5 to a separate tar-
get dataset for each MSR. (We note here that the Global Wind Atlas dataset is based on downscaled ERA5 data.
e downscaling process used for the Global Wind Atlas allows to resolve local topography-induced corridors
of high wind speeds, such as hill ridges, whereas ERA5 grid cells are generally too large to resolve these.) Such a
target dataset should have the same average as the Global Wind Atlas average value for the MSR, but also reect
the Weibull-shape distribution of wind speed time series, which is why an additive bias-correction (as for GHI)
is not appropriate here (since it does not preserve the Weibull shape). e mappings and target datasets were
obtained separately for each country, based on all ERA5 cells linked to MSRs within that country, as follows: If a
country has N MSRs for wind power, it accordingly has N time series of 8760 hourly wind speed values obtained
from ERA5. e 8760 values in each time series were assigned a rank between 1 and 8760 (reecting lowest to
highest values within that series). is resulted in N wind speed values assigned rank 1, N wind speed values
assigned rank 2, etc. Subsequently, the N values in each rank were correlated against the annual average wind
speed from each of the N time series, and 8760 linear ts were thus made (one t through N data points for each
rank). e bias-correction mapping was then obtained by evaluating those linear ts for each rank at the average
wind speeds from the Global Wind Atlas that represents the bias-correction target for each MSR. us, N target
datasets (each containing 8760 values) were obtained from N mappings (i.e. the evaluations of 8760 linear t
equations, unique for each country, at N target mean wind speeds). Any extracted ERA5 time series could thus
be bias-corrected by determining the rank of each value in the time series and subsequently mapping to the
target value based on the linear t for that rank. is process successfully bias-corrects to the target mean while
preserving a Weibull shape distribution of wind speeds at each site. is bias-correction was necessary as the
annual mean wind speeds computed for each MSR from the Global Wind Atlas were, on average, 45% higher
than those obtained from ERA5 due to the latter’s spatial coarseness.
Aer bias-correction, the GHI and temperature datasets were converted to solar PV capacity factor datasets
according to the parameterisations of ref. 28, and the 100-m wind speeds were converted to wind turbine capacity
factors according to ref. 15 (which distinguishes between Class-III, Class-II and Class-I wind turbines based on
the average wind speed: less than 7.5 m/s, between 7.5 and 8.5 m/s, or higher than 8.5 m/s, respectively).
e hourly capacity factor proles (8760 values) for each MSR are provided as metadata of the MSR dataset
(see Data Availability). A full list of metadata is provided in the Supplementary MaterialA.
Clustering method. Seeking a balance between computational load and model detail, we propose a clustering
approach to group MSRs based on their similarity to one another in addition to the MSR algorithm discussed
above.
In this approach, MSRs are grouped based on their CF timeseries. is is executed by using the Euclidean dis-
tance between these timeseries to perform k-means clustering. e coarseness of the clusters can be controlled
Content courtesy of Springer Nature, terms of use apply. Rights reserved
7
SCIENTIFIC DATA | (2022) 9:664 | https://doi.org/10.1038/s41597-022-01786-5
www.nature.com/scientificdata
www.nature.com/scientificdata/
by setting the maximum number of clusters according to a modeller’s preferences. Representative parameters for
each cluster are determined by taking the sum or, where appropriate, weighted average (by maximum capacity
deployable) of those MSRs that make up the cluster. e resulting hourly capacity factor prole is made up by
the weighted average production of the individual MSRs at each hour.
is algorithm does not intrinsically consider geographical distance between MSRs as a criterion. However,
given the typical spatial correlation observed for VRE potential across certain distances7, the clusters typically
represent areas that are geographically contiguous or near-contiguous (although constrained by the presence of
country borders and exclusion zones). is is shown in Supplementary MaterialE.
Results
e resulting (screened) set of lowest-cost MSRs by country is pro-
vided in Fig.3a (see Data Availability for individual maps on country-level). is screened dataset contains 10,905
MSRs for solar PV across Africa (with an estimated total deployment potential of 4.9 TW at 21.4% average CF) and
7,177 for wind power (3.4 TW at 54.9% average CF). Additionally, we show two examples of local hourly and sea-
sonal proles (from a solar PV MSR in Somalia and a wind MSR in Kenya) in Fig.3b,c. Hourly proles are provided
for an example meteorological year, in this case 2018; however, since these are derived from reanalysis datasets (see
Methods section “Additional methodological details”), the same analysis can be readily redone with reanalysis data
for any other year.
Given countries’ vast disparity in spatial resource distribution, the identied set of MSRs allows for a deeper
analysis on the combined eect of resource strength and grid distance on site attractiveness. ese two aspects
are paramount as the main determinants of the expected LCOE in each MSR are, rst, resource strength (the
stronger the resource, the higher the yield per unit of capacity, and the lower the LCOE), and second, the dis-
tance from existing grid and road infrastructure (the higher this distance, the higher the additional costs to
connect power plants to the grid, and the higher the LCOE).
Figure4 shows the average (MSR area-weighted) distance of each country’s MSRs from the existing transmis-
sion grid (le), as well as those MSRs’ expected (area-weighted) average capacity factors (CF, right). is Figure
reveals several points of interest. First, the CF of wind power is spatially much more divergent than that of solar
PV across countries (a well-known fact, linked to wind power generation scaling with wind speeds to the third
power, as opposed to solar PV power generation scaling nearly linearly with irradiation29). Second, the distance
of MSRs from transmission grid infrastructure is also typically much larger for wind power (close to 160 km)
than for solar PV (close to 30 km).
Keeping in mind that the screened MSRs within each country reect that country’s cheapest options, this
signies that there is generally more economic sense in exploiting remote resources for wind than for solar PV.
In other words, paying the “remoteness premium” (additional transmission lines and road infrastructure) is more
worth the eort for wind than for solar PV, since the extra yield obtained by exploiting excellent far-from-grid
wind resources (as opposed to mediocre close-to-grid wind resources) apparently makes up for this premium.
Clearly, the same does not apply to solar PV. is appears logical considering the lower spatial resource diversity
of solar PV as compared to wind power, which makes far-from-grid investments less attractive.
A follow-up investigation can be performed to assess how substantial the eect of this “remoteness premium”
is in cross-country comparisons. For instance, Chad (TD) has among Africa’s best wind power MSRs with an
average CF of 54% (Fig.4, right), but these tend to be very far from existing grid infrastructure (le). How do
the costs of these MSRs, which are Chad’s cheapest, compare to the costs of wind MSRs in e.g. Cameroon (CM),
where weaker but still viable (at 35% CF) wind resources are found much closer to existing grid infrastructure
(by a factor of nearly 20 in terms of distance)?
Fig. 3 Spatial distribution of solar PV and wind MSRs across Africa. (a) A map of the African continent
showing all solar PV and wind MSRs screened by LCOE up to a maximum coverage of 5% of a country’s area.
(b,c) Example temporal proles (diurnal and seasonal) for the two example locations indicated in (a). e
diurnal example in (b) covers the 12th day of March.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
8
SCIENTIFIC DATA | (2022) 9:664 | https://doi.org/10.1038/s41597-022-01786-5
www.nature.com/scientificdata
www.nature.com/scientificdata/
We visualise all MSRs
for solar PV and wind power as classied by their relative cost in Fig.5a (cost expressed in USD2019). All MSRs
were binned into ve categories by their expected LCOE (separately for solar PV and wind). Across the African
continent, based on the most recent global-level VRE, transmission and road infrastructure costing data for the
present-day available to the authors2, the LCOEs for MSRs range from 97.7–148.6 USD/MWh for solar PV and
from 34.5–127.4 USD/MWh for wind power. Clearly, the lowest-cost MSRs for both VRE types are found in a
“boomerang” shape which stretches from West to East across the Sahara and parts of the Sahel, then across the
length of the East African Ri from Northeast to Southwest Africa. On the other hand, southern West Africa
and large parts of Central Africa are clearly less well-resourced in the African context—although their solar PV
potential is still markedly above that of e.g. many European countries.
In Fig.5b, we show the division of MSRs within each country (by area) across the dierent LCOE bins, with
countries ordered from le to right by average MSR LCOE. Interestingly, a few countries (e.g. Kenya, Djibouti)
appear at the high end (i.e. most favourable LCOE in continental comparison) for both resources, whereas
others (e.g. Gabon, Equatorial Guinea) nd themselves at the low end for both. is suggests that, in the future
(in which countries’ power systems are expected to become more interconnected), some specic countries may
Fig. 4 Capacity factor of MSRs as compared to their distance from the transmission grid. e le axis shows
the average distance from the transmission grid across all MSRs in a country; the right axis shows the average
capacity factor. Averages are weighted by MSR area. Solar PV and wind MSRs dier in (a) the spatial divergence
of CFs (larger for wind than solar PV) and in (b) the distance-from-grid of the cheapest MSRs (higher for
wind than solar PV). Country abbreviations denote alpha-2 codes; see Supplementary Table4 for the list of full
names. Countries are ranked vertically according to the alphabetical order of these full names. Note that some
countries do not have any viable wind power potential according to the present methodology, hence bars for
wind power are omitted for those countries in this graph.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
9
SCIENTIFIC DATA | (2022) 9:664 | https://doi.org/10.1038/s41597-022-01786-5
www.nature.com/scientificdata
www.nature.com/scientificdata/
potentially emerge as typical “VRE hubs” (similarly to how some countries, like Ethiopia and Guinea, are already
known as hydropower hubs5,8).
e relationship between LCOE and resource strength, and the inuence of grid and road infrastructure
costs, is elucidated in Fig.5c,d. Here, we show the (area-weighted) average LCOE versus (area-weighted) aver-
age CF by country. (Charts showing the breakdown of this trend by individual MSR are given in Supplementary
MaterialC.) In both cases, a very clear trend of decreasing LCOE with increasing CF is evident, as would be
expected. However, in both cases, there are also clear instances of the same CF resulting in diverging LCOEs.
is is explained by the dierence between near-grid and far-from-grid resources in countries with similar
average resource strength.
As an example, the highest-yield MSRs overall for solar PV are found in Namibia and Somalia (Fig.5c), but
Namibia achieves a substantially lower average LCOE: Namibia has more adequate existing grid infrastructure
and the “remoteness premium” is correspondingly lower than in Somalia. e Somalian LCOE is, in fact, similar
to that expected for solar PV farms in Uganda, which would have a CF of more than a percentage point less, but
where existing grid infrastructure is more adequate.
A similar eect can be seen for wind power when looking at the case of Chad (Fig.5d). Expected capac-
ity factors in Chad would be very close to those of Djibouti, the country achieving the lowest average wind
LCOE. Yet, the additional costs for grid connection of Chadian wind farms would be monetarily equivalent to
exploiting resources with a CF of roughly 10 percentage points lower but in imminent vicinity of existing grid
Fig. 5 MSRs classied by expected LCOE, including installation costs, operation and maintenance costs,
transmission grid extension costs and road network extension costs. (a) All-Africa MSRs, screened by LCOE
up to coverage of 5% of a country’s area, classied by ve LCOE categories from cheaper to costlier. (b) e
country-level area-weighted distribution of MSRs across these ve categories. Country abbreviations denote
alpha-2 codes; see Supplementary Table4 for a list of full names. Countries with comparatively low overall VRE
potential are marked with symbols (*), (**) or (***) if total MSR area covered less than 3%, 1% and 0.1% of
the country, respectively, and with (−) in case of absence of MSRs in that country. (c,d) Each country’s average
LCOE as function of average CF (averages weighted by MSR area), for solar PV (c) and wind (d). SO = Somalia,
NA = Namibia, UG = Uganda, DJ = Djibouti, TD = Chad, TN = Tunisia, CM = C ameroon.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
10
SCIENTIFIC DATA | (2022) 9:664 | https://doi.org/10.1038/s41597-022-01786-5
www.nature.com/scientificdata
www.nature.com/scientificdata/
infrastructure (e.g. as typical for MSRs in Tunisia). However, the costs of these Chadian wind MSRs would still
trump those of Cameroon evoked earlier (nearly 30% cheaper in Chad compared to Cameroon).
Generally, it can be concluded from Figs.3to 5 that the relative attractiveness of sites for deployment of
VRE plants across Africa is determined by a number of factors. e most important factor, generally, is resource
strength. For instance, the lowest-cost sites for solar PV and wind are to be found in the countries with the best
resource availability (e.g. Namibia, South Africa and Egypt for solar PV, and Djibouti, Sudan and Kenya for
wind), cf. Fig.5b.
When intercomparing countries by site attractiveness, resource strength is therefore generally a good indica-
tor, with the notable exception of countries with very poorly built-out grid infrastructure. For instance, Somalia
(for solar PV) and Chad (for wind) have highly attractive sites from a resource point of view, but these sites
would be equivalent to mediocre near-grid sites in LCOE terms given the additional costs for grid and road
expansion that would be involved), cf. Fig.5c,d.
When intercomparing sites within a single country, resource strength is also seen to be the most important
indicator for site attractiveness for wind power, but importantly, not for solar PV (cf. Figure4). For solar PV,
given its relatively low spatial disparity within each individual country, the best sites (in LCOE terms) simply
tend to cluster close to existing grid infrastructure.
Discussion
In this study, we provide an all-Africa dataset of locations for solar PV and wind park deployment, and their
metadata, to serve the energy modelling community. e dataset contains enough information to include the
locations in cost-optimisation models for capacity expansion and distinguishes between resources with dierent
quality and resources with dierent accessibility.
It is seen that the most attractive locations (in LCOE terms) for solar PV plants tend to cluster near existing
grid infrastructure, whereas the most attractive locations for wind power plants are spatially much more widely
distributed. e dataset has also provided some insights on the possible compromises between resource quality
and grid proximity that may have to be considered for African power systems planning.
Several improvements to the dataset are under consideration for the future. First, as on-the-ground deploy-
ment of VRE plants accelerates across Africa, the reanalysis-inferred power generation proles of the MSRs
should be compared to and validated against observed data from actual power plants to validate the robustness
of the employed methods. Second, as power grids and road networks are presently in full expansion across
Africa to enhance electricity access, the dataset will have to be regularly updated to account for these new real-
ities, which may make locations that are currently relatively inaccessible for grid connection more attractive in
the future. ird, as the upfront investment and operating & maintenance costs of solar PV and wind power
continue to drop, the relative importance of the “distance-from-grid” criterion vis-à-vis the “resource strength”
criterion will shi; typically, the lower the upfront and running costs of VRE, the closer the cheapest MSRs will
cluster around available grid infrastructure, even if this means slightly lower capacity factors (see Supplementary
MaterialD). Fourth, the methodology could be extended to oshore wind power, solar CSP, and other types of
VRE, e.g. tidal and wave power. Fih, the methodology could be improved by including transmission grid con-
gestion (i.e. scoring the availability of existing grid infrastructure in terms of available capacity, not by the mere
presence of the transmission lines) as a parameter in the creation of MSRs. e current algorithm only scores
for proximity to existing grid. As generation capacity gets added to MSRs in concentrated areas where there
is an existing grid, the ability for the existing grid to evacuate more power will diminish, requiring additional
investment in grid capacity. is should increase the LCOE of the subsequent MSRs, which although are in close
proximity to the grid would have to carry additional investment costs. Sixth, we note that the presented dataset
used the meteorological year 2018 as basis to calculate hourly power supply time series. For applications in
which the correlation between VRE supply and electricity demand for all individual hours of the year is of prime
importance (e.g. models that run at full 8760-hour temporal resolution), we therefore urge users to consider
whether the choice of the year 2018 is adequate or whether another year, or a combination of years into e.g. a
“typical meteorological year”, would be better suited. And seventh, the dataset could eventually be extended to
cover all other continents as well, allowing for better data validation and more extended statistical analysis of
MSR characteristics in space and time.
Data availability
e screened MSRs are available in a public repository on https://doi.org/10.5281/zenodo.7014609 in various
formats: (1) country-level georeferenced maps, showing how the screened MSRs align with load centres, roads
and transmission infrastructure within a country’s borders, and how their CFs and estimated LCOEs dier across
a country’s territory; (2) Excel les (along with their metadata, including hourly proles) for screened and pre-
screened datasets; and (3) GIS shapeles for screened datasets30.
Code availability
e Python code used to generate the MSRs along with all their metadata, including hourly proles, as well as the
code to perform screening and clustering, is openly available on https://github.com/bhussain89/Model-Supply-
Regions-MSR-Toolset.
Received: 6 May 2022; Accepted: 17 October 2022;
Published: xx xx xxxx
Content courtesy of Springer Nature, terms of use apply. Rights reserved
11
SCIENTIFIC DATA | (2022) 9:664 | https://doi.org/10.1038/s41597-022-01786-5
www.nature.com/scientificdata
www.nature.com/scientificdata/
References
1. IENA. enewable Capacity Statistics 2020. https://www.irena.org/publications/2020/Mar/enewable-Capacity-Statistics-2020
(2020).
2. IENA. enewable Power Generation Costs in 2020. https://www.irena.org/publications/2021/Jun/enewable-Power-Costs-in-2020
(2021).
3. EN21. enewables 2020 Global Status eport. https://www.ren21.net/wp-content/uploads/2019/05/gsr_2020_full_report_en.pdf
(2020).
4. IENA. Planning and prospects for renewable power: Eastern and Southern Africa. https://www.irena.org/publications/2021/Apr/
Planning-and-prospects-for-renewable-power-Eastern-and-Southern-Africa (2021).
5. Sterl, S. A Grid for all Seasons: Enhancing the Integration of Variable Solar and Wind Power in Electricity Systems Across Africa.
Curr. Sustain. Energy ep. 8, 274–281 (2021).
6. IENA. Planning for the enewable Future: Long-term modelling and tools to expand variable renewable power in emerging economies.
https://www.irena.org/publications/2017/Jan/Planning-for-the-renewable-future-Long-term-modelling-and-tools-to-expand-
variable-renewable-power (2017).
7. Engeland, . et al. Space-time variability of climate variables and intermittent renewable electricity production – A review. e new.
Sustain. Energy ev. 79, 600–617 (2017).
8. Sterl, S. et al. A spatiotemporal atlas of hydropower in Africa for energy modelling purposes. Open es. Eur. 1, 29 (2021).
9. World Ban Group. Global Solar Atlas. at https://globalsolaratlas.info/ (2020).
10. World Ban Group. Global Wind Atlas. at https://globalwindatlas.info/ (2020).
11. Uecerdt, F., Hirth, L., Luderer, G. & Edenhofer, O. System LCOE: What are the costs of variable renewables? Energy 63, 61–75
(2013).
12. AFSIA. Africa Solar Outloo 2021. http://afsiasolar.com/wp-content/uploads/2021/02/AFSIA-Africa-Solar-Outloo-2021-nal-2.pdf
(2021).
13. AEEP. Wind Energy: Joining Forces for an African Li-O. https://africa-eu-energy-partnership.org/wp-content/uploads/2022/02/
AEEP_Wind_Energy_Policy_Brief_2022-3.pdf (2022).
14. IEA. Africa Energy Outloo 2022. https://iea.blob.core.windows.net/assets/6fa5a6c0-ca73-4a7f-a243-fb5e83ecfb94/
AfricaEnergyOutloo2022.pdf (2022).
15. IENA and LBNL. enewable Energy Zones for the Africa Clean Energy Corridor - Multi-Criteria Analysis for Planning enewable
Energy. https://www.irena.org/-/media/Files/IENA/Agency/Publication/2015/IENA-LBNL_Africa-E-_CEC_2015.pdf (2015).
16. Wu, G. C. et al. Strategic siting and regional grid interconnections ey to low-carbon futures in African countries. Proc. Natl. Acad.
Sci. 114, E3004–E3012 (2017).
17. ONL. Oa idge National Laboratory - LandScan Documentation. https://landscan.ornl.gov/documentation (2015).
18. Farr, T. G. et al. e Shuttle adar Topography Mission. ev. Geophys. 45 (2007).
19. ESA. European Space Agency GlobCover Portal. http://due.esrin.esa.int/page_globcover.php (2010).
20. Protected Planet. Protected Areas (WDPA). https://www.protectedplanet.net/en/thematic-areas/wdpa?tab=WDPA (2021).
21. Arderne, C., Zorn, C., Nicolas, C. & os, E. E. Predictive mapping of the global power system using open data. Sci. Data 7, 19
(2020).
22. Meijer, J. ., Huijbregts, M. A. J., Schotten, . C. G. J. & Schipper, A. M. Global patterns of current and future road infrastructure.
Environ. es. Lett. 13, 064006 (2018).
23. NEL. Land-Use equirements for Solar Power Plants in the United States. https://www.nrel.gov/docs/fy13osti/56290.pdf (2013).
24. IENA. Investment opportunities in West Africa: Suitability maps for grid-connected and o-grid solar and wind projects. https://www.
irena.org/-/media/Files/IENA/Agency/Publication/2016/IENA_Atlas_investment_West_Africa_2016.pdf (2016).
25. IENA. Estimating the enewable Energy Potential in Africa: A GIS-based approach. http://www.irena.org/menu/index.
aspx?mnu=Subcat&PriMenuID=36&CatID=141&SubcatID=440 (2014).
26. Hersbach, H. et al. e EA5 global reanalysis. Q. J. . Meteorol. Soc. 146, 1999–2049 (2020).
27. Gudmundsson, L., Bremnes, J. B., Haugen, J. E. & Engen-Saugen, T. Technical Note: Downscaling CM precipitation to the station
scale using statistical transformations – a comparison of methods. Hydrol Earth Syst Sci 16, 3383–3390 (2012).
28. Huld, T., Gottschalg, ., Beyer, H. G. & Topič, M. Mapping the performance of PV modules, eects of module type and data
averaging. Sol. Energy 84, 324–338 (2010).
29. Sterl, S., Liersch, S., och, H., Lipzig, N. P. M. van Lipzig, & iery, W. A new approach for assessing synergies of solar and wind
power: implications for West Africa. Environ. es. Lett. 13, 094009 (2018).
30. Sterl, S., Hussain, B. & Elabbas, M. A. E. Data for the paper «An all-Africa dataset of energy model ‘supply regions’ for solar PV and
wind power» (1.0.1) [Data set]. Zenodo https://doi.org/10.5281/zenodo.7014609 (2022).
Acknowledgements
e authors wish to thank T. Hadjicostas, T. Gumunyu, G. Chileshe, B. Batidzirai, L. Tatry, M. Tot, I. Gherboudj,
M. Nababa, and J. Hampp for inspiring comments and discussions. We acknowledge the European Centre for
Medium-Range Weather Forecasts (ECMWF) for providing the ERA5 reanalysis. e analysis was performed
as part of IRENA’s “Planning and Prospects for Renewable Power” series and is intended to feed into IRENA’s
ongoing support to AUDA-NEPAD in the development of the African Continental Power Systems Masterplan.
Author contributions
S.S. wrote the manuscript with inputs from all other authors. B.H. and Y.L. developed the MSR methodology with
inputs from all other authors. Y.L. developed the clustering algorithm with inputs from M.B.B.T. and B.M., S.S.
analysed the data and created the gures.
Competing interests
e authors declare no competing interests. e designations employed and the presentation of material in
this paper do not imply the expression of any opinion on the part of the authors or their aliated institutions
concerning the legal status of any region, country, territory, city or area or of its authorities, or concerning the
delimitation of frontiers or boundaries.
Additional information
Supplementary information e online version contains supplementary material available at https://doi.
org/10.1038/s41597-022-01786-5.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
12
SCIENTIFIC DATA | (2022) 9:664 | https://doi.org/10.1038/s41597-022-01786-5
www.nature.com/scientificdata
www.nature.com/scientificdata/
Correspondence and requests for materials should be addressed to S.S.
Reprints and permissions information is available at www.nature.com/reprints.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional aliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International
License, which permits use, sharing, adaptation, distribution and reproduction in any medium or
format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Cre-
ative Commons license, and indicate if changes were made. e images or other third party material in this
article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the
material. If material is not included in the article’s Creative Commons license and your intended use is not per-
mitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the
copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
© e Author(s) 2022
Content courtesy of Springer Nature, terms of use apply. Rights reserved
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
onlineservice@springernature.com