Content uploaded by Eduard Kerkhoven
Author content
All content in this area was uploaded by Eduard Kerkhoven on Jan 25, 2024
Content may be subject to copyright.
1
Regulation of lactose and galactose growth: Insights from a unique 1
metabolic gene cluster in Candida intermedia 2
3
Kameshwara V. R. Peri1, Le Yuan1, Fábio Faria Oliveira1, Karl Persson1, Hanna D Alalam1, 4
Lisbeth Olsson1 2, Johan Larsbrink1 2, Eduard J Kerkhoven1 3 4 and Cecilia Geijer1 5
1Department of Life Sciences, Chalmers University of Technology, Gothenburg, Sweden. 6
2Wallenberg Wood Science Center, Chalmers University of Technology, 412 96, Gothenburg, 7
Sweden. 8
3Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, 9
DK-2800 Kgs. Lyngby, Denmark. 10
4SciLifeLab, Chalmers University of Technology, 41296, Gothenburg, Sweden. 11
Corresponding author: cecilia.geijer@chalmers.se 12
Keywords 13
Cheese whey, metabolism, evolution, gene clusters, transcriptional regulation, galactose 14
regulatory system, non-conventional yeast 15
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
2
Abstract 16
Lactose assimilation is a relatively rare trait in yeasts, and Kluyveromyces yeast species have 17
long served as model organisms for studying lactose metabolism. Meanwhile, the metabolic 18
strategies of most other lactose-assimilating yeasts remain unknown. In this work, we have 19
elucidated the genetic determinants of the superior lactose-growing yeast Candida intermedia. 20
Through genomic and transcriptomic analyses and deletion mutant phenotyping, we identified 21
three interdependent gene clusters responsible for the metabolism of lactose and its hydrolysis 22
product galactose: the conserved LAC cluster (LAC12, LAC4) for lactose uptake and hydrolysis, 23
the conserved GAL cluster (GAL1, GAL7, GAL10) for galactose catabolism, and a unique 24
“GALLAC” cluster. This novel GALLAC cluster, which has evolved through gene duplication 25
and divergence, proved indispensable for C. intermedia’s growth on lactose and galactose. The 26
cluster contains the transcriptional activator gene LAC9, second copies of GAL1 and GAL10 27
and the XYL1 gene encoding an aldose reductase involved in carbon overflow metabolism. 28
Notably, the regulatory network in C. intermedia, governed by Lac9 and Gal1 from the 29
GALLAC cluster, differs significantly from the (ga)lactose regulons in Saccharomyces 30
cerevisiae, Kluyveromyces lactis and Candida albicans. Moreover, although lactose and 31
galactose metabolism are closely linked in C. intermedia, our results also point to important 32
regulatory differences. This study paves the way to a better understanding of lactose and 33
galactose metabolism in C. intermedia and provides new evolutionary insights into yeast 34
metabolic pathways and regulatory networks. In extension, the results will facilitate future 35
development and use of C. intermedia as a cell-factory for conversion of lactose-rich whey into 36
value-added products. 37
38
39
40
41
42
43
44
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
3
Introduction 45
Assimilation of lactose is a rather uncommon characteristic among microorganisms, including 46
yeasts. Growth screening of 332 genome-sequenced yeasts from the Ascomycota phylum 47
showed that only 24 (<10%) could grow on lactose, and these lactose-utilizers are scattered 48
throughout the phylogenetic tree1. While lactose increased in abundance on earth with the 49
domestication of lactating mammals about 10,000 years ago2, ascomycetous yeast clades 50
formed already millions of years ago1, suggesting that lactose metabolism may have evolved 51
several times throughout yeast evolution. Whereas ‘dairy yeast’ from the Kluyveromyces genus, 52
including K. lactis and K. marxianus, have been carefully characterized3-6, other lactose-53
metabolizing yeast species remain largely understudied. Elucidating the mechanisms behind 54
their lactose metabolism can help to shed light on how eukaryotic metabolic pathways and the 55
associated regulatory networks have evolved. Moreover, it can enable the development of new 56
yeast species as cell factories for conversion of lactose in the abundant industrial side stream 57
cheese whey into a range of different products7. 58
Lactose is a disaccharide composed of D-glucose and D-galactose connected through a β-1,4-59
glycosidic linkage. Its assimilation starts with the hydrolysis of lactose into its monosaccharides 60
through the action of a lactase – normally an enzyme with β-galactosidase activity. Several 61
different enzyme families encode lactases, which can be found intracellularly or extracellularly. 62
In Kluyveromyces yeasts, lactose is transported across the plasma membrane by a LAC12-63
encoded lactose permease and is subsequently hydrolyzed intracellularly by a LAC4-encoded 64
β-galactosidase6. In contrast, the yeast Moesziomyces aphidis and M. antarcticus seem to show 65
β-galactosidase activity both intra and extracellularly, whereafter glucose and galactose are 66
imported into the cell8. For most other lactose-growing yeast, comparative genomics and 67
growth characterization are still needed to determine their lactose uptake and hydrolysis 68
mechanisms. 69
In Kluyveromyces (and likely most other lactose-assimilating yeasts), lactose-derived glucose 70
and galactose moieties are further catabolized through glycolysis and the Leloir pathway, 71
respectively. The Leloir pathway is carried out by Gal1, Gal7 and Gal10, and starts by 72
conversion of β-D-galactose into α-D-galactose by the mutarotase domain of Gal10 (aldose-1-73
epimerase). Gal1 (galactokinase) then phosphorylates α-D-galactose into α-D-galactose-1-74
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
4
phosphate, whereafter Gal7 (galactose-1-phosphate uridylyl transferase) transfers uridine 75
diphosphate (UDP) from UDP-α-D-glucose-1-phosphate to α-D-galactose-1-phosphate9. The 76
epimerase (UDP-galactose-4-epimerase) domain of Gal10 catalyzes the final step, where UDP-77
α-D-galactose-1-phosphate is converted to UDP-α-D-glucose-1-phosphate10-12. In parallel to the 78
Leloir pathway, some filamentous fungi such as Trichoderma reesei and Aspergillus nidulans 79
have an alternative galactose catabolic pathway called the oxidoreductive pathway, where 80
galactose is first converted into galactitol through the action of an aldose reductase11,12. Also a 81
third galactose catabolic pathway, the DeLey-Doudoroff pathway, has been described to some 82
detail12. To the best of our knowledge, (ga)lactose-growing yeasts described to date exclusively 83
use the Leloir pathway, although some reports on galactose-to-galactitol conversion in 84
Rhodosporidium toruloides and Metschnikowia pulcherrima exist13-15. Moreover, 12 out of 332 85
ascomycetous yeasts have been shown to grow on galactitol1, indicating that they might possess 86
an oxidoreductive pathway to catabolize this carbon source. 87
Comparative genomic studies have revealed that the GAL1, 7 and 10 genes are often found 88
located together in a “GAL cluster” in the genomes of yeast and filamentous fungi16, and also 89
the LAC4 and LAC12 genes form a “LAC cluster” in for example K. marxianus and K. lactis6,16. 90
Such metabolic gene clusters, identified both in filamentous fungi and yeasts, are particularly 91
prevalent for pathways involved in sugar and nutrient acquisition, synthesis of vitamins and 92
secondary metabolites17. Some clusters, including the GAL cluster, are conserved over a wide 93
range of species whereas other clusters appear unique to one or a few species16,18,19. Like 94
bacterial operons, the eukaryotic cluster genes are co-regulated in response to environmental 95
changes, and clusters sometimes even encode their own transcriptional activators17. Clustering 96
of genes under a common control mechanism allows the microorganism to rapidly adapt to 97
environmental cues, which can be advantageous to avoid deleterious recombination events and 98
high concentrations of local protein products. For example, co-regulation of the GAL genes is 99
necessary to avoid accumulation of the toxic intermediate galactose-1-phosphate in the Leloir 100
pathway16,20. Gene clusters can also propagate together by horizontal transfers to other species, 101
which is less likely to occur for non-clustered genes21. In fact, selective pressures in lactose-102
rich environments in dairy farms led to the formation of an efficient lactose utilization system 103
by rearrangement and horizontal gene transfer (HGT) of the LAC cluster genes in 104
Kluyveromyces dairy strains6. 105
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
5
Regulation of galactose metabolism (and lactose where applicable) has been carefully 106
characterized in yeasts such as S. cerevisiae, K. lactis and Candida albicans 22-25, displaying 107
both similarities and differences among species. In S. cerevisiae, three regulatory proteins 108
(ScGal4, ScGal80, ScGal3) are responsible for galactose regulation. In the absence of galactose, 109
the transcriptional activation domain of ScGal4 is bound to the inhibitor ScGal80. In the 110
presence of galactose, ScGal3 relieves ScGal4 from ScGal80 in a galactose- and ATP-111
dependent manner, resulting in the induction of the GAL structural genes. Like for S. cerevisiae, 112
K. lactis GAL regulatory system relies on relieving KlLac9 (ortholog of ScGal4) from Gal80 113
inhibition. However, K. lactis lacks Gal3 and instead uses a bifunctional galactokinase KlGal1 114
to induce both galactose and lactose genes26. There are four KlLac9 binding sites in the LAC 115
cluster gene promoters, which indicate the tight coregulation of lactose and galactose 116
metabolism in this yeast27. Similar to K. lactis, C. albicans lacks Gal3 but possesses a Gal1 with 117
both enzymatic and regulatory functions, but in this yeast the GAL gene expression is controlled 118
by transcription factors Rtg1/Rtg328 and/or CaRep1/CaCga129 rather than CaGal4, which 119
instead is responsible for expression of genes involved in glucose metabolism22. Such 120
transcriptional rewiring is common among yeasts, which calls for coupling of comparative 121
genomics with detailed mutant phenotyping and transcriptional analysis to decipher how 122
regulation occurs in individual species. 123
While (ga)lactose metabolism in S. cerevisiae and K. lactis has long served as a model system 124
for understanding the function, evolution and regulation of eukaryotic metabolic pathways, the 125
corresponding knowledge regarding non-conventional yeasts is scarce. One such non-126
conventional yeast is Candida intermedia, a haploid yeast belonging to the Metschnikowia 127
family in the CUG-Ser1 clade, which can grow on a wide range of different carbon sources1. 128
C. intermedia has previously received attention as a fast-growing yeast on xylose. The xylose 129
transporters and xylose reductases responsible for C. intermedia’s xylose-fermentative capacity 130
have been characterized in several studies30-35. C. intermedia is one of very few yeasts in the 131
Metschnikowia family that can grow on lactose1, and it has been used for cheese whey 132
bioremediation in the past36. Our previous works on characterizing the in-house isolated C. 133
intermedia strain CBS 141442 in terms of genomics, transcriptomics and physiology33,37,38 and 134
the development of a genome editing toolbox for this species39 provide a stable platform for 135
exploration of the genetic determinants of lactose metabolism in this yeast. 136
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
6
In this study, we show that C. intermedia possesses a unique ‘GALLAC’ cluster, in addition to 137
the conserved GAL and LAC clusters, that is essential for growth on lactose and highly 138
important for growth on galactose. Characterization of the individual genes within GALLAC 139
cluster revealed differentiation in their functionality, enabling the yeast to regulate the 140
expression of galactose and lactose genes differently. This cluster represents a new, interesting 141
example of metabolic network rewiring in yeast, and likely helps to explain how C. intermedia 142
has evolved into an efficient lactose-assimilating yeast. 143
Results 144
C. intermedia is among the top five lactose-growers out of 332 sequenced ascomycetous yeasts 145
As a start, we wanted to assess the capacity of C. intermedia to grow on lactose compared to 146
other yeasts. We cultured 24 of the 332 ascomycetous species that have scored positive for 147
lactose growth1, as well as C. intermedia strains CBS 572 (type strain), CBS 141442 and PYCC 148
4715 (previously characterized for utilization of xylose)1,34. The yeast species displayed 149
different growth patterns in lag phase, doubling time and final biomass (Figure 1, Figure S1). 150
When ranked based on lowest doubling time, K. lactis and K. marxianus were the fastest 151
growers on lactose, closely followed by C. intermedia strains PYCC 4715 and CBS 141442, 152
Debaryomyces subglobulus and Blastobotrys muscicola (Figure 1, Figure S1). Other species 153
such as Kluyveromyces aestuarii, Millerozyma acaciae and Lipomyces mesembris showed poor 154
or no growth under the conditions tested while others had very long lag phases. Thus, under the 155
assessed conditions, our results establish Candida intermedia as one of the top five fastest 156
lactose-growing species within this subset of ascomycetous yeasts1. 157
Genomic and transcriptomic analysis identify three gene clusters involved in lactose and 158
galactose assimilation. 159
To identify the genetic determinants for lactose metabolism in C. intermedia CBS 141442, we 160
searched the genome for orthologs of known genes involved in the uptake and conversion of 161
lactose, and its tightly coupled hydrolysis-product, galactose. We found several genes encoding 162
expected transcription factor orthologs including LAC9, GAL4, RTG1, RTG3, REP1 and CGA1 163
that have been associated with lactose and galactose metabolism in K. lactis40, S. cerevisiae41 164
and C. albicans28,29. In accordance with previous reports for yeasts belonging to the genus 165
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
7
Candida10, we did not find orthologous of GAL80, strongly suggesting that C. intermedia does 166
not possess the Gal3-Gal80-Gal4 regulon. 167
Moreover, genome of C. intermedia contains the conserved GAL cluster including GAL1, 7, 10 168
genes as well as an ORF-X gene encoding a putative glucose-4,6-dehydratase similar to GAL 169
clusters in Candida/Schizosaccharomyces strains10,16 (Figure 2). We also identified the 170
conserved LAC cluster containing the β-galactosidase gene LAC4 and lactose permease gene 171
LAC123,4, which correlates well with C. intermedia predominantly displaying intracellular β-172
galactosidase activity (data not shown)6. To our surprise, C. intermedia also possesses a third 173
cluster, hereafter referred to as the GALLAC cluster, containing a putative transcriptional 174
regulator gene LAC9 (LAC9_2) next to a second copy of the GAL1 gene (GAL1_2), followed 175
by one of the three xylose/aldose reductase genes (XYL1_2) previously characterized in C. 176
intermedia37 and lastly, a second copy of GAL10 (GAL10_2). Interestingly, the GAL10_2 gene 177
is shorter than GAL10 in the GAL cluster and seems to encode only the epimerase domain, 178
similar to GAL10 orthologs in Schizosaccharomyces species and filamentous fungi10. 179
Next, we performed transcriptome analysis using RNA-sequencing (RNA-seq) technology on 180
the CBS 141442 strain cultivated in media containing 2% of either lactose, galactose, or glucose 181
(Figure 2). All genes in the LAC and GAL clusters were among the highest upregulated genes 182
in both galactose and lactose as compared to glucose conditions. Also, the genes in the GALLAC 183
cluster were highly upregulated on both of these carbon sources with respect to glucose, with 184
the exception of the constitutively expressed LAC9_2 gene (Figure S2), indicating that the novel 185
cluster might play an important role in galactose and lactose metabolism in this non-186
conventional yeast. 187
The GALLAC cluster is essential for growth on lactose and unique to C. intermedia. 188
To decipher the importance of the three clusters for (ga)lactose metabolism in C. intermedia, 189
we deleted the clusters one by one using the split-marker technique previously developed for 190
this yeast39. The cluster deletion mutants (lac
∆
, gal
∆
and gallac
∆
) grew almost as well as the 191
wild-type strain (WT) in minimal media containing glucose (Figure 3A). As expected, gal
∆
192
failed to grow on galactose, which can be attributed to the complete shut-down of the Leloir 193
pathway, whereas the lac
∆
grew like WT. Interestingly, no growth was observed for the gallac
∆
194
in galactose during the first 90 h, whereafter it slowly started to grow (Figure 3B). With lactose 195
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
8
as carbon source, both lac
∆
and gallac
∆
completely failed to grow, whereas gal
∆
started to 196
grow slowly after approx. 100 h (Figure 3C). Thus, our results show that the GALLAC cluster 197
is essential for growth on lactose and highly important for growth on galactose. 198
To the best of our knowledge, the existence of a GALLAC-like cluster and its interdependence 199
with the GAL and LAC clusters has never previously been reported. This, along with the severe 200
growth defects of gallac
∆
, encouraged us to determine the origin and prevalence of the cluster 201
in other yeasts. First, we performed a comparative genomic analysis among the dataset of 332 202
genome-sequenced ascomycetous yeasts1. Although GAL1 and GAL10 were found clustered 203
together as parts of the conserved GAL clusters in 150/332 species16, C. intermedia was the 204
only species where these genes also clustered with LAC9 and XYL1 genes (Figure 3D). Next, 205
to decipher the evolutionary events that led to the formation of the GALLAC cluster, we 206
generated phylogenetic trees for each individual gene product of the cluster. Our analysis 207
revealed that although the amino acid identities between the paralogs in C. intermedia are 208
relatively low (56% for Gal1 and Gal1_2, 72% for Gal110 and Gal10_2, 49% for Lac9_2 and 209
Lac9 and 66% and 62% for Xyl1_2 compared to Xyl1 and Xyl1_3, respectively), the identities 210
between the paralogs are still higher than for most orthologs in other species (Figure 3B, Figure 211
S 3-6). Combined, these results strongly suggest that the unique GALLAC cluster has evolved 212
within C. intermedia through gene duplication and divergence. 213
Deletion of individual genes in the GAL and GALLAC clusters reveals importance of Lac9_2 214
and Gal1_2 for (ga)lactose metabolism. 215
To elucidate the physiological function of genes situated in the GALLAC cluster and to better 216
understand the interdependence between the clusters, we deleted individual genes in both the 217
GALLAC and GAL clusters. The mutant phenotypes were compared with WT and complete 218
cluster deletions regarding growth, consumption of sugars and production of metabolites in 219
defined media containing either 2% galactose or lactose. 220
With galactose as carbon source, deletion of LAC9_2 located in the GALLAC cluster resulted 221
in an extended lag phase accompanied by galactitol production, indicating that this putative 222
transcription factor is involved in regulation of galactose metabolism. However, deletion of the 223
other genes in the GALLAC cluster did not result in severe growth defects (Figure S 7). For 224
mutants deleted of individual GAL cluster genes, we saw the expected severe growth defects 225
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
9
for gal1
∆
, gal7
∆
and gal10
∆
(Figure 4A). However, gal10
∆
repeatedly displayed some growth 226
after a very long lag phase of approx. 250h (Figure S 8), which could suggest that Gal10_2 227
from the GALLAC cluster can partly complement the deletion of GAL10 from the GAL cluster. 228
With lactose as carbon source, lac9_2 displayed a delay in the onset of growth as was observed 229
for galactose while gal10_2
∆
and xyl1_2
∆
grew like WT. However, in contrast to the galactose 230
case, deletion of GAL1_2 abolished growth and resembled the deletion of the whole GALLAC 231
cluster, indicating an important function for this protein in lactose metabolism and a clear 232
phenotypic difference between the two carbon sources. On the contrary, deletion of GAL1 from 233
the GAL cluster did not fully abolish growth on lactose, but growth was slower and 234
accompanied with accumulation of galactitol (73% of theoretical yield), suggesting that most 235
of the lactose-derived galactose is catabolized through the action of an aldose reductase (such 236
as Xyl1_2), rather than through the putative galactokinase Gal1_2 in this mutant. Also, gal10
∆
237
grew slowly but with no measurable accumulation of galactose or galactitol, again showing that 238
the GAL10_2 in the GALLAC cluster can partly complement this deletion. Deletion of the only 239
copy of GAL7 gene encoding for galactose-1-phosphate uridylyltransferase resulted in 240
complete growth inhibition on lactose (as for galactose), and we speculate that the severe 241
growth phenotype is due to the accumulation of toxic intermediate galactose-1-phosphate as 242
seen in S. cerevisiae in previous studies 20. 243
As no single deletion resembled the growth defect seen for gallac
∆
on galactose, we 244
hypothesized that two or more genes must be deleted for the same phenotype to appear. We 245
therefore deleted both LAC9_2 and GAL1_2, which resulted in a growth defect strikingly 246
similar to that of the complete GALLAC cluster mutant (Figure S 9). Overall, we can conclude 247
that Lac9_2 and gal1_2 have important functions during galactose and lactose growth, although 248
there seem to be significant differences between the two carbon sources. 249
Lac9 binding motifs are found in promoters in the GALLAC cluster but not in the GAL and LAC 250
clusters. 251
To better understand the putative role of Lac9_2 as a transcriptional regulator, we performed 252
Multiple Em for Motif Elicitation42 (MEME; Version 5.5.43) analysis to identify conserved 253
transcription factor binding motifs in gene promoters in the three clusters. The analysis revealed 254
Lac9 (Gal4) binding motifs (p-value = 8.66×10-3) in the promoters of GAL1_2, XYL1_2 and 255
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
10
GAL10_2 in the GALLAC cluster, but not in the promoters in the GAL and LAC clusters (Figure 256
5A, Figure S 10). These results confirm the bioinformatic analysis of the 332 ascomycetous 257
yeast recently published, showing that C. intermedia and many other CTG clade yeasts lack 258
Lac9/Gal4 binding sites in their GAL clusters16. Although additional analysis would be needed 259
to better understand the transcriptional regulation exerted by Lac9_2, it is likely that it directly 260
binds the promoters of genes within the GALLAC cluster. 261
Besides LAC9_2 in the GALLAC cluster, our comparative genomics analysis also identified a 262
second, non-clustered LAC9 gene (Figure 3) as well as GAL4 gene. All three proteins have 263
predicted Gal4-like DNA-binding domains, but they differ substantially in protein sequence 264
identity (45% for Lac9_2 and Lac9, and 18% and 19% for Gal4 compared to Lac9 and Lac9_2, 265
respectively). As deletion mutants of LAC9 and GAL4 did not display growth defects on lactose 266
or galactose (Figure S 11), we conclude that they are not important transcriptional regulators 267
for (ga)lactose metabolism in C. intermedia. 268
Gal1_2 is required for the induction of LAC cluster genes in C. intermedia 269
Our deletion mutant phenotyping results suggest that Gal1 and Gal1_2 have at least partly 270
different physiological functions in C. intermedia (Figure 4). As both genes are highly 271
upregulated on both galactose and lactose in the WT strain (Figure 2), we speculated that they 272
must differ in their activities as galactokinases or regulators. To this end, we expressed both 273
proteins in S. cerevisiae BY4741 gal1
∆
, which successfully rescued the mutant’s growth defect 274
on galactose (Figure 6A). This experiment demonstrates that both proteins have galactokinase 275
activity, at least when expressed in S. cerevisiae. We also compared the predicted structures of 276
Gal1 and Gal1_2 using Alphafold243,44, observing that even though the amino acid sequence 277
identity between the two proteins is as low as 56%, the protein structures are very similar to 278
each other (rmsd 0.490 Å; Figure 6B) as well as to the experimentally solved structure of 279
ScGal142 (rmsd 0.778 and 0.758 Å for Gal1 and Gal1_2, respectively). Additionally, we 280
observed that the amino acids interacting with galactose in ScGal1 (PDB ID: 2aj4) are identical 281
to those in the CiGal1 proteins, apart from Asn213 in ScGal1 (Asn205 in CiGal1), which 282
interacts with the O2 hydroxyl group, which in CiGal1_2 is instead a serine residue (Ser199). 283
The active site clefts of all enzymes are only big enough to accommodate monosaccharides like 284
galactose. Thus, it is highly unlikely that they bind to other, larger substrates such as lactose 285
(Figure 6B). In S. cerevisiae, the regulator ScGal3 is similar in structure to the galactokinase 286
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
11
ScGal1 but has lost its galactokinase activity due to an addition of two extra amino acids (Ser-287
Ala dipeptide)45. However, no such structural changes were seen for CiGal1 or CiGal1_2 that 288
could help us to predict regulatory functions. 289
Instead, we examined the role of Gal1 and Gal1_2 as regulators of lactose metabolism by 290
performing β-galactosidase assays with C. intermedia gal1
∆
, gal1_2
∆
as well as WT (positive 291
control) and lac
∆
(negative control). Our RNAseq data showed that in WT, LAC4 is expressed 292
during growth on both galactose and lactose, respectively (Figure 2). Thus, we assessed the 293
lactase activity during growth on both these sugars to include at least one condition where all 294
strains could grow. For both galactose and lactose, lactase activity was readily detected in WT 295
and gal1
∆
cells but close to zero in the lac
∆
and gal1_2
∆
mutants (Figure 6C), showing that 296
Gal1_2 is essential to induce lactase activity. Moreover, qPCR analysis of WT and gal1_2
∆
297
showed that LAC4 expression was diminished in gal1_2
∆
as compared to the WT, indicating 298
that regulation is exerted on the transcriptional level. In the same mutant we also observed that 299
GAL1 was still expressed (Figure 6D), fortifying the growth phenotyping results where we saw 300
a clear difference in growth on galactose (+) and lactose (-) for this single mutant. Overall, these 301
results firmly establish a difference in function between Gal1 and Gal1_2, where lack of Gal1_2 302
diminishes lactase transcription and activity while Gal1 does not, and further indicate important 303
differences in regulation of lactose and galactose metabolism and growth. 304
Discussion 305
In this work we have investigated how (ga)lactose is metabolized in the non-conventional yeast 306
C. intermedia and shed light on the genetic determinants behind this trait. Interestingly, we 307
found that the genome of C. intermedia contains not only the conserved GAL and LAC clusters, 308
but also a unique GALLAC cluster that has evolved through gene duplication and divergence. 309
By combining results from comparative genomics, transcriptomics analysis, deletion mutant 310
phenotyping and metabolite profiling, we have started to unravel parts of the regulatory 311
networks and interdependence of the three clusters and can show that the GALLAC cluster plays 312
a vital role in both galactose and lactose metabolism in this yeast. With the Leloir pathway of 313
budding yeasts acting like a model system for understanding the function, evolution and 314
regulation of eukaryotic metabolic pathways, this work adds interesting new pieces to the 315
puzzle. 316
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
12
Our results show that C. intermedia grows relatively fast on lactose, and strains of this species 317
have been isolated several times from lactose-rich niches including fermentation products like 318
white-brined cheese46 and cheese whey47. In these lactose-rich environments, survival likely 319
necessitates a genetic makeup that can help outcompete rivaling microorganisms. Is the 320
GALLAC cluster facilitating the fast lactose growth observed for C. intermedia, and if so, how? 321
This is currently unresolved, but the genes within the cluster and the mutant phenotyping results 322
provide some clues. First, the GALLAC cluster seems to have important regulatory functions, 323
which can help to finetune metabolic fluxes and growth. We demonstrate that the cluster-324
encoded transcription factor Lac9_2 is important for onset of (ga)lactose growth, as deletion of 325
LAC9_2 leads to increased lag phase on both carbon sources. However, as lac9_2Δ cells 326
eventually grow, Lac9_2 cannot be solely responsible for expression of the metabolic genes. 327
Moreover, Lac9 binding motifs were only found in the promoters of GALLAC genes, suggesting 328
that other transcriptional activators are responsible for induction of the GAL and LAC cluster 329
genes. 330
In addition to Lac9_2, Gal1_2 from the GALLAC cluster also seems to be an important regulator 331
of (ga)lactose growth. The bioinformatic analysis strongly suggests that GAL1_2 in C. 332
intermedia formed through gene duplication and divergence from the GAL1 gene in the GAL 333
cluster. Our results also show that Gal1_2 is essential for LAC4 transcription and in extension, 334
lactase activity and lactose growth, whereas deletion of GAL1_2 alone did not abolish GAL1 335
expression and galactose growth. Combined, these results indicate that the original Gal1 seems 336
to have maintained the function as main galactokinase while Gal1_2 has taken on the role as a 337
regulator. This evolutionary trajectory mirrors the path taken by Gal1 and Gal3 in S. 338
cerevisiae45, but with a crucial distinction: the Gal1 proteins in C. intermedia have evolved in 339
response to both lactose and galactose. On galactose, an additional deletion of LAC9_2 was 340
needed to impair growth, suggesting that the yeast senses and regulates expression of the 341
galactose and lactose genes somewhat differently. Since Gal1_2 does not have a DNA binding 342
capacity, we hypothesize that Gal1_2 binds galactose and thereafter activates unknown 343
transcription factor(s) that ultimately bind and induce expression from the LAC and GAL 344
clusters. Although many details are still to be elucidated, it is clear that C. intermedia has 345
developed a way of regulating its (ga)lactose metabolism that differs from other yeast species 346
studied to date, including the Gal3-Gal80-Gal4 regulon in S. cerevisiae48, the Gal1-Gal80-Lac9 347
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
13
equivalent in K. lactis40 and the Rep1-Cga1regulatory complex in C. albicans29. Future research 348
will include identifying these unknown TFs and fully elucidating the roles of Lac9_2 and 349
Gal1_2 in sensing, signaling, and regulating the cellular response to changes in the nutritional 350
environment. 351
Another interesting feature of the GALLAC cluster is the XYL1_2 gene encoding an aldose 352
reductase. Although no galactitol or other intermediates of an oxidoreductive pathway 353
accumulate in the WT under the growth conditions assessed, several of the constructed mutants 354
(in particular, galΔ and gal1Δ) accumulate galactitol upon growth on lactose. In S. cerevisiae, 355
galactitol functions as an overflow metabolite ensuring that cells avoid accumulation of 356
galactose-1-phosphate, a known toxic intermediate of the Leloir pathway in the cell15,20, and it 357
is reasonable to assume that the same is true for C. intermedia. Moreover, it is interesting to 358
note that aldose reductases can directly convert β-D-galactose, the hydrolysis product of lactose, 359
whereas galactokinase requires β-D-galactose conversion into
α
-D-galactose before it can be 360
metabolized via the Leloir pathway. We speculate that induction of an aldose reductase gene in 361
tandem with the LAC and GAL genes in response to lactose (and galactose) can be an efficient 362
way to quickly metabolize these sugars, providing a growth advantage in competitive lactose-363
rich environments. 364
In addition to the basic scientific questions that can be answered by studying evolution and 365
sugar metabolism in lactose-growing yeast species, these yeasts can also be used as cell 366
factories in industrial biotechnology processes. Here, a better understanding of the underlying 367
genetics for this trait enables metabolic engineering to optimize the conversion of lactose-rich 368
whey into value-added products. The dairy yeasts K. lactis and K. marxianus have been 369
developed and used for whey-based production of ethanol49, recombinant proteins50 as well as 370
bulk chemicals such as ethyl acetate51, while exploration of new lactose-metabolizing yeasts 371
allows for additional product diversification. With lactose as substrate, a carbon-partition 372
strategy can be used for bioproduction, where the glucose moiety is converted into energy and 373
yeast biomass and the galactose moity in steered into production of the wanted metabolite, or 374
vice versa13. Through this strategy, the non-conventional yeast C. intermedia can also be 375
explored to produce various growth-coupled metabolites, including galactitol and derivatives 376
thereof. 377
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
14
In conclusion, our work on the non-conventional, lactose-metabolizing yeast C. intermedia has 378
paved the way towards a better understanding of the (ga)lactose metabolism in this relatively 379
under-studied species. To the best of our knowledge, we show for the first time that gene 380
duplication and divergence resulted in the formation of a unique GALLAC cluster and its 381
essential role in (ga)lactose metabolism in this yeast, providing new insights of how organisms 382
can evolve metabolic pathways and regulatory networks. In addition, the proven ability of C. 383
intermedia to grow relatively well on lactose establishes this yeast as an interesting lactose-384
assimilating species also for future industrial applications. 385
Materials and Methods 386
Culture conditions and molecular techniques 387
For amplification of plasmids, E. coli was grown on LB medium (1 % tryptone, 1 % NaCl and 388
0.5 % yeast extract) containing ampicillin (100 μg/mL) for plasmid selection. 389
C. intermedia CBS 141442 was grown in YPD medium (1% yeast extract, 2% bactopeptone 390
and 2% glucose) prior to yeast transformation using the split marker technique as described 391
previously39. Using this technique, deletion cassettes were constructed as two partially 392
overlapping fragments, each containing half of the selection marker fused to either upstream or 393
downstream sequences of the target gene. Deletion fragments were transformed using 394
electroporation (BioRad Micropulse electroporator). After transformation, cells were plated on 395
YPD agar containing 200 μg/ml nourseothricin to select for integration and expression of the 396
CaNAT1 selection marker. 397
Colony PCR was used to identify transformants with correct gene deletions, where single 398
colonies were resuspended in 50 μL dH2O using a sterile toothpick and then heated to 90 °C for 399
10 min. After cooling to 12 °C, 2 μL of each suspension was used as a template for PCR using 400
PHIRE II polymerase (Thermo-Fisher Scientific, USA). For each mutant, three PRC primers 401
were used, where the first primer was designed to hybridize to the genome outside the flanking 402
region, the second to the marker gene and the third to the targeted gene (negative control). For 403
each gene deletion, three correctly targeted transformants were selected for subsequent 404
phenotyping. 405
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
15
To construct the double gene deletion mutant (lac9_2, the split marker method was used twice 406
in the same strain background, first employing the split CaNAT1 selection marker as described 407
above, and then a split KanMX selection marker PCR amplified from the plasmid 408
pTO149_RFP_CauNEO developed for Candida auris52. Correctly assembled and genome 409
integrated KanMX markers gave rise to C. intermedia transformants resistant to the antibiotic 410
Geneticin (200 μg/mL). 411
For complementation tests in S. cerevisiae, C. intermedia GAL1 and GAL1_2 genes were 412
synthesized and cloned in a vector backbone (pESC-URA; GenScript Biotech, New Jersey, 413
USA). Codon CTG were adjusted to alternate codon prior to optimization of the complete gene 414
for expression in S. cerevisiae using the GenSmart™ Codon Optimization tool (GenScript 415
Biotech, New Jersey, USA). S. cerevisiae BY4741/2 GAL1 knockouts used for 416
complementation experiment were grown on YP media with 2% glucose and transformed with 417
above mentioned plasmids using LiAc/PEG heat-shock method53. Transformants were selected 418
on agar plates with YNB -uracil and 2% glucose, restreaked and then tested for growth in liquid 419
YNB -URA media with 2% galactose in GrowthProfiler at 30 °C and 250 rpm. S. cerevisiae 420
BY4741/2 gal1
∆
transformed with p426 (empty vector with URA3 as selection marker) was 421
used as negative control. 422
Growth Experiments 423
Growth Profiler 424
To follow growth over time for C. intermedia CBS 141442 and the other yeasts characterized 425
in this work, strains were precultured at 30 °C, 180 rpm overnight in synthetic defined minimal 426
Verduyn media54 containing 2% glucose (w/v). Precultured cells were then inoculated in 427
250 µL minimal media supplemented with 20 g/L carbon source to a starting OD600 = 0.1. All 428
yeast strains were grown in biological triplicates in a 96-well plate setup in a GrowthProfiler 429
960 (Enzyscreen, Netherlands). ‘Green Values’ (GV) measured by the GrowthProfiler 430
correspond to growth based on pixel counts, and GV changes were recorded every 30 min for 431
72 h at 30 °C and 150 rpm. 432
Cell growth quantifier (CGQ) 433
Growth characterization was also performed in shake flasks using Cell Growth Quantifier 434
(CGQ-Scientific Bioprocessing, Germany) 55. Wild type and mutant strains were precultured at 435
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
16
30 °C, 200 rpm overnight in synthetic defined minimal Verduyn media containing 2% glucose 436
(w/v), followed by inoculation of 25ml of minimal medium supplemented with 2% carbon 437
source (galactose or lactose) in 100mL shake flasks to a starting OD600 = 0.1. Growth was 438
quantified as “Scatter values” by the CGQ system56. Scatter values were recorded for 10 days 439
at 30 °C and 200 rpm for each strain growth in biological triplicates and sampling was 440
performed for sugar and polyol analysis. 441
Lactase activity assay 442
β-galactosidase activity was determined using the Yeast β-Galactosidase Assay Kit (Thermo-443
Fisher Scientific, USA) following the manufacturer’s instructions. Cells were harvested at 444
different timepoints during growth and tested for lactase activity. A Working solution was 445
prepared by mixing equal amounts of 2X β-galactosidase Assay Buffer (containing ortho-446
nitrophenyl-β-galactoside (ONPG)) and Yeast Protein Extraction Reagent. The reaction was 447
initiated by mixing 100uL of working solution with 100uL cell culture and incubated for 30 448
min at 37 °C in a thermomixer. After 30 min, cell mix was centrifuged at 5000 rpm for 3 mins 449
and the supernatant was analyzed for lactase activity by measuring o-nitrophenol release from 450
ONPG at 420 nm in microplate reader (FLU-Ostar Omega-BMG LabTech, Ortenberg, 451
Germany). 452
Determination of sugar and polyol concentrations 453
Sugars and galactitol concentrations were measured using a Dionex high-performance liquid 454
chromatography (HPLC) system equipped with an RID-10A refractive index detector and an 455
Aminex HPX-87H carbohydrate analysis column (Bio-Rad Laboratories). Analysis was 456
performed with the column at 80 °C, and 5 mM H2SO4 as mobile phase at a constant flow rate 457
of 0.8 mL/min. Culture samples were pelleted prior to analysis, following which, the 458
supernatant was passed through a 0.22 μm polyether sulfone syringe filter. Chromatogram 459
peaks were identified and integrated using the Chromeleon v6.8 (Dionex) software and 460
quantified against prepared analytical standards. 461
Comparative genomics and evolutionary mapping 462
We established the blast database for 332 yeast species based on the work of Shen et al., 20181. 463
Then we used tblastn to get gene hits for each specific gene in three clusters against 332 yeast 464
species. Based on the generated data, we further mapped gene hits from species to clade levels. 465
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
17
To investigate the evolution of genes in the GALLAC cluster, a comprehensive pipeline based 466
on the work of Goncalves and colleagues was developed57. For each candidate gene in the 467
GALLAC cluster, BLASTP was run against the NCBI non-redundant (nr) protein sequence 468
database and homologs were selected according to the top 300 BLAST hits to each query 469
sequence. These homologs were aligned with MAFFT v7.31058 using default settings for 470
multiple sequence alignment. Poorly aligned regions were removed with trimAl59 using the 471
‘-automated1’ option. Subsequently, phylogenetic trees were built using IQ-TREE v1.6.12 472
60 with 1000 ultrafast bootstrapping replicates61. Each tree was rooted at the midpoint using a 473
customized script combining R packages ape v5.4-1 and phangorn v2.5.5. Finally, 474
the resulting phylogenies were visualized using iTol v562. 475
Transcription factor binding motif analysis 476
To determine the binding motifs of transcription factors in promoter regions of the GAL, LAC 477
and GALLAC cluster, MEME (Version 5.5.43) promoter binding motif analysis was used. 478
Promoter regions of all genes from the three clusters were added as query sequences with the 479
following constraints: maximum number of motifs = 5, maximum length of motif = 25 bases, 480
any number of motif repetitions (-anr), background model = 0-order model of sequences. 481
Motif(s) derived from this analysis were then fed as input to Tomtom63 (version 5.5.4) to 482
compare against Yeastract64 database. 483
RNA sequencing 484
Transcriptomics using RNA sequencing was performed as previously described37. In brief, C. 485
intermedia CBS 141442 was grown in controlled stirred 1-L bioreactor vessels (DASGIP, 486
Eppendorf, Hamburg, Germany) containing 500 mL synthetic defined minimal Verduyn media 487
with 2% Glucose, Galactose or Lactose. Reactor conditions were maintained as: Temp = 30 °C; 488
pH = 5.5 (maintained with 2M Potassium Hydroxide); Aeration = 1 Vessel Volume per Minute; 489
stirring = 300 rpm. 490
RNA extraction 491
For RNA extraction, samples (10 mL) were collected when the dissolved oxygen of the culture 492
was 35–40% (v/v). After washing the cells, the pellets were immediately frozen using liquid 493
nitrogen. Frozen pellets were stored at -80 °C until extraction. The frozen pellets were thawed 494
in 500 μL of TRIzol (Ambion—Foster City, CA, USA) and thoroughly resuspended. Then, 495
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
18
cells were lysed in 2 mL tubes with Lysing Matrix C (MP Biomedical, Santa Ana, CA, USA) 496
in a FastPrep FP120 (Savant, Carlsbad, CA, USA) for five cycles, at intensity 5.5 for 30 s. 497
Tubes were cooled on ice for a minute between cycles and resuspended once again in 500 μL 498
of TRIzol and vortexed thoroughly. After incubation at room temperature for 5 min, tubes were 499
centrifuged for 10 min at 12,000 rpm and 4 °C. Chloroform was added to the collected 500
supernatants (200 μL of chloroform per mL of supernatant) and vortexed vigorously for 30 s. 501
After centrifugation for 15 min at 12,000 rpm, 4 °C, the top clear aqueous phase was collected 502
and transferred to a new RNase-free tube, to which, equal amount of absolute ethanol was 503
slowly added while mixing. Each sample was loaded into a RNeasy column (RNeasy Mini Kit, 504
Qiagen—Hilden, Germany) and further steps followed the protocol of the manufacturer. The 505
RNA was eluted with RNase-free water and samples were stored at -80 °C until use. 506
Data analysis 507
RNA samples were analyzed in a TapeStation (Agilent, Santa Clara, CA, USA), and only 508
samples with RNA integrity number above 8 were used for library preparation. Sequencing 509
using the HiSeq 2500 system (Illumina Inc.—San Diego, CA, USA), with paired-end 125 bp 510
read length, and v4 sequencing chemistry, was followed by quality control of read data using 511
the software FastQC version 0.11.565. Software Star version 2.5.2b66 was used to map reads to 512
the reference genome. Gene counts were normalized with weighted trimmed mean of M-values 513
using the calcNormFactor function from the package edgeR67 and Limma package68 were used 514
to transform and make data suitable for linear modelling. The estimated p-values were corrected 515
for multiple testing with the Benjamini-Hochberg procedure, and genes were considered 516
significant if the adjusted p-values were lower than 0.05. The raw counts were filtered such that 517
genes with CPM > 3.84 in at least 12% (5/43) of the samples were retained. The R function 518
‘varianceStabilizingTransformation()’ from R package ‘DESeq2’69 was used to convert raw 519
counts to variance-stabilized-counts (VST). Expression data for C. intermedia on galactose and 520
lactose was normalized using glucose as control condition. The RNA seq datasets are available 521
in the European Nucleotide Archive (ENA) with the accession number E-MTAB-6670. 522
523
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
19
Gene expression analysis using qPCR 524
Primers used for mRNA quantification using qPCR are listed in Table S1. Primers were 525
designed using Primer3 (https://primer3.ut.ee/) and were checked for efficiency. Only primers 526
having efficiency between 90-110% were used for qPCR. Cultures were grown at 30 °C and 527
200 rpm in 100 ml shake flasks containing 25 ml synthetic defined minimal Verduyn media 528
containing either 2% glucose (control), galactose or lactose as carbon source. Cells were 529
harvested for each strain at lag, early log and late log phases, taking three biological replicates. 530
Harvested cells were pelleted by centrifugation at 4 °C for 5 mins at 5000 rpm and washed 531
twice by resuspending in ice-cold sterile dH2O water and centrifugation. Cell pellet was snap-532
frozen using liquid nitrogen and stored at -80 °C for cDNA synthesis. RNA extraction was 533
performed as described for RNA sequencing above. cDNA synthesis and RT qPCR analysis 534
was performed using Maxima H Minus First Strand cDNA Synthesis Kit (Thermo Fisher) and 535
Maxima SYBR Green/Fluorescein qPCR Master Mix (2X) (Thermo Fisher), according to the 536
manufacturer's instruction. Fold change was calculated using the delta-delta Ct method (2-
∆∆
Ct) 537
with expression values in glucose as control condition and CiACT1 as the reference gene for 538
normalization. 539
Acknowledgments 540
Authors would like to thank Peter Dahl from Department of Chemistry & Molecular Biology, 541
Gothenburg University for providing S. cerevisiae BY4741 and BY4742 knockout strains and 542
Dr. Xiang Jiao from Department of Life Sciences, Chalmers University of Technology for 543
providing the plasmid p426. Authors would also like to thank ARS culture collection (NRRL) 544
for providing us with different lactose metabolizing strains upon request. 545
Funding 546
This research was funded by Formas, grant number 2017-01417. The AlphaFold2 structure 547
predictions were enabled by resources provided by the National Academic Infrastructure for 548
Supercomputing in Sweden (NAISS), partially funded by the Swedish Research Council 549
through grant agreement no. 2022-06725. 550
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
20
Author contributions 551
Conceptualization: K.V.R.P. and C.G.; methodology: K.V.R.P., L.Y., K.P., F.F.O., and C.G.; 552
investigation: K.V.R.P., L.Y., K.P., F.F.O., J.L. and C.G.; original manuscript draft preparation: 553
K.V.R.P. and C.G.; and manuscript review and editing: all. 554
Availability of data 555
The RNA-Seq datasets are available in the in the European Nucleotide Archive (ENA) with the 556
accession number E-MTAB-6670. 557
558
References 559
1. Shen, X.X., Opulente, D.A., Kominek, J., Zhou, X., Steenwyk, J.L., Buh, K.V., Haase, 560
M.A.B., Wisecaver, J.H., Wang, M., Doering, D.T., et al. (2018). Tempo and Mode of 561
Genome Evolution in the Budding Yeast Subphylum. Cell 175, 1533-1545 e1520. 562
10.1016/j.cell.2018.10.023. 563
2. Capuco, A.V., and Akers, R.M. (2009). The origin and evolution of lactation. J Biol 8, 564
37. 10.1186/jbiol139. 565
3. Schaffrath, R., and Breunig, K.D. (2000). Genetics and molecular physiology of the 566
yeast Kluyveromyces lactis. Fungal Genet Biol 30, 173-190. 10.1006/fgbi.2000.1221. 567
4. Godecke, A., Zachariae, W., Arvanitidis, A., and Breunig, K.D. (1991). Coregulation 568
of the Kluyveromyces lactis lactose permease and beta-galactosidase genes is achieved 569
by interaction of multiple LAC9 binding sites in a 2.6 kbp divergent promoter. Nucleic 570
Acids Res 19, 5351-5358. 571
5. Lane, M.M., Burke, N., Karreman, R., Wolfe, K.H., O’Byrne, C.P., and Morrissey, J.P. 572
(2011). Physiological and metabolic diversity in the yeast Kluyveromyces marxianus. 573
Antonie van Leeuwenhoek 100, 507-519. 10.1007/s10482-011-9606-x. 574
6. Varela, J.A., Puricelli, M., Ortiz-Merino, R.A., Giacomobono, R., Braun-Galleani, S., 575
Wolfe, K.H., and Morrissey, J.P. (2019). Origin of Lactose Fermentation in 576
Kluyveromyces lactis by Interspecies Transfer of a Neo-functionalized Gene Cluster 577
during Domestication. Curr Biol 29, 4284-4290 e4282. 10.1016/j.cub.2019.10.044. 578
7. Marcus, J.F., DeMarsh, T.A., and Alcaine, S.D. (2021). Upcycling of Whey Permeate 579
through Yeast- and Mold-Driven Fermentations under Anoxic and Oxic Conditions. 580
Fermentation 7, 16. 581
8. Nascimento, M.F., Barreiros, R., Oliveira, A.C., Ferreira, F.C., and Faria, N.T. (2022). 582
Moesziomyces spp. cultivation using cheese whey: new yeast extract-free media, β-583
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
21
galactosidase biosynthesis and mannosylerythritol lipids production. Biomass 584
Conversion and Biorefinery. 10.1007/s13399-022-02837-y. 585
9. Thoden, J.B., and Holden, H.M. (2007). The molecular architecture of glucose-1-586
phosphate uridylyltransferase. Protein Sci 16, 432-440. 10.1110/ps.062626007. 587
10. Slot, J.C., and Rokas, A. (2010). Multiple GAL pathway gene clusters evolved 588
independently and by different mechanisms in fungi. Proc Natl Acad Sci U S A 107, 589
10136-10141. 10.1073/pnas.0914418107. 590
11. Mojzita, D., Herold, S., Metz, B., Seiboth, B., and Richard, P. (2012). l-xylo-3-Hexulose 591
Reductase Is the Missing Link in the Oxidoreductive Pathway for d-Galactose 592
Catabolism in Filamentous Fungi*. Journal of Biological Chemistry 287, 26010-26018. 593
https://doi.org/10.1074/jbc.M112.372755. 594
12. Gruben, B.S., Zhou, M., and de Vries, R.P. (2012). GalX regulates the D-galactose 595
oxido-reductive pathway in Aspergillus niger. FEBS Lett 586, 3980-3985. 596
10.1016/j.febslet.2012.09.029. 597
13. Liu, J.J., Zhang, G.C., Kwak, S., Oh, E.J., Yun, E.J., Chomvong, K., Cate, J.H.D., and 598
Jin, Y.S. (2019). Overcoming the thermodynamic equilibrium of an isomerization 599
reaction through oxidoreductive reactions for biotransformation. Nat Commun 10, 600
1356. 10.1038/s41467-019-09288-6. 601
14. Zhang, G., Zabed, H.M., An, Y., Yun, J., Huang, J., Zhang, Y., Li, X., Wang, J., 602
Ravikumar, Y., and Qi, X. (2022). Biocatalytic conversion of a lactose-rich dairy waste 603
into D-tagatose, D-arabitol and galactitol using sequential whole cell and fermentation 604
technologies. Bioresource Technology 358, 127422. 605
https://doi.org/10.1016/j.biortech.2022.127422. 606
15. Jagtap, S.S., Bedekar, A.A., Liu, J.J., Jin, Y.S., and Rao, C.V. (2019). Production of 607
galactitol from galactose by the oleaginous yeast Rhodosporidium toruloides IFO0880. 608
Biotechnol Biofuels 12, 250. 10.1186/s13068-019-1586-5. 609
16. Harrison, M.C., LaBella, A.L., Hittinger, C.T., and Rokas, A. (2022). The evolution of 610
the GALactose utilization pathway in budding yeasts. Trends Genet 38, 97-106. 611
10.1016/j.tig.2021.08.013. 612
17. Rokas, A., Wisecaver, J.H., and Lind, A.L. (2018). The birth, evolution and death of 613
metabolic gene clusters in fungi. Nature Reviews Microbiology 16, 731-744. 614
10.1038/s41579-018-0075-3. 615
18. Wong, S., and Wolfe, K.H. (2005). Birth of a metabolic gene cluster in yeast by adaptive 616
gene relocation. Nat Genet 37, 777-782. 10.1038/ng1584. 617
19. Krause, D.J., Kominek, J., Opulente, D.A., Shen, X.-X., Zhou, X., Langdon, Q.K., 618
DeVirgilio, J., Hulfachor, A.B., Kurtzman, C.P., Rokas, A., and Hittinger, C.T. (2018). 619
Functional and evolutionary characterization of a secondary metabolite gene cluster in 620
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
22
budding yeasts. Proceedings of the National Academy of Sciences 115, 11030-11035. 621
doi:10.1073/pnas.1806268115. 622
20. de Jongh, W.A., Bro, C., Ostergaard, S., Regenberg, B., Olsson, L., and Nielsen, J. 623
(2008). The roles of galactitol, galactose-1-phosphate, and phosphoglucomutase in 624
galactose-induced toxicity in Saccharomyces cerevisiae. Biotechnol Bioeng 101, 317-625
326. 10.1002/bit.21890. 626
21. Lawrence, J. (1999). Selfish operons: the evolutionary impact of gene clustering in 627
prokaryotes and eukaryotes. Current Opinion in Genetics & Development 9, 642-648. 628
https://doi.org/10.1016/S0959-437X(99)00025-8. 629
22. Martchenko, M., Levitin, A., Hogues, H., Nantel, A., and Whiteway, M. (2007). 630
Transcriptional rewiring of fungal galactose-metabolism circuitry. Curr Biol 17, 1007-631
1013. 10.1016/j.cub.2007.05.017. 632
23. Van Ende, M., Wijnants, S., and Van Dijck, P. (2019). Sugar Sensing and Signaling in 633
Candida albicans and Candida glabrata. Front Microbiol 10, 99. 634
10.3389/fmicb.2019.00099. 635
24. Peng, G., and Hopper, J.E. (2002). Gene activation by interaction of an inhibitor with a 636
cytoplasmic signaling protein. Proceedings of the National Academy of Sciences 99, 637
8548-8553. 10.1073/pnas.142100099. 638
25. Bhat, P.J., and Murthy, T.V. (2001). Transcriptional control of the GAL/MEL regulon 639
of yeast Saccharomyces cerevisiae: mechanism of galactose-mediated signal 640
transduction. Mol Microbiol 40, 1059-1066. 10.1046/j.1365-2958.2001.02421.x. 641
26. Meyer, J., Walker-Jonah, A., and Hollenberg, C.P. (1991). Galactokinase encoded by 642
GAL1 is a bifunctional protein required for induction of the GAL genes in 643
Kluyveromyces lactis and is able to suppress the gal3 phenotype in Saccharomyces 644
cerevisiae. Molecular and Cellular Biology 11, 5454-5461. 645
doi:10.1128/mcb.11.11.5454-5461.1991. 646
27. Halvorsen, Y.C., Nandabalan, K., and Dickson, R.C. (1990). LAC9 DNA-binding 647
domain coordinates two zinc atoms per monomer and contacts DNA as a dimer. Journal 648
of Biological Chemistry 265, 13283-13289. https://doi.org/10.1016/S0021-649
9258(19)38296-1. 650
28. Dalal, C.K., Zuleta, I.A., Mitchell, K.F., Andes, D.R., El-Samad, H., and Johnson, A.D. 651
(2016). Transcriptional rewiring over evolutionary timescales changes quantitative and 652
qualitative properties of gene expression. Elife 5. 10.7554/eLife.18981. 653
29. Sun, X., Yu, J., Zhu, C., Mo, X., Sun, Q., Yang, D., Su, C., and Lu, Y. (2023). 654
Recognition of galactose by a scaffold protein recruits a transcriptional activator for the 655
GAL regulon induction in Candida albicans. Elife 12. 10.7554/eLife.84155. 656
30. Wu, J., Hu, J., Zhao, S., He, M., Hu, G., Ge, X., and Peng, N. (2018). Single-cell Protein 657
and Xylitol Production by a Novel Yeast Strain Candida intermedia FL023 from 658
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
23
Lignocellulosic Hydrolysates and Xylose. Appl Biochem Biotechnol 185, 163-178. 659
10.1007/s12010-017-2644-8. 660
31. Gardonyi, M., Osterberg, M., Rodrigues, C., Spencer-Martins, I., and Hahn-Hagerdal, 661
B. (2003). High capacity xylose transport in Candida intermedia PYCC 4715. FEMS 662
Yeast Res 3, 45-52. 10.1111/j.1567-1364.2003.tb00137.x. 663
32. Fonseca, C., Olofsson, K., Ferreira, C., Runquist, D., Fonseca, L.L., Hahn-Hagerdal, B., 664
and Liden, G. (2011). The glucose/xylose facilitator Gxf1 from Candida intermedia 665
expressed in a xylose-fermenting industrial strain of Saccharomyces cerevisiae 666
increases xylose uptake in SSCF of wheat straw. Enzyme Microb Technol 48, 518-525. 667
10.1016/j.enzmictec.2011.02.010. 668
33. Moreno, A.D., Carbone, A., Pavone, R., Olsson, L., and Geijer, C. (2019). Evolutionary 669
engineered Candida intermedia exhibits improved xylose utilization and robustness to 670
lignocellulose-derived inhibitors and ethanol. Appl Microbiol Biotechnol 103, 1405-671
1416. 10.1007/s00253-018-9528-x. 672
34. Mayr, P., Bruggler, K., Kulbe, K.D., and Nidetzky, B. (2000). D-Xylose metabolism by 673
Candida intermedia: isolation and characterisation of two forms of aldose reductase 674
with different coenzyme specificities. J Chromatogr B Biomed Sci Appl 737, 195-202. 675
10.1016/s0378-4347(99)00380-1. 676
35. Nidetzky, B., Bruggler, K., Kratzer, R., and Mayr, P. (2003). Multiple forms of xylose 677
reductase in Candida intermedia: comparison of their functional properties using 678
quantitative structure-activity relationships, steady-state kinetic analysis, and pH 679
studies. J Agric Food Chem 51, 7930-7935. 10.1021/jf034426j. 680
36. Yonten, V., and Aktas, N. (2014). Exploring the optimum conditions for maximizing 681
the microbial growth of Candida intermedia by response surface methodology. Prep 682
Biochem Biotechnol 44, 26-39. 10.1080/10826068.2013.782044. 683
37. Geijer, C., Faria-Oliveira, F., Moreno, A.D., Stenberg, S., Mazurkewich, S., and Olsson, 684
L. (2020). Genomic and transcriptomic analysis of Candida intermedia reveals the 685
genetic determinants for its xylose-converting capacity. Biotechnol Biofuels 13, 48. 686
10.1186/s13068-020-1663-9. 687
38. Moreno, A.D., Tellgren-Roth, C., Soler, L., Dainat, J., Olsson, L., and Geijer, C. (2017). 688
Complete Genome Sequences of the Xylose-Fermenting Candida intermedia Strains 689
CBS 141442 and PYCC 4715. Genome Announc 5. 10.1128/genomeA.00138-17. 690
39. Peri, K.V.R., Faria-Oliveira, F., Larsson, A., Plovie, A., Papon, N., and Geijer, C. 691
(2023). Split-marker-mediated genome editing improves homologous recombination 692
frequency in the CTG clade yeast Candida intermedia. FEMS Yeast Res 23. 693
10.1093/femsyr/foad016. 694
40. Wray, L.V., Jr., Witte, M.M., Dickson, R.C., and Riley, M.I. (1987). Characterization 695
of a positive regulatory gene, LAC9, that controls induction of the lactose-galactose 696
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
24
regulon of Kluyveromyces lactis: structural and functional relationships to GAL4 of 697
Saccharomyces cerevisiae. Mol Cell Biol 7, 1111-1121. 10.1128/mcb.7.3.1111-698
1121.1987. 699
41. Douglas, H.C., and Hawthorne, D.C. (1964). ENZYMATIC EXPRESSION AND 700
GENETIC LINKAGE OF GENES CONTROLLING GALACTOSE UTILIZATION 701
IN SACCHAROMYCES. Genetics 49, 837-844. 10.1093/genetics/49.5.837. 702
42. Bailey, T.L., Johnson, J., Grant, C.E., and Noble, W.S. (2015). The MEME Suite. 703
Nucleic Acids Research 43, W39-W49. 10.1093/nar/gkv416. 704
43. Thoden, J.B., Sellick, C.A., Timson, D.J., Reece, R.J., and Holden, H.M. (2005). 705
Molecular structure of Saccharomyces cerevisiae Gal1p, a bifunctional galactokinase 706
and transcriptional inducer. J Biol Chem 280, 36905-36911. 10.1074/jbc.M508446200. 707
44. Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., 708
Tunyasuvunakool, K., Bates, R., Zidek, A., Potapenko, A., et al. (2021). Highly accurate 709
protein structure prediction with AlphaFold. Nature 596, 583-589. 10.1038/s41586-710
021-03819-2. 711
45. Hittinger, C.T., and Carroll, S.B. (2007). Gene duplication and the adaptive evolution 712
of a classic genetic switch. Nature 449, 677-681. 10.1038/nature06151. 713
46. Geronikou, A., Larsen, N., Lillevang, S.K., and Jespersen, L. (2022). Occurrence and 714
Identification of Yeasts in Production of White-Brined Cheese. Microorganisms 10, 715
1079. 716
47. TANJI, M., NAMIMATSU, K., KINOSHITA, M., MOTOSHIMA, H., ODA, Y., and 717
OHNISHI, M. (2004). Content and Chemical Compositions of Cerebrosides in Lactose-718
assimilating Yeasts. Bioscience, Biotechnology, and Biochemistry 68, 2205-2208. 719
10.1271/bbb.68.2205. 720
48. Yano, K., and Fukasawa, T. (1997). Galactose-dependent reversible interaction of 721
Gal3p with Gal80p in the induction pathway of Gal4p-activated genes of 722
Saccharomyces cerevisiae. Proc Natl Acad Sci U S A 94, 1721-1726. 723
10.1073/pnas.94.5.1721. 724
49. Tesfaw, A., Oner, E.T., and Assefa, F. (2021). Evaluating crude whey for bioethanol 725
production using non-Saccharomyces yeast, Kluyveromyces marxianus. SN Applied 726
Sciences 3, 42. 10.1007/s42452-020-03996-1. 727
50. Maullu, C., Lampis, G., Desogus, A., Ingianni, A., Rossolini, G.M., and Pompei, R. 728
(1999). High-level production of heterologous protein by engineered yeasts grown in 729
cottage cheese whey. Appl Environ Microbiol 65, 2745-2747. 10.1128/aem.65.6.2745-730
2747.1999. 731
51. Urit, T., Stukert, A., Bley, T., and Löser, C. (2012). Formation of ethyl acetate by 732
Kluyveromyces marxianus on whey during aerobic batch cultivation at specific trace 733
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
25
element limitation. Appl Microbiol Biotechnol 96, 1313-1323. 10.1007/s00253-012-734
4107-z. 735
52. Santana, D.J., and O'Meara, T.R. (2021). Forward and reverse genetic dissection of 736
morphogenesis identifies filament-competent Candida auris strains. Nat Commun 12, 737
7197. 10.1038/s41467-021-27545-5. 738
53. Gietz, R.D., and Schiestl, R.H. (2007). High-efficiency yeast transformation using the 739
LiAc/SS carrier DNA/PEG method. Nature Protocols 2, 31-34. 10.1038/nprot.2007.13. 740
54. Verduyn, C., Postma, E., Scheffers, W.A., and Van Dijken, J.P. (1992). Effect of 741
benzoic acid on metabolic fluxes in yeasts: a continuous-culture study on the regulation 742
of respiration and alcoholic fermentation. Yeast 8, 501-517. 10.1002/yea.320080703. 743
55. Bruder, S., Reifenrath, M., Thomik, T., Boles, E., and Herzog, K. (2016). Parallelised 744
online biomass monitoring in shake flasks enables efficient strain and carbon source 745
dependent growth characterisation of Saccharomyces cerevisiae. Microb Cell Fact 15, 746
127. 10.1186/s12934-016-0526-3. 747
56. Bruder, S., Reifenrath, M., Thomik, T., Boles, E., and Herzog, K. (2016). Parallelised 748
online biomass monitoring in shake flasks enables efficient strain and carbon source 749
dependent growth characterisation of Saccharomyces cerevisiae. Microbial Cell 750
Factories 15, 127. 10.1186/s12934-016-0526-3. 751
57. Goncalves, C., Wisecaver, J.H., Kominek, J., Oom, M.S., Leandro, M.J., Shen, X.X., 752
Opulente, D.A., Zhou, X., Peris, D., Kurtzman, C.P., et al. (2018). Evidence for loss and 753
reacquisition of alcoholic fermentation in a fructophilic yeast lineage. Elife 7. 754
10.7554/eLife.33034. 755
58. Katoh, K., and Standley, D.M. (2013). MAFFT multiple sequence alignment software 756
version 7: improvements in performance and usability. Mol Biol Evol 30, 772-780. 757
10.1093/molbev/mst010. 758
59. Capella-Gutiérrez, S., Silla-Martínez, J.M., and Gabaldón, T. (2009). trimAl: a tool for 759
automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 760
1972-1973. 10.1093/bioinformatics/btp348. 761
60. Nguyen, L.T., Schmidt, H.A., von Haeseler, A., and Minh, B.Q. (2015). IQ-TREE: a 762
fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. 763
Mol Biol Evol 32, 268-274. 10.1093/molbev/msu300. 764
61. Minh, B.Q., Nguyen, M.A., and von Haeseler, A. (2013). Ultrafast approximation for 765
phylogenetic bootstrap. Mol Biol Evol 30, 1188-1195. 10.1093/molbev/mst024. 766
62. Letunic, I., and Bork, P. (2019). Interactive Tree Of Life (iTOL) v4: recent updates and 767
new developments. Nucleic Acids Res 47, W256-w259. 10.1093/nar/gkz239. 768
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
26
63. Gupta, S., Stamatoyannopoulos, J.A., Bailey, T.L., and Noble, W.S. (2007). 769
Quantifying similarity between motifs. Genome Biology 8, R24. 10.1186/gb-2007-8-2-770
r24. 771
64. Teixeira, M.C., Viana, R., Palma, M., Oliveira, J., Galocha, M., Mota, M.N., Couceiro, 772
D., Pereira, M.G., Antunes, M., Costa, I.V., et al. (2022). YEASTRACT+: a portal for 773
the exploitation of global transcription regulation and metabolic model data in yeast 774
biotechnology and pathogenesis. Nucleic Acids Research 51, D785-D791. 775
10.1093/nar/gkac1041. 776
65. Andrews, S. (2010). FastQC: a quality control tool for high throughput sequence data. 777
Babraham Bioinformatics, Babraham Institute, Cambridge, United Kingdom. 778
66. Dobin, A., Davis, C.A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut, P., 779
Chaisson, M., and Gingeras, T.R. (2013). STAR: ultrafast universal RNA-seq aligner. 780
Bioinformatics 29, 15-21. 781
67. Robinson, M.D., McCarthy, D.J., and Smyth, G.K. (2010). edgeR: a Bioconductor 782
package for differential expression analysis of digital gene expression data. 783
bioinformatics 26, 139-140. 784
68. Smyth, G.K. (2005). Limma: linear models for microarray data. In Bioinformatics and 785
computational biology solutions using R and Bioconductor, (Springer), pp. 397-420. 786
69. Love, M.I., Huber, W., and Anders, S. (2014). Moderated estimation of fold change and 787
dispersion for RNA-seq data with DESeq2. Genome biology 15, 1-21. 788
789
790
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
27
Figures and Figure legends: 791
792
793
Figure 1: Candida intermedia is one of the top five fastest lactose-growing yeast species. A) 794
Representative growth profiles of 10/24 lactose-yeast species including three different C. intermedia 795
strains. The graphs depict data procured from GrowthProfiler in 96-well format, represented as mean ± 796
standard deviation (shaded region) for biological triplicates. On y-axis final biomass is depicted in green 797
values (G.V. - corresponding to growth based on pixel counts, as determined by a GrowthProfiler 798
instrument) and is plotted against time (h) on x-axis. B) Heat map showing doubling time (h), lag phase 799
duration (h) and final biomass (green values – G.V.) measured for all the tested strains in minimal media 800
containing lactose as the sole carbon source and plotted as an average of three biological replicates. 801
Strains are ranked based on their doubling time, from low to high. 802
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
28
803
Figure 2: Genomic and transcriptomic analyses identified three gene clusters involved in lactose 804
and galactose assimilation: A schematic representation of lactose and galactose metabolic pathways 805
and results of RNAseq data analysis showing expression of different genes (as present in clusters) 806
upregulated in galactose or lactose compared to glucose. Lactose uptake and transport into the cell is 807
enabled by LAC12_3 encoded lactose permease followed by hydrolysis to glucose (blue circle: Glc) and 808
galactose (yellow circle: Gal) enabled by LAC4 encoded β-galactosidase enzyme. Glucose is further 809
metabolized via glycolysis. Galactose is metabolized via the Leloir pathway, encoded by three clustered 810
genes, GAL1 (galactokinase), GAL7 (galactose-1-phosphate-uridylyltransferase) and GAL10 811
(mutarotase and UDP-glucose-4-epimerase). The enzymatic functions for the genes are depicted by 812
dotted lines based on genome sequence data for C. intermedia CBS141442. Legend shows Log2 fold 813
change with carbon sources tested represented as Glc for 2% glucose, Gal for 2% galactose and Lac for 814
2% lactose containing media. Gene expression log fold change is normalized with glucose as control. 815
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
29
816
Figure 3: The GALLAC cluster is essential for growth on lactose and galactose and is unique to C. 817
intermedia: Cluster deletion mutants of C. intermedia were characterized by growth on glucose (A), 818
galactose (B) lactose (C) in growth profiler. Legend shows the wild-type strain (black), LAC cluster 819
mutant (light green), GAL cluster deletion mutant (dark green) and GALLAC cluster deletion mutant 820
(purple), depicted in the graph with biomass as green values (G.V. - corresponding to growth based on 821
pixel counts, as determined by a GrowthProfiler instrument) on the y-axis against time(h) on the x-axis. 822
Data are represented as mean ± standard deviation (shaded region) for biological triplicates indicated by 823
colors: wild type – black, lac cluster mutant – light green, gal cluster mutant – dark green and gallac 824
cluster mutant – purple. D) Graphical representation of genomic location of cluster and individual genes 825
which are paralogs to GALLAC gene cluster and their protein identity as per comparative genomics 826
analysis. Arrows depict assumed duplication events which are still unclear. 827
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
30
828
Figure 4: Deletion of individual genes in the GAL and GALLAC clusters reveals importance of Lac9 829
and Gal1_2 for (ga)lactose metabolism: Growth and metabolite profiles for deletion mutants of 830
individual genes in the GAL and GALLAC cluster of C. intermedia, in both galactose (top two rows) and 831
lactose (bottom two rows) containing media. Graphs represent biomass (filled circle; gal – dark green; 832
lac- purple) on the right y-axis, consumption of respective sugars (filled triangle for galactose in g/L or 833
filled square for lactose in g/L) and metabolite production (open circle for galactitol in g/L) on the left 834
y-axis (depicted by saccharides (g/L), plotted against time (h) on x-axis. Data are represented as mean 835
± standard deviation (shaded region for biomass and bars for sugars and metabolites) for biological 836
triplicates. 837
838
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
31
839
Figure 5: Lac9 binding motifs are found in promoters in the GALLAC cluster: A) Graphical 840
representation of results of transcription binding motif analysis for promoters of individual genes of the 841
GALLAC cluster, using MEME (version 5.5.43). GALLAC gene cluster with the location of three 842
statistically significant promoter binding motifs found in the promoters of the cluster genes. B) Motif 843
consensus of the binding motif with the lowest E-value score of the overall match of the motif in the 844
input sequence. Depiction of the Gal4p consensus sequence and its associated p-value. C) List of three 845
(statistically significant) motifs found in the promoters of GALLAC cluster genes and the transcription 846
factors associated to these motifs derived from Yeastract database using TomTom. 847
848
849
850
851
852
853
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
32
854
Figure 6: Characterization of C. intermedia’s Gal1 and Gal1_2 proteins reveals important 855
functional differences: A) Results of complementation of codon optimized CiGAL1 and CiGAL1_2 by 856
heterologous expression in S. cerevisiae (BY4741) gal1
∆
mutant. Growth profiles are depicted for 857
Scgal1
∆
(dark purple), Scgal1
∆
with plasmid p426 containing URA marker (light purple), Scgal1
∆
with 858
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
33
pCiGAL1_2 containing URA3 marker with codon-optimized CiGAL1_2(dark green) and Scgal1
∆
with 859
pCiGAL1 containing URA3 marker with codon-optimized CiGAL1). Time (in hours) on x-axis is plotted 860
against biomass (green values – G.V.) on y-axis. Data are represented as mean ± standard deviation 861
(shaded region for biomass) for biological triplicates. B) Structure of ScGal1 (grey) in complex with 862
AMPPNP and α-galactose next to the superimposed Alphafold2-predicted structures of Gal1 (cyan) and 863
Gal1_2 (blue) in the same orientation, showing their high structural similarity C) β -galactosidase assay 864
on galactose- and lactose- grown cultures of wild type, lac
∆
, gal1
∆
and gal1_2
∆
strains of C. intermedia. 865
Graphs show lactase activity (OD420) plotted on left y-axis against time (in hours) on x-axis and 866
biomass (OD600) plotted on right y-axis. D) Quantitative PCR results for LAC4 and GAL1 gene 867
expression in C. intermedia wild-type and gal1_2
∆
grown in glucose or lactose. Samples were taken 868
during different growth phases (On glucose, Lag = 5h, Early log = 10h and late log = 20h and on lactose, 869
Lag = 5h, early log = 24h, late log = 44h). Data are represented as mean ± standard deviation (error bars) 870
for biological and technical triplicates. 871
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
35
Figure 7: Graphic representation of regulatory mechanisms in C. intermedia and other yeast 874
species: A) Depiction of lactose (green box) and galactose (light yellow box) metabolism in C. 875
intermedia with the regulation of GALLAC cluster by the transcription factor CiLac9. On galactose, 876
Lac9 and Gal1_2 interact directly or indirectly resulting in the regulation of GAL cluster gene(s), thus, 877
affecting C. intermedia’s growth. On lactose, our results show that Gal1_2 from the GALLAC cluster 878
regulates the LAC cluster at a transcriptional level. This effect of Gal1_2 can be speculated to be indirect 879
due to the inability of Gal1_2 to bind DNA or protein based on predicted structure. Graphical 880
representation also illustrates the overflow metabolism in C. intermedia because of aldose reductase 881
mediated conversion of galactose to galactitol. B) Regulation of galactose metabolism in S. cerevisiae 882
by the Gal3-Gal80-Gal4 system where galactose and ATP induce Gal3 to bind Gal80 resulting in the 883
activation of Gal4. Thus, Gal4 induces structural GAL genes. C) Regulation of (ga)lactose genes in K. 884
lactis is mediated by the bi-functional KlGal1. The ScGal4 homolog in K. lactis (KlGal1) is induced by 885
galactose (or galactose derived from lactose) resulting in sequestering Gal80 and relieving Gal4 886
homolog, Lac9, which in turn activates the interconnected galactose and lactose metabolic genes in this 887
yeast. C) Graphic representation of the Rep1 and Cga1 mediated galactose regulatory system in C. 888
albicans. Galactose physically binds to Rep1 resulting in recruitment of Cga1 and the complex 889
ultimately induces the structural genes responsible for galactose metabolism in this yeast. 890
891
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
36
Supplementary figures and figure legends 892
893
Figure S 1: Growth curves for 24 lactose growing species from the work of Shen et al 1. The graphs 894
depict data procured from GrowthProfiler in 96-well format, plotted as mean ± standard deviation 895
(shaded region) for biological triplicates per strain. On y-axis final biomass yield is depicted in green 896
values (G.V. - corresponding to growth based on pixel counts, as determined by a GrowthProfiler 897
instrument) and is plotted against time (h) on x-axis. 898
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
37
899
Figure S 2: Gene expression pattern for different transcription factor orthologues and Lac12 like genes 900
in C. intermedia. Gene expression in Galactose (GA20) and Lactose (L20) have been normalized for 901
values on Glucose (G20). Legend shows expression (in fold change) from 0 to 10 in increasing gradient 902
of red. 903
904
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
38
905
Figure S 3: Maximum likelihood phylogenetic tree depicting the origin and evolution of the XYL1_2 906
gene in C. intermedia. 907
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
39
908
Figure S 4: Maximum Likelihood phylogenetic tree depicting origin and evolution of the GAL1_2 gene 909
in C. intermedia. 910
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
40
911
Figure S 5: Maximum likelihood phylogenetic tree for the origin and evolution of the GAL10_2 gene in 912
C. intermedia. 913
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
41
914
Figure S 6: Maximum likelihood phylogenetic tree for the origin and evolution of the LAC9 gene in C. 915
intermedia. 916
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
42
917
Figure S 7: Bar plot with growth rates of different mutants in comparison to the wild-type strain (WT) 918
on galactose as well as lactose. Significance difference in growth rates compared to the WT strain have 919
been estimated using students t-test and values with p > 0.01 are considered significantly different. Data 920
are represented as mean ± standard deviation for biological triplicates. 921
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
43
922
Figure S 8: Growth and metabolite profile for gal10∆ in galactose containing minimal media. Graph 923
represents biomass (filled green circle) on the right y-axis, consumption of respective sugars (filled 924
triangle for galactose in g/L) and metabolite production (open circle for galactitol in g/L) on the left y-925
axis, plotted against time (in hours) on x axis. Data are represented as mean ± standard deviation for 926
biological triplicates. 927
928
929
930
931
932
933
934
935
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
44
936
Figure S 9: Growth profile for the double deletion mutant gal1_2
∆
lac9
∆
in comparison with WT and 937
gallac
∆
in galactose. Legend shows the wild-type strain (black circle), GALLAC cluster deletion mutant 938
(light green triangle), gal1_2
∆
lac9
∆
depicted in the graph with growth as green values (A.U.) on the y-939
axis against time(hours) on the x-axis. Data are represented as mean ± standard deviation for biological 940
triplicates indicated by colors: wild type – yellow, lac cluster mutant – purple, gallac cluster deletion – 941
dark green and gal cluster mutant – light green. 942
943
944
945
946
947
948
949
950
951
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
45
952
Figure S 10:Results of transcription factor binding analysis performed using MEME (version 5.5.2) on 953
promoter regions of genes in the GAL and LAC clusters. Results show the predicted binding motifs of 954
TFs in the promoter regions ranked based on p-value for the motif. Also mentioned are the predicted 955
transcription factors that are associated to the binding motifs. 956
957
958
959
960
961
Figure S 11:Growth profiles for WT, gal4 and lac9 mutants in glucose (light green), galactose (dark 962
green) and lactose (purple) containing media. Time (in hours) on x-axis is plotted against biomass yield 963
(green values – G.V.) on y-axis. Data are represented as mean ± standard deviation for biological 964
triplicates. 965
966
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint
46
967
primer
name
sequence
gene
target
orientation
GTB396
ATCCTGGTCCTCAATGCACA
lac4
fwd
GTB397
CTGGAATCTCGAGGTCTCCC
lac4
rev
GTB351
ACCTCCAAGCACTCGGAAAG
GAL1
fwd
GTB352
ACGATAGACCCGCCAAATCC
GAL1
rev
GTB357
TGACCGAGGCTCCAATGAAC
ACT1
fwd
GTB358
CACCGTCACCAGAGTCCAAA
ACT1
rev
Table S 1: Primers used for mRNA quantification using qPCR in C. intermedia. Primers were designed 968
using Primer3 (https://primer3.ut.ee/) and primer pairs were checked for efficiency prior to use. 969
preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for thisthis version posted December 19, 2023. ; https://doi.org/10.1101/2023.12.19.572343doi: bioRxiv preprint