ArticlePDF Available


Intracellular ligands that bind heavy metals (HMs) and thereby minimize their detrimental effects to cellular metabolism are attracting great interest for a number of applications including bioremediation and development of HM-biosensors. Metallothioneins (MTs) are short, cysteine-rich, genetically encoded proteins involved in intracellular metal-binding and play a key role in detoxification of HMs. We searched approximately 700 genomes and transcriptomes of non-ciliate protists for novel putative MTs by similarity and structural analyses and found 21 unique proteins playing a potential role as MTs. Most putative MTs derive from heterokonts and dinoflagellates and share common features such as (i) a putative metal-binding domain in proximity of the N-terminus, (ii) two putative MT-specific domains near the C-terminus and (iii) one to three CTCGXXCXCGXXCXCXXC patterns. Although the biological function of these proteins has not been experimentally proven, knowledge of their genetic sequences adds useful information on proteins that are potentially involved in HM-binding and can contribute to the design of future biomolecular assays on HM–microbe interactions and MT-based biosensors.
Cite this article: Balzano S, Sardo A. 2022
Bioinformatic prediction of putative
metallothioneins in non-ciliate protists. Biol.
Lett. 18: 20220039.
Received: 1 February 2022
Accepted: 18 March 2022
Subject Areas:
bioinformatics, environmental science
heavy metals, pollution, metallothioneins,
non-ciliate protists
Author for correspondence:
Sergio Balzano
Electronic supplementary material is available
online at
Marine biology
Bioinformatic prediction of putative
metallothioneins in non-ciliate protists
Sergio Balzano
and Angela Sardo
Stazione Zoologica Anton Dohrn Napoli (SZN), Department of Ecosustainable Marine Biotechnology,
via Ammiraglio Ferdinando Acton 55, 80133, Naples, Italy
NIOZ Royal Netherlands Institute for Sea Research, 1790AB Den Burg, The Netherlands
Istituto di Scienze Applicate e Sistemi Intelligenti CNR, via Campi Flegrei 34, 80078 Pozzuoli, Naples, Italy
SB, 0000-0002-3172-1332
Intracellular ligands that bind heavy metals (HMs) and thereby minimize
their detrimental effects to cellular metabolism are attracting great interest
for a number of applications including bioremediation and development
of HM-biosensors. Metallothioneins (MTs) are short, cysteine-rich, geneti-
cally encoded proteins involved in intracellular metal-binding and play a
key role in detoxification of HMs. We searched approximately 700 genomes
and transcriptomes of non-ciliate protists for novel putative MTs by simi-
larity and structural analyses and found 21 unique proteins playing a
potential role as MTs. Most putative MTs derive from heterokonts and dino-
flagellates and share common features such as (i) a putative metal-binding
domain in proximity of the N-terminus, (ii) two putative MT-specific
domains near the C-terminus and (iii) one to three CTCGXXCXCGX
XCXCXXC patterns. Although the biological function of these proteins has
not been experimentally proven, knowledge of their genetic sequences
adds useful information on proteins that are potentially involved in HM-
binding and can contribute to the design of future biomolecular assays on
HMmicrobe interactions and MT-based biosensors.
1. Introduction
Microorganisms inhabiting heavy metal (HM)-contaminated environments,
eventually incorporating contaminants within the cell, are biotechnologically
interesting because of their potential use for bioremediation [1]. Passive adsorp-
tion of cations onto cell walls and transport across cell membranes are the two
major mechanisms of HM uptake by living cells [2,3]. Subsequently, intracellu-
lar polypeptides such as enzymatically produced phytochelatins and
genetically encoded metallothioneins (MTs) limit the detrimental effect of
HMs by complexing and transporting them towards vacuoles, chloroplasts or
mitochondria [4,5].
MTs are low-molecular weight proteins exhibiting a low content of aromatic
amino acids and high proportions of cysteine residues (10% or more); they have
been characterized in great detail in multicellular organisms [6,7] as well as
in bacteria [8], yeasts and ciliates [9], and are currently classified in 15 families
that are not phylogenetically related but are likely to result from convergent
evolution [10]. Ciliate MTs are generally longer than average and, along with
MTs from metazoans and fungi, contain greater proportions of cysteine than
MTs from plants and bacteria [11]. In addition to classified proteins, MTs iso-
lated and characterized experimentally from the brown macroalga Fucus
vesiculosus [12], the excavate Trichomonas vaginalis [10] and different fungi and
metazoans [13], as well as HM-contaminated soils [14], could not be classified
and were suggested to make up novel MT families [13].
The broad genetic diversity spanning living organisms [15,16] and the scar-
city of known MTs in microbial eukaryotes other than fungi and ciliates [17]
© 2022 The Author(s) Published by the Royal Society. All rights reserved.
Downloaded from on 05 May 2022
Table 1. List of protein sequences predicted from eukaryotic genomes and transcriptomes as likely to play a role as MTs, as revealed by Interproscan analyses or motif search.
protein ID species class supergroup strain ID transcriptome ID database ID
no. identical
stress condition
domain code
EinvMT Entoamoeba invadens archamoeboe Amoebozoans IP1 NA XP_004259069 1 NA
AlanMT Armaparvus languidus vannellids Amoebozoans PRA-29 MMETSP0420 Tr3694 1 HL IPR035715
CsorMT Chlorella sorokiniana green algae Archaeplastida 1602 NA PRW44601.1 1 NA IPR002045
MconMT Micractinium condutrix green algae Archaeplastida SAG 241.80 NA PSC70917 1 NA
TvagMT Trichomonas vaginalis parabasalids Discoba ATCC PRA-98 NA XP_001321197 1 NA
CrotMT Chrysochromulina rotalis haptophytes Hacrobians UIO044 MMETSP0287 Tr26136 1 HL IPR001008
CowcMT Capsaspora owczarzaki lozoa Opisthokonts ATCC 30864 NA XP_011270693 1 NA
BbigMT Babesia bigemina apicomplexa SAR
NA XP_012768823 1 NA
BlasMT Blastocystis sp. bigyra SAR ATCC 50177 NA OAO13187 1 NA
EsilMT Ectocarpus siliculosus brown algae SAR NA CBJ32637 IPR001008
FvesMT Fucus vesiculosus brown algae SAR NA CAA06729 IPR001008
AglaMT1 Asterionellopsis glacialis diatoms SAR CCMP134 MMETSP0708 Tr19519 3 N-/P- IPR001008
AglaMT2 Asterionellopsis glacialis diatoms SAR CCMP1581 MMETSP1394 Tr220 1 N-/P-
CwaiMT Coscinodiscus wailesii diatoms SAR CCMP2513 MMETSP1066 Tr41518 1 HL IPR001008
DbriMT Ditylum brightwellii diatoms SAR GSO105 MMETSP0998 Tr22984 8 HL/No/N-/P- IPR001008
EspiMT Extubocellulus spinifer diatoms SAR CCMP396 MMETSP0697 Tr10701 1 Si-/No/HL
MpolMT Minutocellus polymorphus diatoms SAR NH13 MMETSP1070 Tr24663 2 No/HL
OaurMT Odontella aurita diatoms SAR Is-1302-5 MMETSP0015 Tr34634 2 HL
PdubMT Pseudodictyota dubia diatoms SAR CCMP147 MMETSP1175 Tr24667 1 HL IPR001008
SyneMT Synedropsis sp. diatoms SAR CCMP1620 MMETSP1176 Tr28518 2 HL IPR001008
TpseMT Thalassiosira pseudonana diatoms SAR CCMP1335 NA XP_002296843 1 NA
AcatMT Alexandrium catenella dinoagellate SAR OF101 MMETSP0790 Tr99632 1 No
AmonMT Alexandrium monilatum dinoagellate SAR CCMP3105 MMETSP0096 Tr45933 4 HL/P-
AzspMT Azadinium spinosum dinoagellate SAR 3D9 MMETSP1037 Tr93697 2 HL IPR001008
GspiMT Gonyaulax spinifera dinoagellate SAR CCMP409 MMETSP1439 Tr79705 1 HL
LpolMT Lingulodinium polyedrum dinoagellate SAR CCMP1738 MMETSP1032 Tr14667 4 No/HL
AplaMT Aplanochytrium sp. labyrinthulids SAR PBS07 MMETSP0956 Tr7261 4 NA IPR001008
AstoMT Aplanochytrium stocchinoi labyrinthulids SAR GSBS06 MMETSP1349 Tr9377 4 NA IPR001008
AuanMT1 Aureococcus anophagefferens pelagophyceae SAR CCMP1850 MMETSP0917 Tr30268 3 N-
2 Biol. Lett. 18: 20220039
Downloaded from on 05 May 2022
suggest that the real diversity of MTs as well as the number of
distinct families are likely to be broader than is currently
known. For example, less than 1% of proteins annotated as
MTs on GenBank belong to protists, they are mostly associ-
ated with parasitic genera (Babesia,Entoamoeba,Plasmodium
and Trichomonas), and other microorganisms including
microalgae are highly underrepresented [17]. MTs from
both eukaryotic and prokaryotic microbes have been recently
reviewed by Gutiérrez et al. [18] and while MTs from ciliates
and fungi have been characterized in detail and classified in
different families, little is known on MTs from non-
ciliate protists. Although some common featuressuch as a
prevalence of CXC motifswere observed, MTs from non-
ciliate protists do not share a common evolutionary origin
and are likely to result from the convergent evolution of
different genes [18]. Overall, very little is known to date on
proteins from microalgae and, in general, from protists differ-
ent from ciliates. Here we predicted, through a bioinformatic
approach, novel potential MTs from eukaryotic microbial
genomes and transcriptomes.
2. Material and methods
We searched 44 genomes [19] and 636 transcriptomes [20] for
novel MTs of non-ciliate protists. The amino acid sequences of
the proteins predicted from the genomes were downloaded from
GenBank (electronic supplementary material, table S1), whereas
a re-assembled version of the proteins predicted from the
marine microbial eukaryote transcriptome sequencing project
(MMETSP) database (electronic supplementary material, table
S2) was downloaded from iMicrobe [21,22]. We carried out struc-
tural analyses of the proteins predicted from the abovementioned
databases using InterProScan [23] with default parameters
html); proteins found to possess regions identified as MT-domains
with a score (e-value) of less than 5 × 10
were retained for down-
stream analyses (electronic supplementary material, table S3).
GPS-Prot software [24] was used to plot the position of the differ-
ent domains within each protein. The resulting proteins were
aligned using MAFFT-linsy [25] and analyses revealed the pres-
ence of one to three highly conserved CTCGXXCXCGXXCXC
XXC patterns in most proteins. We then searched for other pro-
teins possessing the CTCGXXCXCGXXCXCXXC pattern within
the abovementioned databases and results were then added to
the previous alignments (electronic supplementary material,
figure S1). A sequence logo of the abovementioned pattern was
generated using WebLogo [26].
3. Results and discussion
Functional analyses of genomes and transcriptomes
sequenced from non-ciliate protists yielded 10 unique
proteins possessing putative MT-specific domains (table 1;
electronic supplementary material, table S4). AlanMT protein
(Armaparvus languidus, amoebozoan and excavate) possesses
a region sharing similarities with a domain present in yeast
MTs (IPR035715), whereas all the other proteins found here
contain two adjacent regions sharing similarities with
known MT domains from molluscs (IPR001008). Most
(8 out of 10) proteins also contain a putative HM-associated
domain (HMA, IPR006121) located in proximity of the
N-terminus (figure 1), one to three conserved cysteine-rich
patterns 18 AA long (CTCGXXCXCGXXCXCXXC), and
have been originally isolated from species affiliated to the
Table 1. (Continued.)
protein ID species class supergroup strain ID transcriptome ID database ID
no. identical
stress condition
domain code
AuanMT2 Aureococcus anophagefferens pelagophyceae SAR CCMP1984 NA XP_009037419 1 NA IPR001008
PsubMT Pelagococcus subviridis pelagophyceae SAR CCMP1429 MMETSP0883 Tr17315 3 N-
PcalMT Pelagomonas calceolata pelagophyceae SAR RCC969 MMETSP1328 Tr480 4 NA
Known MTs identied in previous studies are in bold.
In many cases, 2 or more identical proteins possessing MT-specic domain or resulting from keyword searches were found from different transcriptomes of the same strain.
Stress condition at which the strain was maintained prior to transcriptome sequencing. Norefers to transcriptomes derived from strains cultured at standard conditions.
In some cases identical sequences were obtained from different transcriptomes reecting either different stress treatments or both stress and non-stress conditions. Abbreviations: NA, not available; N-, nitrogen deprivation (<2 μM); P-,
phosphorus deprivation (<0.5 μM); Si-, silica deprivation (<0.5 μM); HL, high light (>300 μEm
The sequences without an InterProScan code were identied by keyword search of the conserved CTCGXXCXCGXXCXCXXC motif.
3 Biol. Lett. 18: 20220039
Downloaded from on 05 May 2022
StramenopileAlveolataRhizaria (SAR) supergroup. Twelve
additional unique proteins containing the same 18 AA
pattern were subsequently found in other SAR species (elec-
tronic supplementary material, table S5). Overall, structural
analyses and pattern search allowed the identification of 21
unique proteins (table 1), 19 of which derive from SAR
species and possess a highly conserved cysteine-rich pattern,
that are likely to play a role as MTs (figure 2). Thirteen
putative MTs are present in more than one transcriptome of
the MMETSP database being thus very unlikely to result
from contaminations or sequencing errors. Interestingly, in
many cases, our putative MTs derive from transcriptomes
sequenced out of specimens collected under stress conditions
such as high light irradiance (greater than 300 µE m
under nitrogen (less than 2 µM) or phosphorus (less than
0.5 µM) limitation (table 1). Both high light irradiance and
nutrient starvation can generate oxidative stress [27,28] that
has been reported to induce MT biosynthesis [4,29]. Current
data thus suggest that the proteins found here are more
likely to be expressed while microorganisms thrive under
oxidative stress conditions, coherently with a potential role
as MTs.
Little is known on metal-binding mechanisms in microal-
gal MTs. MTs are generally known to have affinities with
monovalent and divalent ions, with each cation coordinated
by 3 to 4 cysteine residues, and each residue coordinating
one or two cations [7,30,31]. The number of monovalent or
divalent metal cations that can be coordinated by the putative
151 187
Aplanochytrium sp. MMETSP0956_Tr7261
Aplanochytrium stocchinoi MMETSP1349_Tr_9377
Coscinodiscus wailesii MMETSP1066_Tr41518
Asterionellopsis glacialis MMETSP0708_Tr_19519
Dytilum brightwellii MMETSP0998_Tr22984
Synedropsis sp. MMETSP1176_Tr28518
Pseudictyota dubia MMETSP1175_Tr24667
Azadinium spinosum MMETSP1037_Tr93697
Chrysochromulina rotalis MMETSP0287_Tr26136
Armaparvus languidus MMETSP0420_Tr3694
Figure 1. Proteins from different microbial eukaryotes containing MT-specific domains as found by structural analyses using InterProScan [23]. Numbers indicate
protein length and the position of the different domains. Domains specific for MTs are in black (Mollusc MTs, IPR001008; crustacean MTs, IPR002045; eukaryotic MT,
PF12809), whereas HM-associated domains (HMA, IPR006121) are in grey. Species name and sequence identifiers are indicated on the left of each putative MTs,
whereas class names are on the right.
4 Biol. Lett. 18: 20220039
Downloaded from on 05 May 2022
MTs found here cannot be predicted in silico but needs to be
evaluated experimentally. It has been suggested that a MT is
able to chelate a number of monovalent cations slightly
higher than half of its cysteine residues and a number of diva-
lent cations lower than 50% of its cysteine residues [7,30,31].
Short putative MTs such as AlanMT,CrotMT or EspiMT can
coordinate around 510 cations, whereas the longest proteins
found here such as AplaMT (258 AA), AstoMT (264) and
SyneMT (255) can coordinate up to 30 cations.
Current results strongly suggest that at least the proteins
that possess an HMA domain along with two adjacent MT
domains (figure 1) that were found here from SAR represen-
tatives are likely to play a role as MTs. HMA domains have
previously been found in proteins involved in HM transport
and detoxification in mammals [32,33], and two adjacent
MT-domains typically occur in known MTs from plants [7],
mammals [34] and ciliates [35]. Proteins found here from
SAR representatives are longer than most known MTs
(table 1), ranging from 189 (DbriMT) to 320 AA (OaurMT).
The presence of multiple, conserved cysteine-rich patterns
(figure 2), and the fact that such proteins are longer than
average, suggest that putative SAR MTs might have resulted
from gene duplication of shorter MTs, similarly to what has
been hypothesized for very long MTs in fungi [36], molluscs
[37] and T. vaginalis [10].
The cysteine content found in our putative MTs is lower
than that of most known MTs, ranging from 8% (AlanMTs)
to 19% (CrotMT and CwaiMT) and was highly variable even
within SAR-derived proteins (table 2). Histidine content is
very low (less than 2%) in all proteins except for CrotMT
(3.6%) and SyneMT (3.9%); aromatic amino acids account
for less than 5% in most proteins, whereas lysine contribution
ranges from 0.9% (CrotMT) to 10% (AlanMTs). Overall, puta-
tive SAR MTs found here, along with the known MT
AuanMT2, exhibit a similar domain distribution (figure 1),
contain cysteine residues mostly clustered in CXC motifs
and share one to three conserved 18 AA patterns (figure 2).
Gutiérrez et al. [18] observed a predominance of CXC
motifs, especially CKC, in MTs from non-ciliate protists.
However, while some known MTs like BlasMT, CowcMT
and TvagMT are indeed rich (more than 8) in CKC motifs,
this does not seem to be a common feature among the puta-
tive MTs found here in non-ciliate protists. For example,
AuanMTs and MconMT do not contain such motifs, whereas
only one CKC motif occurs in BbigMT,CsorMT and TpseMT
(table 2). Similarly, among our putative MTs, a CKC motif
occurs five times in OaurMT, but it is repeated three times
or less in the other proteins. In general, CTC and CQC
motifs are more common than CKC motifs in our putative
SAR MTs (table 2). Current data indicate that both proteins
with an experimentally proven HM-binding activity and
putative MTs found here via bioinformatic analyses exhibit
a highly variable content in CKC, CTC and CQC motifs.
The possible role of our SAR proteins as MTs is further
suggested by the presence in known MTs, from some metazo-
ans, amoebozoans, fungi and higher plants, of a region slightly
Figure 2. Alignment of the putative MTs from heterokonts (Labyinthulids, Pelagophyceae and diatoms) and dinoflagellates and sequence logo of the highly con-
served motif CTCGXXCXCGXXCXCXXC. Underlined sequence IDs correspond to putative MTs found in the present study, whereas IDs that are not underlined are related
known MTs from previous studies. Numbers reflect the amino acid position with respect to the longest protein found here (SyneMT from Synedropsis sp. CCMP1620).
Cysteine residues are highlighted in black while histidine residues, which might also be involved in HM binding, are in grey. Only the regions corresponding to the
HM-associated domains (HMA, IPR006121, positions 197 to 242) and those exhibiting the cysteine-rich motif CTCGXXCXCGXXCXCXXC are shown for clarity, whereas
the full alignment is shown in electronic supplementary material, figure S1. MTs predicted in this study are underlined, whereas MT activity has been previously
proven or predicted in the other proteins. The species, strain and treatment associated with each protein abbreviated here are reported in table 1. Sequence logo was
created using WebLogo (
5 Biol. Lett. 18: 20220039
Downloaded from on 05 May 2022
different from our 18 AA pattern. In this case, the threonine
residue on the second position is replaced by other polar or
positively charged amino acids (electronic supplementary
material, figure S2). Besides this difference, putative SAR
MTs share the same number and position of cysteine residues
with metal-binding domains in Type 1 MTs from plants [7],
copper and cadmium MTs in snails [38], and silver MTs in
fungi [39].
In spite of the similarities found, even putative SAR MTs,
possessing the shared 18 AA pattern, exhibit great differences
among each other, and we could not construct a meaningful
(i.e. bootstrap support greater than 30%, using neighbour
joining or maximum-likelihood algorithms) phylogenetic
tree from the alignment of such sequences. This variability
is likely to reflect the broad genetic diversity of non-ciliate
protists and suggests that, although SAR species share a
common evolutionary origin [16], their MTs are likely to
result from convergent evolution of different genes, in spite
of the shared 18 AA pattern.
Although the putative SAR MTs found here possess two
regions related to metal-binding domains of mollusc MTs
(figure 1) and a conserved 18 AA cysteine-rich region
(figure 2) that can be found, in part, in MTs from different
organisms (electronic supplementary material, figure S2),
none of the putative SAR MTs found here possesses the
motifs previously described for the 15 MT families [10,11]
Table 2. Main features and proportions of amino acids potentially involved in metal chelation, for the putative MTs found in the present study.
species protein ID length
amino acid
histidine (%)
AA (%) CXC
18 AA
Alexandrium catenella AcatMT 189 9 1.6 3.2 6 2 1 1
Alexandrium monilatum AmonMT 196 10 1.0 5.1 6 3 1 1
Aplanochytrium sp. AplaMT 258 15 0.4 1.9 12 0 5 3
Aplanochytrium stocchinoi AstoMT 264 14 1.1 1.5 12 2 6 3
Armaparvus languidus AlanMT 116 8 1.7 7.7 3 0
Asterionellopsis glacialis AglaMT1 204 13 1.5 2.5 9 1 3 2
Asterionellopsis glacialis AglaMT2 196 14 1.0 2.6 9 1 4 1
Aureococcus anophagefferens AuanMT1 232 15 0.0 1.3 11 1 4 1
Azadinium spinosum AzspMT 208 11 0.5 4.8 6 3 1 1
Chrysochromulina rotalis CrotMT 112 19 3.6 6.3 6 1
Coscinodiscus wailesii CwaiMT 312 19 0.0 0.0 18 0 7 4
Ditylum brightwellii DbriMT 193 11 1.0 1.6 6 0 3 1
Extubocellulus spinifer EspiMT 129 12 0.8 1.6 4 0 2 1
Gonyaulax spinifera GspiMT 164 10 0.6 3.7 6 3 1
Lingulodinium polyedrum LpolMT 196 10 1.0 3.6 6 1 1 1
Minutocellus polymorphus MpolMT 203 11 1.0 1.5 6 0 3 1
Odontella aurita OaurMT 320 17 0.0 1.3 17 5 6 3
Pelagococcus subviridis PsubMT 207 11 0.5 2.4 6 3 2 1
Pelagomonas calceolata PcalMT 160 11 0.6 2.5 6 0 1 2
Pseudodictyota dubia PdubMT 260 19 0.0 0.8 15 3 5 3
Synedropsis sp. SyneMT 255 11 3.9 4.7 9 1 2 1
AuanMT2 171 18. 0.0 1.2 12 031
Babesia bigemina BbigMT 214 12 1.9 7.9 2 1
Blastocystis sp. BlasMT 207 40 0.0 0.0 33 29
Capsaspora owczarzaki CowcMT 176 27 0.0 0.0 15 11 1
Chlorella sorokiniana CsorMT 56 32 0.0 0.0 6 11
Entoamoeba invadens EinvMT 103 35 0.0 1.9 13 4
Micractinium condutrix MconMT 59 30 0.0 0.0 6 03
Thalassiosira pseudonana TpseMT 141 13 1.4 5.7 6 1
Trichomonas vaginalis TvagMT 308 30 6.2 2.3 41 9
Values refer to the numbers of amino acids in the sequence.
Specic, 18 amino acid motif (CTCGXXCXCGXXCXCXXC) identied in the putative MTs found in the present study.
6 Biol. Lett. 18: 20220039
Downloaded from on 05 May 2022
and thus do not belong to any family described to date.
Although ciliates are part of the SAR supergroup, MTs
from ciliates (Family 7) are shorter, contain greater pro-
portions of cysteine and differ in their amino acid sequence
from the putative SAR MTs found here [9]. In addition,
except for AuanMT2, known unclassified MTs from SAR
species (BbigMT, BlasMT, EsilMT, FvesMT and TpseMT)do
not possess the conserved 18AA pattern observed here
(figure 2), suggesting great differences even within SAR MTs.
MTs can contribute to the development of more efficient
HM-sensors. Whole-cell MT-based biosensors have been
developed in different microbes [4042] and ciliates are cur-
rently considered as the most suitable candidates because of
the absence of cell wall [43,44]. However, testing the potential
of MTs from other microbes for the development of whole-
cell biosensors might yield some more efficient candidates.
Microalgae can be cultured autotrophically in simple sea-
water or freshwater enriched with basic nutrients, and
several green algae, diatoms, dinoflagellates and Eustigmato-
phyceae are commonly used for genetic editing. In particular,
lightly silicified diatoms, unarmoured dinoflagellates and
Chlorella spp. are known for their weak cell walls [45], and
cell wall-free mutants of Chlamydomonas spp. are currently
available ( Diatoms and
dinoflagellates typically dominate shallow benthic commu-
nities [46], including HM-contaminated sediments [47], and
might thus be suitable for the development of MT-based
Bioinformatic mining of eukaryotic genomes and tran-
scriptomes thus contributed to predict putative MTs of 21
species, 19 of which derive from SAR representatives and
share an 18 amino acid-long cysteine-rich motif. The biologi-
cal function of these proteins remains to be experimentally
proven for a complete structural and functional in vivo charac-
terization, as well as for the quantification of MT expression
in polluted environments and in laboratory microcosms by
real-time PCR, and, finally, for the development of MT-based
biosensors. Furthermore, physiological assays of species tol-
erance to HMs can be combined with gene expression
determination to improve our understanding of microbe
HM interactions.
Data accessibility. All of the GenBank and MMETSP IDs for the
sequences used in this study are included in the electronic sup-
plementary material, tables [48]. The electronic supplementary
material also includes the protein sequences used in these studies
and the same sequences aligned to identified conserved patterns.
Both files are available as fasta files.
Authorscontributions. S.B.: conceptualization, formal analysis, investi-
gation, writingoriginal draft and writingreview and editing;
A.S.: conceptualization, writingoriginal draft and writing
review and editing.
Both authors gave final approval for publication and agreed to be
held accountable for the work performed therein.
Competing interests. We declare we have no competing interests.
Funding. We received no funding for this study.
Acknowledgements. The authors are grateful to M. Miralto and
L. Ambrosino (RIMAR, SZN) for their support in bioinformatic
data processing, and to G. Lanzotti (RIMAR, SZN) for graphical
assistance. Analyses were performed by using the SZN bioinfor-
matics server Falkor available at SZN ( The
authors received no financial support for the research and authorship
of this article.
1. Kumar KS, Dahms HU, Won EJ, Lee JS, Shin KH.
2015 Microalgae a promising tool for heavy metal
remediation. Ecotoxicol Environ. Saf. 113, 329352.
2. Blaby-Haas CE, Merchant SS. 2012 The ins and outs
of algal metal transport. Biochim. et Biophys. Acta -
Mol. Cell Res. 1823, 15311552. (doi:10.1016/j.
3. Das N, Vimala R, Karthika P. 2008 Biosorption of
heavy metalsan overview. Indian J. Biotechnol. 7,
4. Cobbett C, Goldsbrough P. 2002 Phytochelatins and
metallothioneins: roles in heavy metal detoxification
and homeostasis. Annu. Rev. Plant Biol. 53,
159182. (doi:10.1146/annurev.arplant.53.100301.
5. Perales-Vela HV, Pena-Castro JM, Canizares-
Villanueva RO. 2006 Heavy metal detoxification in
eukaryotic microalgae. Chemosphere 64,110.
6. Blindauer CA, Leszczyszyn OI. 2010
Metallothioneins: unparalleled diversity in structures
and functions for metal ion homeostasis and more.
Nat. Prod. Rep. 27, 720741. (doi:10.1039/
7. Leszczyszyn OI, Imam HT, Blindauer CA. 2013
Diversity and distribution of plant metallothioneins:
a review of structure, properties and functions.
Metallomics 5, 11461169. (doi:10.1039/
8. Blindauer CA. 2011 Bacterial metallothioneins: past,
present, and questions for the future. J. Biol. Inorg.
Chem. 16, 10111024. (doi:10.1007/s00775-011-
9. Gutiérrez JC, Amaro F, Diaz S, de Francisco P, Cubas
LL, Martín-González A. 2011 Ciliate
metallothioneins: unique microbial eukaryotic
heavy-metal-binder molecules. J. Biol. Inorg. Chem.
16, 10251034. (doi:10.1007/s00775-011-0820-9)
10. Capdevila M, Atrian S. 2011 Metallothionein protein
evolution: a miniassay. J. Biol. Inorg. Chem. 16,
977989. (doi:10.1007/s00775-011-0798-3)
11. Ziller A, Yadav RK, Capdevila M, Reddy MS, Vallon
L, Marmeisse R, Atrian S, Palacios O, Fraissinet-
Tachet L. 2017 Metagenomics analysis reveals a new
metallothionein family: sequence and metal-
binding features of new environmental cysteine-rich
proteins. J. Inorg. Biochem. 167,111. (doi:10.
12. Morris CA, Nicolaus B, Sampson V, Harwood JL, Kille
P. 1999 Identification and characterization of a
recombinant metallothionein protein from a marine
alga, Fucus vesiculosus.Biochem. J. 338, 553560.
13. Ziller A, Fraissinet-Tachet L. 2018 Metallothionein
diversity and distribution in the tree of life: a
multifunctional protein. Metallomics 10,
15491559. (doi:10.1039/C8MT00165K)
14. Lehembre F et al. 2013 Soil metatranscriptomics for
mining eukaryotic heavy metal resistance genes.
Environ. Microbiol. 15, 28292840. (doi:10.1111/
15. Baldauf SL. 2008 An overview of the phylogeny and
diversity of eukaryotes. J. Syst. Evol. 46, 263273.
16. Keeling PJ. 2013 The number, speed, and impact of
plastid endosymbioses in eukaryotic evolution.
Annu. Rev. Plant Biol. 64, 583607. (doi:10.1146/
17. Balzano S, Sardo A, Blasio M, Chahine TB,
DellAnno F, Sansone C, Brunet C. 2020
Microalgal metallothioneins and phytochelatins
and their potential use in bioremediation.
Front. Microbiol. 11, 517. (doi:10.3389/fmicb.
18. Gutiérrez JC, de Francisco P, Amaro F, Díaz S,
Martín-González A. 2019 Structural and functional
diversity of microbial metallothionein genes. In
Microbial diversity in the genomic (eds ES Das, HR
Dash), pp. 387407. New York, NY: Academic Press.
19. Blaby-Haas CE, Merchant SS. 2019 Comparative and
functional algal genomics. Annu. Rev. Plant Biol. 70,
605638. (doi:10.1146/annurev-arplant-050718-
7 Biol. Lett. 18: 20220039
Downloaded from on 05 May 2022
20. Keeling PJ et al. 2014 The marine microbial eukaryote
transcriptome sequencing project (MMETSP):
illuminating the functional diversity of eukaryotic life in
the oceans through transcriptome sequencing. PLoS
Biol. 12, e1001889. (doi:10.1371/journal.pbio.1001889)
21. Johnson LK, Alexander H, Brown CT. 2019 Re-assembly,
quality evaluation, and annotation of 678 microbial
eukaryotic reference transcriptomes. GigaScience 8,4.
22. Youens-Clark K, Bomhoff M, Ponsero AJ, Wood-Charlson
EM, Lynch J, Choi I, Hartman JH, Hurwitz BL. 2019
iMicrobe: tools and data-driven discovery platform for
the microbiome sciences. GigaScience 8, giz083. (doi:10.
23. Jones P et al. 2014 InterProScan 5: genome-scale
protein function classification. Bioinformatics 30,
12361240. (doi:10.1093/bioinformatics/btu031)
24. Fahey ME et al. 2011 GPS-Prot: a web-based
visualization platform for integrating host-pathogen
interaction data. BMC Bioinf. 12, 298. (doi:10.1186/
25. Katoh K, Standley DM. 2013 MAFFT multiple sequence
alignment software version 7: improvements in
performance and usability. Mol. Biol. Evol. 30,
772780. (doi:10.1093/molbev/mst010)
26. Crooks GE, Hon G, Chandonia JM, Brenner SE. 2004
WebLogo: a sequence logo generator. Genome Res.
14, 11881190. (doi:10.1101/gr.849004)
27. Niyogi KK. 1999 Photoprotection revisited: genetic
and molecular approaches. Annu. Rev. Plant Physiol.
Plant Mol. Biol. 50, 333359. (doi:10.1146/annurev.
28. Zhang YM, Chen H, He CL, Wang Q. 2013 Nitrogen
starvation induced oxidative stress in an oil-
producing green alga Chlorella sorokiniana C3. PLoS
ONE 8, e69225. (doi:10.1371/journal.pone.0069225)
29. Ruttkay-Nedecky B, Nejdl L, Gumulec J, Zitka O,
Masarik M, Eckschlager T, Stiborova M, Adam V,
Kizek R. 2013 The role of metallothionein in
oxidative stress. Int. J. Mol. Sci. 14, 60446066.
30. Korkola NC, Scarrow PM, Stillman MJ. 2020 pH
dependence of the non-cooperative binding of Bi
to human apo-metallothionein 1A: kinetics,
speciation, and stoichiometry. Metallomics: Integr.
Biometal Sci. 12, 435448. (doi:10.1039/
31. Scheller JS, Irvine GW, Stillman M. 2018 Unravelling
the mechanistic details of metal binding to
mammalian metallothioneins from stoichiometric,
kinetic, and binding affinity data. Dalton Trans. 47,
36133637. (doi:10.1039/C7DT03319B)
32. Bull PC, Cox DW. 1994 Wilson disease and Menkes
disease: new handles on heavy-metal transport.
Trends Genet. 10, 246252. (doi:10.1016/0168-
33. Gitschier J, Moffat B, Reilly D, Wood WI, Fairbrother
WJ. 1998 Solution structure of the fourth metal-
binding domain from the Menkes copper-
transporting ATPase. Nat. Struct. Biol. 5,4754.
34. Nielsen AE, Bohr A, Penkowa M. 2007 The balance
between life and death of cells: roles of
metallothioneins. Biomarker Insights 1,99111.
35. Zahid MT, Shakoori FR, Zulfiqar S, Al-Ghanim KA,
Shakoori AR. 2018 Growth characteristics, metal
uptake and expression analysis of copper
metallothionein in a newly reported ciliate,
Tetrahymena farahensis.Pakistan J. Zool. 50,
11711181. (doi:10.17582/journal.pjz/2018.50.3.
36. Iturbe-Espinoza P, Gil-Moreno S, Lin WY, Calatayud
S, Palacios O, Capdevila M, Atrian S. 2016 The
fungus Tremella mesenterica encodes the longest
metallothionein currently known: gene, protein
and metal binding characterization. PLoS ONE
11, 0148651. (doi:10.1371/journal.pone.
37. Pedrini-Martha V, Koll S, Dvorak M, Dallinger R.
2020 Cadmium uptake, MT gene activation and
structure of large-sized multi-domain
metallothioneins in the terrestrial door snail Alinda
biplicata (Gastropoda, Clausiliidae). Int. J. Mol. Sci.
21, 21. (doi:10.3390/ijms21051631)
38. Dvorak M et al. 2018 Metal binding functions of
metallothioneins in the slug Arion vulgaris differ
from metal-specific isoforms of terrestrial snails.
Metallomics 10, 16381654. (doi:10.1039/
39. Sácký J, Leonhardt T, Borovička J, Gryndler M, Briksí A,
Kotrba P. 2014 Intracellular sequestration of zinc,
cadmium and silver in Hebeloma mesophaeum and
characterization of its metallothionein genes. Fungal
Genet. Biol. 67,314. (doi:10.1016/j.fgb.2014.03.003)
40. Amaro F, Turkewitz AP, Martín-González A, Gutiérrez
JC. 2011 Whole-cell biosensors for detection of
heavy metal ions in environmental samples based
on metallothionein promoters from Tetrahymena
thermophila.Microb. Biotechnol. 4, 513522.
41. Shetty RS, Deo SK, Liu Y, Daunert S. 2004
Fluorescence-based sensing system for copper using
genetically engineered living yeast cells. Biotechnol.
Bioeng. 88, 664670. (doi:10.1002/bit.20331)
42. Shitanda I, Takada K, Sakai Y, Tatsuma T. 2005
Amperometric biosensing systems based on motility
and gravitaxis of flagellate algae for aquatic risk
assessment. Anal. Chem. 77, 67156718. (doi:10.
43. Gutiérrez JC, Amaro F, Martín-González A. 2009
From heavy metal-binders to biosensors: ciliate
metallothioneins discussed. Bioessays 31, 805816.
44. Gutiérrez JC, Amaro F, Martín-González A. 2015
Heavy metal whole-cell biosensors using eukaryotic
microorganisms: an updated critical review. Front.
Microbiol. 6, 8. (doi:10.3389/fmicb.2015.00048)
45. Dunker S, Wilhelm C. 2018 Cell wall structure of
coccoid green algae as an important trade-off
between biotic interference mechanisms and
multidimensional cell growth. Front. Microbiol. 9,
719. (doi:10.3389/fmicb.2018.00719)
46. Forster D et al. 2016 Benthic protists: the under-
charted majority. FEMS Microbiol. Ecol. 92,8.
47. Gu R, Sun P, Wang Y, Yu F, Jiao N, Xu D. 2020
Genetic diversity, community assembly, and shaping
factors of benthic microbial eukaryotes in Dongshan
Bay, Southeast China. Front. Microbiol. 11, 592489.
48. Balzano S, Sardo A. 2022 Bioinformatic prediction of
putative metallothioneins in non-ciliate protists.
Figshare. (
8 Biol. Lett. 18: 20220039
Downloaded from on 05 May 2022
... The number of both sequenced genes and scientific publications related to protistan MTs has been reported to be highly underrepresented with respect to the total MTs [85]. MTs have been characterised only in five microalgal genera (Aureococcus, Symbiodinium, Thalassiosira, Ostreococcus, Chlorella and Nannochloropsis) [85]; a recent survey of genomic and transcriptomic databases allowed the prediction of 18 novel potential MTs in microalgae, mostly affiliated to diatoms and dinoflagellates [99]. MTs are also likely to play a role as radical scavengers, protecting cells from oxidative stress [79,100]. ...
Full-text available
Microalgae are increasingly recognised as suitable microorganisms for heavy metal (HM) removal, since they are able to adsorb them onto their cell wall and, in some cases, compartmentalise them inside organelles. However, at relatively high HM concentrations, they could also show signs of stress, such as organelle impairments and increased activities of antioxidant enzymes. The main aim of this review is to report on the mechanisms adopted by microalgae to counteract detrimental effects of high copper (Cu) concentrations, and on the microalgal potential for Cu bioremediation of aquatic environments. Studying the delicate balance between beneficial and detrimental effects of Cu on microalgae is of particular relevance as this metal is widely present in aquatic environments facing industrial discharges. This metal often induces chloroplast functioning impairment, generation of reactive oxygen species (ROS) and growth rate reduction in a dose-dependent manner. However, microalgae also possess proteins and small molecules with protective role against Cu and, in general, metal stress, which increase their resistance towards these pollutants. Our critical literature analysis reveals that microalgae can be suitable indicators of Cu pollution in aquatic environments, and could also be considered as components of eco-sustainable devices for HM bioremediation in association with other organisms.
Full-text available
Microbial eukaryotes are pivotal components of marine ecosystems. However, compared with the pelagic environments, the diversity distribution and the driving mechanisms of microbial eukaryotes in the marine sediments have rarely been explored. In this study, sediment cores were collected along a transect from inner to outer Dongshan Bay, southeast China. By combining high throughput sequencing of SSU rRNA gene with measurements on multiple environmental variables, the genetic diversity, community structure and assembly processes, and environmental shaping factors were investigated. Alveolata (mainly Ciliophora and Dinophyceae), Rhizaria (mainly Cercozoa), and Stramenopiles (mainly Bacillariophyta) were the most dominant groups in terms of both relative sequence abundance and OTU richness. Grain size composition of the sediment was the primary factor determining the alpha diversity of microbial eukaryotes followed by sediment depth and heavy metal, including Cr, Zn, and Pb. Geographic distance and water depth surpassed other environmental factors to be the primary factors shaping the microbial eukaryotic communities. Dispersal limitation was the primary driver of the microbial eukaryotic communities, followed by drift and homogeneous selection. Overall, our study shed new light on the spatial distribution patterns and controlling factors of benthic microbial eukaryotes in a subtropical bay which is subject to increasing anthropogenic pressure.
Full-text available
The persistence of heavy metals (HMs) in the environment causes adverse effects to all living organisms; HMs accumulate along the food chain affecting different levels of biological organizations, from cells to tissues. HMs enter cells through transporter proteins and can bind to enzymes and nucleic acids interfering with their functioning. Strategies used by microalgae to minimize HM toxicity include the biosynthesis of metal-binding peptides that chelate metal cations inhibiting their activity. Metal-binding peptides include genetically encoded metallothioneins (MTs) and enzymatically produced phytochelatins (PCs). A number of techniques, including genetic engineering, focus on increasing the biosynthesis of MTs and PCs in microalgae. The present review reports the current knowledge on microalgal MTs and PCs and describes the state of art of their use for HM bioremediation and other putative biotechnological applications, also emphasizing on techniques aimed at increasing the cellular concentrations of MTs and PCs. In spite of the broad metabolic and chemical diversity of microalgae that are currently receiving increasing attention by biotechnological research, knowledge on MTs and PCs from these organisms is still limited to date.
Full-text available
Terrestrial snails (Gastropoda) possess Cd-selective metallothioneins (CdMTs) that inactivate Cd 2+ with high affinity. Most of these MTs are small Cysteine-rich proteins that bind 6 Cd 2+ equivalents within two distinct metal-binding domains, with a binding stoichiometry of 3 Cd 2+ ions per domain. Recently, unusually large, so-called multi-domain MTs (md-MTs) were discovered in the terrestrial door snail Alinda biplicata (A.b.). The aim of this study is to evaluate the ability of A.b. to cope with Cd stress and the potential involvement of md-MTs in its detoxification. Snails were exposed to increasing Cd concentrations, and Cd-tissue concentrations were quantified. The gene structure of two md-MTs (9md-MT and 10md-MT) was characterized, and the impact of Cd exposure on MT gene transcription was quantified via qRT PCR. A.b. efficiently accumulates Cd at moderately elevated concentrations in the feed, but avoids food uptake at excessively high Cd levels. The structure and expression of the long md-MT genes of A.b. were characterized. Although both genes are intronless, they are still transcribed, being significantly upregulated upon Cd exposure. Overall, our results contribute new knowledge regarding the metal handling of Alinda biplicata in particular, and the potential role of md-MTs in Cd detoxification of terrestrial snails, in general.
Full-text available
Background Scientists have amassed a wealth of microbiome datasets, making it possible to study microbes in biotic and abiotic systems on a population or planetary scale; however, this potential has not been fully realized given that the tools, datasets, and computation are available in diverse repositories and locations. To address this challenge, we developed, a community-driven microbiome data marketplace and tool exchange for users to integrate their own data and tools with those from the broader community. Findings The iMicrobe platform brings together analysis tools and microbiome datasets by leveraging National Science Foundation–supported cyberinfrastructure and computing resources from CyVerse, Agave, and XSEDE. The primary purpose of iMicrobe is to provide users with a freely available, web-based platform to (1) maintain and share project data, metadata, and analysis products, (2) search for related public datasets, and (3) use and publish bioinformatics tools that run on highly scalable computing resources. Analysis tools are implemented in containers that encapsulate complex software dependencies and run on freely available XSEDE resources via the Agave API, which can retrieve datasets from the CyVerse Data Store or any web-accessible location (e.g., FTP, HTTP). Conclusions iMicrobe promotes data integration, sharing, and community-driven tool development by making open source data and tools accessible to the research community in a web-based platform.
Full-text available
Background De novo transcriptome assemblies are required prior to analyzing RNAseq data from a species without an existing reference genome or transcriptome. Despite the prevalence of transcriptomic studies, the effects of using different workflows, or “pipelines”, on the resulting assemblies are poorly understood. Here, a pipeline was programmatically automated and used to assemble and annotate raw transcriptomic short read data collected by the Marine Microbial Eukaryotic Transcriptome Sequencing Project (MMETSP). The resulting transcriptome assemblies were evaluated and compared against assemblies that were previously generated with a different pipeline developed by the National Center for Genome Research (NCGR). Results New transcriptome assemblies contained the majority of previous contigs as well as new content. On average, 7.8% of the annotated contigs in the new assemblies were novel gene names not found in the previous assemblies. Taxonomic trends were observed in the assembly metrics. Assemblies from the Dinoflagellata showed a higher number of contigs and unique k-mers than transcriptomes from other phyla while assemblies from Ciliophora had a lower percentage of open reading frames compared to other phyla. Conclusions Given current bioinformatics approaches, there is no single best reference transcriptome for a particular set of raw data. As the optimum transcriptome is a moving target, improving (or not) with new tools and approaches, automated and programmable pipelines are invaluable for managing the computationally-intensive tasks required for re-processing large sets of samples with revised pipelines and ensuring a common evaluation workflow is applied to all samples. Thus, re-assembling existing data with new tools using automated and programmable pipelines may yield more accurate identification of taxon-specific trends across samples in addition to novel and useful products for the community.
Full-text available
Arion vulgaris is a land-living European slug belonging to the gastropod clade of Stylommatophora. The species is known as an efficient pest organism in vegetable gardening and horticulture, which may in part be the consequence of its genetically based innate immunity, along with its high ability to withstand toxic metal stress by intracellular detoxification. Like many species of terrestrial snails, slugs possess a distinct capacity for Cd accumulation in their midgut gland, where the metal is stored and inactivated, conferring to these animals an increased metal tolerance. Although midgut gland Cd fractions in slugs have been shown to be variably allocated between different metal-binding protein pools, depending on the level of environmental metal contamination, a true metallothionein (MT) was so far never characterized from slugs. Instead, the Cd binding proteins identified so far were described as Metallothionein-like proteins (MTLPs). In the present study, the slug A. vulgaris was used as a model organism, in order to verify the presence of true MTs in experimentally metal-exposed slugs. We wanted to find out if these suggested slug MTs have similar metal binding properties and metal-selective features like those previously reported from helicid snails. To this aim, two MT isoform genes (AvMT1 and AvMT2) were characterized from midgut gland extracts and localized in the cells of this tissue. The AvMT1 and AvMT2 proteins were purified and partially sequenced, and their metal-binding features analysed after recombinant expression. Eventually, we wanted to understand if and by how much the metal binding features of the two MT isoforms of A. vulgaris may be related, owing to their reciprocal amino acid sequence similarities, to the binding properties of metal-specific MTs from terrestrial snails.
Bismuth is a well-known therapeutic agent that is used primarily for treatment against peptic ulcers. It has also had success in protecting against nephrotoxicity caused by the anticancer compound cisplatin by inducing the liver and kidney metalloprotein, metallothionein (MT) that then binds to the cisplatin. MT is a small, ubiquitous protein that binds monovalent, divalent, and trivalent metals using its abundant cysteine thiols (20 cysteines in the mammalian protein). It is important in the understanding of both these therapeutic applications to explore in detail the earliest stages of MT binding to bismuth salts. In this paper, we explored the binding of [Bi(cit)]- and [Bi(EDTA)]- to apo-MT 1a as the most basic of binding motifs. It was found that both Bi3+ salts bound in a non-cooperative stepwise manner to terminal cysteinal thiolates at pH 2.6, 5.0, and 7.4. We report that [Bi(EDTA)]- only binds stepwise up to Bi6MT, whereas [Bi(cit)]- forms up to Bi8MT, where the 7th and 8th Bi3+ appear to be adducts. Stepwise speciation analysis provided the 7 binding constants that decreased systematically from K1 to K7 indicating a non-cooperative binding profile. They are reported as log K1 = 27.89, log K2 = 27.78, log K3 = 27.77, log K4 = 27.62, log K5 = 27.32, log K6 = 26.75, and log K7 = 26.12, with log K[Bi(cit)]- determined to be 24.17. Cysteine modifications with benzoquinone and iodoacetamide revealed that when apoMT is fully metallated with Bi3+ there are two free cysteines, meaning 18 cysteines are used in binding the 6 Bi3+. Kinetic studies showed that [Bi(EDTA)]- binds very slowly at pH 2.6 (k = 0.0290 × 106 M-1 s-1) and approximately 2000 times faster at pH 7.4 (k = 66.5 × 106 M-1 s-1). [Bi(cit)]- binding at pH 2.6 was faster than [Bi(EDTA)]- (k = 672 × 106 M-1 s-1) at either pH level. The data strongly support a non-clustered binding motif, emphasizing the non-traditional pathway reported previously for As3+.
Over 100 whole-genome sequences from algae are published or soon to be published. The rapidly increasing availability of these fundamental resources is changing how we understand one of the most diverse, complex, and understudied groups of photosynthetic eukaryotes. Genome sequences provide a window into the functional potential of individual algae, with phylogenomics and functional genomics as tools for contextualizing and transferring knowledge from reference organisms into less well-characterized systems. Remarkably, over half of the proteins encoded by algal genomes are of unknown function, highlighting the volume of functional capabilities yet to be discovered. In this review, we provide an overview of publicly available algal genomes, their associated protein inventories, and their quality, with a summary of the statuses of protein function understanding and predictions. Available: