Rev. sci. tech. Off. int. Epiz., 2005, 24 (1), 354-377
Functional genomics: tools for improving farm
animal health and welfare
S. Hiendleder(1), S. Bauersachs (1, 2), A. Boulesteix(3), H. Blum(2),
G.J. Arnold(2), T. Fröhlich(2)& E. Wolf (1, 2)
(1) Institute of Molecular Animal Breeding and Biotechnology, Gene Center of the Ludwig-Maximilian
University, Feodor-Lynen-Strasse 25, 81377 Munich, Germany
(2) Laboratory for Functional Genome Analysis (LAFUGA), Feodor-Lynen-Strasse 25, 81377, Munich, Germany
(3) Department of Statistics, Ludwig-Maximilian University Munich, Ludwigstrasse 33, 80539, Munich,
The first genome sequence assemblies of farm animal species are now
accessible through public domain databases, and further sequencing projects
are in rapid progress. In addition, large collections of expressed sequences have
been obtained, which will aid in constructing annotated transcript maps for
many economically important species. Thus, the breeding of farm animals is
entering the post-genome era. Functional genomics, defined as applying global
experimental approaches to assess gene function, by using the information and
reagents provided by structural genomics (i.e. mapping and sequencing), has
become the focus of interest.
Combining a holistic view of phenotypes at the molecular level with genetic
marker data seems a particularly promising approach for improving health and
welfare traits in farm animals. These traits are often difficult to define. They
suffer from low heritabilities and a corresponding lack of genetic gain in
conventional selection and breeding programmes. At the same time, genomic
information from micro-organisms and parasites offers the potential for new
vaccines and therapeutics. This review describes major functional genomics
tools, lists genomic resources available for farm animals and discusses the
prospects and challenges of functional genomics in improving the health and
welfare of farm animals.
Farm animals – Functional genomics – Functional genomics resources – Genomics –
Health – Proteomics – Structural genomics resources – Transcriptomics – Welfare.
Farm animal health and welfare traits are mostly complex
and multifactorial. They are difficult to define and/or
expensive to measure, and are often characterised by low
heritabilities. This has hindered genetic gains through
conventional selection and breeding programmes. Prime
examples of this broad range of economically important
traits are, as follows:
– mastitis susceptibility in cattle (106)
– resistance to gastro-intestinal nematode infections in
– the incidence of splay leg and anal atresia in pigs (119)
– resistance to Marek’s disease in chickens (4)
– resistance to infectious salmon anaemia in Atlantic
salmon (Salmo salar) (84).
Moreover, this spectrum includes fertility and body
conformation traits that are correlated with health and
longevity (52, 71). Marker-assisted selection could
significantly improve health and welfare traits (see papers
by Williams et al. and Gibson et al. in this issue), and many
genome-wide quantitative trait loci (QTL) mapping
experiments have indeed identified chromosome regions
that harbour genes with important effects on such traits in
various species, e.g. cattle (65). However, confidence
intervals for QTL often extend over many centimorgans
with hundreds of genes, and complex interactions such as
epistasis must be considered (3). Identifying functional
candidate genes and causative mutations will thus be a
challenge, especially when traditional reductionist
approaches studying one gene, transcript or protein at a
time are employed.
The term ‘functional genomics’ refers to the development
and application of holistic (i.e. genome-wide or system-
wide) experimental approaches to assess gene function by
using the information and reagents provided by structural
genomics (i.e. genome mapping and sequencing). This
approach is based on data mining, and characterised by
large-scale, high-output experimental methodologies
combined with statistical and computational analyses of
the results. The fundamental strategy behind functional
genomics is to expand the scope of biological investigation
from studying single genes, transcripts or proteins to
studying all genes, transcripts or proteins simultaneously,
in a systematic fashion. Functional genomics thus
promises to narrow the gap between sequence and
function to yield new insights into the behaviour of
biological systems (54).
As of October 2004, genome sequence assemblies of the
honey bee (Apis mellifera), chicken (Gallus gallus) and cow
(Bos taurus) were accessible through public domain
databases. A Danish-Chinese consortium has recently
released a partial pig (Sus scrofa) genome sequence for
public use (135). Further sequencing projects for
important farm animal species are expected in the near
future. In addition, large complementary deoxyribonucleic
acid (cDNA) collections of expressed sequences, known as
‘expressed sequence tags’ (ESTs), are being developed for
economically important species, including farmed fish.
These EST collections are used to generate gene indices
(114) and annotated transcript maps (115). They are also
being used to construct comparative maps which exploit
the extensive genomic resources generated for other
species, such as humans, mice (Mus musculus) and
zebrafish (Brachydanio rerio) (78). In fact, ESTs are also
yielding both the in silico information and the physical
substrates needed for large-scale expression analyses (118).
The structural genomic data from farm animal species are
accompanied by genomic data from various micro- and
macroparasites relevant to farm animals (21, 22).
Structural genomic information about parasites promises
advances in the development of new vaccines and
treatments (19, 68). Thus, the breeding of farm animals, as
well as the treatment of their parasites, is entering the post-
genome era. In this review, the authors describe the major
functional genomics tools, list the genomic resources
relevant to farm animals, and discuss the prospects and
challenges of using functional genomics to improve the
health and welfare of farm animals.
Functional genomics tools
High-throughput marker genotyping
Species-specific genome sequences and large EST
collections (see ‘Structural and functional genomic
resources’, below) aid in extensive data mining for genetic
markers. Genome sequences provide direct access to large
numbers of microsatellite markers and, by database
sequence mining or re-sequencing, single nucleotide
polymorphisms (SNPs). This marker resource can be used
to map and fine map QTL, and identify quantitative trait
genes and causative quantitative trait nucleotides.
Microsatellites provide an efficient medium-resolution
mapping tool for initial QTL mapping, but often lack
resolution for fine mapping when investigating breeds with
limited effective population size, instead of experimental
crosses. Furthermore, microsatellites appear to be much
more common in mammals than in birds (95). The SNPs
occur at a frequency of approximately 1/500 base pairs
(bp) of nucleotide sequence in pigs, cattle and chickens
(30, 32, 134) and represent a vast marker resource.
Considering the costs of high-throughput genotyping, it
should be noted that successfully applying DNA pooling
strategies for QTL mapping in farm animals can
significantly reduce the number of individuals that need to
be analysed (76).
High-throughput microsatellite genotyping, performed in
multiplex polymerase chain reactions (PCR) with capillary
electrophoresis instruments, is cost effective, reliable and
straightforward (69). Various concepts and techniques are
available for high-throughput SNP genotyping (124). All
SNP genotyping technologies have two components, as
– a method for discriminating between alternative alleles
– a method for reporting the presence of alleles in a given
The general methods for allele discrimination are
hybridisation/annealing, primer extension and enzymatic
cleavage. However, methods for reporting the presence of
alleles in a given sample are much more varied. Most signal
detection platforms follow the fate of a label in real time or
at the assay end point. Mass spectrometry (MS) is unique
in that it can be employed to detect the product of the
discrimination assay directly (124). Major high-
throughput technologies are described in more detail
Rev. sci. tech. Off. int. Epiz., 24 (1)
below. Further reading about other techniques, such as
Taqman®real-time PCR, fluorescence polarisation and
‘molecular barcodes’, is provided in Hardenbol et al.,
Kwok, and Twyman and Primrose (46, 72 and 124).
Deoxyribonucleic acid hybridisation to an
In this approach, PCR-amplified DNA, which contains the
polymorphisms of interest, is hybridised to an
oligonucleotide microarray with thousands of different
oligonucleotides of known
oligonucleotides are gridded onto a solid substrate, such as
a glass microscope slide or silicon wafer. Genotyping in
most commercially available devices is achieved by allele-
specific hybridisation or allele-specific primer extension.
Allele identification results from the emission of signals
from specific positions on the chip, which allows the
sequence around the polymorphism to be deduced (124).
An example is the oligonucleotide chip technology offered
by Affymetrix, Inc., whose GeneChip®platform involves
the production of high-density arrays of oligonucleotide
probes using a photolithographic process.
Genomic DNA is digested with a restriction enzyme,
ligated to adapters and amplified by PCR. The amplified
DNA is then fragmented, labelled and hybridised to the
array. Each allele is represented and interrogated by 40
overlapping oligonucleotides and, after washing, allele-
specific hybridisation (i.e. fluorescence) intensities are
recorded and used for genotype calling. Sample
throughput is listed as 96 10-K arrays per person per week,
which corresponds to approximately 96,000 genotypes.
One disadvantage of chip formats that score a standard set
of polymorphisms is inflexibility. However, more flexible
systems, which allow researchers to develop their own
assays, have also been introduced (124).
The GeneChip®MegAlleleTMGenotyping Bovine 10-K SNP
panel contains approximately 10,000 SNPs and combines
molecular inversion probe (MIP) technology from
ParAllele®BioScience, Inc., with the Affymetrix GeneChip®
detection technology. In
oligonucleotide probe undergoes a unimolecular
rearrangement, from a molecule that cannot be amplified
by PCR into a molecule that can be amplified by PCR. A
single probe is used to detect both alleles of each SNP. The
probes contain two nucleotide sequences, which are
unique to each probe, to target specific genomic DNA
sequences and two PCR-primer sequences which are
common to all probes. The probe rearrangement
(inversion) is mediated by hybridisation to genomic DNA
and an enzymatic ‘gap fill’ process that occurs in an allele-
specific manner. The two specific sequences of each probe
target and hybridise to complementary sites in the genome,
creating a circular conformation with a single nucleotide
gap between the termini of the probe. Unlabelled
MIP technology, an
deoxyadenosinetriphosphate (dATP), deoxycytidine-
triphosphate (dCTP), deoxyguanosinetriphosphate (dGTP)
or deoxythymidinetriphosphate (dTTP) is added to each of
four reactions. In reactions where the added nucleotide is
complementary to the nucleotide gap, DNA polymerase
and DNA ligase close the gap. The resulting circularised
probe can be separated from cross-reacted or unreacted
probes by an exonuclease reaction. The enzymatically
inverted probe is amplified by PCR, using the primer
sequences common to each probe (46). The PCR product
is then hybridised to appropriate oligonucleotide arrays for
genotype calling. This technology provides a flexible
platform with a very high throughput, which is listed as
0.5 million genotypes per day with the 10-K SNP panel.
Mass spectrometers are mature and robust devices for
analysing biomolecules and are perfectly suited for SNP
typing. In matrix-assisted laser desorption/ionisation time-
of-flight (MALDI-TOF), analyte molecules (e.g. allele-
specific DNA fragments) and matrix molecules (typically
ultraviolet [UV] or infra-red light-absorbing small
molecules) are mixed in solution. They are then co-
crystallised on a sample plate, which is subsequently
loaded into the vacuum chamber of the mass spectrometer.
The DNA molecules are gently desorbed and ionised along
with the matrix molecules by UV laser irradiation and the
resulting charged ions are accelerated under a constant
electric voltage, which causes them to fly towards the ion
detector. Charged molecules arrive at the detector at
different times, depending on their masses (66). Although
several different strategies for allele discrimination were
previously combined with
spectrometric detection, only primer extension methods
are currently applied in practice for large-scale SNP
genotyping (40). An example is the MassARRAYTMsystem
marketed by Sequenom, Inc., which was one of the first
genotyping assays to be coupled with MS. In this approach,
a common primer is hybridised adjacent to the
polymorphic site. A cocktail mixture of three
deoxynucleotides (dNTPs) and one dideoxynucleotide
triphosphate (ddNTP) that corresponds to the
polymorphic site is added to initiate the primer extension
reaction. This reaction generates allele-specific primer
extension products that are generally from one to four
bases longer than the original primer. The reactions are
performed in 384-well microtitre plates and a small aliquot
is transferred to the SpectroCHIP bioarray, which is placed
into the MALDI-TOF and the mass and correlating
genotype are determined in real time (www.sequenom.de
/applications/hme_assay.php). The MALDI-TOF is among
the most powerful and reliable SNP genotyping
methods and a throughput of more than 30,000 genotypes
per day can be achieved, with high quality results
(40). Furthermore, the system is amenable to
multiplexing and, in principle, the analysis of up to
Rev. sci. tech. Off. int. Epiz., 24 (1)
300 SNPs in a single spot of an MS sample chip is possible.
A 384-spot sample plate could thus identify in excess of
100,000 SNPs (66). In addition, MALDI-TOF MS can also
be used for SNP discovery (70) and microsatellite
Another method based on primer extension on a PCR-
amplified template is PyrosequencingTM, marketed by
Biotage AB, which is capable of delivering a short
nucleotide sequence of approximately 50 bp in real-time.
Pyrosequencing is therefore suitable for typing SNPs and
scoring haplotypes (24). This method is based on the
detection of pyrophosphate, a byproduct of DNA
synthesis, which is converted to adenosine triphosphate
(ATP), which then stimulates luciferase activity, causing the
emission of a chemiluminescent signal. The reaction
includes adenosine 5’ phosphosulfate (APS) and ATP
sulfurylase, which converts APS into ATP in the presence
of pyrophosphate. Also present is luciferin (the substrate of
luciferase) and apirase, which continuously degrades
unincorporated dNTPs and excess ATP. The dNTPs are
added to the reaction one by one, and only complementary
dNTP will extend the primer, release pyrophosphate in
equimolar quantity and cause the production of ATP,
which generates a chemiluminescent signal (24, 124).
Most physiological processes are accompanied by complex
changes in the transcriptome and/or the proteome. A
combined holistic analysis of gene expression in cells or
tissues at the transcript and protein level is therefore
desirable (39). For reasons detailed in ‘Proteome analyses’,
below, the proteome is at present not amenable to a
simultaneous comprehensive analysis, even when ‘state of
the art’ analytical systems are used. The transcriptome, in
contrast, may be analysed in a simultaneous and
comprehensive manner by various high-throughput
methods. These methods include, as follows:
– serial analysis of gene expression (SAGE)
– massively parallel signature sequencing (MPSS)
– microarray-based transcript analyses (13, 58, 128).
Such methods can be applied in any species with sufficient
EST data and/or genome information for transcript
annotation. Microarray-based technology relies on
hybridisation signal detection, whereas SAGE and MPSS
are based on DNA sequencing and counting ‘tags’. The
MPSS method generates millions of short sequence tags for
counting messenger ribonucleic acid (mRNA) frequencies
and provides an unprecedented depth of analysis. A single
MPSS experiment can provide a ten-fold coverage of the
transcripts expressed in a mammalian cell. This method is,
therefore, particularly well suited for identifying rare
transcripts and able to generate comprehensive genome-
wide expression profiles (13). However, the MPSS
approach is complex and difficult to apply in a standard
laboratory. Further reading on MPSS is provided in
Brenner et al. and Reinartz et al. (13, 101).
Serial analysis of gene expression
The SAGE technology is based on the principle that a
10-bp cDNA fragment, the tag, contains sufficient
information to unambiguously identify a transcript,
provided that the tag has been isolated from a defined
position within the transcript. The original SAGE protocol
is based on the construction of SAGE libraries which
contain concatenates of up to 50 short sequence tags, each
representing a single mRNA. Initially, double-stranded
cDNA is synthesised with a biotinylated oligo(dT) primer
and bound to streptavidine-coated beads. The cDNA is
cleaved with a restriction endonuclease, the anchoring
enzyme, revealing specific fragments of the 3’ ends of every
single cDNA. Next, specific linker sequences, which
contain a recognition site of a Type IIS restriction
endonuclease, the tagging enzyme, are ligated to these
fragments. Cleavage with this enzyme results in unique
sequence tags of 10 bp or 11 bp, plus the 4-bp recognition
site of the anchoring enzyme, of every cDNA species. The
tags are concatenated and ligated into a cloning vector to
create a SAGE library. Each sequencing reaction of one of
these cloned concatenates determines the tags of up to
50 mRNA species, accelerating the speed of EST
sequencing up to fifty-fold (128). Sequencing appropriate
numbers of clones from SAGE libraries yields hundreds of
thousands of tags, and tag identification and counting have
identified gene expression profiles of many different cell
types and tissues (74). More recently, SAGE protocols were
modified (i.e. Long SAGE, Robust-Long SAGE) to increase
tag length to 21 bp (37, 107). In contrast to conventional
SAGE, which allows tag assignments to EST collections,
Long SAGE allows unique tag assignments to much more
complex genomic sequences. Long SAGE can therefore be
used to identify novel genes and exons, and thus aid in
genome annotation. Further modifications of the Long
SAGE protocol (i.e. 5’ and 3’ Long SAGE) provide 20 bp
tag pairs from the 5’- and 3’-ends of a transcript, which can
be used to determine transcription unit boundaries (48,
133). Various SAGE protocols for very low amounts of
initial RNA material have also been designed (130).
Within-class variability, i.e. variability due to intrinsic
biological differences among sampled individuals of the
same class, not simply variability due to technical sampling
error, is an important challenge for transcript counting
methods. Within-class variability must be considered in
appropriate statistical models to identify reliable lists of
candidate transcripts that show differential expression
(129). Assigning experimentally obtained data to a known
Rev. sci. tech. Off. int. Epiz., 24 (1)
transcript is a crucial step in SAGE and MPSS. However,
tag-to-transcript assignment is not a straightforward
process, since alternative tags for a given transcript can also
be obtained experimentally. Therefore, SNP-associated
alternative tags must be taken into account (112). In
contrast to microarrays, the SAGE output format is digital
and therefore suitable for direct comparisons with data
generated by other laboratories. Thus, SAGE should
provide a broadly applicable method for the quantitative
cataloguing and comparison of expressed genes in a variety
of developmental and disease states. The main restriction
of the SAGE method is the complexity of the technique
and the prerequisite that a reasonable number of ESTs or
genes must be known to assign the tags.
Deoxyribonucleic acid microarrays
The basic concept behind all microarrays is the precise
positioning and immobilisation of a defined amount of
gene-specific DNA fragments (probes) at high density on a
solid support. These sequences are then queried by
hybridisation with labelled copies of nucleic acids from
biological samples (targets). The underlying theory is that
the greater the expression of a gene, the greater the amount
of labelled target and, hence, the greater the output signal.
In principle, this approach is the reversion of a classical
Northern blot, with the advantage that, in a single
experiment, the expression level of as many genes as are
arrayed on the solid support can be determined (58, 110).
Microarrays vary, according to, principally:
a) the solid support used (e.g. filters or glass)
b) the surface modifications with various substrates
c) the type of DNA fragments on the array (e.g. cDNA or
d) whether the gene fragments are presynthesised and
deposited or synthesised in situ
e) the machinery used to place the fragments on the array
(e.g. ink jet printing or spotting).
Combinations of these variables are used to generate three
main types of microarray:
– filter arrays
– spotted glass slide arrays
– oligonucleotide arrays synthesised in situ.
Filter arrays and spotted arrays can be produced in
academic facilities or purchased from commercial sources
but high-density oligonucleotide arrays which are
synthesised in situ can only be obtained from commercial
sources (1, 58). An example is the Affymetrix GeneChip®
technology, which uses
combinatorial chemistry to synthesise probes on
Ideally, an array for expression profiling consists of
sequence validated probes, in which each sequence is
unique, shows minimal cross-hybridisation to related
sequences and provides, collectively, a comprehensive
representation of the expressed fraction of the genome,
including splice variants. In gene expression microarrays,
either synthetic oligonucleotides or cDNA fragments are
used as probes. The principal probe sources for cDNA
microarrays are cDNA/EST libraries and known open
reading frames in genomic clones such as bacterial artificial
chromosome (BAC) libraries. Various clone sets are
distributed by and available through licensed vendors,
such as the German Resource Centre for Genome Research
Deutsches Ressourcenzentrum für Genomforschung:
RZPD (www.rzpd.de/). The DNA is typically prepared for
arraying by high-throughput PCR. Quality control is
crucial for ensuring probe identity and correct assignment
(58). Longer oligonucleotide probes (60 mers to 80 mers)
were reported to provide significantly better detection
sensitivity than shorter probes (25 mers to 30 mers).
However, longer oligonucleotides and cDNA probes are
prone to cross-hybridisations.
hybridisation is highly sequence-dependent and probes
binding to different regions of a gene yield different signal
intensities. Multiple oligonucleotides have been used in
array designs to overcome this problem. Thus,
bioinformatics-based oligonucleotide probe design is
complex and requires optimisation of probe length and
number of probes per gene (15). Microarray probe
specificity in various species, including the chicken and
cow, can be evaluated with ProbeLynx (www.
Filter arrays require tiny amounts of RNA for radioactive
target labelling, use widely available phosphorimager
instrumentation to read and are relatively inexpensive to
produce and use. The radioactive detection method
requires parallel hybridisations to duplicate filters in order
to compare gene expression between samples. Expression
analyses using glass slide microarrays are performed by
fluorescent labelled target samples. An expression analysis
can be performed by competitive hybridisation of two
samples, each labelled with a specific fluorescent dye, such
as Cy3 or Cy5 (two-colour arrays), or by hybridisation of a
single sample (a one-colour array). Experiments with two-
colour arrays may be conducted as individual paired
comparisons or by comparing each sample against all
others. With increasing numbers of samples, the latter
approach becomes impractical, both in terms of the
number of arrays and the amount of RNA required (1, 58).
Direct sample labelling is inefficient and causes a reduction
in sensitivity, in addition to potential dye bias. Indirect
labelling protocols that incorporate the dye after reverse
transcription have been developed. However, dye bias may
still pose a problem in microarray reference designs (25).
Various amplification and high sensitivity techniques have
been designed to reduce the substantial amounts of RNA
Rev. sci. tech. Off. int. Epiz., 24 (1)
(e.g. 20 µg to 70 µg) that are currently needed in the direct
and indirect labelling procedures (79).
The use of different array platforms, experimental designs,
sample preparation protocols and methods of data analysis
prevents a direct and reliable comparison of microarray
data available in the literature. The Microarray Gene
Expression Data Society, an international organisation of
microarray users and developers, has proposed common
guidelines for designing and reporting microarray
experiments (minimum information about a microarray
experiment or MIAME) as a first step towards standardised
microarray-based gene expression data (12). The two main
types of platforms, radioactive labelling and hybridisation
to a cDNA filter array versus fluorescent dye labelling (one-
colour array) and hybridisation to an oligonucleotide array,
were recently directly compared. This study revealed only
moderate overlap in the results of the two array systems.
Only 64% of the genes represented on both platforms
matched in ‘present’ or ‘absent’ calls (81). Another recent
investigation compared six commercially available, high-
density microarray platform types, ranging from a two-
colour spotted cDNA array to short in situ synthesised
oligonucleotide chips, for variability, sensitivity and
correlation. Using significance analysis of microarrays
(SAM), there were significant differences among platform
types in their ability to detect differential expression
between two very different cell types. Oligonucleotide
platforms performed better than cDNA arrays, irrespective
of whether a one-colour or two-colour approach was used,
and there was a remarkable degree of overlap among the
three oligonucleotide systems (141).
Commercial microarray design usually relies on large
amounts of sequence information derived from GenBank or
other resources. Therefore, these platforms are restricted to
organisms with a reasonable amount of sequence data. A
powerful approach to detect differentially expressed genes in
species with less or even no sequence information is based
on microarrayed suppression subtractive hybridisation (SSH)
cDNA libraries (139). The SSH method is founded on a
suppression PCR effect and combines normalisation and
subtraction in a single procedure. The normalisation step
equalises the abundance of cDNA fragments within the target
sample population, while the subtraction step reduces the
number of sequences that are common to the populations
being compared. This dramatically increases the probability
of obtaining low-abundance differentially expressed cDNA
(100). Owing to enrichment of the transcripts of interest, it
is often sufficient to analyse only a few thousand cDNAs of a
library (Fig. 1). The spotted cDNAs are only sequenced after
the detection of differential expression, and can be identified
and assigned to genes by comparison with the sequence data
in public databases (5, 6).
The major prerequisite for a high quality microarray
experiment is an appropriate experimental design which is
Rev. sci. tech. Off. int. Epiz., 24 (1)
Spotting of cDNA microarrays
Sequencing of differentially expressed cDNAs Bioinformatics
cDNA: complementary deoxyribonucleic acid
General strategy for combined suppression subtractive
hybridisation and microarray gene expression analysis
well adapted to the properties of the biological system
under investigation. This design should account for
biological as well as technical variation. For some
biological systems, pooling samples may be a reasonable
approach to reduce costs and experimental effort, but
information about biological variation is lost. Ideally,
replicates of individual samples are analysed to provide
sufficient data for an appropriate statistical analysis (17). It
is essential that the experimental design is based on all
available information about the biological system under
investigation, to provide optimal samples with respect to
temporal and spatial variation in gene expression.
Unfortunately, microarray results are affected by data
processing. For example, different methods of data
normalisation can have profound effects on the number of
differentially expressed genes that are identified. A recent
study showed that analysing an Affymetrix GeneChip®data
set with four different normalisation methods caused the
number of genes being detected as differentially expressed
to differ by a factor of about three (55). The semi-
quantitative nature of microarray experiments, and the
variables described above, have raised concerns about the
reproducibility of microarray data. Northern blot or
quantitative reverse transcription-PCR analysis of a limited
number of genes on the microarray is, therefore, typically
performed to confirm observed changes. However,
the arguments for corroborative studies have been
Initial results of microarray experiments are mere lists of
differentially expressed genes or ESTs. For further selection
of interesting genes or gene networks, functional
annotation and classification can be helpful. Therefore,
– try to catalogue genes according to their functions
– perform systematic interaction analyses (Reactome)
– assign genes to metabolic or other cellular pathways
Selected database links are presented in Table I. Additional
important resources (gene indices, metabolic pathways)
are available at www.tigr.org/tdb/tgi/.
Bioinformatics and statistical modelling in
Interpreting the output of large-scale microarray
experiments requires statistical tools. Each cDNA or
oligonucleotide tag (probe) on the array represents a
‘variable’ and each array represents an ‘observation’, where
the number of arrays is much smaller (usually between
5 and 100) than the number of probes (thousands or tens
of thousands). Data analyses include two major steps.
Initially, systematic and random variations in microarray
experiments must be removed so that the data may be
reliably interpreted with statistical methods. This step is
denoted as data normalisation and calibration. The second
step is the statistical analysis itself. This may include the
a) the detection of differentially expressed transcripts or
transcripts that are associated with specific outcomes, such
as survival time or phenotypic values
b) the construction of homogeneous groups of similarly
c) the identification of the major sources of variation
across the arrays
d) the reconstruction of genetic networks.
The most popular software tool for statistical analysis of
microarray data is ‘R’, which includes many up-to-date
methods. These packages are free and publicly available
from the website of the Bioconductor project
(www.bioconductor.org) or from the ‘R’ homepage
Data normalisation and calibration
The normalisation of high-density oligonucleotide arrays
(e.g. Affymetrix) is a specific issue, which has been
reviewed in Bolstad et al. (9). A popular and recognised
approach for cDNA microarray data normalisation and
calibration is the robust, locally weighted regression
normalisation method (140), which aims to perform
location and scale normalisation within each array as well
as between different arrays. This method adjusts for
differences such as those between print-tip-groups and
spatial effects in each array, imbalances between red and
green intensities and differences between arrays. It is
implemented in the ‘R’ package ‘marray’.
Another method designed for cDNA arrays as well as for
stabilisation’ (60). It transforms the data so that the
variance of the intensities for each probe across the arrays
is independent of its mean, which makes the detection of
differential expression more reliable. The method is
implemented in the ‘R’ package ‘vsn’. In the rest of this
section, the authors assume that the data have been
arrays is ‘variance
Detecting differential expression
The identification of transcripts that are differentially
expressed in two distinct groups (e.g. ‘disease’ and ‘no
disease’) is of major interest. Simply computing the
t-statistic and its associated p-value for each gene is
inadequate because the very high number of investigated
transcripts (probes) requires adjustment for multiple
testing, and adjusted p-values are not significant in most
practical applications. The recent method SAM (123)
computes a measure of differential expression for each
Rev. sci. tech. Off. int. Epiz., 24 (1)
transcript and produces a list of significantly differentially
expressed genes, including an estimated false discovery
rate. This method is known to identify differentially
expressed transcripts more accurately than classical
approaches, e.g. p-values of the t-statistic. The SAM
method is implemented in the ‘R’ package, ‘siggenes’.
Differential expression may also be examined in the
context of survival analysis or clinical studies to detect
transcripts whose expression levels are associated with
survival time or any numeric clinical outcome. The
Rev. sci. tech. Off. int. Epiz., 24 (1)
Functional analyses: selected database links
National Center of Biotechnology
UniGene is an experimental system for automatically partitioning GenBank sequences
into a non-redundant set of gene-oriented clusters. Each UniGene cluster contains
sequences that represent a unique gene, as well as related information such as the tissue
types in which the gene has been expressed and map location
Gene annotation databases
Entrez gene Gene provides a unified query environment for genes defined by sequence and/or in the
map viewer of NCBI. Information can be found on names, symbols, publications, Genetic
Organism terminology, chromosome numbers and many other attributes associated with
genes and the products they encode
The gene ontology project is a collaborative effort to address the need for consistent
descriptions of gene products in different databases
H-invitational database (H-InvDB) is a human gene database, with integrative annotation
of 41,118 full-length complementary deoxyribonucleic acid (cDNA) clones currently
available from six high-throughput cDNA sequencing projects. This database represents
21,037 cDNA clusters, describing their gene structures, functions, novel alternative
splicing isoforms, non-coding functional ribonucleic acids, functional domains, sub-cellular
localisations, mapping of single nucleotide polymorphisms and microsatellite repeat
motifs in relation to orphan diseases, gene expression profiling and comparative results
with mouse full-length cDNAs in the context of molecular evolution
Gene Ontology Consortiumgeneontology.org/
H-invitational database (H-Inv DB)www.h-invitational.jp/
Gene expression databases
Gene expression omnibus (GEO)The GEO is a high-throughput gene expression/molecular abundance data repository,
as well as a curated, online resource for gene expression data browsing, query and
Serial analysis of gene expression SAGE tag to gene mapping
Pathway resource list (PRL) The PRL is a new database that contains information on 156 internet pathway resources.
Most of these resources are databases themselves, containing such things as
protein-protein interactions or metabolic reactions. In its present form, PRL only provides
links to these resources. In the future, PRL will provide additional information such as the
amount of data and the organism coverage in each pathway resource
Kyoto encyclopaedia of genes and Current knowledge on molecular interaction networks, including metabolic pathways,
genomes pathway database regulatory pathways and molecular complexes
MetaCyc pathway databaseMetaCyc is a database of non-redundant, experimentally elucidated metabolic pathways
from more than 240 different organisms. MetaCyc is curated from the scientific
Signaling pathway database (SPAD) The SPAD database is an integrated database for genetic information and signal
Kinase pathway database The Kinase pathway database is an integrated database on completed sequenced major
eukaryotes, which contains the classification of protein kinases and their functional
conservation and orthologous tables among species, protein-protein interaction data,
domain information, structural information and provides graphic pathway images
Reactome – a knowledge base of The Reactome project is a collaboration among Cold Spring Harbor Laboratory, the
biological processesEuropean Bioinformatics Institute and the Gene Ontology Consortium to develop a
curated resource of core pathways and reactions in human biology
CytoscapeCytoscape is an open source bioinformatics software platform for visualising molecular
interaction networks and integrating these interactions with gene expression profiles
and other state data
statistical methods employed are essentially similar to
those used in the two-group context, except for the
definition of the statistic. For instance, one might use
Pearson’s correlation co-efficient instead of the t-statistic.
Multivariate statistical analysis
Apart from differential expression, other interesting
questions can be addressed using multivariate statistical
methods (117). These methods are difficult to apply but it
is possible to use cluster analyses to look for groups of
similarly expressed transcripts that are as homogeneous as
possible but as different as possible from the other groups.
There are numerous clustering approaches, such as:
– partitioning around medoids
– hierarchical clustering.
Most of these methods are implemented in the ‘R’ package,
‘cluster’. The resulting clusters can often be represented
graphically, for instance, using a dimension reduction
method. However, clustering methods always form a
partition into clusters, even if the data do not have any
underlying cluster structure. Thus, clustering methods
should be used as data mining tools that may, for instance,
support empirical assumptions but should not be
interpreted as ‘statistical evidence’.
Another topic of interest is classification, where gene
expression data are used to assign a sample to one of two
(or several) previously identified ‘disease’ and ‘no disease’
groups. Classification methods allow the determination of
diagnostic rules based on the gene expression data, so that
an analysed target sample can be assigned to one of the
groups when it is normalised by the same method
employed for the other arrays. A comparison study of
effective classical classification methods can be found in
Dudoit et al. (28). However, more recent methods, such as
‘prediction analysis for microarrays’ (120), which is
implemented in the ‘R’ package ‘pamr’ or ‘partial least
squares’ (11), seem to perform better because they are
especially appropriate for high-dimensional data.
Dimension reduction is a family of statistical methods for
high-dimensional data. The most widely used dimension
reduction method is principal component analysis (PCA).
It may be used to visualise microarray data experiments
graphically in two or three dimensions. This approach
identifies the major sources of variation in the data. The
PCA is implemented in ‘R’ in the function ‘prcomp’.
Another important topic is the search for interaction
structures between genes. Although data collected at the
transcription level miss much potentially useful
information, a few data mining approaches deal with the
reconstruction of genetic networks using microarray data
through graphical models. A recent overview of this topic
is presented in Friedman (34). The ‘R’ package ‘GeneTS’
(109) uses a recent statistical method based on Gaussian
graphical models to detect interaction structures in the
form of statistically significant partial correlations between
genes, then produces genetic networks.
The transcriptome is a representation or ‘snapshot’ of the
complete set of RNA transcripts of tissues or cells, formed
by the dynamic equilibrium between DNA transcription
and mRNA degradation. The proteome is the complete set
of proteins in a given tissue, cell or biological fluid at a
given point in time. It is the result of three major post-
a) protein synthesis
b) protein secretion
c) protein degradation.
Moreover, various proteins undergo post-translational
modification (PTM), by protein-protein interaction, which
is crucial for their activity and stability. As a consequence,
quantitative predictions of protein populations based on
transcriptome data are frequently insufficient (75).
Investigations of the
complementary to transcriptomics and an essential
component in describing biochemical pathways and
protein functions. The identification and quantification of
proteins and their PTMs in a global and holistic approach
is referred to as ‘proteomics’. In principle, proteome
studies have two main objectives:
proteome are therefore
a) to obtain an inventory of the different proteins and their
modifications within a defined biological state
b) to compare the proteomes derived from different
biological states, e.g. normal in comparison to pathological
cells or treated in comparison to non-treated cells.
A major challenge is the complexity of the protein mixtures
that must be analysed. In humans, for example, with
approximately 30,000 genes, 1.8 million different proteins
and protein isoforms are expected (62). In addition,
proteins show an extraordinary variation in concentration,
which complicates the detection of low-abundance
proteins. The abundance of albumin in blood plasma, for
example, differs by more than 10 orders of magnitude from
the concentration of low-abundance proteins, such as
interleukins (2). For this reason, proteomes are usually
divided into several fractions using pre-fractionation
procedures, such as subcellular fractionation and
chromatographic and/or electrophoretic separation (132).
Each fraction is then investigated separately and the
Rev. sci. tech. Off. int. Epiz., 24 (1)
resulting data are recombined to obtain a more global view
of the proteome under study.
Two-dimensional polyacrylamide gel electrophoresis
The most frequently applied method to separate the
proteins of a proteome or a proteome fraction is two-
(2D-PAGE), initially described by O’Farrell and Klose in
1975 (67, 88). The procedure consists of two
electrophoresis steps with different separation criteria. In
the first dimension, proteins are separated according to
their isoelectric point. In the second dimension, they are
separated according to their molecular weight. Following
protein separation, gel-staining techniques (92) generate a
spot pattern (Fig. 2). Ideally, each spot on a 2D gel
represents a different protein or a distinct modification
state of a single protein. Quantitative and qualitative
differences in the protein patterns of samples thus result in
quantitative or qualitative alterations in the spot patterns.
The 2D-PAGE method is powerful enough to separate and
resolve thousands of proteins on a single gel (27). Typically,
several gels are matched and compared using 2D-gel
analysis software packages. Differing spots are then excised
and identified by protein identification techniques (36),
such as MS (Fig. 3). Although 2D-PAGE was developed
almost thirty years ago, it remains a ‘state-of-the-art’
technique and is amenable to further development. An
important enhancement of this technology is the difference
gel electrophoresis (DIGE) technique (122, 127), made
available by GE Healthcare. In principle, DIGE consists of:
a) derivatising two different samples with two different
b) combining the two samples
c) running the two samples on a single 2D gel (Fig. 4).
As the label does not affect the migration properties of
proteins in the gel, the proteins of both samples are co-
migrating. Protein detection is performed on a dual laser-
scanning device with different excitation/emission filters.
The images are then matched by a computer-assisted
overlay technique, normalising and quantifying the signals.
Differences in protein expression are identified by
evaluating a pseudo-coloured image and data spreadsheet.
To increase the accuracy of the results, internal
standardisation can be achieved by labelling the standard
with a third fluorophore.
Rev. sci. tech. Off. int. Epiz., 24 (1)
Example of silver-stained, two-dimensional gel of 100 µg bovine
oviduct epithel proteins generated in a comparison of oviduct
General strategy of two-dimensional-gel-based proteomics
Cells of state 1
Solubilisation of proteins
of the gels
Isolation of spots of
interest and enzymatic
cleavage of the proteins
Identification of proteins
with mass spectrometry-
Cells of state 2
Techniques based on mass spectrometry
Despite its enhancements, 2D-gel-based proteomics is still
time-consuming and expensive. For a single proteome
analysis, including pre-fractionation procedures, hundreds
of gels have to be prepared. Gel-free MS protein
quantification techniques have the following advantages:
a) they are faster
b) they can identify and quantify the proteins in
a single run
c) they can be automated easily.
However, MS signals are not quantitative per se, since the
signal strength depends on the sequence of the measured
peptide. The ‘stable isotopic labelling’ (63) technique
allows multiplexed liquid chromatography (LC)
experiments, coupled with MS (LC-MS). In principle,
stable isotope labelling experiments consist of four
sequential steps. First, the protein mixture of sample A is
labelled with an isotopically light form of the labelling
reagent (e.g. reagent with 12C atoms). The protein mixture
of sample B is then derivatised in a similar manner with the
isotopically heavy reagent (e.g. reagent with 13C atoms).
Following derivatisation, the two samples are combined
and enzymatically digested to generate peptide fragments.
Finally, the labelled peptides are analysed by LC-MS
(Fig. 5). As the isotope pattern (e.g. 12C versus13C) does not
affect their chromatographic characteristics, the differently
labelled peptides co-elute and reach the mass spectrometer
at the same time. Relative quantification is performed by a
comparison of the intensities of corresponding MS-signals
of the light and heavy versions of the labelled peptides.
Observed peak ratios for isotopic analogues are highly
accurate, because there are no chemical differences
between the species, and they are analysed in the same
experiment. More recently, numerous variations of this
approach have been developed (41). These include
isotope-coded affinity tag (ICAT) and cleavable ICAT that
employ modified biotin (2H versus1H or 13C versus12C), and
stable isotope labelling by amino acids in cell culture,
which uses labelled amino acids in cell culture media (41,
44, 89). A large number of studies based on stable isotope
labelling have demonstrated the feasibility and efficiency of
Rev. sci. tech. Off. int. Epiz., 24 (1)
2D-PAGE: two dimensional polyacrylamide gel electrophoresis
General overview of the two-dimensional-difference gel
LC-MS: liquid cluomatography mass spectrometry
m/z mass to change ratio
General strategy of gel-free proteomics using stable isotope
Proteins of state 1
Proteins of state 2
Proteins of state 1
stable isotope coded label
pair finding and quantitation
Proteins of state 2
As in transcriptomics, chip-based tools have the potential
to revolutionise the field of proteomics. An antibody-
microarray consists of a set of antibodies spotted
robotically on appropriate solid supports (42). To measure
protein levels, the labelled sample (e.g. with a fluorescent
label) is incubated on the surface of the microarray. Then
the proteins bound to the antibody probes are quantified
using fluorescence array-scanning systems (Fig. 6). The
potential of this technique has been demonstrated with
antibody-arrays targeting very low-abundancy proteins,
such as cytokines, which are hardly detectable with
standard proteomics techniques (59). A major challenge is
the production of the thousands of antibodies that are
needed to perform holistic proteome-wide analyses. The
antibodies for array-based experiments must be highly
specific and sensitive, even when similar proteins are
present in a high concentration. For this reason, projects
such as the antibody initiatives of the Human Proteome
Organisation have recently begun systematically to
generate high-quality antibodies against every human
protein (83). Experience gathered with these antibody
initiatives will help to establish highly effective array-based
protein detection techniques for other species, including
Bioinformatics in proteome analyses
A variety of specialised bioinformatics tools are integral
parts of the overall workflow in proteome analyses. The
frequently employed 2D-PAGE strategy requires software
tools for spot detection and quantification on large gel sets.
Software packages are also available to match and compare
corresponding spots on different gels (27). However, these
tools are not fully automated and results often have to be
corrected manually by spot editing and matching. State-of-
the-art LC-MS setups are able to generate up to five spectra
per second, leading to thousands of spectra in a single
experiment. Protein identification is based on comparisons
between experimental MS spectra and in silico-generated
MS spectra of known proteins. Several appropriate search
algorithms (31) have been developed for this purpose. The
huge number of database searches that must be performed
may nevertheless constitute a problem and the most
sophisticated computing facilities are necessary to
overcome this. Thus, computers such as the Proteome
Analysis Under Linux Architecture cluster at the Medical
Proteome Center in Bochum, Germany, with its ability to
search a single spectrum against a protein database with
one million entries in 150 milliseconds, are a prerequisite
for high-throughput analyses. Managing the large data sets
generated in proteome projects is much more difficult
than managing those in transcriptome projects. In
a typical proteome project, samples are split and processed
with completely different approaches to reduce
their complexity, and then investigated in individual
experiments. All the diverse results must be stored in a
manner that allows statistical evaluations and data mining.
Finally, the data sets must be reassembled to provide an
overview of whole proteomes (33).
A proteomics standards initiative (PSI) was founded in
2002 to define community standards for data
representation in proteomics and aid in data comparison,
exchange and verification. This initiative includes
guidelines for the following:
– the minimum information about a proteomics
– data exchange formats (PSI mark-up language [PSI-
– formats for the presentation of MS data (PSI-MS)
– ontology guidelines (PSI-ONT) (50, 90).
Commercial suppliers are now providing integrated
software platforms, based on relational databases, that
should finally be able to perform the following:
– track samples in the proteomics workflow
– control equipment, such as liquid-handling robots and
mass spectrometers, in a sample-dependent way
– collect, process and store obtained data in central
Strategies for defining gene functions in vivo
Depending on the traits of interest, the mouse is not always
an appropriate model to clarify gene function in livestock.
However, the functional analysis of genes in farm animals
in vivo is a major challenge, since target gene validation
Rev. sci. tech. Off. int. Epiz., 24 (1)
Antibody-array-based quantitative detection of proteins
through traditional methods, such as transgenesis by DNA
micro-injection into zygotes, is costly and cumbersome
(137). Recent progress in farm animal transgenic
technology provides a more efficient basis for gene
function studies. Somatic cell nuclear transfer (SCNT) has
been established for several species and, in combination
with random or targeted genetic alteration of nuclear
donor cells, enables functional studies (23). Efficient
generation of transgenic livestock using lentiviral vectors
(56, 57) is another important step towards studying gene
function in these species. Lentiviral vectors can be used for
overexpression and also for gene knock-down studies,
using small interfering RNAs (121). The current status of
transgenic technology in livestock species is reviewed by
Niemann et al. in this issue.
Ribonucleic acid interference
Ribonucleic acid interference (RNAi) holds considerable
promise as a functional genomics tool to study and define
gene function in vivo by gene knock down. This approach
is based on the discovery that double-stranded RNA
(dsRNA) is an important regulator of gene expression in a
wide range of eukaryotes. It triggers different types of gene
silencing that are collectively referred to as RNA silencing
or RNA interference. A central step is the processing of
dsRNAs into short RNA duplexes of 21 to 23 nucleotides
in length, which then guide the recognition and ultimately
the cleavage or translational repression of complementary
single-stranded RNAs, such as messenger RNAs. In
Caenorhabditis elegans (a soil nematode, used extensively in
genetic studies) and mammals, two types of naturally
occurring small RNAs have been described:
– short interfering RNAs (siRNAs)
– microRNAs (miRNAs).
Long dsRNA and miRNA precursors are processed to
siRNA and miRNA duplexes by the RNase-III-like enzyme
Dicer. The short dsRNAs are subsequently unwound and
assembled into effector complexes: RNA-induced silencing
complex (RISC) and miRNA containing ribonucleoprotein
particles (miRNPs). The RISC mediates mRNA-target
degradation while miRNPs guide translational repression
of target mRNAs (10, 82).
Combined with appropriate constructs and transgenic
technology, RNAi has the potential to create animals with
inducible, tissue-specific silencing of almost any gene.
However, RNAi can also be applied directly, without
transgenic approaches, if a suitable system for delivery is
available. Applying RNAi to C. elegans is relatively easy and
can be accomplished by soaking the worms in appropriate
dsRNA solutions (68). Numerous transfection reagents are
commercially available for work with cell lines, but these
do not work in vivo. Thus, the main obstacle to achieving
in vivo gene knock down by RNAi technologies in farm
animals is delivery. Ova and pre-implantation embryos are
therefore particularly amenable to RNAi through micro-
injection or electroporation (45, 93). However, post-natal
tissues have also been targeted by intravenous injection of
siRNA in physiological solution (e.g. saline). This approach
has been used to deliver siRNAs to highly vascularised
mouse tissues, such as the liver and muscle. Knock down
of genes targeted in this manner is transient, but may last
longer than a week with a reduction in gene expression of
30% to 60%. In vivo gene knock down has also been
reported after local, direct administration of siRNAs to
sequestered anatomical sites, demonstrating the potential
for delivery to organs such as the eyes, lungs and central
nervous system. More recently, chemically modified siRNA
was employed to induce massive knock down of
apolipoprotein B mRNA in the liver and jejunum, which
caused significantly decreased plasma levels of
apolipoprotein B and a reduction in total cholesterol in
mice (116). This study further demonstrates the power of
RNAi as a functional genomics tool for target validation
and the study of gene function in vivo. However, care must
be taken in siRNA selection and experimental design,
including appropriate controls, to ensure potency and
specificity in gene knock-down experiments (43). There
are several publications which give practical advice on the
experimental application of RNAi in birds and mammals
(38, 99, 126).
Structural and functional
Farm animal resources
Substantial structural genomic resources are already
available for several economically important farm animal
species, such as the following:
– cattle (B. taurus/B. indicus)
– horses (Equus caballus)
– sheep (Ovis aries)
– pigs (S. scrofa)
– chickens (G. gallus)
– channel catfish (Ictalurus punctatus)
– rainbow trout (Oncorhynchus mykiss)
– salmon (S. salar)
– honey bees (A. mellifera).
Moreover, resources for other species are emerging.
Structural genomic resources include the following:
– genetic maps (14, 16, 18, 61, 85, 86, 105, 131)
– physical maps (14, 16, 18, 74, 107, 137)
Rev. sci. tech. Off. int. Epiz., 24 (1)
– EST collections (14, 16, 18, 20, 102, 103, 105, 125)
– SNP collections (14, 49, 125)
– large-insert (e.g. BAC) libraries (16, 18, 73, 96, 105).
For some of these species, the first draft genome sequence
assemblies (see ‘Introduction’, above) are available. Gene
indices have also been constructed or are under
construction for various species (14, 102, 105, 113, 114,
115). Comparative maps were developed concurrently
with genetic maps and are continuously being refined to
link farm animal genome information with the more
advanced structural and functional genomic data of other
species, such as humans and mice (14, 16, 18, 29, 64, 73,
78, 105). Recently developed comparative mapping tools
(47, 64, 78) enable rapid in silico mapping and the
discovery of positional candidate genes in QTL regions.
Microarrays based on cDNA collections have been
developed and tested for various species, including the
honey bee (135), cattle (118) and salmonids (103), and the
first high-density bovine GeneChip®has recently become
commercially available through Affymetrix. Database links
for farm animal structural genomics resources are listed in
Andersson and Georges (3). General functional genomics
resources have already been listed above. Additional
resources for farm animal species are available at the
– pbil.univ-lyon1.fr/software/identitag/ (Identitag is a
relational database for SAGE tag identification and
interspecies comparison of SAGE libraries)
– http://titan.biotec.uiuc.edu/compass (for multi-species
comparative mapping in silico, using the Compass
– http://pede.dann.affrc.go.jp/ (for pig EST data Explorer,
immunologically relevant bovine genes)
– www.livestockgenomics.csiro.au/ibiss/ (an interactive
bovine in silico SNP database).
Microparasite and macroparasite resources
Various parasite genome projects, which aim to collect
complete genome sequences and/or EST resources, have
been initiated, completed or are about to be completed in
the near future. Microparasite projects include a wide
range of viruses, protozoa and bacteria. Examples are
members of the genus Parapoxvirus (22), various species of
Trypanosoma (21) and strains of Listeria monocytogenes (26).
Projects for metazoan parasites include nematodes,
cestodes and fluke, such as Haemonchus contortus,
Echinococcus spp. and Fasciola hepatica (51, 91). The
available resources range from network sites and project
pages at sequencing institutes to databases that integrate
and curate sequence data and associated annotation with
diverse biological data sets. The scope and tools of major
databases (e.g. www.genedb.org/,
have recently been reviewed and a list of sequencing
projects was provided by Hertz-Fowler and Hall (51).
NEMBASE is an interesting database (www.nematodes.
org/) and valuable links are listed at parasite-genome
C. elegans genome project (e.g. www.ncbi.nlm.nih.gov/,
www.tigr.org/) has provided a wealth of structural and
functional data that form the basis for work on many
Examples of functional
genomics in farm animal health
Expression profiling is perhaps the most promising
application of functional genomics in farm animal health
and welfare. The expression of thousands of genes in a
given tissue or cell type is measured simultaneously in two
or more biological conditions, such as ‘infected’ and ‘non-
infected’, and compared to identify differentially expressed
transcripts. In a ‘proof of principle’ experiment in
chickens, a limited microarray with only 1,200 genes was
able to reproducibly detect gene expression differences in
peripheral blood lymphocytes of two strains of Marek’s
disease, one susceptible, the other resistant, that were
compared before and after infection. These microarray data
were found to be consistent with previous data in the
literature concerning gene induction and immune
response. Moreover, one of the genes with differential
expression was known to confer resistance to Marek’s
disease, while another gene provided a prime positional
QTL candidate (77). Another experiment used the SAGE
approach for profiling gene expression in peripheral blood
mononuclear cells of a trypanotolerant N’Dama cow,
before and after experimental infection with Trypanosoma
congolense. The identification of more than 180
differentially expressed genes and ESTs involved in
trypanotolerance will allow the establishment of specific
microarray sets for further metabolic and pharmacological
studies and the design of marker sets for marker-assisted
introgression programmes (8). In livestock species such as
cattle, monozygotic twins provide a valuable resource for
this type of experiment, especially when studying fertility
Other examples of QTL projects that will benefit from a
molecular phenotype are mastitis resistance (106) and
studies aimed at identifying the polled gene in cattle (35).
Rev. sci. tech. Off. int. Epiz., 24 (1)
As with any functional genomics analysis, temporal and
spatial target tissue or cell population definition is vital for
success. Laser capture microdissection can obtain relevant
cell populations precisely (7). The recently developed
‘genetical genomics’ approach to QTL mapping combines
classical QTL mapping with molecular phenotypes defined
by expression profiles. This approach promises new
insights into a wide spectrum of traits (108). However, this
type of study requires a highly standardised experimental
setting, since environmentally induced variation in gene
expression can be considerable (98). Such standardisation
might be difficult to achieve in some farm animal species.
Another interesting application of gene expression
profiling is in assisted reproduction technologies. Embryo
production by in vitro fertilisation (IVF) procedures and,
even more significantly, by SCNT, is often associated with
complex health and welfare problems, collectively known
as ‘large offspring syndrome’ (LOS) (53). The occurrence
and severity of LOS phenotypes depend on the IVF and
SCNT protocols and media used and are, at present, rather
unpredictable. This has hindered the more widespread
application of embryo technologies in animal breeding and
production. Microarray analysis has recently been
employed to characterise gene expression profiles in IVF
and SCNT embryos, and has identified several genes
implicated in abnormal phenotypes (94). Normalising the
expression pattern of these genes by appropriate embryo
protocols could be a way of improving the health of IVF
and SCNT offspring. A more global approach in this field
aims to establish a ‘gold standard’ of normal gene
expression in embryos by large-scale microarray analyses.
An ‘embryo chip’ with all the relevant genes could then be
used to design and test optimised in vitro protocols and
media at the pre-implantation embryo stage, to allow
optimal development to term (87).
Expression profiling in host cells or tissues and parasites is
also an extremely valuable tool for the detailed
characterisation of host-parasite interactions, as well as for
identifying new drug targets or developing new vaccines
(68). A ‘genomic filtering’ approach is particularly
promising in the search for novel antiparasitics. For
infectious disease applications, comparative genomics
filters allow the selection of pathogen-specific gene
products, whereas functional genomics filters, such as
RNAi, allow the selection and targeting of gene products
that are essential for parasite survival (80).
Conclusions and prospects
Recent technological advances in ‘omics’ technologies and
their corresponding bioinformatic tools will greatly aid a
systematic analysis of molecular changes during
physiological and pathological processes. This will provide
insights into the biological mechanisms underlying animal
health and allow better definition of traits with previously
low heritability. Applying the ‘genetic genomics’ approach
to farm animals is a major challenge, principally because
maintaining sufficient numbers of large animals under
standardised environmental conditions presents problems
associated with management, housing and economics.
Identifying QTL-affecting molecular profiles through these
‘omics’ technologies may prove valuable when using
marker-assisted selection to improve fertility, health and
longevity in farm animals. In turn, molecular profiles of
relevant tissues may also be useful to predict the
consequences of selection before the animals reach an age
at which classical phenotypic traits can be recorded. This
represents a first step towards a systems biology approach
to complex organisms. Such an approach would aim to
model dynamic gene-protein-metabolite signalling
networks and enable researchers to predict the outcome of
a given perturbation of the system under investigation.
Similar approaches are currently being developed and
tested in simple model organisms, but may also be
applicable to laboratory and farm animals. Thus,
functional genomics is expected to have a major positive
impact on sustainable livestock production.
The functional genomics projects of the authors are funded
by the Deutsche Forschungsgemeinschaft,
Bundesministerium für Bildung und Forschung and the
Rev. sci. tech. Off. int. Epiz., 24 (1)
Rev. sci. tech. Off. int. Epiz., 24 (1)
La génomique fonctionnelle : outils pour améliorer la santé et le
bien-être des animaux d’élevage
S. Hiendleder, S. Bauersachs, A. Boulesteix, H. Blum, G.J. Arnold,
T. Fröhlich & E. Wolf
Les premiers assemblages de séquences du génome des espèces d’élevage
sont désormais accessibles grâce à des banques de données du domaine
public ; de plus, d’autres projets de séquençage progressent rapidement. En
outre, un grand nombre de séquences exprimées a été obtenu, ce qui
contribuera à la construction de cartes de transcrits (gènes annotés) pour de
nombreuses espèces qui ont une importance économique. Ainsi, la reproduction
des animaux d’élevage entre dans l’ère post-génomique. La génomique
fonctionnelle, définie comme étant l’application des méthodes expérimentales
globales visant à évaluer la fonction des gènes, en utilisant les données et les
réactifs mis à disposition par la génomique structurelle (c’est-à-dire,
cartographie et séquençage), capte désormais toute l’attention.
La conjugaison d’une vue holistique des phénotypes au niveau moléculaire et
des données sur les marqueurs génétiques semble être une approche
particulièrement prometteuse pour l'amélioration des caractères liés à la santé
et au bien-être chez les animaux d’élevage. Ces caractères sont souvent
difficiles à définir. Comme ils ont une héritabilité faible, ils ne sont pas associés
à une amélioration génétique dans les programmes de reproduction et de
sélection traditionnels. Parallèlement, les informations génomiques obtenues de
micro-organismes et de parasites ouvrent des perspectives en matière de
nouveaux vaccins et de traitements. Le présent article décrit les principaux
outils de la génomique fonctionnelle, énumère les ressources génomiques
disponibles pour les animaux d’élevage et examine les perspectives et les
problèmes de la génomique fonctionnelle au service de l’amélioration de la santé
et du bien-être des animaux d’élevage.
Animal d’élevage – Bien-être – Bien-être animal – Génomique – Génomique
fonctionnelle – Protéomique – Ressource génomique fonctionnelle – Ressource
génomique structurelle – Santé – Transcriptomique.
1. Affara N.A. (2003). – Resource and hardware options for
microarray-based experimentation. Brief. funct. Genomics
Proteomics, 2 (1), 7-20.
2. Anderson N.L. & Anderson N.G. (2002). – The human
plasma proteome: history, character, and diagnostic
prospects. Molec. cell. Proteomics, 1 (11), 845-867. Erratum:
Molec. cell. Proteomics, 2 (1), 50.
3. Andersson L. & Georges M. (2004). – Domestic-animal
genomics: deciphering the genetics of complex traits.
Nat. Rev. Genet., 5 (3), 202-212.
4. Bacon L.D., Hunt H.D. & Cheng H.H. (2001). – Genetic
resistance to Marek’s disease. Curr. Top. Microbiol. Immunol.,
Rev. sci. tech. Off. int. Epiz., 24 (1)
La genómica funcional: herramientas para mejorar la sanidad y el
bienestar del ganado
S. Hiendleder, S. Bauersachs, A. Boulesteix, H. Blum, G.J. Arnold,
T. Fröhlich & E. Wolf
Ya es posible acceder a los primeros repertorios de secuencias genómicas de
especies ganaderas, que figuran en bases de datos de dominio público, y hay
otros varios proyectos de secuenciación que avanzan con rapidez. Se han
obtenido además amplios conjuntos de secuencias expresadas, que resultarán
de ayuda para elaborar mapas de transcripción anotados de numerosas
especies de importancia económica. La cría y selección de animales domésticos
está entrando pues en la era postgenómica. La genómica funcional, definida
como la aplicación de métodos experimentales holísticos para evaluar la función
génica a partir de datos y reactivos procedentes de la genómica estructural
(esto es, la cartografía y la secuenciación), se está convirtiendo en el área de
La combinación de marcadores genéticos y de una visión holística de los
fenotipos en su dimensión molecular parece un planteamiento especialmente
prometedor para mejorar las características ligadas a la salud y al bienestar del
ganado. En general resulta difícil determinar cuáles son estas características,
pues su nivel de heredabilidad es bajo y por consiguiente no deparan grandes
ventajas genéticas con programas convencionales de cría y selección. Al mismo
tiempo, la información sobre el genoma de microorganismos y parásitos puede
ser útil de cara a la elaboración de nuevas vacunas y productos terapéuticos. El
autor, tras describir las principales herramientas de la genómica funcional,
expone los recursos genómicos existentes para el ganado y examina las
perspectivas y dificultades que anuncia la genómica funcional con vistas a
mejorar la salud y el bienestar del ganado.
Bienestar – Bienestar animal – Ganado – Genómica – Genómica funcional – Recurso de
la genómica estructural – Recurso de la genómica funcional – Proteómica – Salud –
5. Bauersachs S., Blum H., Mallok S., Wenigerkind H., Rief S.,
Prelle K. & Wolf E. (2003). – Regulation of ipsilateral and
contralateral bovine oviduct epithelial cell function in the
postovulation period: a
Biol. Reprod., 68 (4), 1170-1177.
6. Bauersachs S., Rehfeld S., Ulbrich S.E., Mallok S., Prelle K.,
Wenigerkind H., Einspanier R., Blum H. & Wolf E. (2004). –
Monitoring gene expression changes in bovine oviduct
epithelial cells during the oestrous cycle. J. molec. Endocrinol.,
32 (2), 449-466.
7. Becker A.J., Wiestler O.D. & Blumcke I. (2002). – Functional
genomics in experimental and human temporal lobe epilepsy:
powerful new tools to identify molecular disease mechanisms
of hippocampal damage. Prog. Brain Res., 135, 161-173.
8. Berthier D., Quere R., Thevenon S., Belemsaga D., Piquemal
D., Marti J. & Maillard J.C. (2003). – Serial analysis of gene
expression (SAGE) in bovine trypanotolerance: preliminary
results. Genet. Selec. Evol., 35 (Suppl. 1), S35-47.
9. Bolstad B.M., Irizarry R.A., Astrand M. & Speed T.P. (2003).
– A comparison of normalization methods for high density
oligonucleotide array data based on variance and bias.
Bioinformatics, 19 (2), 185-193.
10. Bonetta L. (2004). – RNAi: silencing never sounded better.
Nature Meth., 1 (1), 79-86.
11. Boulesteix A.L. (2004). – PLS dimension reduction for
classification with microarray data. Stat. Appl. Genet. molec.
Biol., 3 (1), Article 33.
12. Brazma A., Hingamp P., Quackenbush J., Sherlock G.,
Spellman P., Stoeckert C., Aach J., Ansorge W., Ball C.A.,
Causton H.C., Gaasterland T., Glenisson P., Holstege F .C.,
Kim I.F., Markowitz V., Matese J.C., Parkinson H.,
Robinson A., Sarkans U., Schulze-Kremer S., Stewart J.,
Taylor R., Vilo J. & Vingron M. (2001). – Minimum
information about a microarray experiment (MIAME) –
toward standards for microarray data. Nature Genet.,
29 (4), 365-371.
13. Brenner S., Johnson M., Bridgham J., Golda G., Lloyd D.H.,
Johnson D., Luo S., McCurdy S., Foy M., Ewan M., Roth R.,
George D., Eletr S., Albrecht G., Vermaas E., Williams S.R.,
Moon K., Burcham T., Pallas M., DuBridge R.B., Kirchner J.,
Fearon K., Mao J. & Corcoran K. (2000). – Gene expression
analysis by massively parallel signature sequencing (MPSS)
on microbead arrays. Nature Biotechnol., 18 (6), 630-634.
Erratum: Nature Biotechnol., 18 (10), 1021.
14. Burt D.W. (2004). – The chicken genome and the
developmental biologist. Mechanisms Dev., 121 (9), 1129-
15. Chou C.C., Chen C.H., Lee T.T. & Peck K. (2004). –
Optimization of probe length and the number of probes per
gene for optimal microarray analysis of gene expression.
Nucleic Acids Res., 32 (12), e99.
16. Chowdhary B.P. & Bailey E. (2003). – Equine genomics:
galloping to new frontiers. Cytogenet. Genome Res., 102 (1-4),
17. Churchill G.A. (2002). – Fundamentals of experimental
design for cDNA microarrays. Nature Genet., 32 (Suppl.),
18. Cockett N.E. (2003). – Current status of the ovine genome
map. Cytogenet. Genome Res., 102 (1-4), 76-78.
19. Cowman A.F . & Crabb B.S. (2003). – Functional genomics:
identifying drug targets
Trends Parasitol., 19 (11), 538-543.
for parasitic diseases.
20. Da Mota A.F ., Sonstegard T.S., Van Tassell C.P., Shade L.L.,
Matukumalli L.K., Wood D.L., Capuco A.V., Brito M.A.,
Connor E.E., Martinez M.L. & Coutinho L.L. (2004). –
Characterization of open reading frame-expressed sequence
tags generated from Bos indicus and B. taurus mammary gland
cDNA libraries. Anim. Genet., 35 (3), 213-219.
21. Degrave W.M., Melville S., Ivens A. & Aslett M. (2001). –
Parasite genome initiatives. Int. J. Parasitol., 31 (5-6),
22. Delhon G., Tulman E.R., Afonso C.L., Lu Z.,
de la Concha-Bermejillo A., Lehmkuhl H.D., Piccone M.E.,
Kutish G.F. & Rock D.L. (2004). – Genomes of the
parapoxviruses ORF virus and bovine papular stomatitis
virus. J. Virol., 78 (1), 168-177.
23. Denning C. & Priddle H. (2003). – New frontiers in gene
targeting and cloning: success, application and challenges in
domestic animals and human embryonic stem cells.
Reproduction, 126 (1), 1-11.
24. Diggle M.A. & Clarke S.C. (2004). – Pyrosequencing™:
sequence typing at the speed of light. Molec. Biotechnol.,
28 (2), 129-138.
25. Dombkowski A.A., Thibodeau B.J., Starcevic S.L. &
Novak R.F . (2004). – Gene-specific dye bias in microarray
reference designs. FEBS Lett., 560 (1-3), 120-124.
26. Doumith M., Cazalet C., Simoes N., Frangeul L., Jacquet C.,
Kunst F ., Martin P., Cossart P., Glaser P. & Buchrieser C.
(2004). – New aspects regarding evolution and virulence of
Listeria monocytogenes revealed by comparative genomics and
DNA arrays. Infect. Immun., 72 (2), 1072-1083.
27. Dowsey A.W., Dunn M.J. & Yang G.Z. (2003). – The role of
bioinformatics in two-dimensional gel electrophoresis.
Proteomics, 3 (8), 1567-1596.
28. Dudoit S., Fridlyand J. & Speed T.P. (2002). – Comparison of
discrimination methods for the classification of tumors using
gene expression data. J. Am. Stat. Soc., 97, 77-87.
29. Everts-van der Wind A., Kata S.R., Band M.R., Rebeiz M.,
Larkin D.M., Everts R.E., Green C.A., Liu L., Natarajan S.,
Goldammer T., Lee J.H., McKay S., Womack J.E. &
Lewin H.A. (2004). – A 1463 gene cattle-human comparative
map with anchor points defined by human genome sequence
coordinates. Genome Res., 14 (7), 1424-1437.
30. Fahrenkrug S.C., Freking B.A., Smith T.P., Rohrer G.A. &
Keele J.W. (2002). – Single nucleotide polymorphism (SNP)
discovery in porcine expressed genes. Anim. Genet., 33 (3),
Rev. sci. tech. Off. int. Epiz., 24 (1)
31. Fenyo D. (2000). – Identifying the proteome: software tools.
Curr. Opin. Biotechnol., 11 (4), 391-395.
32. Fitzsimmons C.J., Savolainen P., Amini B., Hjalm G.,
Lundeberg J. & Andersson L. (2004). – Detection of sequence
polymorphisms in red junglefowl and White Leghorn ESTs.
Anim. Genet., 35 (5), 391-396.
33. Foubister V. (2004). – Human liver proteome project.
J. Proteome Res., 3 (2), 164.
34. Friedman N. (2004). – Inferring cellular networks using
probabilistic graphical models. Science, 303 (5659), 799-805.
35. Georges M., Drinkwater R., King T., Mishra A., Moore S.S.,
Nielsen D., Sargeant L.S., Sorensen A., Steele M.R., Zhao X.,
Womack J.E. & Hetzel J. (1993). – Microsatellite mapping of
a gene affecting horn development in Bos taurus. Nature
Genet., 4 (2), 206-210.
36. Gevaert K. & Vandekerckhove J. (2000). – Protein
identification methods in proteomics. Electrophoresis, 21 (6),
37. Gowda M., Jantasuriyarat C., Dean R.A. & Wang G.L. (2004).
– Robust-Long SAGE (RL-SAGE): a substantially improved
Long SAGE method for gene discovery and transcriptome
analysis. Plant Physiol., 134 (3), 890-897.
38. Grabarek J.B. & Zernicka-Goetz M. (2003). – RNA
interference in mammalian systems – a practical approach.
Adv. exp. Med. Biol., 544, 205-216.
39. Griffin T.J., Gygi S.P., Ideker T., Rist B., Eng J., Hood L. &
Aebersold R. (2002). – Complementary profiling of gene
expression at the transcriptome and proteome levels in
Saccharomyces cerevisiae. Molec. cell. Proteomics, 1 (4), 323-
40. Gut I.G. (2004). – DNA analysis by MALDI-TOF mass
spectrometry. Hum. Mutat., 23 (5), 437-441.
41. Gygi S.P., Rist B., Gerber S.A., Turecek F ., Gelb M.H. &
Aebersold R. (1999). – Quantitative analysis of complex
protein mixtures using isotope-coded affinity tags. Nature
Biotechnol., 17 (10), 994-999.
42. Haab B.B. (2003). – Methods and applications of antibody
microarrays in cancer research. Proteomics, 3 (11),
43. Hall J. (2004). – Opinion: unravelling the general properties
of siRNAs: strength in numbers and lessons from the past.
Nat. Rev. Genet., 5 (7), 552-557.
44. Hansen K.C., Schmitt-Ulms G., Chalkley R.J., Hirsch J.,
Baldwin M.A. & Burlingame A.L. (2003). – Mass
spectrometric analysis of protein mixtures at low levels using
cleavable 13C-isotope-coded affinity tag and multidimensional
chromatography. Molec. cell. Proteomics, 2 (5), 299-314.
45. Haraguchi S., Saga Y., Naito K., Inoue H. & Seto A. (2004). –
Specific gene silencing in the pre-implantation stage mouse
embryo by an siRNA expression vector system. Molec. Reprod.
Dev., 68 (1), 17-24.
46. Hardenbol P., Baner J., Jain M., Nilsson M., Namsaraev E.A.,
Karlin-Neumann G.A., Fakhrai-Rad H., Ronaghi M.,
Willis T.D., Landegren U. & Davis R.W. (2003). –
Multiplexed genotyping with sequence-tagged molecular
inversion probes. Nature Biotechnol., 21 (6), 673-678.
47. Harhay G.P. & Keele J.W. (2003). – Positional candidate gene
selection from livestock EST databases using gene ontology.
Bioinformatics, 19 (2), 249-255.
48. Hashimoto S., Suzuki Y., Kasai Y., Morohoshi K., Yamada T.,
Sese J., Morishita S., Sugano S. & Matsushima K. (2004). –
5’-end SAGE for the analysis of transcriptional start sites.
Nature Biotechnol., 22 (9), 1146-1149.
49. Hawken R.J., Barris W.C., McWilliam S.M. & Dalrymple B.P.
(2004). – An interactive bovine in silico SNP database (IBISS).
Mamm. Genome, 15 (10), 819-827.
50. Hermjakob H., Montecchi-Palazzi L., Bader G., Wojcik J.,
Salwinski L., Ceol A., Moore S., Orchard S., Sarkans U.,
von Mering C., Roechert B., Poux S., Jung E., Mersch H.,
Kersey P., Lappe M., Li Y., Zeng R., Rana D., Nikolski M.,
Husi H., Brun C., Shanker K., Grant S.G., Sander C., Bork P.,
Zhu W., Pandey A., Brazma A., Jacq B., Vidal M., Sherman D.,
Legrain P., Cesareni G., Xenarios I., Eisenberg D., Steipe B.,
Hogue C. & Apweiler R. (2004). – The HUPO PSI’s molecular
interaction format – a community standard for the
representation of protein interaction data. Nature Biotechnol.,
22 (2), 177-183.
51. Hertz-Fowler C. & Hall N. (2004). – Parasite genome
databases and web-based resources. Meth. molec. Biol.,
52. Hiendleder S., Thomsen H., Reinsch N., Bennewitz J.,
Leyhe-Horn B., Looft C., Xu N., Medjugorac I., Russ I.,
Kuhn C., Brockmann G.A., Blumel J., Brenig B., Reinhardt F .,
Reents R., Averdunk G., Schwerin M., Forster M., Kalm E. &
Erhardt G. (2003). – Mapping of QTL for body conformation
and behavior in cattle. J. Hered., 94 (6), 496-506.
53. Hiendleder S., Mund C., Reichenbach H.D., Wenigerkind H.,
Brem G., Zakhartchenko V., Lyko F . & Wolf E. (2004). –
Tissue-specific elevated genomic cytosine methylation levels
are associated with an overgrowth phenotype of bovine
fetuses derived by in vitro techniques. Biol. Reprod., 71 (1),
54. Hieter P. & Boguski M. (1997). – Functional genomics: it’s all
how you read it. Science, 278 (5338), 601-602.
55. Hoffmann R., Seidl T. & Dugas M. (2002). – Profound effect
of normalization on detection of differentially expressed
genes in oligonucleotide microarray data analysis. Genome
Biol., 3 (7), 33.
56. Hofmann A., Kessler B., Ewerling S., Weppert M., Vogg B.,
Ludwig H., Stojkovic M., Boelhauve M., Brem G., Wolf E. &
Pfeifer A. (2003). – Efficient transgenesis in farm animals by
lentiviral vectors. EMBO Rep., 4 (11), 1054-1060.
57. Hofmann A., Zakhartchenko V., Weppert M., Sebald H.,
Wenigerkind H., Brem G., Wolf E. & Pfeifer A. (2004). –
Generation of transgenic cattle by lentiviral gene transfer into
oocytes. Biol. Reprod., 71 (2), 405-409.
Rev. sci. tech. Off. int. Epiz., 24 (1)
58. Holloway A.J., van Laar R.K., Tothill R.W. & Bowtell D.D.
(2002). – Options available – from start to finish – for
obtaining data from DNA microarrays II. Nature Genet.,
32 (Suppl.), 481-489.
59. Huang R.P. (2003). – Cytokine antibody arrays: a promising
tool to identify molecular targets for drug discovery.
Comb. Chem. high Throughput Screen., 6 (8), 769-775.
60. Huber W., von Heydebreck A., Sultmann H., Poustka A. &
Vingron M. (2002). – Variance stabilization applied to
microarray data calibration and to the quantification of
differential expression. Bioinformatics, 18 (Suppl. 1), S96-
61. Ihara N., Takasuga A., Mizoshita K., Takeda H., Sugimoto M.,
Mizoguchi Y., Hirano T., Itoh T., Watanabe T., Reed K.M.,
Snelling W.M., Kappes S.M., Beattie C.W., Bennett G.L. &
Sugimoto Y. (2004). – A comprehensive genetic map of the
cattle genome based on 3802 microsatellites. Genome Res.,
14 (10A), 1987-1998.
62. Jensen O.N. (2004). – Modification-specific proteomics:
characterization of post-translational modifications by mass
spectrometry. Curr. Opin. chem. Biol., 8 (1), 33-41.
63. Julka S. & Regnier F . (2004). – Quantification in proteomics
through stable isotope coding: a review. J. Proteome Res., 3 (3),
64. Karsenty E., Barillot E., Tosser-Klopp G., Lahbib-Mansais Y.,
Milan D., Hatey F ., Cirera S., Sawera M., Jorgensen C.B.,
Chowdhary B., Fredholm M., Wimmers K., Ponsuksili S.,
Davoli R., Fontanesi L., Braglia S., Zambonelli P., Bigi D.,
Neuenschwander S. & Gellin J. (2003). – The GENETPIG
database: a tool for comparative mapping in pig (Sus scrofa).
Nucleic Acids Res., 31 (1), 138-141.
65. Khatkar M.S., Thomson P.C., Tammen I. & Raadsma H.W.
(2004). – Quantitative trait loci mapping in dairy cattle:
review and meta-analysis. Genet. Selec. Evol., 36 (2), 163-190.
66. Kim S., Ruparel H.D., Gilliam T.C. & Ju J. (2003). – Digital
genotyping using molecular affinity and mass spectrometry.
Nat. Rev. Genet., 4 (12), 1001-1008.
67. Klose J. (1975). – Protein mapping by combined isoelectric
focusing and electrophoresis of mouse tissues. A novel
approach to testing for induced point mutations in mammals.
Hum. Genet., 26 (3), 231-243.
68. Knox D.P. (2004). – Technological advances and genomics in
metazoan parasites. Int. J. Parasitol., 34 (2), 139-152.
69. Koumi P., Green H.E., Hartley S., Jordan D., Lahec S.,
Livett R.J., Tsang K.W. & Ward D.M. (2004). – Evaluation
and validation of the ABI 3700, ABI 3100, and the
MegaBACE 1000 capillary array electrophoresis instruments
for use with short tandem repeat microsatellite typing in a
forensic environment. Electrophoresis, 25 (14), 2227-2241.
70. Krebs S., Medugorac I., Seichter D. & Forster M. (2003). –
RNaseCut: a MALDI mass spectrometry-based method for
SNP discovery. Nucleic Acids Res., 31 (7), e37.
71. Kuhn Ch., Bennewitz J., Reinsch N., Xu N., Thomsen H.,
Looft C., Brockmann G.A., Schwerin M., Weimann C.,
Hiendleder S., Erhardt G., Medjugorac I., Forster M.,
Brenig B., Reinhardt F ., Reents R., Russ I., Averdunk G.,
Blumel J. & Kalm E. (2003). – Quantitative trait loci mapping
of functional traits in the German Holstein cattle population.
J. Dairy Sci., 86 (1), 360-368.
72. Kwok P.Y. (2002). – SNP genotyping with fluorescence
polarization detection. Hum. Mutat., 19 (4), 315-323.
73. Larkin D.M., Everts-van der Wind A., Rebeiz M., Schweitzer
P.A., Bachman S., Green C., Wright C.L., Campos E.J.,
Benson L.D., Edwards J., Liu L., Osoegawa K., Womack J.E.,
de Jong P.J. & Lewin H.A. (2003). – A cattle-human
comparative map built with cattle BAC-ends and human
genome sequence. Genome Res., 13 (8), 1966-1972.
74. Lash A.E., Tolstoshev C.M., Wagner L., Schuler G.D.,
Strausberg R.L., Riggins G.J. & Altschul S.F . (2000). –
SAGEmap: a public gene expression resource. Genome Res.,
10 (7), 1051-1060.
75. Laub M.T., McAdams H.H., Feldblyum T., Fraser C.M. &
Shapiro L. (2000). – Global analysis of the genetic network
controlling a bacterial cell cycle. Science, 290 (5499), 2144-
76. Lipkin E., Mosig M.O., Darvasi A., Ezra E., Shalom A.,
Friedmann A. & Soller M. (1998). – Quantitative trait locus
mapping in dairy cattle by means of selective milk DNA
pooling using dinucleotide microsatellite markers: analysis of
milk protein percentage. Genetics, 149 (3), 1557-1567.
77. Liu H.C., Cheng H.H., Tirunagaru V., Sofer L. & Burnside J.
(2001). – A strategy to identify positional candidate genes
conferring Marek’s disease resistance by integrating DNA
microarrays and genetic mapping. Anim. Genet., 32 (6), 351-
78. Liu L., Gong G., Liu Y., Natarajan S., Larkin D.M.,
Everts-van der Wind A., Rebeiz M. & Beever J.E. (2004). –
Multi-species comparative mapping in silico using the
COMPASS strategy. Bioinformatics, 20 (2), 148-154.
79. Livesey F .J. (2003). – Strategies for microarray analysis of
limiting amounts of RNA. Brief. funct. Genomics Proteomics,
2 (1), 31-36.
80. McCarter J.P. (2004). – Genomic filtering: an approach to
discovering novel antiparasitics. Trends Parasitol., 20 (10),
81. Mah N., Thelin A., Lu T., Nikolaus S., Kuhbacher T.,
Gurbuz Y., Eickhoff H., Kloppel G., Lehrach H., Mellgard B.,
Costello C.M. & Schreiber S. (2004). – A comparison of
oligonucleotide and cDNA-based microarray systems. Physiol.
Genomics, 16 (3), 361-370.
82. Meister G. & Tuschl T. (2004). – Mechanisms of gene
silencing by double-stranded RNA. Nature, 431 (7006),
83. Merrick B.A. (2003). – The human proteome organization
(HUPO) and environmental health. EHP Toxicogenomics,
111 (1T), 1-5.
Rev. sci. tech. Off. int. Epiz., 24 (1)
84. Moen T., Fjalestad K.T., Munck H. & Gomez-Raya L. (2004).
– A multistage testing strategy for detection of quantitative
trait loci affecting disease resistance in Atlantic salmon.
Genetics, 167 (2), 851-858.
85. Moen T., Hoyheim B., Munck H. & Gomez-Raya L. (2004). –
A linkage map of Atlantic salmon (Salmo salar) reveals an
uncommonly large difference in recombination rate between
the sexes. Anim. Genet., 35 (2), 81-92.
86. Nichols K.M., Young W.P., Danzmann R.G., Robison B.D.,
Rexroad C., Noakes M., Phillips R.B., Bentzen P., Spies I.,
Knudsen K., Allendorf F .W., Cunningham B.M., Brunelli J.,
Zhang H., Ristow S., Drew R., Brown K.H., Wheeler P.A. &
Thorgaard G.H. (2003). – A consolidated linkage map for
rainbow trout (Oncorhynchus mykiss). Anim. Genet., 34 (2),
87. Niemann H. & Wrenzycki C. (2000). – Alterations of
expression of developmentally important genes in
preimplantation bovine embryos by in vitro culture
conditions: implications for subsequent development.
Theriogenology, 53 (1), 21-34.
88. O’Farrell P.H. (1975). – High resolution two-dimensional
electrophoresis of proteins. J. biol. Chem., 250 (10), 4007-
89. Ong S.E., Blagoev B., Kratchmarova I., Kristensen D.B.,
Steen H., Pandey A. & Mann M. (2002). – Stable isotope
labeling by amino acids in cell culture, SILAC, as a simple
and accurate approach to expression proteomics. Molec. Cell.
Proteomics, 1 (5), 376-386.
90. Orchard S., Hermjakob H., Julian R.K. Jr, Runte K., Sherman
D., Wojcik J., Zhu W. & Apweiler R. (2004). – Common
interchange standards for proteomics data: public availability
of tools and schema. Proteomics, 4 (2), 490-491.
91. Parkinson J., Whitton C., Schmid R., Thomson M. &
Blaxter M. (2004). – NEMBASE: a resource for parasitic
nematode ESTs. Nucleic Acids Res., 32 (Database issue),
92. Patton W.F . (2002). – Detection technologies in proteome
analysis. J. Chromatogr. B: Analyt. Technol. Biomed. Life Sci.,
771 (1-2), 3-31.
93. Pekarik V., Bourikas D., Miglino N., Joset P., Preiswerk S. &
Stoeckli E.T. (2003). – Screening for gene function in chicken
embryo using RNAi and electroporation. Nature Biotechnol.,
21 (1), 93-96. Epub. 23 Dec. 2002. Erratum: Nature
Biotechnol., 21 (2), 199.
94. Pfister-Genskow M., Myers C., Childs L.A., Lacson J.C.,
Patterson T., Betthauser J.M., Goueleke P.J., Koppang R.W.,
Lange G., Fisher P., Watt S.R., Forsberg E.J., Zheng Y.,
Leno G.H., Schultz R.M., Liu B., Chetia C., Yang X.,
Hoeschele I. & Eilertsen K.J. (2005). – Identification of
differentially expressed genes in individual bovine
preimplantation embryos produced by nuclear transfer:
improper reprogramming of genes required for development.
Biol. Reprod., 72 (3), 546-555.
95. Primmer C.R., Raudsepp T., Chowdhary B.P., Moller A.P. &
Ellegren H. (1997). – Low frequency of microsatellites in
the avian genome. Genome Res., 7 (5), 471-482.
96. Quiniou S.M., Katagiri T., Miller N.W., Wilson M.,
Wolters W.R. & Waldbieser G.C. (2003). – Construction
and characterization of a BAC library from a gynogenetic
channel catfish Ictalurus punctatus. Genet. Selec. Evol., 35 (6),
97. Raadsma H.W., Gray G.D. & Woolaston R.R. (1998). –
Breeding for disease resistance in Merino sheep in Australia.
In Genetic resistance to animal diseases (M. Müller &
G. Brem, eds). Rev. sci. tech. Off. int. Epiz., 17 (1), 315-328.
98. Radich J.P., Mao M., Stepaniants S., Biery M., Castle J.,
Ward T., Schimmack G., Kobayashi S., Carleton M.,
Lampe J. & Linsley P.S. (2004). – Individual-specific
variation of gene expression in peripheral blood leukocytes.
Genomics, 83 (6), 980-988.
99. Rao M., Baraban J.H., Rajaii F . & Sockanathan S. (2004). –
In vivo comparative study of RNAi methodologies by in ovo
electroporation in the chick embryo. Dev. Dyn., 231 (3),
100. Rebrikov D.V., Desai S.M., Siebert P.D. & Lukyanov S.A.
(2004). – Suppression subtractive hybridization.
Meth. molec. Biol., 258, 107-134.
101. Reinartz J., Bruyns E., Lin J.Z., Burcham T., Brenner S.,
Bowen B., Kramer M. & Woychik R. (2002). – Massively
parallel signature sequencing (MPSS) as a tool for in-depth
quantitative gene expression profiling in all organisms.
Brief. funct. Genomics Proteomics, 1 (1), 95-104.
102. Rexroad C.E. III, Lee Y., Keele J.W., Karamycheva S.,
Brown G., Koop B., Gahr S.A., Palti Y. & Quackenbush J.
(2003). – Sequence analysis of a rainbow trout cDNA
library and creation of a gene index. Cytogenet. Genome Res.,
102 (1-4), 347-354.
103. Rise M.L., von Schalburg K.R., Brown G.D., Mawer M.A.,
Devlin R.H., Kuipers N., Busby M., Beetz-Sargent M.,
Alberto R., Gibbs A.R., Hunt P., Shukin R., Zeznik J.A.,
Nelson C., Jones S.R., Smailus D.E., Jones S.J., Schein J.E.,
Marra M.A., Butterfield Y.S., Stott J.M., Ng S.H.,
Davidson W.S. & Koop B.F . (2004). – Development and
application of a salmonid EST database and cDNA
microarray: data mining and interspecific hybridization
characteristics. Genome Res., 14 (3), 478-490. Epub. 12
104. Rockett J.C. & Hellmann G.M. (2004). – Confirming
microarray data – is it really necessary? Genomics, 83 (4),
105. Rothschild M.F . (2004). – Porcine genomics delivers new
tools and results: this little piggy did more than just go to
market. Genet. Res., 83 (1), 1-6.
106. Rupp R. & Boichard D. (2003). – Genetics of resistance to
mastitis in dairy cattle. Vet. Res., 34 (5), 671-688.
Rev. sci. tech. Off. int. Epiz., 24 (1)
107. Saha S., Sparks A.B., Rago C., Akmaev V., Wang C.J.,
Vogelstein B., Kinzler K.W. & Velculescu V.E. (2002). –
Using the transcriptome to annotate the genome. Nature
Biotechnol., 20 (5), 508-512.
108. Schadt E.E., Monks S.A., Drake T.A., Lusis A.J., Che N.,
Colinayo V., Ruff T.G., Milligan S.B., Lamb J.R., Cavet G.,
Linsley P.S., Mao M., Stoughton R.B. & Friend S.H. (2003).
– Genetics of gene expression surveyed in maize, mouse
and man. Nature, 422 (6929), 297-302.
109. Schafer J. & Strimmer K. (2005). – An empirical Bayes
approach to inferring large-scale gene association networks.
Bioinformatics, 21 (6), 754-764.
110. Schena M., Shalon D., Davis R.W. & Brown P.O. (1995). –
Quantitative monitoring of gene expression patterns with a
complementary DNA microarray. Science, 270 (5235), 467-
111. Seichter D., Krebs S. & Forster M. (2004). – Rapid and
accurate characterisation of short tandem repeats by
MALDI-TOF analysis of endonuclease cleaved RNA
transcripts. Nucleic Acids Res., 32 (2), e16.
112. Silva A.P., De Souza J.E., Galante P.A., Riggins G.J.,
De Souza S.J. & Camargo A.A. (2004). – The impact of
SNPs on the interpretation of SAGE and MPSS experimental
data. Nucleic Acids Res., 32 (20), 6104-6110.
113. Smith J., Speed D., Law A.S., Glass E.J. & Burt D.W. (2004).
– Insilico identification of chicken immune-related genes.
Immunogenetics, 56 (2), 122-133.
114. Smith T.P., Grosse W.M., Freking B.A., Roberts A.J.,
Stone R.T., Casas E., Wray J.E., White J., Cho J.,
Fahrenkrug S.C., Bennett G.L., Heaton M.P., Laegreid W.W.,
Rohrer G.A., Chitko-McKown C.G., Pertea G., Holt I.,
Karamycheva S., Liang F ., Quackenbush J. & Keele J.W.
(2001). – Sequence evaluation of four pooled-tissue
normalized bovine cDNA libraries and construction of a
gene index for cattle. Genome Res., 11 (4), 626-630.
115. Sonstegard T.S., Capuco A.V., White J., Van Tassell C.P.,
Connor E.E., Cho J., Sultana R., Shade L., Wray J.E.,
Wells K.D. & Quackenbush J. (2002). – Analysis of bovine
mammary gland EST and functional annotation of the
gene index. Mamm. Genome, 13
116. Soutschek J., Akinc A., Bramlage B., Charisse K.,
Constien R., Donoghue M., Elbashir S., Geick A.,
Hadwiger P., Harborth J., John M., Kesavan V., Lavine G.,
Pandey R.K., Racie T., Rajeev K.G., Rohl I., Toudjarska I.,
Wang G., Wuschko S., Bumcrot D., Koteliansky V.,
Limmer S., Manoharan M. & Vornlocher H.P. (2004). –
Therapeutic silencing of an endogenous gene by systemic
administration of modified
432 (7014), 173-178.
117. Speed T. (2003). – Statistical analysis of gene expression
microarray data. Chapman & Hall/CRC Press LLC, Boca
Raton, Florida, 222 pp.
118. Suchyta S.P., Sipkovsky S., Kruska R., Jeffers A.,
McNulty A., Coussens M.J., Tempelman R.J., Halgren R.G.,
Saama P.M., Bauman D.E., Boisclair Y.R., Burton J.L.,
Collier R.J., DePeters E.J., Ferris T.A., Lucy M.C.,
McGuire M.A., Medrano J.F ., Overton T.R., Smith T.P.,
Smith G.W., Sonstegard T.S., Spain J.N., Spiers D.E., Yao J.
& Coussens P.M. (2003). – Development and testing of a
high-density cDNA microarray resource for cattle. Physiol.
Genomics, 15 (2), 158-164.
119. Thaller G., Dempfle L. & Hoeschele I. (1996). –
Investigation of the inheritance of birth defects in swine by
complex segregation analysis. J. anim. Breed. Genet.,
113 (2), 77-92.
120. Tibshirani R., Hastie T., Narasimhan B. & Chu G. (2002). –
Diagnosis of multiple cancer types by shrunken centroids of
gene expression. Proc. natl Acad. Sci. USA, 99 (10), 6567-
121. Tiscornia G., Tergaonkar V., Galimi F . & Verma I.M. (2004).
– CRE recombinase-inducible RNA interference mediated
by lentiviral vectors. Proc. natl Acad. Sci. USA, 101 (19),
7347-7351. Epub. 30 April 2004.
122. Tonge R., Shaw J., Middleton B., Rowlinson R., Rayner S.,
Young J., Pognan F ., Hawkins E., Currie I. & Davison M.
(2001). – Validation and development of fluorescence
two-dimensional differential gel electrophoresis proteomics
technology. Proteomics, 1 (3), 377-396.
123. Tusher V.G., Tibshirani R. & Chu G. (2001). – Significance
analysis of microarrays applied to the ionizing radiation
response. Proc. natl Acad. Sci. USA, 98 (9), 5116-5121.
Epub. 17 April 2001. Erratum: Proc. natl Acad. Sci. USA,
98 (18), 10515.
124. Twyman R.M. & Primrose S.B. (2003). – Techniques patents
for SNP genotyping. Pharmacogenomics, 4 (1), 67-79.
125. Uenishi H., Eguchi T., Suzuki K., Sawazaki T., Toki D.,
Shinkai H., Okumura N., Hamasima N. & Awata T. (2004).
– PEDE (Pig EST Data Explorer): construction of a database
for ESTs derived from porcine full-length cDNA libraries.
Nucleic Acids Res., 32 (Database issue), D484-488.
126. Ui-Tei K., Naito Y., Takahashi F., Haraguchi T.,
Ohki-Hamazaki H., Juni A., Ueda R. & Saigo K. (2004). –
Guidelines for the selection of highly effective siRNA
sequences for mammalian and chick RNA interference.
Nucleic Acids Res., 32 (3), 936-948.
127. Unlu M., Morgan M.E. & Minden J.S. (1997). – Difference
gel electrophoresis: a single gel method for detecting
changes in protein extracts. Electrophoresis, 18 (11),
128. Velculescu V.E., Zhang L., Vogelstein B. & Kinzler K.W.
(1995). – Serial analysis of gene expression. Science,
270 (5235), 484-487.
129. Vencio R.Z., Brentani H., Patrao D.F . & Pereira C.A. (2004).
– Bayesian model accounting for within-class biological
variability in serial analysis of gene expression (SAGE). BMC
Bioinformatics, 5 (1), 119.
Rev. sci. tech. Off. int. Epiz., 24 (1)
130. Vilain C. & Vassart G. (2004). – Small amplified RNA-
SAGE. Meth. molec. Biol., 258, 135-152.
131. Waldbieser G.C., Bosworth B.G., Nonneman D.J. & Wolters
W.R. (2001). – A microsatellite-based genetic linkage map
for channel catfish, Ictalurus punctatus. Genetics, 158 (2),
132. Wang H. & Hanash S. (2005). – Intact-protein based
sample preparation strategies for proteome analysis in
combination with mass spectrometry. Mass. Spectrom. Rev.,
24 (3), 413-426.
133. Wei C.L., Ng P., Chiu K.P., Wong C.H., Ang C.C.,
Lipovich L., Liu E.T. & Ruan Y. (2004). – 5’ Long serial
analysis of gene expression (Long SAGE) and 3’ Long SAGE
for transcriptome characterization and genome annotation.
Proc. natl Acad. Sci. USA, 101 (32), 11701-11706.
134. Werner F .A., Durstewitz G., Habermann F .A., Thaller G.,
Kramer W., Kollers S., Buitkamp J., Georges M., Brem G.,
Mosner J. & Fries R. (2004). – Detection and
characterization of SNPs useful for identity control and
parentage testing in major European dairy breeds.
Anim. Genet., 35 (1), 44-49.
135. Wernersson R., Schierup M.H., Jorgensen F .G., Gorodkin J.,
Panitz F ., Staerfeldt H.H., Christensen O.F ., Mailund T.,
Hornshoj H., Klein A., Wang J., Liu B., Hu S., Dong W.,
Li W., Wong G.K., Yu J., Wang J., Bendixen C.,
Fredholm M., Brunak S., Yang H. & Bolund L. (2005). –
Pigs in sequence space: a 0.66X coverage pig genome survey
based on shotgun sequencing. BMC Genomics, 6 (1), 70.
136. Whitfield C.W., Band M.R., Bonaldo M.F ., Kumar C.G.,
Liu L., Pardinas J.R., Robertson H.M., Soares M.B. &
Robinson G.E. (2002). – Annotated expressed sequence
tags and cDNA microarrays for studies of brain and
behavior in the honey bee. Genome Res., 12 (4), 555-566.
137. Wolf E., Schernthaner W., Zakhartchenko V., Prelle K.,
Stojkovic M. & Brem G. (2000). – Transgenic technology in
farm animals – progress and perspectives. Experim. Physiol.,
85 (6), 615-625.
138. Wolf E., Arnold G.J., Bauersachs S., Beier H.M., Blum H.,
Einspanier R., Frohlich T., Herrler A., Hiendleder S.,
Kolle S., Prelle K., Reichenbach H.D., Stojkovic M.,
Wenigerkind H. & Sinowatz F . (2003). – Embryo-maternal
communication in bovine – strategies for deciphering a
complex cross-talk. Reprod. dom. Anim., 38 (4), 276-289.
139. Yang G.P., Ross D.T., Kuang W.W., Brown P.O. & Weigel R.J.
(1999). – Combining SSH and cDNA microarrays for rapid
identification of differentially expressed genes. Nucleic Acids
Res., 27 (6), 1517-1523.
140. Yang Y.H., Dudoit S., Luu P., Lin D.M., Peng V., Ngai J. &
Speed T.P. (2002). – Normalization for cDNA microarray
data: a robust composite method addressing single and
multiple slide systematic variation. Nucleic Acids Res.,
30 (4), e15.
141. Yauk C.L., Berndt M.L., Williams A. & Douglas G.R.
(2004). – Comprehensive comparison of six microarray
technologies. Nucleic Acids Res., 32 (15), e124.
Rev. sci. tech. Off. int. Epiz., 24 (1)
Page 24 Download full-text