Science topic

Quantitative Genetics - Science topic

Explore the latest questions and answers in Quantitative Genetics, and find Quantitative Genetics experts.
Questions related to Quantitative Genetics
  • asked a question related to Quantitative Genetics
Question
7 answers
Hi, I have problems to get a SNP input file working well in Genalex. I tried to change letters by numbers (A=1; C=2; G=3 and T=4) in one column but it does not work. It does not seem to work either using numbers and two columns (codominant). Some help on that ?? In attached, an exemple of my SNP data set. Thanks
Relevant answer
Answer
Severine Roques did you ever get SPAGEDI to work on SNPS? I can't seem to find papers that did accomplish that although I also don't see why it shouldn't work...
And if yes, how did you make the SPAGEDI input file??
  • asked a question related to Quantitative Genetics
Question
5 answers
Greeting to all researchers,
As we all know that YVMV in okra is not yet all seed transmitted viral disease, so, what happens if we go for harvesting of those infected plants. because, breeding for viral disease resistance is not only concerned trait.. so, the question arises can we go for harvesting of YVMV infected plants for generation proceeding?
Relevant answer
Answer
To me, big no, viral diseases are systemic so harvesting already infected pods could be a source of contamination for the new progenies.
  • asked a question related to Quantitative Genetics
Question
9 answers
Hi, I am working on a theoretical quantitative genetics paper exploring rank distributions of parents and progeny based off of genetic (narrow heritability) , GXE, and environmental (random, maternal effects) influences. I want to test some data but public datasets of complete (not summarized by moments or correlations) data of parent-progeny phenotypes are hard to find.
Ideally, I would like a data set with animal ID, animal sex, sire and dam ID, sire and dam phenotype, and animal phenotype. Best case would be an approximately outbred population but I accept a lot of agricultural and lab animals have some inbreeding present so I can take what I can get a long as it is not a consciously inbred line or with frequent backcrossing, etc.
Also, the number of progeny per coupling is an important variable so animals with a moderate number of offspring (5-10) per female would be appreciated.
Thanks for any help.
Relevant answer
Answer
Thank you very much
Regards
  • asked a question related to Quantitative Genetics
Question
24 answers
Which one software is best for agricultural data analysis?
Experimental design
Plant Breeding Trials
Quantitative Genetics
Relevant answer
Answer
R is free and has high flexibility in most genetic and statistical analyses.
  • asked a question related to Quantitative Genetics
Question
27 answers
Hello everyone, I want to learn a new programming language. The purpose is to write software for animal quantitative genetics. I found that the three most popular software (BLUPF90, DMU, ASREML) in the quantitative genetics of animals are all written in Fortran. But some of my friends suggested that I learn python.
I am good at the R language. I think there is not much difference between R and Python, and Julia is also one of my languages to consider.
So I am now confused about which language to learn? Which language is more suitable for animal quantitative genetics? Anyone who knows, please give me some suggestions. I still have three years to learn this new language.
Relevant answer
Wish you success,
The meaning of matlab is the matrix laboratory. So is sought originally for solve matrices. Please see the link: https://cimss.ssec.wisc.edu/wxwise/class/aos340/spr00/whatismatlab.htm#:~:text=The%20name%20MATLAB%20stands%20for,with%20input%20from%20many%20users.
Best wishes
  • asked a question related to Quantitative Genetics
Question
13 answers
I have a population of a dioecious species with significant phenotypic variation and want to select individuals from this population. What statistical methods can I use to perform multi-trait selection or dissimilarity studies with data of individual plants? If have a R package is better yet.
Relevant answer
Answer
Dear Hugo Gabriel Peres Your question is highly interesting. To my opinion, before performing multi-trait selection you will have to pay attention on whether the traits in question are uncorrelated or correlated, and whether your population consists of equal number of male and female individuals. If traits are genetically uncorrelated, for practicing 5% selection intensity you will have to select 37% best individuals for each trait (if you are concentrating on 3 genetically uncorrelated traits). If your population consists of equal number of male and female individuals, it's well and good. However, if there is significantly unequal number of male and female individuals, then your effective population size will drastically come down! Keeping these facts into consideration, you may proceed further as suggested by Maurice Ekpenyong
Thanks!
  • asked a question related to Quantitative Genetics
Question
5 answers
I am having a problem running PHASE. The error that appears to me is that the number of alleles is too large, that I need to increase the KMAX value in constants.h and recompiling. It is not explained in the manual how to do this. I have seen an archive called constants.hpp within the folder phase-master\src\phase.2.1.1.source, but it does not work to change the Kmax number here.
Could you help me with this issue?
I do not know how to do it in MS-DOS
Thank you for your help
Relevant answer
Answer
Dear Luciana, thank you so much for the suggestion.
I will try it with your way.
  • asked a question related to Quantitative Genetics
Question
4 answers
Hi, can you please let me know any software to analyze the PCR array data that is alternative to Ingenuity pathway analysis. Cant afford it due to its cost.
Relevant answer
Answer
thanks
  • asked a question related to Quantitative Genetics
Question
1 answer
need help on finding biomarker for F1 hybrid and backcross of a fish species
Relevant answer
Answer
can you tell us how you overcame this problem?
  • asked a question related to Quantitative Genetics
Question
13 answers
I have un-replicated mean data of various grain yield traits from two consecutive years. I want to estimate the broad sense heritability from these data. Can I estimate? Kindly give your kind suggestions. If yes then please share me an example or any link?
Relevant answer
Answer
Error variance is required to estimate heritability in broad sense. So in case of unreplicated dats, it is not possible to estimate heritability.
  • asked a question related to Quantitative Genetics
Question
2 answers
Dear all,
Apologies if this does not make sense, I am very new to GWAS analysis.
I have recently imputed my data using the Michigan imputation server which has aligned my data to the HRC reference panel and has returned chr.vcf.gz and chr.info files for analysis.
I am now trying to filter to data and convert to plink bim/bed/fam files, however, my question is this. My data contains multi allelic variants and so may have duplicate ids. If I filter on MAF 0.01 using vcftools, if one of the alt variants has MAF<0.01 will it also remove the other due to the same duplicate id?
Alternatively, I was thinking of converting to plink format without filtering, however, I believe plink 1.9 default is to keep the most common alt allele and set the other to missing. I don't know what effect that would have on my analysis.
Hope this makes sense.
Any help is appreciated, thanks.
Relevant answer
Answer
In addition to a MAF threshold you need to set an imputation quality or R2 threshold as well for post-imputation QC.
  • asked a question related to Quantitative Genetics
Question
11 answers
Hello,
I am currentliy investigating post harvest data of breeding-field trials* and besides phenotypic correlations among traits, I also receive genetic correlations from the analysis software.
For instance I receive the values
1.08 + for the relationship yield with test weight
and
0.71 ++ for the relationship yield with thousand kernel weight.
My questions are:
1. Why are there correlation coeffecients above 1?
2. Is the "+" nomenclature a common nomenclature for significance levels in genotypic correlations?
3. What would I use as an indirect selection trait for yield in my case: test weight or thousand kernel weight?
Thanks in advance!!
*wheat yield trials, analysis with software Plabstat
Relevant answer
Answer
Benedikt
I used this software for all my analyses. The significant level here for genetic correlation is determined by standard error. So, your values are correct. I also suggest to contact Prof Utz. He is so friendly professor and he will answer you and give you more details. Try to google genetic correlation by standard error or just use incorporate keywords
  • asked a question related to Quantitative Genetics
Question
4 answers
I have data from randomly selected individuals for which i wish to determine the PI(sib). What is the best software program to find the Pi from the sequenced data against 9 markers?
Relevant answer
Answer
In R:
If you put your allele frequencies into a matrix pmat (one row per marker, pad with zeros if needed)
e.g.
pmat <- rbind(
c(0.885, 0.115,rep(0,8)),
c(0.76, 0.14, 0.1,rep(0,7)),
c(0.59, 0.2, 0.1, 0.07, 0.04,rep(0,5)),
c(0.39, 0.15, 0.11, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05) )
(4 test marker systems from Waits et al, Mol Ecol 2001)
then use:
S2 <- rowSums(pmat^2)
PIsib <- prod((1+2*S2+2*S2*S2-rowSums(pmat^4))/4)
(result for these 4 markers is 0.098)
  • asked a question related to Quantitative Genetics
Question
3 answers
I am testing the hypothesis that being heterozygote at a certain gene locus, say X, increases the chances of having the good allele of X compared to being homozygote. I have data of more than 6000 individuals with their zygosity for gene X. What could be the background distribution of allele frequency for testing the hypothesis if the observed frequency of good allele is more in heterozygotes compared to one expected by chance. Thank you so much.
Relevant answer
Answer
Thank you everyone for your answers, have been very useful.
  • asked a question related to Quantitative Genetics
Question
6 answers
The Best Linear Unbiased Estimates (BLUE) are to be used in GWAS of resistance to yellow rust in durum wheat landraces. I have data from three environments generated through alpha lattice design in 2 replications. Per replication, I have 10 blocks each containing 30 accessions summing up a total of 300 materials. The accessions were sown in two rows of 0.5 m long and 0.2m far apart; the distance between two accessions is 0.4 m.
Relevant answer
Answer
Thanks for your input Salvador. 
Cheers, 
Geraldo
  • asked a question related to Quantitative Genetics
Question
4 answers
I have 3 data sets of resistance evaluation from two locations generated in a field experiment of alpha-lattice design ( 2 replications of 300 materials and per each replication, 10 incomplete blocks containing 30 accessions). Two of the data sets are from the same location but different years; and the third one is a single year data from another location). So I am thinking of calculating the BLUE / BLUP of each location and a total one for the combined data to be used in the GWAS.
Relevant answer
Answer
I strongly recommend to use BLUE, as you are doing a two-stage analysis. BLUE will allow you to have an 'adjusted mean' for each genotype according to design effects, and maybe other covariates in your model. And this is what you want, a more precise mean.This translates into using your model effects (for example replicate fixed and incomplete blocks and plots random), but your genotype (or clone) effect FIXED.
If you use BLUP then you are doing shrinkage of your genetic effects. This will mean that your genetic effects are moved towards the mean according to theri information, yes, they will be your best predictions of those random effects, but they will be adjusted by theri sample size and the vairance associated with theri data. This is what a random effects does, but the issue is that you eliminate part of the genetic signal if is random, and then often you will end up with more noise than what you want for your GWAS. Hence, you fit is as fixed, and then once you do your GWAS it will be a random effect, but it will not be double shrinkaged.
Good luck
  • asked a question related to Quantitative Genetics
Question
4 answers
I needed to fine map a 10cM QTL region to a narrow interval to find candidate genes associated with the phenotype. I did QTL mapping with F2 populations. and selected the recombinant F2 plants using the significant (flanking) markers and selfed to produce F2:3 seeds. Would some one direct me how to find the number of F2:3 individuals needed for fine mapping based on recombination frequency in the interval?
Thank you for sharing the thoughts in advance
Malli
Relevant answer
Answer
What is the size of the F1 population? If you intend to do map based cloning of the gene, it would be essential to increase the size of the population substantially. If you had a size of below 200 individuals for example, it will be worthwhile to increase to over 1000 individuals. You would need to genotype this population with markers flanking the QTL to identify individuals showing recombination events. You might also need to develop more closely linked markers within the region. Good luck! 
  • asked a question related to Quantitative Genetics
Question
2 answers
can correlation alone be useful?..have seen studies interpreting twins using correlation coefficient alone.. will it be enough to tell the role of genetics in diseases..?..if not, which other tests can be done?...came across this two tests heritability estimate and proband concordance..but there are mixed reviews about this tests...please suggest me how to carry on..
regards
ANU
Relevant answer
Answer
@wim e crusio..Thanq sir for ur valuable suggestions...i have done both concordance nd correlation...but variables significant in correlation havent shown good concordance rate and negative heritability estimate ... is it ok if i stick to correlation alone...
  • asked a question related to Quantitative Genetics
Question
2 answers
what I considere till know is Environment variance + Genotype variance + GxE variance needs to be 100% however my editor said it should be E + G + GxE + error variance =100%. Further I considered Residuals means square as error is it correct?
Relevant answer
Answer
Hi Sall, 
Yes your SStotal= SSE+ SSG+SSGXE+SSError.
  • asked a question related to Quantitative Genetics
Question
3 answers
I am soon to be analyzing some nanopore data and wanted to see if anybody has any particularly good papers on the technical approaches of best pipeline for this type of data? Thanks.
Relevant answer
Answer
Absolutely, yes I am wondering if there are good practices for the assembly/mapping and calling in handling nanopore data. You're right, the population genomics analysis I will be doing is downstream, but I wanted to mention it in case some of the inefficiencies of nanopore data have to be considered more strongly in the upstream processing in order for them not to affect the downstream population genomics analysis. If you have any advice, that would be much appreciated! Thank you!
  • asked a question related to Quantitative Genetics
Question
7 answers
I fed in my diploid data (29 samples, 18 primers (loci) and average of 8 bands/locus) for dominat analysis for Shanon Index, total allele diversity, percentage polymorphism etc sucessful. But at the last comment of analysis, the dialog box showing OUT MEMORY was displayed. How do i overcome this problem?
Relevant answer
Answer
"OUT OF MEMORY" always shows up when the data input file does not meet the requirements, in all the cases it has happened to me - it was a space/gap left at the end of the row
  • asked a question related to Quantitative Genetics
Question
3 answers
Is it methodologically correct to normalize grain moisture data in maize to mean zero and unit variance and then calculate variance components?
Since grain moisture in full-season maize hybrids is environmentally limited (in environments of interest) and in some years dry-down before autumn frost is slower, while faster in other years, would it be convenient to perform variance components analysis with data that is centered to same mean? In some years (rainy), there is 250-290 g/kg-1 water in grain while in others, dry years, there is no more than 200 g/kg-1.
I am aware that this would shrink environmental variance, and give information only about genotype x environment interaction and genotipic variance, but if goal of selection is only to get drydown faster than in some parental line, information should be preserved since the relations within environments would not change.
Any opinion is much appreciated.
Relevant answer
Answer
Hello Vlatko, I do not like transformations in general, unless strictly necessary. If I understand correctly I'd rather use a model that take the variables that you mention and let the data tell me if this variables are really having an effect on the response.
  • asked a question related to Quantitative Genetics
Question
7 answers
I have 15 genotypes each of corn and sorghum, planted in 2 different irrigation regimes with 3 replications in each regimes. I have data related to these 6 replications - yield/area, yield/panicle, chlorophyll content, plant height, total plant weight, harvest index, etc. I need to estimate BLUEs, BLUPs, prediction and/or estimate table of effect of each line for a major trait and effect of each environment for the same trait using mixed model (also Bayesian but not REML because some variance components are in negative values) in JMP. 
Relevant answer
Answer
Rahul this will depend on the model you are building. Is this a split plot design? When you build the model to estimate BLUES you are treating genotypes as fixed to estimate the BLUES and another model treating genotypes as random to estimates BLUPS. The reason you are getting negative variances is not because of REML it is because you have left the "unbounded variance components" this is useful in some cases. But I would recommend running REML. If they are negative and you leave them bounded they will be estimated as zero. Hope this helps.
  • asked a question related to Quantitative Genetics
Question
11 answers
We analysed heritability of body size in sexual dimorphism species using MCMCglmm. We aim to calculate inter-sex genetic correlation as sqrt((hFD*hMS)/(hMD*hFS)). We got r > 1. How is that possible???
Relevant answer
Answer
With some software you can model this as two traits - males have a value for one of the traits and missing for the other and the other way round for females. Genetic relationships allow estimating the genetic covariance, but there is no information on residual covariance which can be fixed at a value which you then ignore. I don't know if MCMChmm can do this, but worth looking at.
  • asked a question related to Quantitative Genetics
Question
17 answers
Making inbreds is the most important step to develop hybrids, and it is not always easy to find genetic variations, so, some researchers try to get some new inbreds from same old inbred, by selfing  different plants in color, stature, ear length,...etc which character is better to start on , quantitative or qualitative ?
Relevant answer
Answer
 Dear Shaheed,  thank you for you answer, and I agree with you that selecting aolygenic trait is the best way to improve grain yield of a new line version.there was so many studying with me in KSU, USA, finished his PhP in corn breeding, his name Salim khan, please if you something about him let me know, thanks in advance.
  • asked a question related to Quantitative Genetics
Question
1 answer
Line=50,
Tester=2,
Environment=5
Without parents
how can in R?
Relevant answer
Answer
You can use PPM (Plant Preservative Mixture) @ 1ml/litre of culture media. This is a American Product and 100% safe. No side effect on culture of explant.
  • asked a question related to Quantitative Genetics
Question
3 answers
SSR  Marker 
Relevant answer
Answer
There are many papers regarding the effect of SSR markers on economically important traits. Although their PIC are much more higher than SNPs but SNPs are much more abundant and amenable to throughput technologies. This is the reason we use SNPs for breeding. Here is a good paper: http://www.sciencedirect.com/science/article/pii/S2214024714000100
  • asked a question related to Quantitative Genetics
Question
5 answers
60 eucalyptus seed lot  from three different sources and including 4 control clones. when Heritability estimating  through ANOVA table but when calculate ANOVA without control or with control ?
Relevant answer
Answer
The commonly applied coefficient of relationship for the first generation eucalypt progeny of 1/2.5 appears to be quite suitable for correcting variance component and heritability estimates.  http://dx.doi.org/10.20886/ijfr.2016.3.2.119-127
  • asked a question related to Quantitative Genetics
Question
8 answers
When would one use QTL mapping to find loci underlying complex traits?
What are the advantages of QTL mapping over GWAS? When would you want to use GWAS instead? 
I've noticed some recent papers that use a combination of both techniques, but I'm wondering what their key differences are.
Relevant answer
Answer
QTL mapping give low resolution but give you high statistical power for detecting a QTL. The disadvantage is that you are limited to the genetic diversity present into the parents of your segregating population. You could think of using advanced intercrosses for increasing the resolution.
GWAS offers you very fine resolution, almost to the bp resolution, however, the power for detecting a QTL will be determined by the frequency of alleles, for instance, you may lose power for detecting rare alleles. Another disadvantage of GWAS is that is sensible to the population structure that may lead to many false positives. There are ways to correct for population structure, however, if your population structure contribute to the variation in your trait correcting for population structure may lead you to many false negatives.
Like you mention there are several methods that try to bridge the advantages of QTL mapping and GWAS. For examples, MAGIC and NAM populations they have a broader genetic diversity, and higher resolutions than bi-parental populations and the frequencies of alleles are quite well represented to reduce the problems of rare alleles. The development of these types of populations is labor and cost intensive
I think the decision between using one or the other depends on several factors:
So the question is what is the goal of your study:
1-To identify the causal gene or mutation (GWAS better than QTL mapping) or just looking for a linked marker (Probably a QTL mapping would do the job).
2- Are you interested in the whole genetic architecture of the trait, that is how many genes and their effect, (GWAS better over QTL mapping) or you are interested in major QTLs for practical purposes (QTL mapping  or GWAS would do the job).
3- Previous knowledge of the genetic architecture and heritability of the trait can help you to make a decision as well.
3.1 IF your trait or traits of interest is  underlined by few genes of large effect then GWAS will be suitable
3.2 If your trait is governed by rare alleles, then QTL mapping would be better or you would need large populations to detect them with GWAS.
Here I would like to suggest a few literature on the subject:
  • asked a question related to Quantitative Genetics
Question
3 answers
Just as the title says:
Suppose I have a set of genotype data with hundreds of samples. I want to simulate the gene expression data of the same samples, and at the same time incorporate the genotype information. I know there is tool like ruvcorr. But can the simulated expression also integrate genotype information as well?
Thanks a lot.
Relevant answer
Answer
Hi,
Maybe using C programing. See the link below. 
Good luck. 
  • asked a question related to Quantitative Genetics
Question
4 answers
I am working on germplasm materials collected from 9 different locations. Several different species were collected from each location. However, the number of species collected from each location are not equal (some location does not have certain species). Which statistical model should I use to evaluate the genetic variances of this population? Attached is an example of my dataset. Any help is much appreciated.
Relevant answer
Answer
ANOVA test is useful for evaluation of variance
  • asked a question related to Quantitative Genetics
Question
2 answers
I want to test how much variation in expression of 10 different genes (combined effect 10 genes) explains variation in one particular phenotype.
Relevant answer
Answer
  • asked a question related to Quantitative Genetics
Question
1 answer
I have obtained the microarrays data for the large cohort (both sexes). I have performed initial GWAS for all the SNPs from all the chromosomes to check the genetic association with trait which I am interested in. I found some regions but the most interesting is the one in X chromosome (in my opinion it is not a fake). However, I am a bit confused because I do not know - can I? and how can I? - analyse these data. for women there is standard 3 alleles distribution but for men, it possible to have only 2 variants: presence of allel or lack of allel.
- should I divide cohort for separate analysis for men and women subsets?
- what kind of statistics should I use for men, because I think there is impossible use simple MAF? and are the statistics results only for men subset from PLINK are reliable?
- or do you have any more advice?
I would be very grateful for all you help.
Relevant answer
Answer
I guess separate analyses need be done for men and women as men have XY chromosomes and would differ from the normal (MFA) analysis.
I think software PLINK should help you do the needful.
There is plenty of literature on GWAS in different organisms. The field is exploding. For fruitful guidance in your case, a journal like American Journal of Human Genetics would be very useful to search problems and solutions similar to yours.
  • asked a question related to Quantitative Genetics
Question
3 answers
Question :  In a large random mating population, there are initially 91 percent dominants for a monogenic trait.
a. If there would be 20% selection against the dominants, what would be the expected 
frequency of the three genotypes in the next generation? 

b. If selection would be against the recessives at an intensity of 0.10, what would be the 
frequency of the recessives after one generation? What would be the change in the 
frequency of the recessive allele? 

c. If selection against the recessives would be complete every generation, how many 
generations would be needed to reduce the frequency of the recessive gene to 5%? 

d. If the population would be selfed before completely discarding the recessives, what 
would be the frequency of the recessives after one generation of selection? 
Relevant answer
Answer
Dear Dhananjay
It is a basic calculation for gene frequency.
Please, read the reference textbook:
Falconer D. S. and Trudy F.C. Mackay.1996. Introduction to quantitative genetics.Burnt Mill, England.
I hope will be helpful for you.
Good Luck
  • asked a question related to Quantitative Genetics
Question
2 answers
Dear colleagues,
I have some categorical data that I would like to analyse with GenStat software using GLMM  Model. But I am not sure how to introduce data in software, is it like row data ( please see attached file ) or contingency table since are not continuance variables?
Thank you for your help 
Relevant answer
Answer
Hi
See the attached files.
Good luck
  • asked a question related to Quantitative Genetics
Question
2 answers
In mlm : genetic Variance (σ2G) =?????
in glm  genetic Variance (σ2G) = CMp - CMe / r
Env Var (σ2A) = CMe
Fen Var (σ2F) = σ2G + σ2
donde: Msp = mean square of
populations,
Mse = mean square of
populations error
experimental,  r = rep
In mlm : genetic Variance (σ2G) =?????
Relevant answer
Answer
Hi,
See the attached file.
Good luck
  • asked a question related to Quantitative Genetics
Question
3 answers
Hello everyone! Does anyone know how I can transform the data output from brlmm-p to a phd file from the Afflymetrix 5.0 genome-wide chip? 
Format:
probe_id A-100 A-1 A-2 A-4 A-3 A-5 A-6 A-11 A-12 A-13 A-14 A-15 A-16 A-17 
SNP_A-1780520 1 -1 2 2 2 2 2 2 2 0 2 2 2 2 2 2 2
etc
Calling code
-1= no call
0=AA (homozygous for the reference allele)
1=AB
2=BB (homozygous for the alternative allele)
Thank you in advance!
Relevant answer
Answer
Hi
See the attached file
Good luck
  • asked a question related to Quantitative Genetics
Question
7 answers
Hi all,
I'm a R user and would like to perform some quantitative genetics tests. The experimental design is full- half-sib design (each male mated with two females).
Thanks in advance 
Relevant answer
Answer
ASReml-R is a good package for classical quantitative genetics using AIREML. MCMCGLMM is also a good pacakge. rrBLUP, BGLR are also good packages after you convert your pedigree to inverse matrix of A.
  • asked a question related to Quantitative Genetics
Question
17 answers
Is mixed linear model (MLM) always better than GLM (general linear model) in association mapping? I was told by a researcher that MLM model is always better than GLM in association mapping, because MLM uses not only the population structure but also the kinship to control the false positive, whereas GLM only considers population structure. However, I was also told by others that it is better to choose the right model and MLM is not always better than GLM. Indeed, I read some recent published papers that only use GLM in association mapping. My question is that under what kinds of circumstances, GLM is a better option compared to MLM? How can I determine that? Thank you for your help and answer in advance!
Relevant answer
Answer
The MLM always outperform GLM in controlling false positives, which could be seen by QQplot. To this end, MLM is better than GLM. However, the more stringent constrols of MLM sometimes introduce false negatives, which is believed to be one reason of the missing heritability. Empirically, we also met some cases that no significant sigals were identified in one population for the trait with sufficiently high heritability. If you come back to GLM results, you may see too many significances across the whole genome, that embarrass you to select ones in high credit. In this complex situation, I think GLM and MLM are both incapable to handle it. There are always another ways besides GLM and MLM method, for example, FarmCPU, ADGWAS, etc. 
Do not hesitate to try different methods, you will find the right one suitable to your specific data.
  • asked a question related to Quantitative Genetics
Question
6 answers
General and Specific combining ability
Relevant answer
Answer
get in touch with ILRI scientists, Dr Ojango
  • asked a question related to Quantitative Genetics
Question
3 answers
Hi all,
I am not familiar with GenStat progamtion,So, I am using just Windows version, I would like to know how to calculate broad sense heritability?
Thanks
Relevant answer
Answer
Thanks’ a lot Dr. Filippo, I really appreciate your answer and guides.
Sincrely,
Djouher
  • asked a question related to Quantitative Genetics
Question
3 answers
The software should take fasta file
Relevant answer
Answer
Please consider that a fasta file does not include general/proper/standard annotations.
A fasta file is only a name and sequence. Some people encode annotations in the name in non-standard way sometimes, but as such, programs cannot rely on them. Normally a fasta file can be accompanied by GFF file that annotates somehow a fasta file. But this is an external file. 
Genbank files for example includes both the sequences and the annotations, on the other hand. 
I am not sure about what you are looking for. Can you provide a link or an image of an example of what you want to do? for example if you have seen what you want in some published article that we can see?
Best regards,
---sram
  • asked a question related to Quantitative Genetics
Question
5 answers
Hi folks - I am trying to run a Joint Scaling test but am having difficulties finding a step-by-step description of how to run one. Does anyone have a great go-to book? (Please note, my university library doesn't have a copy of the Lynch & Walsh (1998) textbook.) Any suggestions of where to look/read, would be much appreciated!
Relevant answer
Answer
More modern mixed models also work to do the analysis ("cross-means analysis","generation means analysis"....just a different name for the same thing):
Piepho HP, Möhring J (2010) Generation means analysis using mixed models. Crop Science 50, 1674-1680.
This publication gave me a good overview:
Cavalli LL (1952) An analysis of linkage in quantitative inheritance. In: Quantitative Inheritance: Papers read at a colloquium held at the Institute of Animal Genetics, Edinburgh University under the auspices of the Agricultural Research Council, April 4th to 6th, 1950 (eds. Rieve ECR, Waddington CH), pp. 135-144. HMSO, London.
  • asked a question related to Quantitative Genetics
Question
4 answers
Hi everyone,
Can I compare the copy numbers of two genes estimated by qPCR? I am quantifying two genes from soil samples (gene A and gene B), assuming I get a copy number of 1x103 and 1x105 for gene A and B, respectively. Can I say there is less gene A present in soil than gene B? Or in a quantitative way: gene A is 100-folds less than gene B?
Additionally, can I compare the copy number of gene A in other publications that estimated gene A using different primer set?
Thank you for your help
Relevant answer
Answer
@Nimala
Thank you for your reply. I understand there are copy number variation between genes. What if we assume both gene of interest are single-copied, can we directly compare qPCR calculated copy number between 2 genes?
  • asked a question related to Quantitative Genetics
Question
4 answers
I have 145 genotypes and I would like to analyze them as lattice design. Can I design my experiment with 29 partial blocks with 5 genotyeps in each?
Relevant answer
Answer
Yes, what you want is an alpha lattice.
I recommend the lecture by Jennifer Kling (Oregon State University), linked below.
The 'agricolae' R package can assist you with the design and analysis of your alpha lattice.
You will, of course, still want to replicate the experiment.
29 x 5 may not be the most efficient or effective design, however. You may consider adding some checks to make a better design.150 entries would give you more options to choose from. Alternately, if you can drop a genotype, 144 would make a nice square lattice. As Jennifer says, "Use as large a block size as possible while maintaining homogeneity of plots within blocks."
  • asked a question related to Quantitative Genetics
Question
3 answers
Hi 
I have 100 genomes and they were tested in three replication. I need to know the model in R which allow me to know if there is a significant interaction between replications x genotypyes
 I know the model will be replication + genotypes + replication x genotyeps. How can I test the significance of replications x genotypes
Relevant answer
Answer
Thank you for your answer. The experiment is randomized complete block design
  • asked a question related to Quantitative Genetics
Question
7 answers
I'm working on a gwas project and running the data in plink (with the --assoc and --adjust options) originally gave me a genomic inflation factor of around 1.12. I conducted a pca of the data using gcta software and tried to exclude samples seen as outliers but my genomic inflation factor just seems to keep increasing (1.13 - 1.14) everytime I exclude samples. Is there a reason why this is happening?
Edit: Forgot to add that the samples I am using are all cases (i.e. affected and unaffected groups are made up individuals positive for a particular phenotype). Would such sample sets lead to the slight increase in genomic inflation originally seen?
Relevant answer
Answer
These are great steps, Sophie Limou.
At step 5, I would add:
C. Tian (2008) paper 'Accounting for ancestry: population substructure and genome-wide association studies' shows that HLA and large genomic inversion regions can affect your population stratification analysis. Therefore, I suggest excluding SNPs in these regions prior to your PCA analysis, but bring back the SNPs in these regions when you run the actual association analysis.
Build 37 base-pairs coordinates for these regions are:
## BP ranges for HLA, 8p23.1 and 17q21.31 regions ##
##remove 8p23.1 [1680460:19329000]
##remove 17q21.31 [40920300:41421700]
##remove HLA [25707300:33850900]
Hope all these steps will bring the inflation factor below 1.1 :)
P.S. It would be interesting to know your rule when defining the PCA outliers, but I would go with 2, 3 or 4 standard deviations from the mean of PC1 and PC2.
  • asked a question related to Quantitative Genetics
Question
2 answers
 I actually want to test association between disease genetic markers and quantitative traits other than the disease phenotype itself. 
The disease markers have dominant mode of inheritance and are monogenic but I would like to know for testing  association between them and other quantitative traits that are polygenic; which model serves better, Dominant genetic model or the Additive one?
Relevant answer
Answer
آقای دکتر سلام
در این زمینه توصیه می کنم از آقایان دکتر پویا زمانی و دکتر احمد احمدی در گروه علوم دامی دانشگاه بوعلی سینا سوال بفرمایید زیرا متخصص ژنتیک دام هستند و در کارهای آماری هم تبحر کافی دارند.
  • asked a question related to Quantitative Genetics
Question
4 answers
I intend to obtain kinship matrix from pedigree data through proc inbreed to be used as  input to proc mixed in order to obtain additive genetic variance. The warning message is always "Individual clone=I011412 has been previously defined. Observation 27 corresponding to this individual will
not be processed." How do I order individuals to avoid this warning message?
proc inbreed data=pedigree covar outcov=kinmat;
var clone female male;
run;
Relevant answer
Answer
There are many functions (and libraries) in R that can help you sort your data in the proper order. The main things are: grandparents before parents, parents before offspring, however, sometimes this is not easy. The key is that for a given individual that has a sire and dam defined in the pedigree file, needs to define these parents previously. I recommend you use GenoMatrix for this. This package has a model (Pedigree Sort) that will help you reading and sorting your pedigree file. Check it out at:
This software has also other routines to clean your molecular data, and to generate your kingship matrix (additive, dominance and even epistatic matrices)
  • asked a question related to Quantitative Genetics
Question
3 answers
In plant breeding, we often talk of estimate of random effect i.e. BLUP and variance component on quantitative variables such as yield, plant height and etc. My thought is that estimate of BLUP or variance component for qualitative trait that are ranked or ordinal is wrong and unreliable. Take for example, if disease severity is scaled from 1 = no symptom to 5 = symptom is extremely severe. This is qualitative trait that is ranked! Is it appropriate to estimate the BLUP or variance component of random effect on such trait?
Relevant answer
Answer
yes, strictly speaking you can not use normal assumptions for this type of data, and you will need to go with Generalized Linear Mixed Models (GLMM) or another type of methodology (Bayesian?) in order to properly model the distribution of your data.
However, in practical terms, we do use LMM with normality assumptions for this type of variable in a regular basis. For example, sometimes we have a score from 1 to 5 for straightness of pine trees, where 0 is completely straight and 5 is very curved. This variable is discrete and ordinal, but if we check for some properties it can work quite well. First, you need to have equal spacing between your classes, this means that 1 to 2 is the same score change than 4 to 5, and this needs to be clearly defined in your measuring protocols. The other thing is that you ideally need to have several class values in your variable (ideally 10 or more), the reason is that this will give you a closer approximation to normality (central limit theorem) and therefore less issues. In any case, you will always have some 'doubts' about your heritability and BLUP estimates as you are 'approximating' your variable (e.g. true level of departure from straigthness)  with something different (e.g. a score).
  • asked a question related to Quantitative Genetics
Question
2 answers
Hi all,
I would be grateful for your opinion.
I am working on archival FFPE prostate samples as old as year 1994. The RNA quantity varies between 991-14500pg/ul ( Agilent bioanalyzer) and RIN number of 1.5-2.5 at best and I only have less than 2ul for each samples. I plan to reverse transcript RNA to cDNA and move on to Fludigm experiments.
What is the best kit for reverse transcription with low RNA concentration, low RIN number samples?
Kind regards,
Kenneth Hiew
Research Fellow
Relevant answer
Answer
Hi, you can try SMARTer Universal Low Input RNA Kit—cDNA Synthesis for NGS from Degraded Samples. http://www.clontech.com/US/Products/cDNA_Synthesis_and_Library_Construction/Next_Gen_Sequencing_Kits/Total_RNA-Seq/Universal_RNA_Seq_Random_Primed
Good luck,
Lesya
  • asked a question related to Quantitative Genetics
Question
1 answer
The 15 farms have new treatments in common but the local variety which is considered to be a check variety varies from one farm to another. There was NO replication of each treatment within a farm. Therefore, each farm is taken as a block in order to estimate block effect and residual term for comparison. Does it make sense to make a pairwise comparison since the local check is unique from one farm to another? The SAS script for the analysis is as follows
proc mixed data=onfarm covtest;
class farm variety;
model fyld=variety nohav/ddfm=satterth; /* nohav is a covariate */
random farm;
lsmeans variety/diff adjust=tukey;
run;
quit;
Relevant answer
Answer
Assuming that you randomized each block, it sounds like you have a slightly unusual augmented design (unusual because the checks are generally replicated, rather than the new treatments). I'd recommend that you read about augmented designs before continuing with your analysis (links below).
You should be able to make pairwise comparisons, but not with a thing called "check". Since your checks are not the same variety in all blocks, you should not estimate a combined mean for them. The augmented design analysis will allow you to compare each treatment variety with each control variety.
  • asked a question related to Quantitative Genetics
Question
3 answers
I am working on QTL associated with seed yield using SNP. 18 SNPs have been generated, my concern is, to what extent could this small number affects the authenticity of my result? I limit the number because of the cost associated with increasing the SNP number. I am a self sponsor female PhD student. Please ,do kindly advise me
Relevant answer
Answer
I agree with Isain: he probably refers to human data, whilst you Folake seem to work with plants. I work with animals, and 3k-7k are low , 54 k medium and 800k high density panels.
  • asked a question related to Quantitative Genetics
Question
2 answers
I`ve been seaching on the internet but can`t find good examples, guides and any other tutorial-type documents for INLA (and AnimalINLA) being applied to quantitative genetics.
Does anyone have any good material?
Relevant answer
Answer
  • asked a question related to Quantitative Genetics
Question
11 answers
Dear all, Can you please tell me which allele should be selected for a QTL with additive effect in DH populations? I have a DH population, lets say it was generated by crossing cultivar (cv.) A with B and cv. A is poor for a particular trait (e.g. protein content) while B is superior. In genotyping, parent A is scored as A and parent B as B. In QTL analysis I identified 5 QTLs, 2 with negative additive effect and 3 with positive additive effect. In this case, which type of QTL (either with -ve or +ve additive effect) is/are desirable to increase protein content and for that/those desirable QTL(s) which parental allele(s) should be selected? 
Relevant answer
Answer
Your question should be framed differently. Each QTL is important, whether it has an overall negative or positive effect. You have to select the desirable allele of each QTL, whether it comes from parent A (poor) or B (superior). The poor parent may also carry a desirable allele for a QTL. It is not the QTL, but its desirable allele, which needs to be selected. Each QTL will have one desirable and the other undesirable allele. Sometimes the desirable allele of a QTL may be present in poor parent. I hope I have answered your question. 
  • asked a question related to Quantitative Genetics
Question
3 answers
I am very new to the topic of Transposon sequencing. Can anyone suggest a good book for beginners? My question at this moment is :
I come across the term Fitness very often when reading articles on Transposon-sequencing. For example: Tn-seq, a robust and sensitive method for the discovery of quantitative genetic interactions in microorganisms through massively parallel sequencing. The approach
does not depend on a pre-existing array of mutants but is instead based on the assembly of a saturated transposon insertion library. After growth of the library under a test condition, the change in frequency of each insertion mutant is determined by sequencing the transposon flanking regions en masse. The change in frequency reflects the effect of the insertion on
fitness. Fitness of every insertion in a genome can be determined in this way and is a quantitative measure of the growth rate.
Relevant answer
Answer
The fitness is just how well an insertion mutant grows relative to the "wildtype" (parent). i.e if the growth rate was halved, the insertion mutant would have a fitness of 0.5. If it grew 10% faster, it would have a fitness of 1.1.
  • asked a question related to Quantitative Genetics
Question
40 answers
I'm looking for a low-cost high-reliability lab that can process a few thousand HNMR samples for metabolomic analysis. We'll do the data analysis and bioinformatics, but we need someone to do the spectras.
Relevant answer
Answer
Since the list appear to be quite active, I'll add our details:
MS-Omics is a service provider of metabolomics and data analysis. 
  • 10 days lead time from receipt of samples to final report for standard analysis by GC-MS or LC-MS/MS
  • Biomarker discovery and identification of unknown metabolites available with state-of-the-art instruments: GCxGC-qTOF and LCxLC-IM-MS/MS
  • We have a data processing service for customer's own data files
  • Advanced multivariate data analyses is our preferred tool to identify biomarkers, elucidate mode-of-action, characterize products and processes etc.
  • asked a question related to Quantitative Genetics
Question
8 answers
Hello,
I'm trying to run a matrix of co-dominat diploid data in POPGENE 1.32. I'm working with 3 populations and 9 loci.
My matrix was coded according to the manual, in fact I already got some results but I want to know the flow gene between popA VS popB, popA VS popC, etc... 
I was looking for a detailed manual on web but no results yet.
The results that I got is added in the image.
Thanks in advace!
Relevant answer
Answer
I also suggest using MIGRATE, which gives more reliable estimations of gene flow than POPGENE and GENALEx, and also allows you to test different migration models.
Hope this helps
  • asked a question related to Quantitative Genetics
Question
9 answers
Currently, I am interested in several (around 100) genes in fish and would like to investigate their expression level using public available RNA-Seq data. My strategy is to build up the reference sequences (interested genes). Index them with bowtie 2 and then align the public available RNA-Seq SRA data (filtered using SRA tool kit) against it. The obtained SAM file was further counted by eXpress for each gene expression level using the FPKM value.
I have several questions about this strategy,
Firstly, when building up the functional gene reference,  what kind of sequences should I use if there is no genomic data available? For example, gene A may studied by several scholars and their sequence results can be found in the NCBI Nucleotide database but with difference lengths. Which one should I choose. Besides, RNA splicing proceeded during RNA expression, introns may be spliced out. Therefore, which sequence should I use before or after splicing (this is important because the length of the gene affect the final FPKM value) and how I can identify whether the obtained RNA sequence is spliced or not.
Secondly, is there any problem with the estimated expression level using this strategy? Over or underestimated.
Any other suggestions are strongly welcomed!
Relevant answer
Answer
I think Juan's idea of building your own transcript assembly and then mapping to that is a good one. Trinity and Bridger are nice software for that. If you can get access to a large server/cluster you could probably run transcriptome assemblies in a matter of a few days. If not, I would suggest taking the full length genes (with introns) and map with a splicing-aware mapper such as Tophat2 or GSNAP. You could use Bowtie if you're not interested in the splicing. A program like HTseq lets you get raw read counts for your genes of interest, and this can be used to calculate FPKM or be put into differential expression analysis software (eg DESEq). Cufflinks can also be used for quick gene expression analysis.
Good luck
  • asked a question related to Quantitative Genetics
Question
12 answers
I have calculated genetic gain across two generations with mean values from two generations (F8 and F9)  that were grown in two years. How to write the units for this situation?
Ex: Yield = kgha-1   generations-1 (gain in kg per hectare  per generation)
Do I need to divide it by two to get per generation? Can't it be considered as one generation because genetically F8 and F9 are almost fixed and our genetic gain calculations were based on mean value across two generations
Relevant answer
Answer
Thank you all for your suggestions.
Sorry for the late reply.
  • asked a question related to Quantitative Genetics
Question
4 answers
Dear all,
I am preparing a quantitative genetics post-doc project on age-related dynamics in vertebrates (probably only birds in a first instance) and there is a strong bias in the suitable datasets toward temperate and polar species. I am thus wondering whether anyone would have suitable datasets on tropical species. The main criteria would be a good pedigree (even if only a social pedigree) for the population and a time span sufficient to have individuals of known age measured in their late life.
I would be happy to describe the aims and methodology of the project in more details and discuss a collaboration with interested people.
Thanks for your help!
Relevant answer
Answer
It sounds to me like you need to be searching for long-term studies on your species of interest, and that there might be more of what you're looking for in the gray literature than in the primary literature.  Have you tried direct queries to field stations and NGOs in your region of interest?  How about banding/ringing databases?  Folks at the North American Banding Council (http://www.nabanding.net/) might have insights, as might people you can access via the Ornithology Exchange (http://ornithologyexchange.org/).  You can find references for life history information of birds worldwide here: http://www.hbw.com/.  Lastly, this reference was just made public, but it's North American birds: http://www.vitalratesofnorthamericanlandbirds.org/.
Good luck!
~tim
  • asked a question related to Quantitative Genetics
Question
4 answers
       Cases Controls
CC    33      36
CT     58     48
TT       9     16
With the above genotype data, how shall i arrive at a haplotype and its frequency?
Relevant answer
Answer
Hi  Raghunath
My apologies, I missed the haplotype part of your question.  The calculation above was for standard case-control association. You will need at least 2 polymorphisms within an appropriate distance from one another e.g. within the same gene to perform haplotype analysis.  Haploview is quite a easy program to use. Alternatively, if you're familiar with scripting you can use haplo.stats in the R programming environment.
  • asked a question related to Quantitative Genetics
Question
10 answers
I want to calculate pooled genetic gain across 5 environments but trials across 5 environments are in different designs and different number of replications. So practically speaking I cannot do combined analysis. Therefore I have done individual analysis. But I would still like to see genetic gain across all five environments. Can I use mean values of each of these environments to get a pooled mean  across 5 environments and use pooled mean values to estimate genetic gain across 5 environments? If not want is the best way to do pooled genetic gain?
Bharath
Relevant answer
Answer
Thank you Dr. Nusret Zencirci,
Is there a way to combine augmented design study with alpha-lattice design for a population size of 100 entries? Some people consider each environment as a block  and analyze it as an RCBD for breeding purpose but I think block size is too big to address field variability. Is there a better way to do combined analysis under this circumstances
  • asked a question related to Quantitative Genetics
Question
6 answers
Sometimes we need to analyze diallel data using Hayman's approach along with Griffing's approach. Wr-Vr graph is one of the outputs of Hayman's approach. I know 'Dial98' can do the same. But I am looking for a script or package in R.
Relevant answer
Answer
I have written a macro SAS program for Hayman approach and I have send it for TAG journal, I hope to be accepted. As I get acceptance I  will put free for other researchers.
  • asked a question related to Quantitative Genetics
Question
2 answers
Can any one suggest the best way to Q-PCR data representation including the +-SE..?. In many cases the expression fold is represented as usual  log 2 base 2^-ddct but error bar is plotted in case of control also ... In that case from where the deviation is represented in control also? because double derivative of Ct value makes the control 1 and the log conversion makes the control 0.... So when the data is normalized to control how the error bar comes in case of control?
Relevant answer
Answer
Showing data that is normalized to a control AND showing the control (normalized to itself !!) with error bars is at best mixing different concepts of variation (what is not a sound idea) and at worst simply unneccesary and silly.
The information of the controls is used up in the calculation of the ddCt values of the other groups. It makes no sense to explicitely normalize the control group itself to itself (there is no information provided, so it is an absolutely pointless thing to do).
The information that is left over is:
"all other (ddCt-)values are refere to 0 as reference"
This means it makes perfect sense here (with ddCt) to draw the horizontal axis at y=0, and one immediately sees that positive values indicate an induction (relative to the controls) and negative values indicate a repression (relative to the controls). If data is normalized to a control, then there is no control group to be shown in the plot. Instead, the reference line should be drawn to indicate the (constant!) reference value. So there is actually no questions like "what is the error bar of the control group?" - it makes no sense.
This is different for dCt values. When dCt values are shown, then there is either NO reference line at all or the mean dCt of the control group can serve as the reference line. Either the horizontal axis as the reference line should then be placed at y=mean(dCt[controls]) or all the dCt values should be shifted by -mean(dCt[controls])) and the horizontal axis be plasec at y=0. This way, all the groups (including the controls) are represented on the plot, but the actual change in expression must be inferred implicitely by visually comparing the different dCt values - there is no information in the plot about the uncertainty with the estimates of these differences (i.e. ddCt values).
NB1: if dCt values are to be shown I'd prever to show all individual points rather than summary statistics like mean and SE. Only if there are really many groups or genes or a hell lot of individual values, so that such a plot would become "too busy" I would use a boxplot instead.
NB2: if ddCt values are shown (that are available only as means), then I'd prefer to show the confidence interval rather than the SE as a measure of uncertainty or precision of this estimate.
  • asked a question related to Quantitative Genetics
Question
5 answers
Hi,
To calculate broad sense heritability, we normally take genetic variance and divide it by total variances (for example, Genetic variance, GxE variance, error variance).
Proc Mixed (Method=Type3, or REML as default) is often used to get the variance estimate with a model following by a bunch of random terms.
However, if the analysis produce negative covariance estimate, what should we do with this negative number?  
Thanks
Relevant answer
Answer
By 'negative covariance' you probably meant to say 'negative variance'? A commonly used approach to deal with this is to treat it as zero. You may refit the model without the corresponding random effect in the model.
  • asked a question related to Quantitative Genetics
Question
4 answers
In 2014, I had planted 210 lines, 3 checks (1 repeated check & 2 random checks) in augmented design 1 replication, 1 location ( 2 loc were planted but lost 1 loc for late freeze damage). These 210 lines comes from 12 populations (Family structure is complex I have 7 wild relatives back crossed to 2 elite parents). In 2015 out of 210 lines, 93 were advanced to next generation based on tillering ability (alpha lattice, 2 replications, 4 locations). I have done BLUPs and BLUEs for 2015 using META & I got heritability for 2015. I have done moving mean analysis using Agro Base Gen II for 2014. End of the day I have to calculate genetic gain we have achieved for grain yield by indirect selection for tillering ability. Please guide me step by step
Relevant answer
Answer
Thank you sir,
Spoke to many scientists on campus they think it's not possible. It will be great if you can help me. If you have time I can call you or we can communicate by email. Please let me know your I'd. My Id: breddy1121@gmail.com 
Thank you in advance 
Regards,
Bharath 
  • asked a question related to Quantitative Genetics
Question
10 answers
Is it possible to calculate heritability estimates from data with unrelated sires?
Relevant answer
Answer
when your data consist of many unrelated sires and their offspring ( paired data ) you can calculate heritability by regression offspring on sire ( heritability = 2b )
  • asked a question related to Quantitative Genetics
Question
6 answers
The observed frequencies of genotypes (AA Aa, aa) and genotypes (BB, Bb, bb) in the controls were (160, 181, 79) and (182, 132, 6).  Both demonstrated a significant departure from HWE (both p=0.03, goodness-of-fit x2 test).
Possible explanations:
1.    Genotyping errors? (The study followed well-described genotyping methods; 5% random samples were assayed twice, concordance >98%; The ’a’ and ‘b’ allele frequencies in controls are very close to that in similar populations from previous reports)
2.    The influence of study design? (Case-control design was used in the study.  Each case-control pair was matched on gender, 5-yr age group, and study site).
3.     Chance?
4.     Any other possible explanations?
Possible impact on risk estimates?
Thank you for your attention and contribution in advance.
Relevant answer
Answer
Dear Ping,
the differnce between observed and expected is not really high:
      obs exp
AA 38% 36%
Aa 43% 48%
aa 19% 16%
Sum 100% 100%
    
BB 57% 60%
Bb 41% 35%
bb 2% 5%
Sum 100% 100%
Could be only by chance given the small sample numbers????
Kind regards
Burkhard
  • asked a question related to Quantitative Genetics
Question
4 answers
Hi science folks! I am trying to explain the distribution of mutations base in mutagenesis experiments. I calculate mutation frequency based in number of mutants obtained from the total growing, but in order to set the distributions, I have seen the use of an old formula for the mutation rate: -(ln(m0)/n), being "m0" the number of experiments with no mutants found (fraction of 0) and "n" the average bact/ml. However, the mutation rate should be a mesure of mutation per time, so it doesnot seem to me totally appropriate or perhaps I do not understand the maths behind. Is there anyone who used this or other parameters and can help me to understand it? Thanks a lot!
Relevant answer
Answer
 Lea-Coulson Model? (Bear in mind the assumptions however)
  • asked a question related to Quantitative Genetics
Question
8 answers
We are having trouble to determine the threshold line of qPCR in a ABI 7500. We always used automatic settings for determination of the threshold line, but recently the software is setting the threshold line below reaction background. Do you have any idea about what is causing this problem and how we can fix it? Attached is following an amplification plot of the wrong threshold. Thank you.
Relevant answer
Answer
If you still need the data from such a run, 0.1 manual dRn threshold could be used to apprehend reasonable Cq values here.  If your FAM is being counter-balanced by TAMRA on the 3' end - you may have higher initial unquenched signal (TAMRA doesn't totally FRET-quench FAM).  MGB-NFQ on the 3' end is much quieter. BHQ is often better as well.  If this is not a calibration issue, it seems like you should not be afraid to use manual threshold settings (memorize them for each different target from same-tissue-same-isolation-method-same-master-mix protocols/work-flows). In your image, the machine is being fooled by the orange line intersecting the thresh at ~7 cycles.  The thresh line needs to be gently nudged out of its confusion by the user here ... to about 0.1 dRn.
  • asked a question related to Quantitative Genetics
Question
6 answers
Hi everybody.
I am planning to screen a group of postmenopausal women for the presence of any genetic variation in genes involved in antioxidant defence
Relevant answer
Answer
Let p^ = population proportion of class of interest, here p^ = 0.52; Za/2 = population distribution for one sided test; and E = maximum error allow, say 0.03. The population proportion sample size is given by:
n = Za/22p^(1 - p^) / e2
If Za/2(0.95) = 1.96; p^ = 0.52 and e = 0.03, then the sample size is:
n = (1.96)2(0.52)(0.48) / (0.03)2
n = 3.84(0.2496) / 0.0009
n = 0.9585 / 0.0009
n = 1065
  • asked a question related to Quantitative Genetics
Question
5 answers
I have been dealing with some data from PLINK. Actually, from those stored in the GWAS catalogue, which are GWAS performed by many groups. I want to understand the meaning of the p-values and the possible correlation with odds ratios.
I understand that a p-value refers to the statistical significance of the association to a particular SNP, and an odds ratio refers to the increase risk of having that particular SNP among individuals of a given population. PLINK presents beta values instead of odds ratios, which is basically a log of the odds ratios. But is it possible to predict or calculate a potential odds ratio from a p-value for a given SNP?  
Thank you!
Relevant answer
Answer
And odds ratio is an estimated parameter like a mean or a slope on a regression line.  A p-value is not a parameter but is a probabilty of obtaining your results under the null hypothesis. If the p-value is very small then you can feel confident rejecting the null hypothesis. I the case of an odds ration a small p-value means that it is very unlikely that the odds ratio = 1 (no difference between the two odds). Mathematically equivalent is the confidence interval which is really more informative than the p-value as it gives you an upper and lower limit on the odds ration estimate. If the odds ratio CI covers 1 then the odds are the same. 
  • asked a question related to Quantitative Genetics
Question
4 answers
I calculate GLMs where the response variable is presence/absence of each allele. There is 6 alleles in total, the organism is diploid and I suppose that using Bonferroni correction with adjusted alpha = 0.05/6 may be too conservative, as there are two alleles per individual. Unfortunately, I cannot assign the allele to locus. Has any of you dealt with similar problem?
Relevant answer
Answer
Thanks for the link, the paper is very useful.
  • asked a question related to Quantitative Genetics
Question
10 answers
I have recently come across a clinical study that expressed gene expression in the following way: "RNA results were then reported as 40-DeltaCt values, which would correlate proportionally to the mRNA expression level of the target gene." (Where delta Ct was the difference between the Ct values of the gene of interest and a reference gene. In this case 40 cycles were used for amplification.) In what type of experiments is it useful to apply this (40 - delta Ct) calculation? How does this relate to the more frequently applied  2(deltaCt) - method?
Relevant answer
Answer
The number 40 has no meaning. You can substitute it by any arbitary other value without changing the interpretation.
The interpretation is not wrong, but the way is unfortunate, and - to my opinion - rather silly.
I think that, as Edward already pointed out, the dct is calculated as ct[target gene] - ct[reference gene]. This is in fact a counter-intuitive way to calculate a log ratio, since this way actually the expression-signal of the reference gene is normalized to the expression.signal of the target gene. Students are blamed for less stupid things.
When the dct is calculated this (unfortunate or silly) way, then higher dct values indicate a lower normalized target gene expression.Some authors try to put this right by presenting -dct as a measure or a log normalized expression value. This is fine, but would be unnneccesary if right from the start the expression of the traget gene would have been normalized to the expression of the reference gene (i.e. dct = ct[ref] - ct[target]).
It makes no difference for the interpretation if I give just -dt, 40-dct, oor x-dct with x being any arbitrary value. dt-values are relative measures. They have no meaning on their own, but they can be compared between groups (e.g. treated vs. control). The comparison is typically (and sensibly) done by calculating the difference of dct values, called the ddct value(what is a log fold-change or log ratio of the normalized expressions in the groups). You see that whatever value x is, it will anyway cancel out.
To my opinion, calculating dct as ct[target]-ct[reference] is un-natural, counter-intuitive, and misleading. Although it is not wrong (as long as it is not interpreted wrongly), this clearly is a bad scientific practice and should be avoided.
The reason for calculating it this way are:
1) people do not understand where this formula comes from and
2) people just repeat the unfortunate mistake of Livak et al. who first reported the use of dct and ddct values for quantification (instead of recognizing the mistake and to make it better).
There is no harm in stating in the methods section how the dct is calculated. When there is written that dct = ct[ref] - ct[target] and that ddct = dct[treated] - dct[control], everything is clear and noone needs to be confused about signs and directions.
People doing real-time PCR should know and understand that
Ft = p * N0 * E^ct
where:
Ft : Fluorescence at the threshold
p:  (unknown) proportionality factor (depending on the assay)
N0: (unknown) starting amount of the amplicon sequence
E: amplification efficiency (may be assumed to be 2.0)*
ct: the (measured) ct-value
* for the ddct-method it is only required that E is identical for all assays.
This is the fundamental equation for real-time PCR. I am very sad that only very few people working with real-time PCR know this and are able to understand this.
What does a ct-value mean then?
take the logarithm (to the base E): log(Ft) = log(p) + log(N0) + ct
solve for ct: ct = log(Ft) - log(p) - log(N0)
You can see that the ct-value is proportional to minus log(N0). The lower the ct, the larger is N0.
The dct for two genes/assays A and B is
dct = ct[A] - ct[B]
dct =( log(Ft[A]) - log(p[A]) - log(N0[A]) ) - ( log(Ft[B]) - log(p[B]) - log(N0[B]) )
dct =( log(Ft[A]) - log(Ft[B]) - log(p[A]) + log(p[B]) ) + ( log(N0[B]) - log(N0[A]) )
dct = Z + log(N0[B] / N0[A])
So the dct is a log ratio of B/A. This means, the values of B is normalized to the value of A. So A serves as "reference".
This log ratio is shifted by a constant Z. The value of Z is unknown, but it is constant for the combination of assays and it will cancel out in the calculation of ddct values. This unkwon part of the dct value makes it a "relative" measure. We do not know relative to what it is. So a dct just remains a number. But different dct-values from the same combi of assays are comparable, because they are relative to the same (still unknown) thing.
This means: knwing that a dct value is, say, -6.3 does not tell my anything. And I do not get any more information from 40-6.3, or from -5-6.3. But if I have two dct-values, one for treated and another for controls, then I can well ineterpret the difference between these dct values.
ddct = dct[trt] - dct[ctl]
ddct = ( Z + log(N0[B,trt] / N0[A,trt]) ) - ( Z + log(N0[B,ctl] / N0[A,ctl]) ) 
ddct = log(N0[B,trt] / N0[A,trt]) - log(N0[B,ctl] / N0[A,ctl])
Z cancelled out!
ddct = log( (N0[B,trt] / N0[A,trt]) / (N0[B,ctl] / N0[A,ctl]) )
Here you see that ddct is the log-ratio the normalized amounts of B, normalized to the amounts of A. It makes perfect sense when A is a reference gene and B is a target gene. Therefore I say that the dct and ddct should be calculated as
dc = ct[reference gene] - ct[target gene]
ddct = dct[treated] - dct[control]
  • asked a question related to Quantitative Genetics
Question
7 answers
the qPCR standard curve for a B.bifidium DNA starting from 20ng to 0.00002ng is giving strange Ct values , eventhough i repeated the qPCR three times in case i mixed up the serial dilution tubes. The Ct value for the highest conc DNA is higher than the less concentrated !What could be the reason? Knowing that the efficiency of the reaction is 90.2%. Below are the average Ct values i got .
20ng---avg Ct=20.37
2ng----avg Ct=16.06
0.2ng----avg Ct=19.02
0.002ng----avg Ct=23.45
0.0002ng---avg Ct=31.29
0.0000ng----avg Ct=35.56
Relevant answer
Answer
Hi,
very probably you have some inibithory. So when you diluite the template, these sostances do not interfere with your taq. But if you diluite a lot your template, you have not template to start the rt-pcr.
How do you extract your template?
I hope to be clear. Best. Olga
  • asked a question related to Quantitative Genetics
Question
6 answers
I have measured the sugar content of some potato landraces, These data are not normally distributed, I think this is mainly because I have some few individuals with extreme phenotypes. I have read in association mapping articles that many authors use normalized phenotypic data to perform association analysis. For me it is not clear the fundamentals of using transformed data into these analysis. I would be very grateful to hear your opinion into this matter. 
Relevant answer
Answer
The extreme values may cause higher variance and co-variance on one direction which may leads to wrong interpretation of results.  If you have only few genotypes as extreme, it will be good to remove the same and proceed for analysis.