Discussion
Started 2 August 2019

How to calculate expected Heterozygosity of Haploid organism?

How to calculate Expected Heterozygosity of Haploid organism and what will be the observed heterozygote frequency if we collect data for multiple markers?

Most recent answer

Arafat Rahman
Oregon State University
You can do it in R using hierfstat package. Checkout this tutorial: https://arftrhmn.net/fst-for-haploid-data-in-r/

All replies (6)

Sonti Roy
National Institute of Animal Biotechnology
@ Muriel Gros-Balthazard
In Genalex Program... You can calculate expected Heterozygosity for haploid organism
Sonti Roy
National Institute of Animal Biotechnology
Dušan Sadiković
Swedish University of Agricultural Sciences
Sonti Roy , do you have the rest of this document?
Sonti Roy
National Institute of Animal Biotechnology
Arafat Rahman
Oregon State University
You can do it in R using hierfstat package. Checkout this tutorial: https://arftrhmn.net/fst-for-haploid-data-in-r/

Similar questions and discussions

I got "NaN' of Fst from vcftools. Does anyone know how to fix this problem?
Question
Be the first to answer
  • Byungwook ChoiByungwook Choi
Hello. Everyone.
I have been calculating Fst of my sequencing result obtained from RADseq.
I've done variant calling using GATK and finished filtering step of some procedures with vcftools.
Anyhow, I made final vcf file having 138 variants.
I just wanted to know Fst between two different populations in that vcf file, and tried to gain the value using vcftools.
I ran the following command;
vcftools --gzvcf $in/FIN.vcf.gz --weir-fst-pop $pop/A.txt --weir-fst-pop $pop/B.txt --fst-window-size 10000 --out $pop/A-B
After that, I saw a results of something weird.
Using zlib version: 1.2.11
Warning: Expected at least 2 parts in FORMAT entry: ID=PL,Number=G,Type=Integer,Description="Normalized, Phred-scaled likelihoods for genotypes as defined in the VCF specification">
Warning: Expected at least 2 parts in INFO entry: ID=AC,Number=A,Type=Integer,Description="Allele count in genotypes, for each ALT allele, in the same order as listed">
Warning: Expected at least 2 parts in INFO entry: ID=AC,Number=A,Type=Integer,Description="Allele count in genotypes, for each ALT allele, in the same order as listed">
Warning: Expected at least 2 parts in INFO entry: ID=AF,Number=A,Type=Float,Description="Allele Frequency, for each ALT allele, in the same order as listed">
Warning: Expected at least 2 parts in INFO entry: ID=AF,Number=A,Type=Float,Description="Allele Frequency, for each ALT allele, in the same order as listed">
Warning: Expected at least 2 parts in INFO entry: ID=MLEAC,Number=A,Type=Integer,Description="Maximum likelihood expectation (MLE) for the allele counts (not necessarily the same as the AC), for each ALT allele, in the same order as listed">
Warning: Expected at least 2 parts in INFO entry: ID=MLEAC,Number=A,Type=Integer,Description="Maximum likelihood expectation (MLE) for the allele counts (not necessarily the same as the AC), for each ALT allele, in the same order as listed">
Warning: Expected at least 2 parts in INFO entry: ID=MLEAF,Number=A,Type=Float,Description="Maximum likelihood expectation (MLE) for the allele frequency (not necessarily the same as the AF), for each ALT allele, in the same order as listed">
Warning: Expected at least 2 parts in INFO entry: ID=MLEAF,Number=A,Type=Float,Description="Maximum likelihood expectation (MLE) for the allele frequency (not necessarily the same as the AF), for each ALT allele, in the same order as listed">
Keeping individuals in 'keep' list
After filtering, kept 20 out of 182 Individuals
Outputting Windowed Weir and Cockerham Fst estimates.
Weir and Cockerham mean Fst estimate: -nan
Weir and Cockerham weighted Fst estimate: -nan
After filtering, kept 168 out of a possible 168 Sites
Run Time = 0.00 seconds
Mean Fst and weighted Fst were all 'NaN'.
Have you ever seen this problem?
Although I have been searching to solve this problem and trying to modify based on the vcftools manual, I am still in fog.
If you guys know the way to breakthrough this problem, please be my light.
Thank you!

Related Publications

Article
A 2-player evolutionary game defines frequency-dependent selection between two alleles in the haploid Moran model of population genetics. A simple inequality of fixation probabilities for the two singleton mutant types in the population is supposed to determine the risk dominant strategy in the game. Proof is developed that risk-dominance fails to...
Article
The dynamics of a 3-locus infinite population with non-overlapping generations and panmixia was studied. Loci are di-allelic: two loci affect fitness under cyclical symmetric haploid selection while the third one is a modifier of recombination (rec-modifier). Selection favors alternatively haplotypes AB and ab or Ab and aB. It has been proven that...
Got a technical question?
Get high-quality answers from experts.