Discussion
Started 2nd Aug, 2019

How to calculate expected Heterozygosity of Haploid organism?

How to calculate Expected Heterozygosity of Haploid organism and what will be the observed heterozygote frequency if we collect data for multiple markers?

Most recent answer

8th May, 2022
Arafat Rahman
University of California, Riverside
You can do it in R using hierfstat package. Checkout this tutorial: https://arftrhmn.net/fst-for-haploid-data-in-r/

All replies (6)

19th Aug, 2019
Sonti Roy
National Institute of Animal Biotechnology
@ Muriel Gros-Balthazard
In Genalex Program... You can calculate expected Heterozygosity for haploid organism
18th Jun, 2020
Sonti Roy
National Institute of Animal Biotechnology
17th Aug, 2020
Dušan Sadiković
Swedish University of Agricultural Sciences
Sonti Roy , do you have the rest of this document?
18th Aug, 2020
Sonti Roy
National Institute of Animal Biotechnology
8th May, 2022
Arafat Rahman
University of California, Riverside
You can do it in R using hierfstat package. Checkout this tutorial: https://arftrhmn.net/fst-for-haploid-data-in-r/

Similar questions and discussions

Adegenet function "df2genind" problem: dataframe read as one individual?
Question
2 answers
  • Matthew PenneyMatthew Penney
Hello,
I'm trying to create a genind object of an SNP dataset from Whelk to examine allelic richness between sites for some COI_16S sequences (combined sequences from the same individual).
My dataframe is in long format with individual SNPs coded as nucleotides(A, T, C, or G) and uninformative sites coded as "NA". There were two columns in the initial dataframe identifying specimens and populations that were selected out as a character vector "ind" and factor "popchar", respectively, and removed from the dataframe prior to converting. An additional character vector indicating loci names was also created, "locichar." The final dataframe contained the SNPs alone and was called "Whelk_SNP_file_dataonly".
Here is the code for the function df2genind():
df2genind(Whelk_SNP_file_dataonly, sep=" ", ploidy=1, ncode=1, ind.names=ind, loc.names=locichar, NA.char="NA", pop=popchar)
Each time I do this and look at the object using head(), I get this:
// 1 individual; 33 loci; 66 alleles; size: 15 Kb // Basic content @tab: 1 x 66 matrix of allele counts @loc.n.all: number of alleles per locus (range: 2-2) @loc.fac: locus factor for the 66 columns of @tab @all.names: list of allele names for each locus @ploidy: ploidy of each individual (range: 1-1) @type: codom @call: .local(x = x, i = i, j = j, drop = drop) // Optional content @pop: population of each individual (group size range: 1-1)
The file includes data from 242 individuals, not one. I cannot figure out why the function is reading my data as one individual. Does anyone have an idea of why adegenet is doing this and how I might fix it?
How do you create a SNPs alignment (fasta) from combined vcf without gaps where there is not vcf info?
Question
5 answers
  • Ana Valero RelloAna Valero Rello
Hi,
I'm creating a fasta alignment of concatenated SNPs from a combined vcf file, but I'm having some trouble with the gaps.
I'm using the useful script vcf_tab_to_fasta_alignment.pl (Christina Bergey, 2012, url = "http://code.google.com/p/vcf-tab-to-fasta" ). From an input.vcf.gz it I obtain all SNPs aligned to the reference in fasta format (see below). However, the SNPs coordinates are different between isolates so the loci with no SNPs info are filled with gaps (-).
I get:
>REF ATCCTTGCA
>ID1 -CT-A-CT-
>ID2 CG---CAA-
Is it possible to fill in those gaps using the nucleotide in the reference genome? I imagine that it can be achieved by using a simple script but I'm not a programmer.
I wish:
>REF ATCCTTGCA
>ID1 ACTCATCTA
>ID2 CGCCTCAAA
Any help would be very much appreciated! :D
Many thanks,
A
Help with adegenet R package?
Question
3 answers
  • Adelina LarsenAdelina Larsen
Hello everyone,
I have a matrix of 170 genotypes and 3745 loci. I have added population origin to the matrix for further information when using adegenet package.
So, the first column is dedicated to genotypes (in rows), the second column is for "pop" and the third column is for locus1, fourth column is for locus2, etc. I have read adegenet tutorials.
I saved my file in a csv file. I loaded in R. All well.
The problem comes after I convert the data frame in a genind object. (I have 10 geographical origins on my data. The data frame is called adeg1):
h <- df2genind(adeg1, ploidy=2, pop=1:10, sep="/")
adegenet converts the data frame into a genind object called "h"
When I see my genind object:
h
/// GENIND OBJECT /////////
// 170 individuals; 3,747 loci; 7,670 alleles; size: 6.8 Mb
// Basic content
@tab: 170 x 7670 matrix of allele counts
@loc.n.all: number of alleles per locus (range: 2-170)
@loc.fac: locus factor for the 7670 columns of @tab
@all.names: list of allele names for each locus
@ploidy: ploidy of each individual (range: 2-2)
@type: codom
@call: df2genind(X = adeg1, sep = "/", pop = 1:10, ploidy = 2)
// Optional content
@pop: population of each individual (group size range: 1-1)
The loci number is not 3745 (seems that is counting the other two columns (genotype & pop)).
And, most important, when I try to use the function:
grp <- find.clusters(h, max.n.clust=40)
R retuns this:
pop is given but has invalid length
Error in validObject(x) : invalid class “genind” object: FALSE
I guess there is a problem with the "pop", because I set the original matrix without pop and grp function worked.
Can anyone help me?.
Thanks in advance,
Adelina.

Related Publications

Article
This paper gives a development of ideas presented in a previous paper on the dynamics of haploid genetic populations in an infinite habitat. A brief review of necessary ideas and a discussion of a concept of value for haploid populations are given. A general model for a diploid population, which incorporates life tables, mating of individuals, and...
Article
The dynamics of a 3-locus infinite population with non-overlapping generations and panmixia was studied. Loci are di-allelic: two loci affect fitness under cyclical symmetric haploid selection while the third one is a modifier of recombination (rec-modifier). Selection favors alternatively haplotypes AB and ab or Ab and aB. It has been proven that...
Got a technical question?
Get high-quality answers from experts.