Science method
SNP Genotyping - Science method
Explore the latest questions and answers in SNP Genotyping, and find SNP Genotyping experts.
Questions related to SNP Genotyping
Hi everyone,
I am working on case control genetic data. I have data for SNP with genotypes AA, AG and GG. I want to check the effect of these individual genotypes on the disease outcome, which in my case is diabetes. Now I want to calculate unadjusted and adjusted (for age and gender) Odds ratio in SPSS. I calculated unadjusted Odds ratio by using multinomial regression (is this suitable for my data?). But do not know how to calculate Odds ratio after adjustment for age and geneder.
Secondly, How can I apply dominant, recessive and co-dominant model to check which fits more.
I would be highly obliged if you provide me the flowchart of commands like
Spss--> analyze--> regression--> .........
Thanks,
Misbah
Two genes are PNPLA3 and TM6SF2
Hi. We have some trio pedigrees and performed whole exome sequencing. We wanted to confirm the relationship between the parents and the child. Is there any recommendations about programs for haplotype phasing and paternity test using WES data. Thanks!
Hi Everyone,
I am working with data of SNPs, I want to do logistic regression analysis.
In multinomial logistic regression, is it compulsory to choose most common genotype as reference? or I can choose any genotype as reference?
In my one SNP (Genotypes: II, ID, DD), when I choose most common II genotype as reference than Odds ratio come out like 0.57, but on choosing ID genotype Odds ratio change to 1.67 with p <0.05. Is it fine to choose heterozygous genotype as reference?
Thanks,
I need CEL1 enzyme to use it in my laboratory work to complete my PhD thesis. Unfortunately, I didn't find it anywhere. So I'm waiting for any request about companies name where is available or anyone who sell it for me.
thank you.
Hi,
Other than randomForest, how do you go about analyzing by GWAS the SNPs genotyping data on categorical phenotypes (say, host species for a pathogen)?
Any pointers would be great!
-Marcin
Hi!
I'm currently looking into setting up a protocol for in situ mutation detection by rolling circle amplification (RCA) in our lab.
I've read lots of publications that have successfully done this in either DNA or transcripts. My doubt comes when designing the the padlock probes around my SNP of interest.
Most articles don't specify how this is done, so I assume this is done manually based on the target sequence?
I have downloaded the ProbeMaker java application, which was developed for the design of padlock probes. However, the software hasn't been updated in about a decade, has limited tutorials and I haven't found it very intuitive. Since I'll only be working with a few probes to start with I guess manual design may be a better option?
I'm aware of previous work (Larsson et al, 2004. Nat Methods) that have shown the optimal length for binding regions and design of cDNA primers (if working from mRNA), but I would appreciate if anyone has any technical recommendations for padlock probe design.
In your experience, has manual design worked out? What are some things to look out for in the design? (I assume standard observations in primer design, like similar Tm, avoiding crossmatching, etc).
Thanks!
Dear community,
I am planning on conducting a GWAS analysis with two groups of patients differing in binary characteristics. As this cohort naturally is very rare, our sample size is limited to a total of approximately 1500 participants (low number for GWAS). Therefore, we were thinking on studying associations between pre-selected genes that might be phenotypically relevant to our outcome. As there exist no pre-data/arrays that studied similiar outcomes in a different patient cohort, we need to identify regions of interest bioinformatically.
1) Do you know any tools that might help me harvest genetic information for known pathways involved in relevant cell-functions and allow me to downscale my number of SNPs whilst still preserving the exploratory character of the study design? e.g. overall thrombocyte function, endothelial cell function, immune function etc.
2) Alternatively: are there bioinformatic ways (AI etc.) that circumvent the problem of multiple testing in GWAS studies and would allow me to robustly explore my dataset for associations even at lower sample sizes (n < 1500 participants)?
Thank you very much in advance!
Kind regards,
Michael Eigenschink
What is the meaning of the string (*) in CYP3A4*22, knowing that this SNP has the ID rs35599367?
Does anyone have recommendations (design and/or protocols) for carrying out SNP genotyping with HRM on a CFX96 without the precision melt analysis package? Is it possible?
My understanding is the sensitivity is gated by the machine (and possibly the calibration kit), so I'm not sure why the software is even needed when open source analysis packages are available.
Thanks in advance!
I have genotyped my lines using Infinium 6k SNIP CHIP and have received variants in hapmap format. I want to convert this into VCF format so that i can annotate variants using some online tools.
We wish to explore pairwise relatedness in haplodiploids in a way that allows asymmetrical coefficients (eg. relatedness of mothers to sons is not the same as sons to mothers, ditto for brother-sister relationships). Queller & Goodnight's Relatedness program will do this but won't accept large SNP genotype datasets. Any recommendations for software that will do this would be fantastic.
How would one associate a single phenotype factor (binary presence / absence of trait) to SNPs in Tassel (v5.2.58) GUI?
I generated VCF file using GATK and am able to import the VCF and phenotype data into Tassel. The phenotype data is just two columns the 'Taxa' and the 'Factor' (as Y and N; where Y = has the phenotype and N = does not).
The desired end result is a Manhattan plot with any SNPs associated with the trait, and ideally, an output file which contains the SNP locations.
I have to find out restriction enzymes and design primers for SNP genotyping, but unable to access the SNP cutter and SNP - RFLPing site. Please give me a solution.
'm doing SNP genotyping in the chickpea association panel, I have fetched 1KB up and downstream sequences flanking SNPs. m trying to find out restriction sites at the particular position i.e. the restriction sites within SNPs. Please provide the details except for the NEB cutter.
Hello people,
I did the SNP genotyping of two different plant species through TASSEL5-GBS pipeline. In one of the species, I am getting significantly high number of missing genotype information (N) (Attached figure 1) but in other species I am not getting high number of missing genotype (Figure 2) information.
What could be the reason of getting high number of missing genotype information (N), how to filter them before performing downstream analyses like GWAS, LD and so on.
Any suggestion?


I did genotyping for a 92 patients, then I calculated the allele frequency for them. I get chi2 = 29, which is high, I have to reject the null hypothesis and indicate that my population is out of Hardy-Weinberg equilibrium. Is this is normal, if it is not, what I can do?
Hi,
we obtained our data from SNP genotyping from external lab. We found out, that there are letters "'D" and "I" in some positions. Do you know what those means? It is also in the reference fields.
example:
1 13901895 chr1_13901894 D I
1 13903334 chr1_13903333 I D
1 13903422 chr1_13903421 I .
Thank you very much for your help.
"The development and validation of a medium density SNP genotyping assay in Shrimp" is a research proposal I'm currently working on. Given the restricted budget allotted (9,600 USD) to the project, I'd like to know ahead of time how much it might probably cost me.
I have performed high-resolution melt analysis using KAPA HRM FAST mastermix; PCR conditions were the same for both experiments - Image 1 and Image 2. However, there is a lot of variability in the two consecutive experiments. I haven't changed any parameter between the two experiments. (Image 2 contains additional samples).
Can anyone point out what I could be the possible reasons for variation in results.
Thank you.


Hello!
I recently received many *.CEL files from a recent UK biobank genotyping. According to the SNPolisher guide, I have to conduct certain metrics on these SNPs to keep just the ones fulfilling essential criteria. It is not clear (at least for me, first time using it) how these CEL files are using in the SNPolisher inputs. The first input is: Ps_Metrics(posteriorFile, callFile, output.metricsFile, pidFile).
Is there a previous step where these cell files are converted into these posterior or cal files? Both should be in *.txt format but all I have are *.CEL files.
Hope for some guidance from any expert!
Kind regards,
B
I have cDNA samples extracted from 90+ individuals which I intend to utilise in a series of qPCR experiments to assess the influence of a specific SNP genotype/s on target gene/s expression. I was unaware at the time that no-RT controls were needed, and it is only now as I prepare to perform qPCR that I have realised that this is recommended.
I am able to synthesise these controls, however was hoping whether someone could clarify:
(1) if a no-RT control is recommended for each cDNA sample prepared?, and if so
(2) whether a no-RT control is recommended for every sample in the qPCR run (i.e., three technical replicates + no-RT control)? I'm assuming no-RT controls would not be required in replicates?
Other QC parameters have been applied: All RNA samples were purified prior to cDNA synthesis, and NTCs were included in every batch. Primers have been designed across two exons, etc.
Many thanks in advance!
Why my genotyping results for a SNP is T>C, but the NCBI reported is A>G? How should I describe this issue in the article?
I have data for SNP with genotypes TT, TC and CC. I want to check the effect of these individual genotypes on the disease outcome, which is iron deficiency. How can I apply dominant, recessive and additive binary logistic regression model to check which fits more in SPSS ??
Hello. I'm a beginner in bioinformatics.
I'm handling Illumina SNP microarray data and got genotyping results of several SNPs. But in some SNP sites, for example, in the case of the SNP site of one individual is "BICF2P1141058", the genotyping result is [A/T] and the other SNP site "BICF2G630601486", the genotyping result is [T/A].
I really do not know the difference between [A/T] and [T/A]. For getting more information, I searched the Illumina SNP genotyping technical note (https://www.illumina.com/documents/products/technotes/technote_topbot.pdf) and found some information that to provide accurate SNP strand and orientation, Illumina offers strand information based on SNP genotype and nucleotide sequences surrounding the SNP.
However, I still do not clearly know why the strand information is needed. The SNP genotype result is already determined by a specific nucleotide pair. Then what kind of additional information does the strand information provide? In the case of my example [A/T] and [T/A], aren't the SNP sites just composed of A and T? Could you please tell me how considering stand information helps in the interpretation of SNP array data?
Thank you!
I have a database of snp genotypes analyzed with an Affimetryx platform I want to analyze with a reference genome, however I find obvious differences between Btau8 / UMD3.1.1 or ARS-UCD1.2, I must work with the latter that has been updated or keep me with UMD3.1.1?
Good day. I am working on collecting larvae samples of a saturnid moth (Imbrasia belina), of which i would like to use for DNA barcoding and SNP genotyping. How do i prepare the specimens to avoid contamination from the gut contents? do i have to degut the larvae prior to preserving in absolute ethanol? so as to extract good grade DNA?
Many thanks in advance..
I am using TaqMan probes for SNP genotyping on LightCycler 480. I am facing some difficulties in resolution of heterozygous genotypes. The software can not identify it as " both alleles" automatically. I am using Universal TaqMan master mix with UNG.
Can anyone suggest me if including UNG activation in cycling conditions will make any difference in fluorescence call of FAM or VIC dye?
Other suggestions?
Hello All,
Greetings,
I have 2 cohort datasets with 500 sample size each and 200 SNPS of genotype files as (example1.gen and example2.gen) as input files and trying to generate other formats from the .gen files such as (example1.bgen and example2.bgen), sample ID and phenotype for the cohort data set (example1.sample and example2.sample) and VCF format (example1.vcf & example2.vcf). I am currently using SNPTEST (latest version) for GWAS analysis . Is there any possible ways to look into it ???
Thanks in advance
We are using AppliedBiosystems TaqMan® SNP Genotyping Assays. Is it ok not to run the samples in duplicate to increase sample size?
To perform HRM analysis for genotyping on CFX96 machine for the first time, BioRad recommends to use HRM calibration kit along with Precision melt analysis software. I am just wondering whether simply using any good quality DNA binding dye along with standard PCR reagents will work or not?
I am working with Amaranth accessions which are green to red in color and contain high phenolic compounds. Using the CTAB method with the slight modification I am trying to get high-quality DNA so that it can be used for SNP genotyping.
So, please anyone can solve this problem? And help me to get high-quality DNA which is required for the SNP chip.
I am working with Amaranth accessions which are green to red in color and contain high phenolic compounds. Using the CTAB method with the slight modification I am trying to get high-quality DNA so that it can be used for SNP genotyping.
So, please anyone can solve this problem? And help me to get high-quality DNA which is required for the SNP chip.
Hi all,
I am trying to calculate homozygosity by loci from my SNPs data according to Aparicio et al., 2006 https://pubmed.ncbi.nlm.nih.gov/17107491/. I have a vcf file of 38,140 SNPs with 60 individuals. Does it sounds sensible to use Cernicalin? It seems that the current Cernicalin version can only handle 30 loci, 200 alleles per locus, and 1200 individuals. There are also two R packages developed for calculating homozygosity indexes,
rhh (Alho et al., 2010. https://onlinelibrary.wiley.com/doi/full/10.1111/j.1755-0998.2010.02830.x) and Genhet (Coulon 2010. https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1755-0998.2009.02731.x). However, it's not clear how the input files were prepared. The genotype information looks different from different datasets. Can the genotype information be represented as GT format extracted directly from the vcf file?
Thanks,
Jia
Hi, I have two data sets from the illumina omniexpress snp array platform. The first data set was mapped using the GRCh37 build and the second one was more recently read using the GRCh38 build. Not surprisingly when I've tried to merge the files in PLINK for a larger analysis it comes up with the warning snp rs... is in a different genetic position. Is there any way to update the build of the first data set? Or suggestions for how best to proceed, I haven't done much genetic analysis before so any help would be welcome :)
Best,
Mari
Dears,
I need help with the interpretation of why not obtain results in real-time PCR when I use Taq man SNP genotyping assay ID: rs 22756913 (IL17A) from Thermo fisher scientific
where the final volume reaction was10 microliter
Taq man SNP genotyping assay (20X) working was 0.3 microliter
DNA sample volume was 3 microliter with a concentration of 0.6 ng
the master mix was 5 microliter
nuclease-free water added to complete the total volume of 10 microliter
but when I use another SNP ID: rs 40401 from the same company (Thermo fisher ) with the same concentration and the volumes, I obtained results in real-time PCR
Regards.
i want to find a specific SNP in a fASTA sequence for a gene?
I am trying to look for positive selection in the genome of a non-model specie, therefore I do not have chromosomes structure, I just have scaffolds and positions of SNPs in a VCF file.
I used different commands to create the input file, but the result is the same error message “Please specify a chromosome name. Conversion stopped”.
Could someone please suggest to me how do it?
Thanks in advance.
Can anyone suggest a reason why the association between my SNPs (90 thousand) and two quantitative traits (BMI and age) has a FDR-adjusted P-value over 0.9 for all my SNPs? I think it is very unlikely that not a single SNP is associated... They come from ichip genotyping and I did an imputation with the Michigan Imputation Server.
I had my files in PLINK vcf format and I transformed them to HapMap using TASSEL, and my phenotypes are in a tab separated file just as in the GAPIT manual.
My code was:
myG_tAPIT <- GAPIT(
Y=myY_t,
G=myG_t,
PCA.total=3,
model=c("CMLM")
)
Any ideas?
Thanks!
Hi,
I am doing a GWAS analysis of european bison's a dog's SNPs. Right now I am doing multiple data clean up, according to previous studies and what I have learned.
I was thinking, if it is ok to test HWE and discard data that does not fit if I am working with animals, where mating is controlled and inbreeding coefficient is high?
And what about tests for population structure etc.? It is obvious that there is going to be some structure in reintroduced and domestic animals. I don't want to loose any important data and I want to be sure, that I have done everything correctly.
Thank you for your help.
I want to design BaseScope probes for human samples of a certain gene in his Wild-type and his mutated form (single nucleotide point mutation). To do that I need to provide stable cell lines that overexpress the WT or mutated form. Does anyone know a way to circumvent the hassle to generate a stable cell line from scratch? Is there any database of cell lines commercially available with SNP’s information?
thanks!
I will be using Axiom Genome-wide Human Origins 1 array. Is there a way to tell whether a sample is contaminated with DNA from another sample during the extraction stage (for example, excessive heterozygous calls)? Is there a way to eliminate alleles of the contaminant from the data? What is the minimimum proportion of contaminating DNA which can be detected by most SNP arrays?
I would like to separately perform Genotyping-by-Sequencing (GBS) on individuals from different non-model species. These species have no close phylogenetically species with complete genome sequenced. Therefore, using GBS I will have a set of SNPs genotypes from genomic DNA for each species. Is currently there a method that can allow me to detect and isolate only loci, among those I have retrieved, that are shared and genotyped in both the two species?
Many thanks!
Paolo
HI,
I am carrying out a project on association of DC SIGN CD209 polymorphism (rs4804803) with severe malaria. I need journal publications with specification on PCR-RFLP cycling condition and incubation temperature (for the restriction enzyme) for the SNP genotyping.
thank you.
Vincent.
What would be the best model/algorithm for Genetic Risk Score calculation (for risk prediction) based on genotype information of multiple SNPs.
For example, I have identified 10 SNPs that are reported to increase the risk of Atrial Fibrillation in several GWAS studies. I've curated data, like OR/Risk allele frequency, for these 10 SNPs. Now, if I get a random sample from a person, in whom all these 10 SNPs are genotyped simultaneously. I would like to calculate cumulative risk score of this person for atrial fibrillation using this genotype information from all 10 SNPs (like additive model). Would it be feasible to calculate such risk score? If yes, can any one suggest/share views/algorithms?
I have a DNA sample which was ran as a duplicate on a real-time qPCR experiment for the purposes of SNP genotyping. Bearing in mind that only one of its two technical replicates amplified, can I consider the genotyping result valid, or do I need to omit it from any further analysis e.g. qPCR call rate and deviation from Hardy Weinberg equilibrium?
I'm trying to understand epigenetic variability but the study I'm performing involves measuring the gene expression in a given region according to the SNP's alleles associated with the increase of that gene's expression. But epigenetic is related to heritable variations that affect the phenotype without affecting the DNA sequence, so can SNPs be used in this study?
I'm new in the genetics field and really confused in this matter, any help would be appreciated! Thank you
Which of the following SNP Tping Chemistries for large scale genotyping of plants?
Thermo - TaqMan®
Fluidigm - SNP Type
rhAmp SNP Genotyping
LGC - KASP
rhPCR
high resolution melt (HRM)
DASH-2
ARMS PCR
cleaved amplified polymorphic sequence (CAPS)
SNP Array
Qbead system
Can any body provide insights for the above assays in terms of cost and applicability, advantages and disadvantages of the assays for crop breeding.
Hi everyone,
I have a computer background without any biological knowledge. For one part of my PhD I have been given 44 samples of data (I attach a part of one sample to this question). I know it is a vcf file but wanted to know the exact type of data ( SNPS, RNA Seq ,...) to find suitable tools to work on them. I should calculate the similarity between each pairs ( two patients) and build a similarity matrix based on them to give it to a classifier and ...
I want to know which similarity measurement method is used for this type of data ( for example jaccard) and is there any tools to calculate it.
your help and biological knowledge is highly appreciated.

Dear my professors and colleagues,
I prepare a review about use biotechnology and molecular marker at poultry breeding.
I want your help.
I can't seem to identify conclusively if these are homo or heterozygous. How should I best do this? How high should the second peak be for me to take it as a heterozygous SNP? These are from human samples.


I am planning to standardize a TaqMan duplex PCR assay for allelic discrimination. I intend to use FAM and TAMRA for one probe. Which dye and quencher would be best for the second probe that can work efficiently in combination with FAM?
I have data for which differential gene expression was found. Now I am interested in finding SNPs only in the genes which are differentially expressed (gene name is known).
Could someone suggest me how to predict SNPs for the specific gene alone?
I am working on building a linkage map based on SNP genotypes from a single full sib family. Most other studies I've seen have used JoinMap. I wanted to know if anyone had used JoinMap or any good alternatives (maybe in R?) that could still be used with full sib data. Is there a large computational bottleneck with any of these softwares?
Which type of PCR techniques is the best in your opinion?
There are several different types of PCR techniques. Any technique is used as the best for most reliable, accurate diagnosis, fast, and lower cost.. in your opinion? Why?
Please, Share your experience in this field.
All appreciation for all contributions.
Hi all,
My research related to identifying SNP combinations associated to disease. I found that there are many methods to measure the association between SNP combination and disease. Could you show me some good measure methods for this issue?
Thank you in advance!
Best wishes,
Hi,
I have just a list of SNPs located in a specific gene, and I want to select Tag SNPs for genotyping. It was possible to do it using Haploview but Hapmap data are not updated anymore.. It is possible to do it using PLINK but I don't have a .ped file because I still don't genotype any sample. So, I really don't know how can I select TagSNPs with only a list of ID (RS) SNPs. Can anyone know a way to do it?
Thanks in advance,
I will using tetra-primer ARMS PCR for genotyping specific SNP in adiponectin gene. I have taken the 4 primers designed in a previous paper. but I noticed that the paper used one annealing temp for the 4 primer in one pcr reaction. I don't know will this give accurate result or not. I want to know is it better to use 2 annealing temp (one for outer primers and one for inner primers) but I don't want how many cycles for the Ta of outer primers and how many for Ta of inner primers
we are using arms pcr method for snp genotyping so i want to know the principle behind this pcr and how it differs from normal pcr
I'm attempting to use a Biomeme two3 portable qPCR machine for SNP genotyping as part of an MSc project. I am using DNA from lizard blood samples purified with the Biomeme M1 purification kit and PrimeTime primers and probes designed and manufactured by IDT for our SNP of interest.
The amplification curves we've obtained so far look extremely strange. Our most recent runs consistently show a small, early increase in fluorescence around cycle 10, which almost immediately levels off (after 2-3 cycles). Two previous qPCR runs (with the same reagents) show patterns of increasing and decreasing fluorescence which seem almost random - at the time we assumed this represented background noise, but the relative changes in fluorescence were actually much higher than for more recent runs, suggesting that perhaps something more was going on?
Any help interpreting this somewhat perplexing data would be much appreciated!



Generally, my data will be falling into a 3 x 3 contingency tables.
AA AB BB
Low
Mid
High
We are looking at adrenal suppression by measuring post-metyrapone test ACTH values which are grouped as Low-range(<106pg/ml), Mid-range(106-319pg/ml) and High-range(> 319pg/ml). We have 5 SNPs data from TaqMan Assay. We want to look at which SNP genotypes/alleles are associated with our grouped ranges. One of the SNPs did not have the BB genotype observed in our data.
So we have:
AA AB
Low 33 3
Mid 24 5
High 26 5
1. Could this locus be informative without the BB genotypes?
2. Can I still do HWE analysis on these?
3. How do I calculate the risk of A or B?
4. How do I check the association of these genotypes with the Phenotypes (what will be the measure of association)?
5. What association test will be appropriate for the general 3x 3 data?
6. Which software or tool or publication can be worthy of use or reading?
I will very much appreciate your kind responses.
Regards
Wisdom
I purchased Custom Taqman SNP genotype assay mix from ThermoFisher around April 2018. All the assay mix was aliquoted into small volumes upon arrival. So I use up an entire tube per experiment instead of freeze-thaw, and they are light sensitive so I cover them with aluminum foil. The basic principle of the assay mix works like this: Primer 1 has base A and is attached to VIC fluorophore, primer 2 has base G and is attached to FAM fluorophore. Depending on which SNP is in my DNA samples, the primer will be conjugated and the corresponding fluorophore will emit fluorescence at a certain wavelength. The signal amplifies as rxn goes.
I recently extracted rat genomic DNA using the Qiagen DNeasy Blood and Tissue kit. The concentrations, 260/280, 260/230 were measured using a nanophotometer. The concentrations were above 200ng/ul and ratios are all above 1.8. I furthered confirmed the integrity by running a DNA gel to make sure there is genomic DNA.
However, when I tried to genotype the SNP in my genomic DNA, the reaction doesn't work. I've tried with 10ul rxn/ well instead of the suggested 25ul because I'm just testing, I did this by dividing every component in the well by 2.5. So in each reaction I have: 5ul of EagleTaq Universal Mastermix, 0.5ul of 10X or 20X SNP assay mix, and 4.5ul of genomic DNA at 5ng/ul, 10ng/ul and 20ng/ul. None of these works.
The program was adapted from the Taqman SNP genotype manual. In the protocol, I added a 50C 2min pre-incubation according to one of the troubleshooting technicians. Then it's 10min of 90C activation, then 40 cycles of 15secs of 90C and 1min of 60C. Then just a 1 minute 40C cooling step. The result is that there is no amplification of fluorescence.
Can someone please help me and suggest possible reasons for why there is no fluorescence amplification? Could it be not enough SNP assay mix? mastermix? or DNA? Please help!


I have 50 patients genotyped using Illumina Infinium Global Screening Array and I wonder whether it is possible to type the HLA allele in each patient using the SNP data I have. GSA data sheet say it has 400+ bio markers in HLA genes. Do I need to impute the SNPs or can I directly identify whether the Tag SNPs are present in SNP genotype data and infer the HLA allele quickly?
Your help is much appreciated.
I’ve extracted dna from animal tissue using Qiagen dnaeasy blood and tissue kits. The tissues are frozen at -80 before I took pieces from them and extracted dna. The ratios of 260/280 and 260/230 are all between 1.8 to 2.2. The concentrations vary a bit between 100 to 700ng/ul. I usEd Qiagen elution Buffer to elute , but used water to dilute dna because I know that edta will affect downstream SN genotyping. The dna samples are stored at -20C unless I take them out for dilutions. However, when ran on a dna agarose gel of 1% at 75V for 1.5hr, with dna concentrations between 15-40 ng/ul loaded at 5 ul per well, there is visible degradation. I am planning on doing SNP genotypinG after extraction and am not sure if this will be successful with partially degraded DNA. Ideally undegraded genomic DNA is needed. Can someone please suggest what else I can do to Improve this or rescue this? Thank you!

I have been working with SNP markers and I want to calculate Jaccard distance using NTSYSpc 2.2. I tried to prepare my SNPs data as 0, 2 for homozygotes and 1 for heterozygote but it did not work out. Could anyone tell me how to format input data. Any suggestion would be appreciate.
I am on the periphery of several projects looking at developing SNP genotyping arrays for relatively small numbers of markers, and to be run on relatively small numbers of individuals: potentially hundreds of individuals total, but with runs capable of doing few individuals at a time.
The SNP discovery phase has already been done through GBS.
I am wondering if anyone has any opinions on the best method to do this?
The questions to be answered with the data are things like: 1) matching an individual predator to predation marks, 2) relatedness of invading individuals, 3) landscape genetics.
I am guessing something like SNaPshot would be most cost effective for when just tens of markers are required? But what about when more markers are needed?
Obviously just doing more GBS would be best when thousands of markers are needed - probably landscape genetics questions would be best served by doing this and waiting for individuals to fill the runs. But is there a cost effective option somewhere in the middle in terms of marker number?
Halo, I would like to design probes for SNP genotyping. Did anyone design such probes and can give me a few tips how to handle?
Thank you,
Joanna
I am doing SNP genotyping on Lightcycler 480 using taqman probes and Taqman Universal Master Mix with UNG. Sometime the software get a low signal for heterozygous samples and the software is not able to call it as "both alleles". Will there be any differences if I include the UNG activation step in my protocol? Is there any way to increase the heterozygous call of FAM and vic?
Hello,
I would like your comments/suggestions for my strategy.
I have F0 samples with two different phenotype.
I have F2 samples with unknown phenotype.
I would like to create a library with homozygous F0 genotypes.
Then i would like to genotype my F2 samples using this previously created library.
I already pre-processed BAM files (I have all raw data if required) : bowtie2 and samtools.
STRATEGY:
1) Create genotype library with F0 samples:
- GATK HaplotypeCaller for both F0 phenotype samples : java -Xmx30g -jar GenomeAnalysisTK_3-8.jar -nct 16 -T HaplotypeCaller -R GENOME --emitRefConfidence GVCF -I INPUT.bam -o OUTPUT.g.vcf
- Merge the results: java -Xmx16g -jar GenomeAnalysisTK_3-8.jar -nt 16 -T GenotypeGVCFs -R GENOME --variant F0Variant1.g.vcf --variant F0Variant2.g.vcf -o Results_Merge_F0.vcf
- then i used a homemade script to select only position with homozygous genotype and different genotype between both F0 phenotype samples (like 1/1 for a F0 sample and 0/0 for the other one): Results_Merge_F0_filtered.vcf
2) Genotype F2 sample with the library:
- GATK HaplotypeCaller : java -Xmx30g -jar GenomeAnalysisTK_3-8.jar -nct 16 -T HaplotypeCaller -R GENOME --emitRefConfidence GVCF -I INPUT.bam -o OUTPUT.g.vcf -L Results_Merge_F0_filtered.vcf
- then i used a homemade script to identify genotype related to one (or the other) F0 phenotype.
BUUUUUUT ...
At this last step i mostly got homozygous SNP for my F2 samples...
I should get around 25% phenotype1 -- 25% phenotype2 -- 50% phenotype 1/2
I miss something but I don't know where.
I have used the following procedures but am still having strong primer dimer bands while the desired product is absent.
The cycling conditions for PCR program that i used were 5 min at 95C for activation followed by 35 cycles of 95oC for 30 s for denaturation, 55oC for 30 s for annealing, 72C for 30s for elongation and a final cycle 72C for 10min for final elongation
the primer used are
F: CAGCCATACAGGGCATCCAG
R: ACAGATGGGTGTGTGGGGAT
and expected pcr product size is 250
please i need a help about what is wrong
I want to get data about different SNPs i am working on. especially their role in change in structure and function of gene and regulatory regions. What are different tools available for this type of analysis?
Could anybody please explain how to translate the output from GENELAND of population memberships to use in the program DISTRUCT or if there is some software/R code to convert it please?
I am reporting some cross-phenotype genetic correlations (LDSC method) that are statistically significant after multiple test correction. Peer reviewers have asked us to report top SNPs that might contribute to these correlations. I understand that genetic correlations reflect a genome-wide effect, but I want to satisfy reviewers/readers. Has the field arrived at a best practice for identifying/prioritizing SNPs/regions that mediate a significant genetic correlation? I have a few ideas below, but hope to get feedback from the community of experts.
1) It would be simple to threshold the 2 sets of GWAS results at some p-value (arbitrary threshold?) and compare the overlap of the two results. The surviving results could be clumped to LD-independent markers and nearby genomic features could be reported. Simple, but arbitrary selection.
2) Meta-analysis has been a common approach for combining data across phenotypes. We might see some signals persist or become more significant if a SNP is associated with both phenotypes. We also might see some associations driven by only one phenotype, so this method isn't exactly specific to SNPs contributing to correlation. Overall, this approach doesn't seem to answer the exact question I'm asking. From my cursory understanding, an MTAG analysis also wouldn’t exactly answer my question.
3) pHESS is a new method that seems promising for addressing my question. https://www.biorxiv.org/content/biorxiv/early/2016/12/08/092668.full.pdf; however, this approach does not control for ancestry stratification as nicely as the LDSC method.
As an extension on this question: Have we arrived at methods for assessing whether the SNPs/regions mediating a genetic correlation are significantly enriched for biological pathways/functions? The approach laid out here seems relevant, but I haven't had a chance to read the article (https://www.biorxiv.org/content/early/2017/03/07/114561)
Hi! We've recently purchased Fisher TaqMan SNP APOE Genotyping Assays and have been running them on brain samples with the BioRad CFX96 thermocycler. Test brain samples are supposed to be homozygous E3/E3 and we don't seem to be getting that. Has anyone had experience with this assay and have any tips on analyzing these results in terms of CTs or data interpretation? The protocol was less than helpful, and we're having difficulty finding help online. Thanks!
I want to use HRM analysis for SNP genotyping and I need to amplify just the fragment of target gene that include these two SNPs.
Hello,
I am working with SNP genotyping by PCR-RFLP method. When I put the FASTA sequence of my target region in the NEB cutter to know the relevant cut site of the restriction enzymes, it shows cleavage affected by cpg methylation or cleavage affected by other methylation in case of some enzymes. Should I use those enzyme for genotyping my SNP locus?? I think if there is any methylation in DNA then restriction enzymes will not cut that site. Then how can I determine the polymorphic allele?? Is there any in silico method to check DNA methylation before doing RFLP??? Please suggest.
In 23&Me files for personal genetics services, I found some genetic variants named like (i6033918) and the genotype is (II). What does this mean?
I am looking at genotypes caused by allele A and allele G. The minor allele is allele G with frequency 0.01. Consequently I did not get recessive homozygotes during genotyping of patient samples. So, is it necessary to test whether these genotypes are in HWE ?
I am going to inspect positive selection for a list of SNPs (more accurately loci which contain my SNPs). I am trying to use the rehh package in R for working on 1000 genomes project phased data.
rehh requires as input :
- Standard haplotype format. Each line represents a haplotype (the rst element being the haplotype id.) with SNP genotype in columns.
- A SNP information file: This data file should contain SNP information. Each line correspond to the SNP name, its chromosome of origin, its position on the chromosome, its ancestral and derived alleles (as coded in the haplotype input file).
I am novice in this field and really need your help to make such input files.
Any suggestion would be appreciated greatly.
Anyone using BFDstar and trying to get their SNP alignment in binary format?
They have a pretty good tutorial, but I am having trouble using the R package phrynomics to convert my data.
I keep getting this message:
> TranslateBases(snpdata, translateMissing = FALSE, ordered = FALSE)
Error in `[<-.data.frame`(`*tmp*`, j, , value = c("B", "E", "G", "I", :
replacement has 12 items, need 37
And since I am not a R expert (way far from that) I can not figure out what that means, not even with all the google. Even in the phrynomics google groups they are not answering me.
Anyone can help? That will be much appreciated.
Thank you!
Dear all,
I am trying to design primers for a AS-PCR and I am facing some problems with one of my target SNPs. As for the others, I have been considering the following when designing the AS primer:
- The target SNP [A/G] will be the base at the 3’end of the primer
- Introduced an extra mismatch at the 2nd (or 3rd) base closest to the 3’end of the primer
It worked great for two of my target SNPs. For the third one, the method kept on failing. I had only a couple of sequences available, and thus I have been sequencing the possible fragment around this target SNP (amplicon of around 100 bp only) to be able to check for other sources of variability. And yes, they are there:
GCAGCAAATCGGRMGMGCAGTG[A/G]GCGGATCAYTYYTTCACCTGCC
Flanking my target SNP, I have identified 3 other SNPs. My idea was to try and design degenerated primers that would still be allele specific. Made a clumsy attempt, without success. Does anyone have tips or suggestions on how to work this out? Thank you for your feedback.
Hi Everyone,
I am working with data of SNPs, I want to do logistic regression analysis.
In multinomial logistic regression, is it compulsory to choose most common genotype as reference? or I can choose any genotype as reference?
In my one SNP (Genotypes: II, ID, DD), when I choose most common II genotype as reference than Odds ratio come out like 0.57, but on choosing ID genotype Odds ratio change to 1.67 with p <0.05. Is it fine to choose heterozygous genotype as reference?
Thanks,
I am working with several immortalized cell lines and need a protocol to uniquely ID each, perhaps by genotype? SNP array? Anyone have a good SOP for this? or know of a company that provides such a service?
Thank-you in advance!
I have to do SNP gentyping assay, so I am planing to do Genotyping by sequencing (GBS)assay. However my problem is that I have reference genome of one parent but do not have reference genome of other two parents and moreover my material is also allotetraploid. In GBS I might be in trouble during assembling the reads of samples that do not have reference genome because its polyploidy nature will impose hurdles. Please suggest me something
I have done the Real Time PCR for SNP genotyping with TaqMan probe genotyping assays for 1000 patients. In this method, we have detected the SNP with VIC and FAM due to TaqMan. It is characterized by the green and yellow colors.
Now, how do we distinguish that whether VIC or FAM detected the SNP?
Thank you there.
I want to genotype 3 specific SNPs from a single gene in about 150 human patients. I want the cheapest method, I'm a university student and I'm going to pay it by my own.
Thanks for your time and help.
In our research, we have three groups of genotypes: 1) Resistant, 2) Susceptible and 3) Unknown (maybe resistant, maybe susceptible). We performed NBS (Nucleotide Binding Site) profiling in all genotypes and the results are ready as 0/1 scores. Thus, we want to find a way to make a correlation between the presence/absence of bands and the resistance/susceptibility of genotypes to guess the resistance status in unknown genotypes. I would be grateful if you could kindly share your experiences in this context.
As in my study I used set of 18 microsatellite markers for the genetic diversity study but I am not sure this number is relevant for my research.
How can I calculate haplotype frequency from genotype frequency of two polymorphisms in the gene?
For example
Variant 1 cases controls
CC 20 40
CT 50 40
TT 30 20
Variant 2 cases controls
GG 22 30
GA 28 40
AA 50 30
I suggest haplotype as
CG
CA
TG
TA
Thanks
a project in our group investigating the association between specific and drug resistance. One of these SNPs has been reported that the wild type alleles GG have been altered into TT. Using Sequencing, we found that the wild type allele GG have been altered into CC. How could we explain that regardless of ethnicity variations.
Thank you in advance.
can anybody suggest a friendly software for the analysis of LD and population structure with large SNP data in polyploids?
in the HELP of the Genpop online of how you can input your data
genepop.curtin.edu. /help_input.html
they input the allele size of the locus as in the attached file
I ask about the homozygous of allele size 90
it was written as 090090
this means that it has only one band of 90 bp ?
but another one was written as 000230
this means that it has one band at 230 bp
why it wasn't written as 230230?
is this not considered homozygous

FBAT software provides several different models to run family-based association test, such as dominant, recessive and additive. However, when I used PLINK software, I just can select --tdt, --dfam, or --qfam-total to do family-based association test. I don't know which model PLINK applied. I looked at the document and instruction, but I can't find answer. Does anyone know?
Thanks
Daniel
I have a list of selected SNPs in which I am interested for genotyping. The conducting company needs the flanking sequence of each SNP ? How do I get this or how can I download it from NCBI Gene (it links me directly to the whole gene?)
I am looking forward for your answers
The materials are durum wheat accessions (allotetraploids with A and B genomes; 85% landraces + 15% improved varieties) and the software used to analyse the data is Axiom Analysis suit. v2.
Hi everybody,
I would want to describe my figure showing pairwise genetic differentiation between Vipera berus individuals against logarithm of distance.
More precisely, i would want to know how interpret pairwises with a high rousset's distance, but i can not find any explanation on what is a high or low value. And if this is explain in Rousset 2000, i don't understand.
So, is there a consensus on what constitutes a high and low value of â rousset's distance ?
Thanks in advance