Question

# What does the warning message "1: In is.euclid(d) : Zero distance(s)" mean?

I would like to estimate the genetic distance using simple matching approach "Simple matching coefficient of Sokal & Michener (1958)"
My file name is snp and I run the command as follow
snp
for (i in 1:10) {
d <- dist.binary(snp, method = 2)
cat(attr(d, "2"), is.euclid(d), "\n")}
and the results were
> for (i in 1:10) {
+ d <- dist.binary(snp, method = 2)
+ cat(attr(d, "2"), is.euclid(d), "\n")}
TRUE
TRUE
TRUE
TRUE
TRUE
TRUE
TRUE
TRUE
TRUE
TRUE
Warning messages:
1: In is.euclid(d) : Zero distance(s)
2: In is.euclid(d) : Zero distance(s)
3: In is.euclid(d) : Zero distance(s)
4: In is.euclid(d) : Zero distance(s)
5: In is.euclid(d) : Zero distance(s)
6: In is.euclid(d) : Zero distance(s)
7: In is.euclid(d) : Zero distance(s)
8: In is.euclid(d) : Zero distance(s)
9: In is.euclid(d) : Zero distance(s)
10: In is.euclid(d) : Zero distance(s)
I would like to ask you,
1- what does the warning message mean?
I would like to be sure that I correctly deal with missing values.
2- How can I deal with missing values?

The warning message means that there is at least a Zero distance in your d distance matrix. This should be caused by at least two identical rows in your snp matrix (so the distance between them is equal to 0). In fact, the warning is given by the is.euclid() function and just is there to inform you that have two identical rows (and maybe you want to simplify them).
This is an example:
# ALL ROWS DIFFERENT - No warning
> snp <- matrix(c(1,0,0,0,0,0,1,1,1,1,0,1), nrow=3)
> snp
[,1] [,2] [,3] [,4]
[1,] 1 0 1 1
[2,] 0 0 1 0
[3,] 0 0 1 1
> dist.binary(snp, method=2)
1 2
2 0.7071068
3 0.5000000 0.5000000
> is.euclid(dist.binary(snp, method=2))
 TRUE
#ROWS 1 AND · IDENTICAL - Warning
> snp <- matrix(c(0,0,0,0,0,0,1,1,1,1,0,1), nrow=3)
> snp
[,1] [,2] [,3] [,4]
[1,] 0 0 1 1
[2,] 0 0 1 0
[3,] 0 0 1 1
> dist.binary(snp, method=2)
1 2
2 0.5
3 0.0 0.5
> is.euclid(dist.binary(snp, method=2))
 TRUE
Warning message:
In is.euclid(dist.binary(snp, method = 2)) : Zero distance(s)
By other way, dist.binary doesn't allow for missing values as it only accepts FALSE(0)/TRUE(any positive integer) binary values. If you try to add any NA (missing value) it yields an error:
> snp <- matrix(c(NA,0,0,0,0,0,1,1,1,1,0,1), nrow=3)
> snp
[,1] [,2] [,3] [,4]
[1,] NA 0 1 1
[2,] 0 0 1 0
[3,] 0 0 1 1
> dist.binary(snp, method=2)
Error in if (any(df < 0)) stop("non negative value expected in df") :
missing value where TRUE/FALSE needed
Please note that in you example code you are doing the same thing in the 10 iterations of the for loop. I think that code is adapted from the example given in the dist.binary vignette where the loop is intended to show the outcome for all the 10 methods. So, you should use method=i instead of method=2:
d <- dist.binary(snp, method = 2)
7 Recommendations

5th Feb, 2014
Lindsay Virginia Clark
University of Illinois, Urbana-Champaign
I am not sure why you are running a "for" loop since it looks like you are simply performing the same calculation ten times. I am guessing the error meant that something went wrong at the dist.binary stage and produced a dist object d that is empty. Have you looked at the contents of the R object "snp" to make sure your data was imported correctly? Have you looked at the contents of the object "d" to make sure that dist.binary did what was expected?
The warning message means that there is at least a Zero distance in your d distance matrix. This should be caused by at least two identical rows in your snp matrix (so the distance between them is equal to 0). In fact, the warning is given by the is.euclid() function and just is there to inform you that have two identical rows (and maybe you want to simplify them).
This is an example:
# ALL ROWS DIFFERENT - No warning
> snp <- matrix(c(1,0,0,0,0,0,1,1,1,1,0,1), nrow=3)
> snp
[,1] [,2] [,3] [,4]
[1,] 1 0 1 1
[2,] 0 0 1 0
[3,] 0 0 1 1
> dist.binary(snp, method=2)
1 2
2 0.7071068
3 0.5000000 0.5000000
> is.euclid(dist.binary(snp, method=2))
 TRUE
#ROWS 1 AND · IDENTICAL - Warning
> snp <- matrix(c(0,0,0,0,0,0,1,1,1,1,0,1), nrow=3)
> snp
[,1] [,2] [,3] [,4]
[1,] 0 0 1 1
[2,] 0 0 1 0
[3,] 0 0 1 1
> dist.binary(snp, method=2)
1 2
2 0.5
3 0.0 0.5
> is.euclid(dist.binary(snp, method=2))
 TRUE
Warning message:
In is.euclid(dist.binary(snp, method = 2)) : Zero distance(s)
By other way, dist.binary doesn't allow for missing values as it only accepts FALSE(0)/TRUE(any positive integer) binary values. If you try to add any NA (missing value) it yields an error:
> snp <- matrix(c(NA,0,0,0,0,0,1,1,1,1,0,1), nrow=3)
> snp
[,1] [,2] [,3] [,4]
[1,] NA 0 1 1
[2,] 0 0 1 0
[3,] 0 0 1 1
> dist.binary(snp, method=2)
Error in if (any(df < 0)) stop("non negative value expected in df") :
missing value where TRUE/FALSE needed
Please note that in you example code you are doing the same thing in the 10 iterations of the for loop. I think that code is adapted from the example given in the dist.binary vignette where the loop is intended to show the outcome for all the 10 methods. So, you should use method=i instead of method=2:
d <- dist.binary(snp, method = 2)
7 Recommendations

## Related Publications

Article
Full-text available
Methodical aspects of using the analysis of DNA single-nucleotide polymorphism (SNP-analysis) for certification and identification of maize lines are considered. It is shown that SNP-genotyping is a method with high discriminatory potential that can differentiate maize lines among themselves and is recommended to use for certification of maize line...
Article
To investigate the relationship of common single nucleotide polymorphisms (SNPs) of the beta(2)-adrenergic receptor (AR) gene at codons 16 and 27, and the intermediate phenotype of airways hyperresponsiveness. A case-control study in 543 white men (152 case patients and 391 control subjects), who were nested in an ongoing longitudinal cohort. Subje...
Article
Full-text available
Background Current World Health Organization guidelines for conducting anti-malarial drug efficacy clinical trials recommend genotyping Plasmodium falciparum genes msp1 and msp2 to distinguish recrudescence from reinfection. A more recently developed potential alternative to this method is a molecular genotyping assay based on a panel of 24 single...
Got a technical question?