My data set is a large gene count table which contains 500,000 genes as rows and 6 columns as samples ( 1:3 are control and 4:6 are disease). Whilst some of my samples have a few 0 values, no column contains all 0 values.
I am trying to run a PCoA plot following Bray-Curtis dissimilarity calculation.
Here is my code; gctab <- read.csv("final.gene.count.table.nonzero.csv", row.names=1)
DF = data.frame(id=colnames(gctab),type=rep(c("ctrl","disease"),each=3)) dds = DESeqDataSetFromMatrix(gctab,DF,~type)
vsd <- vst(dds, blind=TRUE) vegDistOut=vegdist(t(assay(vsd)),"bray")
vegDistOut=vegdist(t(assay(vsd) + min(assay(vsd))),"bray") ### this does not work either.
I cannot proceed with making the PCoA, because the error message I get is: In vegdist(t(assay(vsd)), "bray") : results may be meaningless because data have negative entries in method “bray”.
I am not too sure how to overcome this error. Please could anybody advise?
Edit; vegdist is part of the Vegan package in R. The problem is arising because the vst is producing negative values, and bray curtis dissimilarity can only be calculated with positive values. Not too sure whether anybody would suggest using the zinbwave package for preprocessing rather than vst?
Thank-you
It's not clear what package
vegdist
comes from, but you should add that as a tag so the maintainer will get an email. I don't think it's a function in DESeq2, so Mike Love is the only one who is getting an email.