I imported a data as follows which has first column as genes name and other 21 columns for samples with respective counts for genes using >d=read.csv("path/xyz.csv", as.is=T)
THEN, define group using c function
>group=c("control", "test", "n")
>library(edgeR)
>dge=DGEList(d,group=group,remove.zeros=TRUE)
At this step, I get an error message as follows
"Error in .isAllZero(counts) :
count matrix must be integer or double-precision"
You read your data in using read.csv, which returns a data.frame with the first column being gene names. This is neither a matrix, nor does it contain (only) read counts. If you look at the help for DGEList, it specifically says the 'counts' object should be a matrix of read counts.
I would disagree. You might want me to just answer your questions, but that is only better in a very narrow sense. The errors you are getting are completely self-explanatory, if you were to just read and think. Which is what I am trying to get you to do.
You have now got two(!) errors saying essentially the same thing - that your data need to be numeric. And instead of, like, checking to see if your data are numeric, you post a question here.
You will never get anywhere with R, nor Bioconductor if you aren't able to self-diagnose simple problems. So I will help by pointing out the obvious. R is telling you that the data - which the help page for DGEList clearly states should be a matrix of counts - are not numeric. Why is that? Did you look at the data? Are there columns that contain non-numeric values? Which ones? Did you read the help for read.csv? Do you know how to subset matrices (have you read 'An Introduction to R')?
It is better to provide answer rather than just to comment. Now I am getting another error "Error in colSums(counts): "x" must be numeric"
@James W. MacDonald
I would disagree. You might want me to just answer your questions, but that is only better in a very narrow sense. The errors you are getting are completely self-explanatory, if you were to just read and think. Which is what I am trying to get you to do.
You have now got two(!) errors saying essentially the same thing - that your data need to be numeric. And instead of, like, checking to see if your data are numeric, you post a question here.
You will never get anywhere with R, nor Bioconductor if you aren't able to self-diagnose simple problems. So I will help by pointing out the obvious. R is telling you that the data - which the help page for DGEList clearly states should be a matrix of counts - are not numeric. Why is that? Did you look at the data? Are there columns that contain non-numeric values? Which ones? Did you read the help for read.csv? Do you know how to subset matrices (have you read 'An Introduction to R')?
Definitely, in this era, everyone google or try to find answer oneself. I did check the data by different ways
I saved the file as tab delimited format (txt) and run command
> sapply(file.txt, class)
X B2_015 B2_016 B2_017 B2F_015 B2F_016 B2F_017 B3_003
"factor" "integer" "integer" "integer" "integer" "integer" "integer" "integer"
B3_009 B3_010 B3F_003 B3F_009 B3F_010 C_005 C_008 C_012
"integer" "integer" "integer" "integer" "integer" "integer" "integer" "integer"
The first column "X" is gene names.
Unfortunately, I am neither statistician nor bioinformatician.
Thanks