Hi,
I am using topTages to output DEs into a text file. I was using two datasets and one set with the same function was working fine while the other is throwing NAs for all non-significant accessions without filtering. Please see the two scenarios below;
Dataset 1
> summary(de <- decideTestsDGE(et, adjust.method="BH", p=0.05))
TuMV-Mock
Down 90
NotSig 18696
Up 87
> out <- topTags(et, n="Inf", adjust.method = "BH", sort.by = "PValue", p.value = 0.05)
> dim(out)
[1] 177 4
Dataset 2
> summary(de_Aphid <- decideTestsDGE(et_Aphid, adjust.method="BH", p=.05))
TuMV_Aphid-Mock_Aphid
Down 2573
NotSig 13432
Up 2400
>
> out_Aphid <- topTags(et_Aphid, n="Inf", adjust.method = "BH", sort.by = "PValue", p.value = 0.05)
> dim(out_Aphid)
[1] 18405 4
When I export it to a file, it has the following at the end (for notsig 13432) ;
AT5G51140 -0.508549995231026 5.13973842760081 0.0134929828738446 0.0499373315489865
NA NA NA NA NA
NA.1 NA NA NA NA
NA.2 NA NA NA NA
NA.3 NA NA NA NA
What am I doing wrong here? :| Can someone help?
It's soo strange! :( This is the full code.
I don't see any problems with your code. Your code is very old-fashioned, like edgeR code from >10 years ago, but not wrong.
I do not see any way that the code you give could produce the output that you show in your question. The values you show even have "NA" for row names, and edgeR does not allow that in any circumstances.
The output you show does occur if you attempt to access rows of the top table that don't actually exist. That is an R phenomenon and is external to edgeR. For example:
The NAs are not part of the edgeR object, but are shown by R because I have attempted to access rows of the top table that don't exist. The top table itself has only 20 rows but I have attempted to access rows 21:24, so R adds in NAs to represent the non-existent rows.
The
dim(out_Aphid)
output that you show in your question also seems impossible. Since you setp=0.05
in thetopTable
call, thenout_Aphid
should only have only 4973 rows, not 18405 as shown in your output.I have to ask what version of R and what version of edgeR you are using. The latest version of edgeR is 3.38.0.
Thank you for the reply, Gordon. I adopted code from Griffith Lab (https://rnabio.org/) given I am still very new to R. Any advice (or learning material) to improve it will be greatly appreciated. I'm away from the office for the weekend. On Monday morning, I'll post the R and edgeR version information.
edgeR comes with a 120 page pdf User's Guide, which would be the natural place to start. If you have installed edgeR, then you already have the User's Guide on your own computer. Just type
edgeRUsersGuide()
.Thank you. It is very comprehensive. I already started using it. Your F1000 methods article "A guide to creating design matrices for gene expression experiments" is also extremely useful.
Hi Gordon Smyth , I was using edgeR ver 3.36.0 and R version 4.1.3.