error: NA counts not permitted
1
0
Entering edit mode
clarisbaby • 0
@clarisbaby-14317
Last seen 6.6 years ago
hey everyone,
I am trying to run the following code for C. elegans paired end data using the following code:


library("biomartr", lib.loc="~/R/x86_64-pc-linux-gnu-library/3.4")
library(BSgenome.Celegans.UCSC.ce11)
export(Celegans, "dataset/ce11.fasta", compress=FALSE)
getGFF(db = "ensembl", organism = "Caenorhabditis elegans", reference = TRUE, path = file.path("dataset"))
system("gunzip dataset/Caenorhabditis_elegans.WBcel235.91_ensembl.gff3.gz")
library(systemPipeR) 
args <- systemArgs(sysma="./param/tophat.param", mytargets="targetsPE.txt") 
system("bowtie2-build ./dataset/ce11.fasta ./dataset/ce11.fasta")
bampaths <- runCommandline(args=args)
read_statsDF <- alignStats(args)
write.table(read_statsDF, "results/alignStats.xls", row.names=FALSE, quote=FALSE, sep="\t")
txdb <- makeTxDbFromGFF(file="dataset/Caenorhabditis_elegans.WBcel235.91_ensembl.gff3.gz", format="gff")
saveDb(txdb, file="dataset/database.sqlite")
txdb <- loadDb("./dataset/database.sqlite")
eByg <- exonsBy(txdb, by=c("tx"))
countDFeByg <- assays(counteByg)$counts
countDFeByg <- as.data.frame(countDFeByg)
countDFeByg[is.na(countDFeByg)] <- 0
write.table(countDFeByg, "results/countDFeByg.xls", col.names=NA, quote=FALSE, sep="\t")
countDF <- read.table("results/countDFeByg.xls")
cmp <- readComp(args, format="matrix", delim="-")
edgeDF <- run_edgeR(countDF=countDF, targets=targetsin(args), cmp=cmp[[1]], independent=FALSE, mdsplot="")
write.table(edgeDF, "results/edgeRcomp.xls", quote=FALSE, sep="\t", col.names = NA)
edgeDF <- read.table("./results/edgeRcomp.xls")
DEG_list <- filterDEGs(degDF=edgeDF, filter, plot = FALSE)
while running the following code i am facing the error as below
     Error in calcNormFactors.default(object = object$counts, lib.size = object$samples$lib.size,  : 
     NA counts not permitted
     Called from: calcNormFactors.default(object = object$counts, lib.size = object$samples$lib.size,
     method = method, refColumn = refColumn, logratioTrim = logratioTrim, sumTrim = sumTrim, 
     doWeighting = doWeighting, Acutoff = Acutoff, p = p)

I have already tried doing 

countDFeByg[is.na(countDFeByg)] <- 0

which is said to remove the NA.

It would be very kind if someone may help me to solve the error or point out where I am going wrong

Thanks in advance.

Claris

 

systempiper edger rna-seq • 2.0k views
ADD COMMENT
0
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 6 hours ago
The city by the bay

Obviously, whatever matrix is being passed to calcNormFactors has NA counts. Removing NAs in countDFeByg won't help if the NAs are being introduced afterwards.

  • Are the NAs present in countDF? What does anyis.na(countDF)) say?
  • Do any of the libraries have zero counts? This will result in NAs if run_edgeR() computes CPMs.

Otherwise, it's something happening inside run_edgeR() (I assume this is where the error is being raised), and you'll have to do debug(run_edgeR) to see how the NAs are being introduced.

ADD COMMENT
0
Entering edit mode

anyNA(countDF) says false...

countDFeByg has all the values as zero. When i started looking around i found that it may be due to BAM and GFF files that use different chromosome names.

how can i solve that..

my gff file looks as below

##gff-version 3
##sequence-region   I 1 15072434
##sequence-region   II 1 15279421
##sequence-region   III 1 13783801
##sequence-region   IV 1 17493829
##sequence-region   MtDNA 1 13794
##sequence-region   V 1 20924180
##sequence-region   X 1 17718942
#!genome-build WormBase WBcel235

where as the alignment file looks as

@HD VN:1.0 SO:coordinate

@SQ SN:chrI LN:15072434

@SQ SN:chrII LN:15279421

@SQ SN:chrIII LN:13783801

@SQ SN:chrIV LN:17493829

@SQ SN:chrM LN:13794

@SQ SN:chrV LN:20924180

@SQ SN:chrX LN:17718942

@PG ID:TopHat VN:2.1.0

 

Thanking you in advance.

ADD REPLY
0
Entering edit mode

Well, obviously, that's going to be a problem. The solution is simple. Fix your GFF file so that the chromosome names have chr in front of them (or in the case of MtDNA, replaced with chrM). rtracklayer has a few commands for importing GFF files; you should be able to edit the sequence names, and then export it back to another file.

ADD REPLY

Login before adding your answer.

Traffic: 948 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6