cqn package : Error in if (any(lengths <= 0)) stop("argument 'lengths' need to be greater than zero") : missing value where TRUE/FALSE needed
1
0
Entering edit mode
Sally • 0
@e79b98fb
Last seen 2.0 years ago
United States

Hello,

I am new to R and programming in general. I am currently working on RNAseq mouse data. I quantified my dataset with salmon then used Txtimport to create a countmatrix and finally deseq2. I am trying to run the cqn package but I keep having errors! Here are my code

txi <- tximport(files, type="salmon", tx2gene=tx2gene, ignoreTxVersion=TRUE)
ddsTxi <- DESeqDataSetFromTximport(txi, colData = samples, design = ~ condition)
dds <- DESeq(ddsTxi)

head(dds)
class: DESeqDataSet 
dim: 6 31 
metadata(1): version
assays(8): counts avgTxLength ... replaceCounts replaceCooks
rownames(6): ENSMUSG00000000001 ENSMUSG00000000003 ... ENSMUSG00000000037
  ENSMUSG00000000049
rowData names(23): baseMean baseVar ... maxCooks replace
colnames: NULL
colData names(3): SampleID condition replaceable

dds2[is.na(dds2)] <- 0  #I started to remove na from dds then got my gc content
countsdds2 <- counts(dds2)
GC_content <- getGeneLengthAndGCContent(rownames(countsdds2), "mm10", mode="org.db")
head(GC_content)
                   length        gc
ENSMUSG00000000001   3262 0.4421179
ENSMUSG00000000028   2252 0.5006543
ENSMUSG00000000031   2460 0.5560708
ENSMUSG00000000037   6397 0.4864495
ENSMUSG00000000049   1594 0.5017579
ENSMUSG00000000056   4806 0.4936730

mcols(dds2)$gc <-  GC_content[,2]
mcols(dds2)$len <-  GC_content[,1]
fit <- cqn(countsdds2, mcols(dds2)$gc, mcols(dds2)$len)

Error in if (any(lengths <= 0)) stop("argument 'lengths' need to be greater than zero") : 
  missing value where TRUE/FALSE needed

If I try to remove NA from gc and len then I have another error because length and x don't have the same number of rows of counts.

If anyone can help me, I will be very grateful.

Thank you,

Normalization R cqn • 1.0k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 4 days ago
United States

If you have Ensembl IDs, you should use mode = "biomart" for getGeneLengthAndGCContent. The OrgDb packages are all natively based on NCBI Gene IDs, so you are (under the hood) mapping from Ensembl to NCBI gene IDs, which is not one-to-one, and there are any number of genes that don't map at all. On the other hand, using biomaRt which queries an Ensembl-based database will eliminate that mapping issue.

ADD COMMENT
0
Entering edit mode

Thank you! It actually worked out!!

ADD REPLY

Login before adding your answer.

Traffic: 487 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6