Hi everyone,
I used DESeq2 for DE analysis of featureCounts data from RNAseq of 2 treatments (A & B) with 2 replicates each.
I have the same error as in this thread: Error in designAndArgChecker(object, betaPrior) : variables in the design formula cannot have NA values
Error in designAndArgChecker(object, betaPrior) :
variables in the design formula cannot have NA values
But there is no NA in my count data!
Here are the code lines:
> cts = as.matrix(read.delim("countdata.txt", header=TRUE, sep="\t", row.names="Geneid"))
> head(cts)
sampl1 sample2 sample3 sample4
ENSG00000223972 0.62 4.92 6.01 5.63
ENSG00000227232 2592.98 2453.12 2891.61 2821.74
ENSG00000278267 53.61 44.98 13.35 12.96
ENSG00000243485 4.93 6.39 2.67 3.71
ENSG00000284332 0.39 0.62 0.12 0.37
ENSG00000237613 1.00 0.40 0.00 0.25
> colnames(cts)
[1] "sample1" "sample2" "sample3" "sample4"
> coldata = read.table("Design.txt", header=TRUE, sep="\t", row.names=1)
> head(coldata)
treatment replicate
sample1 A R1
sample2 B R1
sample3 A R2
sample4 B R2
> colnames(coldata)
[1] "treatment" "replicate"
> all(rownames(coldata) == colnames(cts))
[1] TRUE
> dds = DESeqDataSetFromMatrix(countData = round(cts), colData = coldata, design = ~ treatment)
converting counts to integer mode
> keep <- rowSums(counts(dds)) >= 10
> dds <- dds[keep,]
> dds$treatment <- factor(dds$treatment, levels = c("B", "A"))
> dds = DESeq(dds)
Error in designAndArgChecker(object, betaPrior) :
variables in the design formula cannot have NA values
This script is working with another featureCounts data set with similar experimental design (2 treatments with 2 replicates), so I don't understand why it doesn't work with this particular data set.
I'd really appreciate if you could point out what could be the problem.
Thank you very much for your help!
Hi Michael,
Thanks a lot for your very quick reply.
I have this error for running your suggested function:
> anyis.na(dds$treatment)
Error in anyis.na(dds$treatment) : could not find function "anyis.na"
I don't know why the markup swallowed the paren. It should be:
any( is.na( dds$treatment ) )
> any( is.na( dds$treatment ) )
[1] TRUE
This is my Design file:
Sample_ID treatment replicate
sample1 A R1
sample2 B R1
sample3 A R2
sample4 B R2
So something about how you are processing the file is giving you an NA before you run DESeq(), and DESeq() can’t handle NA in the covariates.
I figured it out. It's a typo mistake in 1 sample name in the Design file.
Thank you very much again Michael! and sorry for the false alarm!