Question

values (NA) in p value Deseq2 (reopen)

0

Entering edit mode

Johan Largo • 0

@8e20af93

Last seen 3.4 years ago

Colombia

Hello how are you? I reopen this question because the following has happened:

I am doing a differential expression exercise using the hisat2, stringie & DESeq2 workflow. Finally I use the python prepDE.py script recommended in the StringTie manual to extract the counts.

So far so good, I have rows of genes and columns with cases (controls and patients) with number of counts. Now, when using Deseq2 when establishing the differential expression with nbinomWaldTest, I get results in p value with (NA). The question that I was reading forums why these boxes appear with NA values and they tell us that:

If within a row, all samples have zero counts, the baseMean column will be zero, and the log2 fold change estimates, p-value, and adjusted p-value will be set to NA.
If a row contains a sample with an extreme count outlier, the p-value and the adjusted p-value will be set to NA. These outliers are detected by Cook's distance.
If a row is filtered by independent automatic filtering, having a low mean normalized count, only the adjusted p-value will be set to NA.

It is suggested that as filters are deactivated as follows:

res <- results (dds, cooksCutoff = FALSE, independentFiltering = FALSE)

However, in doing so I still have boxes with NA, I really don't know what I'm doing wrong and I hope someone can help me.

I share the script that I have use.

library("DESeq2")
setwd("C:/Users/ADMIN/Desktop/tvt/")
expression_data <- read.table("C:/Users/ADMIN/Desktop/tvt/gene_count_matrixv2.csv", row.names = "gene_id", header = TRUE, sep = ";", stringsAsFactors = FALSE)
expression_data$X <- NULL
dim(expression_data)
summary(expression_data)
apply(expression_data, 2, sum)
mx = apply( expression_data, 1, max )
expression_data = expression_data[ mx > 227, ]
condition <-factor(c("control","control","paciente","paciente","paciente","paciente","paciente","paciente","paciente","paciente","paciente","paciente"),c("control","paciente"))
col_data = data.frame(condition)
dds = DESeqDataSetFromMatrix(expression_data, col_data, ~condition)
dds = estimateSizeFactors(dds)
dds = nbinomWaldTest(dds)
dds <- DESeq(dds, minReplicatesForReplace=Inf)
res <- results(dds, cooksCutoff=FALSE, independentFiltering =FALSE)
res = results(dds)
head(res)
res$padj = ifelse(is.na(res$padj), 0.1, res$padj)

pvalue NA R DESeq2 • 1.8k views

ADD COMMENT • link updated 3.4 years ago by ATpoint ★ 4.8k • written 3.4 years ago by Johan Largo • 0

score 0 · Answer 1 · 2021-12-01

0

Entering edit mode

Johan Largo • 0

@8e20af93

Last seen 3.4 years ago

Colombia

I think my mistake is due to this ... (O.o)

dds = nbinomWaldTest(dds)

ADD COMMENT • link 3.4 years ago Johan Largo • 0