Hello
I'm analyzing some data from tcga (tumor x normal) with deseq2, but some genes are returning with really big log2 fc and p values. Like this:
baseMean | log2FoldChange | lfcSE |stat | pvalue | padj<br>
ENSG00000121691.4| 278.024.793.462.181 | -147192720471788 |0.170612654907734| -862.730.379.240.558 |6,28E-04| 2,62E-02
ENSG00000250722.4 |438.211.180.060.727| -103006615299621| 0.160160306303726 |-643.146.967.415.759 |1,26E+04| 2,05E+05
Is this ok or did I do something wrong? I used HTSEQ-counts data.
Example of how I performed the analysis:
Data was constructed like this:
Data counts= row (gene ids), columns (sample names)
Metadata= row (same sample names, same order), column (condition - target and control; subject - 1/1, 2/2 - paired by patient).
dds <-DESeqDataSetFromMatrix(countData = rawCountTable, colData = sampleInfo, design = ~ subject + condition)
dds <- dds[ rowSums(counts(dds)) >1,]
dds$condition <- relevel(dds$condition, ref="Control")
dds <- estimateSizeFactors(dds)
dds <- estimateDispersions(dds)
dds <- DESeq(dds)
res <- results(dds, contrast=c("condition","Target","Control"), alpha=0.05)
res
Hi, I didn't generated the counts. I have downloaded the file from TCGA. They have available HTSEQ counts and HTSEQ FPKM for each patient. I have downloaded HTSEQ counts. I contacted support from GDC portal and they said that "There is no further transformation after the HTSeq-Counts data are acquired.".
Maybe you can figure out what's going on with a bit of exploration, is it the case that the gene you pasted above has a mean count of 278 trillion reads? Take a look at the counts for that gene with counts(dds, normalized=TRUE) across samples.
(again showing just part of data, but mean was calculated with all of data).
TCGA-BC-A10Q-T TCGA-BC-A10Q-N TCGA-BC-A10R-T TCGA-BC-A10R-N
3584.647 50295.225 11458.843 53618.712
TCGA-BC-A10T-T TCGA-BC-A10T-N TCGA-BC-A10U-T TCGA-BC-A10U-N
52236.447 35884.749 6243.326 39861.797
TCGA-BC-A10W-T TCGA-BC-A10W-N TCGA-BC-A10X-T TCGA-BC-A10X-N
36771.806 6421.304 26461.536 51103.550
I exported results as .csv, maybe the problems are with Excel that is reading the numbers wrong, and not my counts!!!
How about that?
Ok so everything is now solved?