Entering edit mode
Hanneke van Deutekom
▴
20
@hanneke-van-deutekom-6220
Last seen 10.5 years ago
Hi,
My experiment consist of control and diseased samples; looking like
this:
colData<-DataFrame(condition=factor(c("A","A","B","B","C","C","D","D",
"E","E","F","F","H","H","I","I")),
type=factor(c("plasma","plasma","plasma","plasma","plasma","plasma","p
lasma","plasma","tissue","tissue","tissue","tissue","tissue","tissue",
"tissue","tissue")),
diseased=factor(c("no","no","no","no","yes","yes","yes","yes","no","no
","no","no","yes","yes","yes","yes")))
To search for differential expression I compare
AvsB, AvsC, BvsD and BvsD
then
EvsF, EvsH, FvsI and HvsI,
So I use the results(dds,contrast=c("condition","H","I")) option in
DESeq2.
And then, these are the results I do not get;
look at this example:
geneid comparison log2foldchange padj countA1 countA2 countC1 countC2
(counts are the normalized values)
xxx AC 2.40588491829641 0.281844992240883 0 0 0 0
yyy AB 3.98976903585264 0.0455960023534269 0 0 0 0
or .. a log2fold change has been found between A and C of 2.4 and
almost 4
in AvsB, while all of these samples are.
I noticed that in all cases where the counts and the normalized counts
of
the two conditions that I tested were all zero I do get a fold change,
even
up to (2log) 5. Often I see a padj of "NA". So I was satisfied with
that,
as pseudocounts could result in some kind of fold change. Leaving
those
with padj "NA" out would be good then.
But as the example above, I sometimes get a padj, which is even
significant
in the AB example above. This propably has something to do with all
the
other values (A1, A2, B1, B2,....,I1,I2);
yyy 0 0 0 0 0 0 0 11 4897697 4167458
4337956 4139125 3731919 9711412 4043504 3796036 (i.e.
massively present in tissue samples but not in blood, because of the
discrepancies between blood and tissue I am not comparing them
directly)
I would like to understand why there is a significance here between A
and
B??
Or should I 'just' split my data in tissue and plasma?? But then
still. I
will find the AC example above, as B and D have counts in that
example,
however small numbers (E-I are also small numbers)
Hanneke
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats graphics grDevices utils datasets
methods
[8] base
other attached packages:
[1] DESeq2_1.2.2 RcppArmadillo_0.3.910.0 Rcpp_0.10.4
[4] GenomicRanges_1.14.3 XVector_0.2.0 IRanges_1.20.4
[7] BiocGenerics_0.8.0
loaded via a namespace (and not attached):
[1] annotate_1.40.0 AnnotationDbi_1.24.0 Biobase_2.22.0
[4] DBI_0.2-7 genefilter_1.44.0 grid_3.0.2
[7] lattice_0.20-23 locfit_1.5-9.1 RColorBrewer_1.0-5
[10] RSQLite_0.11.4 splines_3.0.2 stats4_3.0.2
[13] survival_2.37-4 XML_3.98-1.1 xtable_1.7-1
[[alternative HTML version deleted]]