Appropriate selection of DE list in limma before and after implementation of arrayWeights in conjuction with duplicateCorrelation functions
0
0
Entering edit mode
svlachavas ▴ 830
@svlachavas-7225
Last seen 13 months ago
Germany/Heidelberg/German Cancer Resear…

Dear Bioconductor Community,

because after some re-evaluation and discussion of my previous analyses regarding one of my current projects with my lab coordinators, (https://support.bioconductor.org/p/71730/#71863), i decided to return and perform again some additional (asked) statistical comparisons regarding my expressionSet. In detail, except the first general paired comparison of cancer & adjucent control samples , and also the anatomic tumor location comparison (that i also performed with the essential feedback and help of Gordon, Aaron and other members of the group)---  i proceeded to perform two "separate" analyses--that is compare only the primary cancer samples vs their adjucent ones, but also only the "metastatic"(these primary colorectal cancer that also had synchronous metastases) versus their respective controls, in order to see any common functional enrichment modules and overlaping genes at the end of the analysis. Thus, because the term Meta_factor describes each patient(i.e. value 1 describes both a "metastatic" and control sample of a patient-thus a between subject comparison), and Disease a within subject comparison, i implemented duplicateCorrelation to be able to perform the below comparisons:

                          Disease Meta_factor
St_1_WL57.CEL  Normal           0
St_2_WL57.CEL  Cancer           0
St_N_EC59.CEL  Normal           0
St_T_EC59.CEL  Cancer           0
St_N_EJ58.CEL  Normal           0
St_T_EJ58.CEL  Cancer           0.....

(Just a illustration of the phenotype object):

In my first approach, i continued with:

> condition <- factor(eset.2$Disease, levels=c("Normal","Cancer"))
> pairs <- factor(rep(1:30, each = 2))
> metastatic <- factor(eset.2$Meta_factor)
> f <- paste(condition, metastatic, sep=".")
> f <- factor(f)
> design1 <- model.matrix(~0 +f)
> colnames(design1) <- levels(f)
> dupcor <- duplicateCorrelation(eset.2, design1, block=pairs)
> fit <- lmFit(eset.2, design1, block=pairs, correlation=dupcor$consensus)
> cm <- makeContrasts(Meta_Cancer=Cancer.1-Normal.1 , Cancer= Cancer.0-Normal.0, levels=design1)
> colnames(design1)
[1] "Cancer.0" "Cancer.1" "Normal.0" "Normal.1"
> fit2 <- contrasts.fit(fit, cm)
> fit3 <- eBayes(fit2, trend=TRUE)...

Regarding the inspection of the number of DE genes from my above comparisons, i get (with two cutoffs) for the "non-metastatic comparison" 1133 genes, whereas on the "metastatic" comparison i get no DE genes( no adjusted p-value less than 0.05).

As then i implemented arrayWeights (as my samples are from tissue specimens) which indeed shows a noticeable variation in quality(below the link to the plot), i implemented then arrayWeights along with duplicateCorrelation:

https://www.dropbox.com/s/yr0zzvebqe7s2nm/arrayWeights_new_design.jpeg?dl=0

> aw <- arrayWeights(eset.2, design1)
> w <- asMatrixWeights(aw, dim(eset.2))
> dupcor <- duplicateCorrelation(eset.2, design1, block=pairs, weights=w)
> fit <- lmFit(eset.2, design1, block=pairs, correlation=dupcor$consensus, weights=w)
> cm <- makeContrasts(Meta_Cancer=Cancer.1-Normal.1 , Cancer= Cancer.0-Normal.0, levels=design1)......

Then, my metastatic comparison returns 861 DE genes, where also the number of DE genes for the non-metastatic comparison increases too-1506

I "naively" can assume than possibly due to the smaller number of samples in the "metastatic comparison" (6 vs 6) where on the non-metastatic(24 vs 24) and also due to various other reasons(i.e. sample quality) arrayWeights is essential very beneficial for the metastatic context. But for the other comparison, i still acquired DE genes before implementing arrayWeights. Thus, i should leave my second implementation with both comparisons with arrayWeights, and not make a separate contrast matrix only for the contrast of the metastatic samples with arrayWeights ? Because even the "moderate" increase from 1133 to 1506, might include some interesting genes to the pathophysiology of my system, that could not be "detected" prior to the usage of arrayWeights ??

Thank you for your consideration on this matter !!

Any opinions on this subject ??

Efstathios

limma microarray multifactorial design duplicatecorrelation arrayweights • 1.4k views
ADD COMMENT
0
Entering edit mode

Any ideas or suggestions for this specific topic ?

ADD REPLY

Login before adding your answer.

Traffic: 1012 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6