Question

Extreme weighting values using arrayWeights

0

Entering edit mode

james perkins ▴ 300

@james-perkins-2675

Last seen 10.2 years ago

Hi all, I am comparing 2 phenotypes, with 4 biological replicates for each phenotype, using standard affy rat arrays (rat2302 chip). The difference between phenotypes is quite low and the chips do not cluster well using hierarchical clustering. When I compare the phenotypes using limma I get high fdr values. However when I use array weights I get fdr values orders of magnitude lower (although generally a similar order of genes) in my toptable genelist output. I am however a little sceptical, mainly because the array weights seem quite extreme in terms of how different they are. The code is as follows: ############################ ################# # with weighting expM <- exprs(eset_RMA) ts <- c(rep('Sham',4),rep('VZV',4)) ts <- factor(ts,levels=c('Sham','VZV')) design <- model.matrix(~0+ts) colnames(design) <- levels(ts) arrayw = arrayWeights(eset_RMA,design=design) cat("Unfiltered arrayweights:",arrayw,"\n") fit <- lmFit(expM,design,weights=arrayw) cont.matrix <- makeContrasts( ShamvsVZV = Sham-VZV, levels=design) fit2 <- contrasts.fit(fit, cont.matrix) fit2 <- eBayes(fit2) genelist_weight <- topTable(fit2, coef = 1 , number = featureNumber, adjust="fdr") write.table(genelist_weight,file='RMA_weight_Res.txt',sep="\t", quote=FALSE, row.names=FALSE) write.table(genelist_weight$ID,file='onlyID-Res.txt',sep="\t", quote=FALSE, row.names=FALSE, col.names=FALSE) ################### # Without weighting expM <- exprs(eset_RMA) ts <- c(rep('Sham',4),rep('VZV',4)) ts <- factor(ts,levels=c('Sham','VZV')) design <- model.matrix(~0+ts) colnames(design) <- levels(ts) fit <- lmFit(expM,design) cont.matrix <- makeContrasts( ShamvsVZV = Sham-VZV, levels=design) fit2 <- contrasts.fit(fit, cont.matrix) fit2 <- eBayes(fit2) genelist <- topTable(fit2, coef = 1 , number = featureNumber, adjust="fdr") write.table(genelist,file='RMA_Res.txt',sep="\t", quote=FALSE, row.names=FALSE) write.table(genelist$ID,file='onlyID-Res.txt',sep="\t", quote=FALSE, row.names=FALSE, col.names=FALSE) #################### and the array weights are: 0.5663663 1.011729 1.043341 3.517868 0.5113972 0.3253149 6.825461 0.418734 so for one phenotype the results are: 0.5113972 0.3253149 6.825461 0.418734 Intuitively it seems strange that one array is much more precise than the other 3, which are way below <1. This is leading to my skepticism about the results... it seems almost too good to be true. It also makes me wonder how many discarded arrays and whole experiments may have benefitted enormously from array weights in the past. My question is simply... what are your thoughts on such extreme values? Does it sound like I've done something wrong in my code? Or perhaps I should stop complaining and use the weightings. A small example of my toptable genelists are: With weights: ID logFC AveExpr t P.Value adj.P.Val B 1389589_at 0.77457288247666 7.74045139913265 11.0892719288942 4.34668043426807e-07 0.00503577435866138 6.5 94696641953 1369415_at 0.93679260752035 6.76922751086505 10.8782990394042 5.23040979395345e-07 0.00503577435866138 6.4 4372245012463 1368885_at 0.649670459155076 7.92097004624013 10.5687769995562 6.90073898637616e-07 0.00503577435866138 6.2 1525281704988 1368914_at -0.729385569512816 7.87901231082336 -10.4928754755850 7.39376095726585e-07 0.00503577435866138 6.1 5792530874207 1387998_at 0.898658658686192 6.99457560109567 10.3291078255848 8.59321711388731e-07 0.00503577435866138 6.0 3244169393093 1397317_at 1.08780675280058 8.233543987132 10.0026932390152 1.16657933646706e-06 0.00503577435866138 5.774832301 45392 1373771_at 0.86174024105482 6.15057655743681 9.87485379533585 1.31792335557147e-06 0.00503577435866138 5.6 7113815136158 1392736_at 1.02143967541956 7.2540657002723 9.70827346419623 1.548011169664e-06 0.00503577435866138 5.533584205 92699 1370097_a_at 0.694423466771426 7.01711652256997 9.67148255458559 1.60450494629841e-06 0.00503577435866138 5.5 0282664866487 Without weights: ID logFC AveExpr t P.Value adj.P.Val B 1389589_at 0.763976292801905 7.74045139913265 7.21599644492725 3.96753405748173e-05 0.244892618065609 1.8 8355105071826 1368914_at -0.768154900805746 7.87901231082336 -7.198918687719 4.04430256120508e-05 0.244892618065609 1.870778910 19229 1392736_at 1.00243181553505 7.2540657002723 7.07052168431582 4.67604561943007e-05 0.244892618065609 1.773433205 30607 1390561_at 0.585940649960035 9.77772749089806 7.00089565509309 5.06295373942562e-05 0.244892618065609 1.7 1965842262294 1373416_at 0.748380638598739 7.37833535955067 6.91865859283585 5.56541633075852e-05 0.244892618065609 1.6 5523467357811 1376731_at 0.660349895702276 6.8931169032963 6.69369593102449 7.23913965899781e-05 0.244892618065609 1.473859334 09197 1380128_at 0.669565213058863 8.81473490828032 6.69004109956142 7.27049388279457e-05 0.244892618065609 1.4 7084936117864 1370097_a_at 0.754022396779006 7.01711652256997 6.59976057461934 8.0938008803152e-05 0.244892618065609 1.3 9584414168989 1390398_at 0.761298085052086 6.79062341144723 6.50221189827325 9.09874101750845e-05 0.244892618065609 1.3 1337309014455 1368030_at 1.05560381977689 6.19823184930056 6.45542913940574 9.62809381553364e-05 0.244892618065609 1.2 7328857900046 Any thoughts would be much appreciated. Kindest regards, James Perkins

Clustering affy limma Clustering affy limma • 1.2k views

ADD COMMENT • link updated 16.1 years ago by Matt Ritchie ▴ 50 • written 16.1 years ago by james perkins ▴ 300

score 0 · Answer 1 · 2008-10-24

Dear James, The code in your email looks fine to me. The improved results you observe (i.e. more power to detect differential expression) is also a good indicator that you have indeed down-weighted the less reliable arrays in your weighted analysis (if things went the other way I would be a bit worried). Best wishes, Matt >Hi all, > >I am comparing 2 phenotypes, with 4 biological replicates for each >phenotype, using standard affy rat arrays (rat2302 chip). The difference >between phenotypes is quite low and the chips do not cluster well using >hierarchical clustering. > >When I compare the phenotypes using limma I get high fdr values. However >when I use array weights I get fdr values orders of magnitude lower >(although generally a similar order of genes) in my toptable genelist >output. I am however a little sceptical, mainly because the array >weights seem quite extreme in terms of how different they are. > >The code is as follows: > >############################ > >################# ># with weighting > >expM <- exprs(eset_RMA) >ts <- c(rep('Sham',4),rep('VZV',4)) >ts <- factor(ts,levels=c('Sham','VZV')) >design <- model.matrix(~0+ts) >colnames(design) <- levels(ts) > >arrayw = arrayWeights(eset_RMA,design=design) >cat("Unfiltered arrayweights:",arrayw,"\n") > >fit <- lmFit(expM,design,weights=arrayw) >cont.matrix <- makeContrasts( > ShamvsVZV = Sham-VZV, > levels=design) >fit2 <- contrasts.fit(fit, cont.matrix) >fit2 <- eBayes(fit2) >genelist_weight <- topTable(fit2, coef = 1 , number = featureNumber, >adjust="fdr") > >write.table(genelist_weight,file='RMA_weight_Res.txt',sep="\t", >quote=FALSE, row.names=FALSE) >write.table(genelist_weight$ID,file='onlyID-Res.txt',sep="\t", >quote=FALSE, row.names=FALSE, col.names=FALSE) > >################### ># Without weighting > >expM <- exprs(eset_RMA) >ts <- c(rep('Sham',4),rep('VZV',4)) >ts <- factor(ts,levels=c('Sham','VZV')) >design <- model.matrix(~0+ts) >colnames(design) <- levels(ts) >fit <- lmFit(expM,design) >cont.matrix <- makeContrasts( > ShamvsVZV = Sham-VZV, > levels=design) >fit2 <- contrasts.fit(fit, cont.matrix) >fit2 <- eBayes(fit2) >genelist <- topTable(fit2, coef = 1 , number = featureNumber, adjust="fdr") > >write.table(genelist,file='RMA_Res.txt',sep="\t", quote=FALSE, >row.names=FALSE) >write.table(genelist$ID,file='onlyID-Res.txt',sep="\t", quote=FALSE, >row.names=FALSE, col.names=FALSE) > >#################### > >and the array weights are: >0.5663663 1.011729 1.043341 3.517868 0.5113972 0.3253149 6.825461 0.418734 > >so for one phenotype the results are: 0.5113972 0.3253149 6.825461 0.418734 > >Intuitively it seems strange that one array is much more precise than >the other 3, which are way below <1. > >This is leading to my skepticism about the results... it seems almost >too good to be true. It also makes me wonder how many discarded arrays >and whole experiments may have benefitted enormously from array weights >in the past. > >My question is simply... what are your thoughts on such extreme values? >Does it sound like I've done something wrong in my code? Or perhaps I >should stop complaining and use the weightings. > > >A small example of my toptable genelists are: > >With weights: > >ID logFC AveExpr t P.Value adj.P.Val B >1389589_at 0.77457288247666 7.74045139913265 >11.0892719288942 4.34668043426807e-07 0.00503577435866138 6.5 >94696641953 >1369415_at 0.93679260752035 6.76922751086505 >10.8782990394042 5.23040979395345e-07 0.00503577435866138 6.4 >4372245012463 >1368885_at 0.649670459155076 7.92097004624013 >10.5687769995562 6.90073898637616e-07 0.00503577435866138 6.2 >1525281704988 >1368914_at -0.729385569512816 7.87901231082336 >-10.4928754755850 7.39376095726585e-07 0.00503577435866138 6.1 >5792530874207 >1387998_at 0.898658658686192 6.99457560109567 >10.3291078255848 8.59321711388731e-07 0.00503577435866138 6.0 >3244169393093 >1397317_at 1.08780675280058 8.233543987132 >10.0026932390152 1.16657933646706e-06 0.00503577435866138 >5.774832301 >45392 >1373771_at 0.86174024105482 6.15057655743681 >9.87485379533585 1.31792335557147e-06 0.00503577435866138 5.6 >7113815136158 >1392736_at 1.02143967541956 7.2540657002723 >9.70827346419623 1.548011169664e-06 0.00503577435866138 >5.533584205 >92699 >1370097_a_at 0.694423466771426 7.01711652256997 >9.67148255458559 1.60450494629841e-06 0.00503577435866138 5.5 >0282664866487 > >Without weights: > >ID logFC AveExpr t P.Value adj.P.Val B >1389589_at 0.763976292801905 7.74045139913265 >7.21599644492725 3.96753405748173e-05 0.244892618065609 1.8 >8355105071826 >1368914_at -0.768154900805746 7.87901231082336 >-7.198918687719 4.04430256120508e-05 0.244892618065609 1.870778910 >19229 >1392736_at 1.00243181553505 7.2540657002723 >7.07052168431582 4.67604561943007e-05 0.244892618065609 >1.773433205 >30607 >1390561_at 0.585940649960035 9.77772749089806 >7.00089565509309 5.06295373942562e-05 0.244892618065609 1.7 >1965842262294 >1373416_at 0.748380638598739 7.37833535955067 >6.91865859283585 5.56541633075852e-05 0.244892618065609 1.6 >5523467357811 >1376731_at 0.660349895702276 6.8931169032963 >6.69369593102449 7.23913965899781e-05 0.244892618065609 >1.473859334 >09197 >1380128_at 0.669565213058863 8.81473490828032 >6.69004109956142 7.27049388279457e-05 0.244892618065609 1.4 >7084936117864 >1370097_a_at 0.754022396779006 7.01711652256997 >6.59976057461934 8.0938008803152e-05 0.244892618065609 1.3 >9584414168989 >1390398_at 0.761298085052086 6.79062341144723 >6.50221189827325 9.09874101750845e-05 0.244892618065609 1.3 >1337309014455 >1368030_at 1.05560381977689 6.19823184930056 >6.45542913940574 9.62809381553364e-05 0.244892618065609 1.2 >7328857900046 > > >Any thoughts would be much appreciated. > >Kindest regards, > >James Perkins