XPS: new quality control features (e.g. residual images)
0
0
Entering edit mode
cstrato ★ 3.9k
@cstrato-908
Last seen 6.2 years ago
Austria
Dear all, Let me first thank Dan Tenenbaum and Herve Page for their continuous support of the special requirements of xps. It is their credit that the daily multiple platform build/check works so seamlessly. Now I am pleased to announce that the new release of "xps" does finally support many quality control features, asked for by many users for quite some time: These quality control features are based on the new S4 class "QualTreeSet", which adds quality control features for the raw intensities, background-corrected and normalized intensities, as well as to the summarization. Concretely, the following QC methods are new: - borderplot(): boxplot of positive and negative border elements of arrays. - coiplot(): "Center of Intensity" plots of positive and negative border elements. - corplot(): a heat map of the array-array Spearman rank correlation coefficients. - madplot(): an array-array expression level distance plot (heat map). - nuseplot(): NUSE boxplots for raw data, background adjusted and normalized intensities. - pcaplot(): a PCA plot of the first two principle components of expression levels or their correlations. - rleplot(): RLE boxplots for raw, background adjusted and normalized expression levels. - image(): a heat map of residual images (both residuals and weights). - YES, you can draw residual images for Exon ST arrays, Gene ST arrays and plate arrays! - plotAffyRNAdeg(): Affymetrix RNA degradation plots. - YES, you can draw RNA degradation plots for Exon ST arrays, Gene ST arrays and plate arrays! - Both, residual images and RNA degradation plots can be created for the raw intensities, background corrected intensities and normalized intensities (see parameter 'qualopt'). Furthermore, improvements in the handling of data, especially of large datasets, have been made in the C++ code: - treeInfo(): many tree data such as min/max intensities, quantiles, percent P/M/A calls are now preprocessed during creation of the root trees and are stored in the trees as tree info ('userinfo'); this information allows to draw boxplots, callplots, etc much faster, and is accessible by calling method treeInfo(). The following QC methods have been improved: - boxplot(): it is now possible to draw boxplots of raw data w/o the need to attachInten() first. - callplot(): barplots of percent P/M/A calls now access the preprocessd tree info. - image(): it is now possible to draw images of intensities w/o the need to attachInten() first. The following functions are improved replacements of similar existing functions: - plotBoxplot(): allows to easily save boxplots as png, jpeg, etc files (will replace function boxplot.dev()). - plotImage(): allows to easily save one or more images as png, jpeg, etc files (will replace function image.dev()). The following QC methods have not been changed: - hist(): density plots of raw data and expression levels still depend on a non-empty slot 'data', thus for large datasets it is still necessary to use function root.density(). - mvaplot(): scatter plot of M vs A values. - pmplot(): barplot of PM and MM intensities still depends on a non-empty slot 'data'. Please note that S4 class "QualTreeSet" is still work in progress, there are still missing features, missing documentation, and the current features are not tested extensively. For this reason there may be still some hidden problems/bugs. If you experience any problems, please report them as follows: - run R from the console, i.e. RTerm, xterm - set "verbose=TRUE" in all functions - send me the complete output of the console - add your sessionInfo() - add the ROOT version that you use (should be root_v5.27.04) As an example you can find below some demo code for the HuGene ST array. Best regards Christian _._._._._._._._._._._._._._._._._._ C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a e.m.a.i.l: cstrato at aon.at _._._._._._._._._._._._._._._._._._ #--------------------------------------------------------------------- ---------# # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Import raw data # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### new R session: load library xps_1.11.12 library(xps) ### define directories: scmdir <- "/Volumes/GigaDrive/CRAN/Workspaces/Schemes/na31" celdir <- "/Volumes/GigaDrive/ChipData/Exon/HuGene" datdir <- getwd() ### import raw data (e.g. Affymetrix human tissue dataset for HuGene ST arrays) scheme.gene <- root.scheme(file.path(scmdir, "hugene10stv1.root")) celfiles <- c("TisMap_Breast_01_v1_WTGene1.CEL","TisMap_Breast_02_v1_WTGene1.CEL", "TisMap_Breast_03_v1_WTGene1.CEL","TisMap_Prostate_01_v1_WTGene1.CEL", "TisMap_Prostate_02_v1_WTGene1.CEL","TisMap_Prostate_03_v1_WTGene1.CEL ") celnames <- c("Breast01","Breast02","Breast03","Prostate01","Prostate02","Prostate 03") data.gene <- import.data(scheme.gene, "TisMapData", filedir=datdir, celdir=celdir, celfiles=celfiles, celnames=celnames) str(data.gene) ### show data stored in "userinfo" of ROOT trees, see "?treeInfo" info <- treeInfo(data.gene, treetype="cel", varlist="*") ### boxplot of raw data (taking advantage of preprocessed quantiles in "userinfo") boxplot(data.gene, which="userinfo:fIntenQuant") ### draw whiskers at 10%, 90% and show min/max values as outliers boxplot(data.gene, which="userinfo:fIntenQuant", range=1.5) ### images of raw data (no longer necessary to attachInten()) names <- unlist(treeNames(data.gene)) image(data.gene, names=names[4], add.legend=TRUE) ### save image as png-file outfile <- paste("Image_", names[4], sep="") plotImage(data.gene, type="intensity", names=names[4], dev="png", outfile=outfile) ### density plot outfile <- paste("DensityPlot_", names[4], sep="") root.density(data.gene, canvasname=outfile, save.as="png") ### the following plots still require attachInten() data.gene <- attachInten(data.gene) # density plot hist(data.gene, which="core") # PM/MM plot pmplot(data.gene) data.gene <- removeInten(data.gene) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Quality control # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### new R session library(xps) ### qualification - RLM rlm.all <- fitRLM(data.gene, "TisMapRLM_allcore", filedir=datdir, tmpdir="", background="antigenomic", normalize=TRUE, qualopt="all", option="transcript", exonlevel="core+affx", add.data=FALSE) ### show data stored in "userinfo" of ROOT trees info <- treeInfo(rlm.all, treetype="rlm", varlist="*") ### boxplot of positive/negative border elements borderplot(rlm.all, transfo=NULL) borderplot(rlm.all, type="pos", transfo=NULL) borderplot(rlm.all, type="neg", transfo=NULL, ylim=c(0,1000)) ### Center-of-Intensity plots for positive/negative border elements coiplot(rlm.all) coiplot(rlm.all, type="pos", ) coiplot(rlm.all, type="neg") ### NUSE plot nuseplot(rlm.all, names="namepart") # for raw data only (usual case) nuseplot(rlm.all, names="namepart:raw") ### RLE plot rleplot(rlm.all, names="namepart") # for normalized data only (usual case) rleplot(rlm.all, names="namepart:normalized") ### residual images of raw data resname <- getTreeNames(rootFile(rlm.all), treetype="res") image(rlm.all, type="resids", names=resname[4], add.legend=TRUE) image(rlm.all, type="pos.resids", names=resname[4]) image(rlm.all, type="neg.resids", names=resname[4]) image(rlm.all, type="sign.resids", names=resname[4]) image(rlm.all, type="weights", names=resname[4], add.legend=TRUE) ### save residual image as png-file outfile <- paste("ResidualImage_", resname[4], sep="") plotImage(rlm.all, type="weights", names=resname[4], dev="png", outfile=outfile) ### residual images of background adjusted data image(rlm.all, type="weights", qualopt="adjusted", names=resname[10], add.legend=TRUE) ### residual images of normalized data image(rlm.all, type="weights", qualopt="normalized", names=resname[16], add.legend=TRUE) ### get RNA degradation rnadeg <- AffyRNAdeg(rlm.all) result <- summaryAffyRNAdeg(rnadeg) ### plot RNA degradation plotAffyRNAdeg(rnadeg, add.legend=TRUE) ### plot slope of RNA degradation plotAffyRNAdeg(rnadeg, summary=TRUE) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Preprocessing: RMA # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### new R session library(xps) ### RMA data.rma <- rma(data.gene, "TisMapRMAcore", filedir=datdir, tmpdir="", background="antigenomic", normalize=TRUE, option="transcript", exonlevel="core+affx") ### show data stored in "userinfo" of ROOT trees info <- treeInfo(data.rma, treetype="mdp", varlist="*") ### boxplot of normalized expression levels boxplot(data.rma, which="userinfo:fLevelQuant") ### density plot hist(data.rma) ### RLE plot rleplot(data.rma) mboxplot(data.rma, ylim=c(-3,3)) ### M vs A plots mvaplot(data.rma, pch=20, ylim=c(-4,4)) ### Detection Above Background Call (DABG) call.dabg <- dabg.call(data.gene,"TisMapDABGcore",filedir=datdir, exonlevel="core+affx") # note: alpha1 and alpha2 need to be adjusted to get usable P/M/A calls ### show data stored in "userinfo" of ROOT trees info <- treeInfo(call.dabg, treetype="dab", varlist="*") ### detection call plots callplot(call.dabg) callplot(call.dabg, beside=FALSE) #--------------------------------------------------------------------- #
xps xps • 1.1k views
ADD COMMENT

Login before adding your answer.

Traffic: 821 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6