Entering edit mode
Guest User
★
13k
@guest-user-4897
Last seen 10.3 years ago
Dear List & Gordon,
Firstly thanks to you and your team for the wonderful limma package
and its well dcoumented manual.
I have used it earlier for analysing both one colored and two colored
data. But with one of the datasets, I get the following warning with
lmFit command in limma package.
> fit <- lmFit(gset38932, design38932)
Warning message:
Partial NA coefficients for 28837 probe(s)
Background of experiment:
It is a two color experiment with common reference.
Paired samples (cancerous and adjacent noncancerous tissues were
obtained from 12 patients)
I have followed the instructions under section 9.4.1 "Paired Samples"
of limma manual (last revised on 17 june 2014) and also goggled for
this warning but I am unable to understand it. My another concern is
that if I ignore the warning and proceed with limma analysis, I get
only 84 differentially expressed genes (genes with adj.P.Value <
0.05), which appears too small a number for a microarray experiment.
Complete code:
> library(Biobase)
> library(GEOquery)
> library(limma)
> gset38932 <- getGEO("GSE38932", GSEMatrix =TRUE)
> GPLid <- levels((gset38932[[1]])$platform_id)
> if (length(gset38932) > 1)
+ {
+ #idx <- grep("GPL5936", attr(gset38932, "names"))
+ idx <- grep(GPLid, attr(gset38932, "names"))
+ } else
+ {
+ idx <- 1
+ }
> gset38932 <- gset38932[[idx]]
> ConventionalMicroarrayData <-
which(pData(gset38932)$characteristics_ch1.10 == "strategy:
conventional microarray")
> ## remove data with "Indirect strategy." and keep ones with "Direct
strategy."
> gset38932 <- gset38932[,ConventionalMicroarrayData]
> # log2 transform
> ex38932 <- exprs(gset38932)
> qx <- as.numeric(quantile(ex38932, c(0., 0.25, 0.5, 0.75, 0.99,
1.0), na.rm=T))
> LogC <- (qx[5] > 100) ||
+ (qx[6]-qx[1] > 50 && qx[2] > 0) ||
+ (qx[2] > 0 && qx[2] < 1 && qx[4] > 1 && qx[4] < 2)
> if (LogC) { ex38932[which(ex38932 <= 0)] <- NaN
+ exprs(gset38932) <- log2(ex38932) }
> PatientInfo <- pData(gset38932)[,c(2,10:20,27)]
> ## remove anything that is not a digit
> PatientID <- as.numeric(sub(pattern="\D+", replacement="",
x=PatientInfo$characteristics_ch1.9,ignore.case=T))
> TissueType <- sub(pattern="tissue type: ", replacement="",
x=PatientInfo$characteristics_ch1.8,ignore.case=T)
> TissueType <- make.names(TissueType)
> Block <- factor(PatientID)
> Treatment <-
factor(TissueType,levels=c("adjacent.non.tumor","tumor")) ## order of
levels is important
> design38932 <- model.matrix(~Block+Treatment)
> fit <- lmFit(gset38932, design38932)
Warning message:
Partial NA coefficients for 28837 probe(s)
> fit2 <- eBayes(fit, proportion=0.01) #proportion: assumed proportion
of genes which are differentially expressed
>
> ## to get the no.of genes in each array
> nrow38932 <- nrow(ex38932)
> LastColumn <- dim(design38932)[2]
> tT <- topTable(fit2, coef=colnames(design38932)[LastColumn],
adjust="fdr", sort.by="B",number=nrow38932)
> dim(subset(tT,adj.P.Val<0.05,))
[1] 84 15
-- output of sessionInfo():
> sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats graphics grDevices utils datasets
methods base
other attached packages:
[1] limma_3.20.8 GEOquery_2.30.1 Biobase_2.24.0
BiocGenerics_0.10.0
loaded via a namespace (and not attached):
[1] RCurl_1.95-4.3 tools_3.1.0 XML_3.98-1.1
--
Sent via the guest posting facility at bioconductor.org.