Dear list members,
I have a very naiv question and I think it is really simple - but I
didn't solve the problem :-(
I have 79 Affymetrix chips and performed justRMA() first.
>data
Expression Set (exprSet) with
54675 genes
79 samples
phenoData object with 3 variables and 79 cases
varLabels
cyto: read from file
vivo: read from file
vitro: read from file
Now, I want to filter out those genes showing no significant
expression
change (decrease or increase). I used the genefilter function:
ff1<-ttest(data.frame(data),.001,na.rm=TRUE)
ff2<-filterfun(ff1)
wh2<-genefilter(exprs(eset), ff2)
Unfortunately, I've got the following message: "Error: cannot allocate
vector of size 168723 Kb"
Do I have a chance to filter out my genes that way? Or is it necessary
to split the expression set? Any hint?
Thanks a lot in advance,
Heike
--
Dr. Heike Pospisil
Center for Bioinformatics, University of Hamburg
Bundesstrasse 43, 20146 Hamburg, Germany
phone: +49-40-42838-7303 fax: +49-40-42838-7312
I think justRMA() uses nearly all the memory you have access to, so it
it only able to handle small computations afterwards. What I would
suggest is try saving the exprSet and exit. Then start from a fresh R
session and do your analysis from that. See below.
On Fri, 2005-02-11 at 10:06 +0100, Heike Pospisil wrote:
> Dear list members,
>
> I have a very naiv question and I think it is really simple - but I
> didn't solve the problem :-(
>
> I have 79 Affymetrix chips and performed justRMA() first.
>
> >data
> Expression Set (exprSet) with
> 54675 genes
> 79 samples
> phenoData object with 3 variables and 79 cases
> varLabels
> cyto: read from file
> vivo: read from file
> vitro: read from file
To save the object and exit without saving workspace image, type in
save(data, file="mydata.rda", compress=TRUE)
quit("no")
Start R and type in
load("mydata.rda")
and proceed with your analysis.
> Now, I want to filter out those genes showing no significant
expression
> change (decrease or increase). I used the genefilter function:
>
> ff1<-ttest(data.frame(data),.001,na.rm=TRUE)
> ff2<-filterfun(ff1)
> wh2<-genefilter(exprs(eset), ff2)
>
> Unfortunately, I've got the following message: "Error: cannot
allocate
> vector of size 168723 Kb"
>
> Do I have a chance to filter out my genes that way? Or is it
necessary
> to split the expression set? Any hint?
>
> Thanks a lot in advance,
> Heike
>
Hello Adaikalavan
>I think justRMA() uses nearly all the memory you have access to, so
it
>it only able to handle small computations afterwards. What I would
>suggest is try saving the exprSet and exit. Then start from a fresh R
>session and do your analysis from that. See below.
>
>
Thanks for your suggestion. Saving and loading the exprSet work and
help. But, unfortunately, my filter function do not work.
ff1<-ttest(data,.001,na.rm=TRUE)
ff2<-filterfun(ff1)
wh2<-genefilter(exprs(data), ff2)
No idea :-(
Best wishes.
Heike
--
Dr. Heike Pospisil
Center for Bioinformatics, University of Hamburg
Bundesstrasse 43, 20146 Hamburg, Germany
phone: +49-40-42838-7303 fax: +49-40-42838-7312
Hi,
You really need to give us more details,
"my xxx did not work", does not let anyone give you good advice.
As someone just said on the R help list,
"Imagine that the rest of us are not sitting looking at your
computer
screen" :-)
Specific errors/warnings, and the commands issued (cut and paste, not
what you think you did, but what you did)
Robert
On Feb 11, 2005, at 5:32 AM, Heike Pospisil wrote:
> Hello Adaikalavan
>
>> I think justRMA() uses nearly all the memory you have access to, so
it
>> it only able to handle small computations afterwards. What I would
>> suggest is try saving the exprSet and exit. Then start from a fresh
R
>> session and do your analysis from that. See below.
>>
>
> Thanks for your suggestion. Saving and loading the exprSet work and
> help. But, unfortunately, my filter function do not work.
>
> ff1<-ttest(data,.001,na.rm=TRUE)
> ff2<-filterfun(ff1)
> wh2<-genefilter(exprs(data), ff2)
>
> No idea :-(
>
> Best wishes.
> Heike
>
> --
> Dr. Heike Pospisil
> Center for Bioinformatics, University of Hamburg
> Bundesstrasse 43, 20146 Hamburg, Germany
> phone: +49-40-42838-7303 fax: +49-40-42838-7312
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>
>
+---------------------------------------------------------------------
--
----------------+
| Robert Gentleman phone: (206) 667-7700
|
| Head, Program in Computational Biology fax: (206) 667-1319 |
| Division of Public Health Sciences office: M2-B865
|
| Fred Hutchinson Cancer Research Center
|
| email: rgentlem@fhcrc.org
|
+---------------------------------------------------------------------
--
----------------+
Heike Pospisil wrote:
> Hello Adaikalavan
>
>> I think justRMA() uses nearly all the memory you have access to, so
it
>> it only able to handle small computations afterwards. What I would
>> suggest is try saving the exprSet and exit. Then start from a fresh
R
>> session and do your analysis from that. See below.
>>
>>
>
> Thanks for your suggestion. Saving and loading the exprSet work and
> help. But, unfortunately, my filter function do not work.
>
> ff1<-ttest(data,.001,na.rm=TRUE)
> ff2<-filterfun(ff1)
> wh2<-genefilter(exprs(data), ff2)
>
> No idea :-(
>
> Best wishes.
> Heike
>
I think you are setting up ff1 incorrectly. As an example, let's say
that your exprSet contains 10 samples, the first 5 are e.g.,
experimental, and the second 5 are control. Then you would set up ff1
like this:
ff1 <- ttest(5, 0.001, na.rm = TRUE)
-or-
cl <- c(rep(1,5), rep(2,5))
ff1 <- ttest(cl, 0.001, na.rm = TRUE)
The second method can be used if the samples are not contiguous (e.g.,
they are ordered exp, cont, exp, cont, etc).
cl <- c(rep(c(1,2), 5)
ff1 <- ttest(cl, 0.001, na.rm = TRUE)
HTH,
Jim
--
James W. MacDonald
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
I never used genefilter and filterfun so I would not be able to advice
on this and hope the suggestions below solves your problem.
On a personal note, I just calculate and store the p-values/statistics
directly. This is more efficient as
* I can generate various lists of "differentially expressed" genes at
different p-value cutoffs. This is often required by the biologists
who
might want a small and confident subset for biological validation and
maybe a bigger subset for computation validation (e.g. pathway
analysis)
* Rank genes by p-values
* Adjust p-values for multiple hypothesis
Here is one way how you can do this
mat <- matrix( rnorm(100000), nc=100 )
rownames(
g <- rep(1:2, each=50) # e.g. 50 normal and 50 tumour
stats <- t( apply( mat, 1, function(z) {
x <- z[ which( g==1 ) ]
y <- z[ which( g==2 ) ]
t.p <- t.test(x, y)$p.value
w.p <- wilcox.test(x, y)$p.value
fc <- mean(x, na.rm=T) - mean(y, na.rm=T)
return( c(t.pval=t.p, wilcox.pval=w.p, fold.change=fc)
)
}))
On Fri, 2005-02-11 at 10:08 -0500, James W. MacDonald wrote:
> Heike Pospisil wrote:
> > Hello Adaikalavan
> >
> >> I think justRMA() uses nearly all the memory you have access to,
so it
> >> it only able to handle small computations afterwards. What I
would
> >> suggest is try saving the exprSet and exit. Then start from a
fresh R
> >> session and do your analysis from that. See below.
> >>
> >>
> >
> > Thanks for your suggestion. Saving and loading the exprSet work
and
> > help. But, unfortunately, my filter function do not work.
> >
> > ff1<-ttest(data,.001,na.rm=TRUE)
> > ff2<-filterfun(ff1)
> > wh2<-genefilter(exprs(data), ff2)
> >
> > No idea :-(
> >
> > Best wishes.
> > Heike
> >
> I think you are setting up ff1 incorrectly. As an example, let's say
> that your exprSet contains 10 samples, the first 5 are e.g.,
> experimental, and the second 5 are control. Then you would set up
ff1
> like this:
>
> ff1 <- ttest(5, 0.001, na.rm = TRUE)
>
> -or-
>
> cl <- c(rep(1,5), rep(2,5))
> ff1 <- ttest(cl, 0.001, na.rm = TRUE)
>
> The second method can be used if the samples are not contiguous
(e.g.,
> they are ordered exp, cont, exp, cont, etc).
>
> cl <- c(rep(c(1,2), 5)
> ff1 <- ttest(cl, 0.001, na.rm = TRUE)
>
> HTH,
>
> Jim
>
>
>
[[Ignore the previous mail. I hit the send button too soon]]
I never used genefilter and filterfun so I would not be able to advice
on this and hope the suggestions below solves your problem. On a
personal note, I just calculate and store the p-values/statistics
directly. This may be more efficient for the following reasons
1) to generate various lists of interesting genes at different p-value
cutoffs. This is often required by the biologists who might want a
high
confidence subset for biological validation and maybe a broader subset
for computation validation (e.g. pathway analysis)
2) to rank genes by p-values
3) to adjust p-values for multiple hypothesis
Here is one way how you can do this :
mat <- matrix( rnorm(100000, sd=5), nc=100 )
rownames(mat) <- paste("g", 1:1000, sep="")
# replace 'mat' with your exprs(data)
g <- rep(1:2, each=50)
# Class information e.g. 50 normal and 50 tumour
# again replace this with your own groups
stats <- t( apply( mat, 1, function(z) {
x <- z[ which( g==1 ) ]
y <- z[ which( g==2 ) ]
t.p <- t.test(x, y)$p.value
w.p <- wilcox.test(x, y)$p.value
fc <- mean(x, na.rm=T) - mean(y, na.rm=T)
return( c(t.pval=t.p, wilcox.pval=w.p, fold.change=fc)
)
}))
# You can modify the above to include further tests etc.
Hopefully you can get something compact like the following (Note :
your
results will vary due to random number generation).
t.pval wilcox.pval fold.change
g1 0.2376890 0.2655573 1.0214440
g2 0.1513874 0.2931174 -1.2895703
g3 0.4788188 0.5014898 -0.7349789
g4 0.2021780 0.1302305 1.3201382
g5 0.2537569 0.2655573 -1.1256882
g6 0.5881588 0.7020112 -0.5907285
.. ......... ......... ..........
Now, you can generate various lists such as
list1 <- names( which( stats[ , "t.pval"] < 0.05 ) )
list2 <- names( which( stats[ , "fold.change"] > 1 ) )
intersect( list1, list2 )
I guess this is probably a matter of taste. Hope this helps.
Regards, Adai
On Fri, 2005-02-11 at 10:08 -0500, James W. MacDonald wrote:
> Heike Pospisil wrote:
> > Hello Adaikalavan
> >
> >> I think justRMA() uses nearly all the memory you have access to,
so it
> >> it only able to handle small computations afterwards. What I
would
> >> suggest is try saving the exprSet and exit. Then start from a
fresh R
> >> session and do your analysis from that. See below.
> >>
> >>
> >
> > Thanks for your suggestion. Saving and loading the exprSet work
and
> > help. But, unfortunately, my filter function do not work.
> >
> > ff1<-ttest(data,.001,na.rm=TRUE)
> > ff2<-filterfun(ff1)
> > wh2<-genefilter(exprs(data), ff2)
> >
> > No idea :-(
> >
> > Best wishes.
> > Heike
> >
> I think you are setting up ff1 incorrectly. As an example, let's say
> that your exprSet contains 10 samples, the first 5 are e.g.,
> experimental, and the second 5 are control. Then you would set up
ff1
> like this:
>
> ff1 <- ttest(5, 0.001, na.rm = TRUE)
>
> -or-
>
> cl <- c(rep(1,5), rep(2,5))
> ff1 <- ttest(cl, 0.001, na.rm = TRUE)
>
> The second method can be used if the samples are not contiguous
(e.g.,
> they are ordered exp, cont, exp, cont, etc).
>
> cl <- c(rep(c(1,2), 5)
> ff1 <- ttest(cl, 0.001, na.rm = TRUE)
>
> HTH,
>
> Jim
>
>
>
Hello Adai,
thanks a lot for your help. It works very well!
And once again, sorry for the (first) uncertain information I gave.
Best wishes,
Heike
Adaikalavan Ramasamy wrote:
>[[Ignore the previous mail. I hit the send button too soon]]
>
>I never used genefilter and filterfun so I would not be able to
advice
>on this and hope the suggestions below solves your problem. On a
>personal note, I just calculate and store the p-values/statistics
>directly. This may be more efficient for the following reasons
>
>1) to generate various lists of interesting genes at different
p-value
>cutoffs. This is often required by the biologists who might want a
high
>confidence subset for biological validation and maybe a broader
subset
>for computation validation (e.g. pathway analysis)
>
>2) to rank genes by p-values
>
>3) to adjust p-values for multiple hypothesis
>
>
>Here is one way how you can do this :
>
>mat <- matrix( rnorm(100000, sd=5), nc=100 )
>rownames(mat) <- paste("g", 1:1000, sep="")
># replace 'mat' with your exprs(data)
>
>g <- rep(1:2, each=50)
># Class information e.g. 50 normal and 50 tumour
># again replace this with your own groups
>
>stats <- t( apply( mat, 1, function(z) {
> x <- z[ which( g==1 ) ]
> y <- z[ which( g==2 ) ]
>
> t.p <- t.test(x, y)$p.value
> w.p <- wilcox.test(x, y)$p.value
> fc <- mean(x, na.rm=T) - mean(y, na.rm=T)
> return( c(t.pval=t.p, wilcox.pval=w.p,
fold.change=fc) )
> }))
># You can modify the above to include further tests etc.
>
>
>Hopefully you can get something compact like the following (Note :
your
>results will vary due to random number generation).
>
> t.pval wilcox.pval fold.change
>g1 0.2376890 0.2655573 1.0214440
>g2 0.1513874 0.2931174 -1.2895703
>g3 0.4788188 0.5014898 -0.7349789
>g4 0.2021780 0.1302305 1.3201382
>g5 0.2537569 0.2655573 -1.1256882
>g6 0.5881588 0.7020112 -0.5907285
>.. ......... ......... ..........
>
>
>Now, you can generate various lists such as
>
>list1 <- names( which( stats[ , "t.pval"] < 0.05 ) )
>list2 <- names( which( stats[ , "fold.change"] > 1 ) )
>intersect( list1, list2 )
>
>I guess this is probably a matter of taste. Hope this helps.
>
>Regards, Adai
>
>
>
>On Fri, 2005-02-11 at 10:08 -0500, James W. MacDonald wrote:
>
>
>>Heike Pospisil wrote:
>>
>>
>>>Hello Adaikalavan
>>>
>>>
>>>
>>>>I think justRMA() uses nearly all the memory you have access to,
so it
>>>>it only able to handle small computations afterwards. What I would
>>>>suggest is try saving the exprSet and exit. Then start from a
fresh R
>>>>session and do your analysis from that. See below.
>>>>
>>>>
>>>>
>>>>
>>>Thanks for your suggestion. Saving and loading the exprSet work and
>>>help. But, unfortunately, my filter function do not work.
>>>
>>>ff1<-ttest(data,.001,na.rm=TRUE)
>>>ff2<-filterfun(ff1)
>>>wh2<-genefilter(exprs(data), ff2)
>>>
>>>No idea :-(
>>>
>>>Best wishes.
>>>Heike
>>>
>>>
>>>
>>I think you are setting up ff1 incorrectly. As an example, let's say
>>that your exprSet contains 10 samples, the first 5 are e.g.,
>>experimental, and the second 5 are control. Then you would set up
ff1
>>like this:
>>
>>ff1 <- ttest(5, 0.001, na.rm = TRUE)
>>
>>-or-
>>
>>cl <- c(rep(1,5), rep(2,5))
>>ff1 <- ttest(cl, 0.001, na.rm = TRUE)
>>
>>The second method can be used if the samples are not contiguous
(e.g.,
>>they are ordered exp, cont, exp, cont, etc).
>>
>>cl <- c(rep(c(1,2), 5)
>>ff1 <- ttest(cl, 0.001, na.rm = TRUE)
>>
>>HTH,
>>
>>Jim
>>
>>
>>
>>
>>
>
>
>
>
--
Dr. Heike Pospisil
Center for Bioinformatics, University of Hamburg
Bundesstrasse 43, 20146 Hamburg, Germany
phone: +49-40-42838-7303 fax: +49-40-42838-7312