ExpressionSet Time-series correlation stuff
1
0
Entering edit mode
@forst-christian-5709
Last seen 10.4 years ago
Is there an easier way to do time-series correlation between genes of an ExpressionSet other than using for-loops and cor()? especially if I want to play with the particular time-series? And I am not really happy with the packages I found so far: bioDist, qpgraph, qvalue I have: es...ExpressionSet ts <- c("t1", "t2", "t3", "t4", "t5") some time series from es (out of many) sp <- matrix(nrow=10,ncol=10) for(i in 1:10) { sp[i,i] <- 1. for(j in i:10) { sp[i,j] <- cor(as.vector(exprs(es[i,ts])), as.vector(exprs(es[j,ts])), method="spearman") sp[j,i] <- sp[i,j] } } And I actually want to do this for all the 40000 genes in es and not 10 as given in the example. Thanks - Chris
bioDist bioDist • 1.5k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 9 hours ago
United States
Hi Christian, On 1/22/2013 4:01 PM, Forst, Christian wrote: > Is there an easier way to do time-series correlation between genes of an ExpressionSet other than using for-loops and cor()? especially if I want to play with the particular time-series? > And I am not really happy with the packages I found so far: bioDist, qpgraph, qvalue > > I have: > > es...ExpressionSet > ts<- c("t1", "t2", "t3", "t4", "t5") some time series from es (out of many) > > sp<- matrix(nrow=10,ncol=10) > for(i in 1:10) { > sp[i,i]<- 1. > for(j in i:10) { > sp[i,j]<- cor(as.vector(exprs(es[i,ts])), as.vector(exprs(es[j,ts])), method="spearman") > sp[j,i]<- sp[i,j] > } > } > > And I actually want to do this for all the 40000 genes in es and not 10 as given in the example. If you are just trying to compute the correlation matrix then you are doing things the hard way. Note from ?cor cor(x, y = NULL, use = "everything", method = c("pearson", "kendall", "spearman")) Arguments: x: a numeric vector, matrix or data frame. So you can just use sp <- cor(es[,ts]) HOWEVA, this may be slow and may well require more RAM than you have if you are doing all 40K genes (which might be sort of silly - you will have high correlations between genes that never change at any time point; is that interesting?). There is a faster version of cor() implemented in the WGCNA package that is designed for these larger scale computations. Best, Jim > > Thanks - Chris > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
Thanks but it doesn't really do what I want. First I still need to do sp <- cor(exprs(es[,ts])) instead of sp <- cor(es[,ts]) Otherwise I get Error in cor(es[,ts]) : supply both 'x' and 'y' or a matrix-like 'x' Then, sp is a square matrix over ts and not over the time-correlated genes which I need. -------------------------------------- And what I really want is doing time-forward/backward correlation. Can this be done elegantly with cor()? Or would I have to go back to my for-loops? Chris ________________________________________ From: James W. MacDonald [jmacdon@uw.edu] Sent: Tuesday, January 22, 2013 16:21 To: Forst, Christian Cc: bioconductor at r-project.org Subject: Re: [BioC] ExpressionSet Time-series correlation stuff Hi Christian, On 1/22/2013 4:01 PM, Forst, Christian wrote: > Is there an easier way to do time-series correlation between genes of an ExpressionSet other than using for-loops and cor()? especially if I want to play with the particular time-series? > And I am not really happy with the packages I found so far: bioDist, qpgraph, qvalue > > I have: > > es...ExpressionSet > ts<- c("t1", "t2", "t3", "t4", "t5") some time series from es (out of many) > > sp<- matrix(nrow=10,ncol=10) > for(i in 1:10) { > sp[i,i]<- 1. > for(j in i:10) { > sp[i,j]<- cor(as.vector(exprs(es[i,ts])), as.vector(exprs(es[j,ts])), method="spearman") > sp[j,i]<- sp[i,j] > } > } > > And I actually want to do this for all the 40000 genes in es and not 10 as given in the example. If you are just trying to compute the correlation matrix then you are doing things the hard way. Note from ?cor cor(x, y = NULL, use = "everything", method = c("pearson", "kendall", "spearman")) Arguments: x: a numeric vector, matrix or data frame. So you can just use sp <- cor(es[,ts]) HOWEVA, this may be slow and may well require more RAM than you have if you are doing all 40K genes (which might be sort of silly - you will have high correlations between genes that never change at any time point; is that interesting?). There is a faster version of cor() implemented in the WGCNA package that is designed for these larger scale computations. Best, Jim > > Thanks - Chris > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD REPLY
0
Entering edit mode
On 1/22/2013 4:42 PM, Forst, Christian wrote: > Thanks but it doesn't really do what I want. First I still need to do > sp<- cor(exprs(es[,ts])) > > instead of > sp<- cor(es[,ts]) > > Otherwise I get > Error in cor(es[,ts]) : > supply both 'x' and 'y' or a matrix-like 'x' > > Then, sp is a square matrix over ts and not over the time-correlated genes which I need. > > -------------------------------------- > And what I really want is doing time-forward/backward correlation. Can this be done elegantly with cor()? Or would I have to go back to my for-loops? I don't know. You gave an example below that is apparently not what you are trying to do, as that results in a square matrix and will give the same results as simply using cor(). So evidently you are trying to do something else, and I am not sure what that might be. Maybe you could use the fact that cor() will accept two matrices of compatible dimensions? You can easily reorder columns of a matrix to do whatever you want. > > Chris > > > ________________________________________ > From: James W. MacDonald [jmacdon at uw.edu] > Sent: Tuesday, January 22, 2013 16:21 > To: Forst, Christian > Cc: bioconductor at r-project.org > Subject: Re: [BioC] ExpressionSet Time-series correlation stuff > > Hi Christian, > > On 1/22/2013 4:01 PM, Forst, Christian wrote: >> Is there an easier way to do time-series correlation between genes of an ExpressionSet other than using for-loops and cor()? especially if I want to play with the particular time-series? >> And I am not really happy with the packages I found so far: bioDist, qpgraph, qvalue >> >> I have: >> >> es...ExpressionSet >> ts<- c("t1", "t2", "t3", "t4", "t5") some time series from es (out of many) >> >> sp<- matrix(nrow=10,ncol=10) >> for(i in 1:10) { >> sp[i,i]<- 1. >> for(j in i:10) { >> sp[i,j]<- cor(as.vector(exprs(es[i,ts])), as.vector(exprs(es[j,ts])), method="spearman") >> sp[j,i]<- sp[i,j] >> } >> } >> >> And I actually want to do this for all the 40000 genes in es and not 10 as given in the example. > If you are just trying to compute the correlation matrix then you are > doing things the hard way. Note from ?cor > > cor(x, y = NULL, use = "everything", > method = c("pearson", "kendall", "spearman")) > > > > Arguments: > > x: a numeric vector, matrix or data frame. > > So you can just use > > sp<- cor(es[,ts]) > > HOWEVA, this may be slow and may well require more RAM than you have if > you are doing all 40K genes (which might be sort of silly - you will > have high correlations between genes that never change at any time > point; is that interesting?). > > There is a faster version of cor() implemented in the WGCNA package that > is designed for these larger scale computations. > > Best, > > Jim > > > >> Thanks - Chris >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD REPLY
0
Entering edit mode
Yes, using the transpose and supplying two matrices to cor() is doing the trick. Thanks again Chris ________________________________________ From: James W. MacDonald [jmacdon@uw.edu] Sent: Tuesday, January 22, 2013 17:02 To: Forst, Christian Cc: bioconductor at r-project.org Subject: Re: [BioC] ExpressionSet Time-series correlation stuff On 1/22/2013 4:42 PM, Forst, Christian wrote: > Thanks but it doesn't really do what I want. First I still need to do > sp<- cor(exprs(es[,ts])) > > instead of > sp<- cor(es[,ts]) > > Otherwise I get > Error in cor(es[,ts]) : > supply both 'x' and 'y' or a matrix-like 'x' > > Then, sp is a square matrix over ts and not over the time-correlated genes which I need. > > -------------------------------------- > And what I really want is doing time-forward/backward correlation. Can this be done elegantly with cor()? Or would I have to go back to my for-loops? I don't know. You gave an example below that is apparently not what you are trying to do, as that results in a square matrix and will give the same results as simply using cor(). So evidently you are trying to do something else, and I am not sure what that might be. Maybe you could use the fact that cor() will accept two matrices of compatible dimensions? You can easily reorder columns of a matrix to do whatever you want. > > Chris > > > ________________________________________ > From: James W. MacDonald [jmacdon at uw.edu] > Sent: Tuesday, January 22, 2013 16:21 > To: Forst, Christian > Cc: bioconductor at r-project.org > Subject: Re: [BioC] ExpressionSet Time-series correlation stuff > > Hi Christian, > > On 1/22/2013 4:01 PM, Forst, Christian wrote: >> Is there an easier way to do time-series correlation between genes of an ExpressionSet other than using for-loops and cor()? especially if I want to play with the particular time-series? >> And I am not really happy with the packages I found so far: bioDist, qpgraph, qvalue >> >> I have: >> >> es...ExpressionSet >> ts<- c("t1", "t2", "t3", "t4", "t5") some time series from es (out of many) >> >> sp<- matrix(nrow=10,ncol=10) >> for(i in 1:10) { >> sp[i,i]<- 1. >> for(j in i:10) { >> sp[i,j]<- cor(as.vector(exprs(es[i,ts])), as.vector(exprs(es[j,ts])), method="spearman") >> sp[j,i]<- sp[i,j] >> } >> } >> >> And I actually want to do this for all the 40000 genes in es and not 10 as given in the example. > If you are just trying to compute the correlation matrix then you are > doing things the hard way. Note from ?cor > > cor(x, y = NULL, use = "everything", > method = c("pearson", "kendall", "spearman")) > > > > Arguments: > > x: a numeric vector, matrix or data frame. > > So you can just use > > sp<- cor(es[,ts]) > > HOWEVA, this may be slow and may well require more RAM than you have if > you are doing all 40K genes (which might be sort of silly - you will > have high correlations between genes that never change at any time > point; is that interesting?). > > There is a faster version of cor() implemented in the WGCNA package that > is designed for these larger scale computations. > > Best, > > Jim > > > >> Thanks - Chris >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD REPLY

Login before adding your answer.

Traffic: 491 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6