time series analysis with limma package

0

Entering edit mode

Xiaokuan Wei ▴ 230

@xiaokuan-wei-4052

Last seen 8.8 years ago

United States

Dear List, I have been thinking with using limma package to perform some time series analysis. There is a simple example in limma's manual. However, it seems that the analysis in the manual does not consider the repeated measurement effect for time series data. I am wondering if limma has developed any method to deal with such time series data. Or I have to manually add random effects term in the model. But I really don't know how to do this. Could some one or Gordon clarify on this topic? My apology first, if this topic has been intensively discussed. Thank you. Xiaokuan [[alternative HTML version deleted]]

limma limma • 2.6k views

ADD COMMENT • link updated 13.8 years ago by Gordon Smyth 52k • written 13.8 years ago by Xiaokuan Wei ▴ 230

0

Entering edit mode

Heidi Dvinge ★ 2.0k

@heidi-dvinge-2195

Last seen 10.7 years ago

On 18 Jul 2011, at 18:23, Xiaokuan Wei wrote: > Dear List, > > I have been thinking with using limma package to perform some time > series > analysis. There is a simple example in limma's manual. However, it > seems that > the analysis in the manual does not consider the repeated > measurement effect for > time series data. Hi Xiaokuan, this doesn't answer your question directly, but depending on how many time points you have, you might want to consider using the package "timecourse" instead. It's using similar principles to limma, and allows for repeat measurement of one/multi-sample longitudinal data. \Heidi > I am wondering if limma has developed any method to deal with such > time series > data. Or I have to manually add random effects term in the model. > But I really > don't know how to do this. Could some one or Gordon clarify on this > topic? > My apology first, if this topic has been intensively discussed. > Thank you. > > Xiaokuan > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/ > gmane.science.biology.informatics.conductor

ADD COMMENT • link 13.8 years ago Heidi Dvinge ★ 2.0k

0

Entering edit mode

Gordon Smyth 52k

@gordon-smyth

Last seen 2 hours ago

WEHI, Melbourne, Australia

Dear Xiaokuan, You are correct that the time course example in the limma User's Guide assumes that all the samples are independent. When the time course is of a repeated measures nature, you can estimate the correlation between the repeated measures using the duplicateCorrelation() function in limma, with the block argument indicating each time course replicate. The correlation is then input to the lmFit() function and carried through all the analysis. This was done for example in the following paper: Peart, MJ., Smyth, GK., van Laar, RK., Richon, VM., Holloway, AJ, Johnstone, RW (2005). Identification and functional significance of genes regulated by structurally diverse histone deacetylase inhibitors. Proceedings of the National Academy of Sciences of the United States of America 102, 3697-3702. Best wishes Gordon > Date: Mon, 18 Jul 2011 10:23:50 -0700 > From: Xiaokuan Wei <weixiaokuan at="" yahoo.com=""> > To: bioconductor <bioconductor at="" stat.math.ethz.ch=""> > Subject: [BioC] time series analysis with limma package > > Dear List, > > I have been thinking with using limma package to perform some time series > analysis. There is a simple example in limma's manual. However, it seems that > the analysis in the manual does not consider the repeated measurement effect for > time series data. > I am wondering if limma has developed any method to deal with such time series > data. Or I have to manually add random effects term in the model. But I really > don't know how to do this. Could some one or Gordon clarify on this topic? > My apology first, if this topic has been intensively discussed. > Thank you. > > Xiaokuan ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}

ADD COMMENT • link 13.8 years ago Gordon Smyth 52k

0

Entering edit mode

Gordon, Thank you for your prompt response and clear explainations. This is exactly I am looking for. -Xiaokuan ________________________________ From: Gordon K Smyth <smyth@wehi.edu.au> Cc: Bioconductor mailing list <bioconductor@r-project.org> Sent: Tue, July 19, 2011 6:52:23 PM Subject: time series analysis with limma package Dear Xiaokuan, You are correct that the time course example in the limma User's Guide assumes that all the samples are independent. When the time course is of a repeated measures nature, you can estimate the correlation between the repeated measures using the duplicateCorrelation() function in limma, with the block argument indicating each time course replicate. The correlation is then input to the lmFit() function and carried through all the analysis. This was done for example in the following paper: Peart, MJ., Smyth, GK., van Laar, RK., Richon, VM., Holloway, AJ, Johnstone, RW (2005). Identification and functional significance of genes regulated by structurally diverse histone deacetylase inhibitors. Proceedings of the National Academy of Sciences of the United States of America 102, 3697-3702. Best wishes Gordon > Date: Mon, 18 Jul 2011 10:23:50 -0700 > To: bioconductor <bioconductor@stat.math.ethz.ch> > Subject: [BioC] time series analysis with limma package > > Dear List, > > I have been thinking with using limma package to perform some time series > analysis. There is a simple example in limma's manual. However, it seems that > the analysis in the manual does not consider the repeated measurement effect >for > time series data. > I am wondering if limma has developed any method to deal with such time series > data. Or I have to manually add random effects term in the model. But I really > don't know how to do this. Could some one or Gordon clarify on this topic? > My apology first, if this topic has been intensively discussed. > Thank you. > > Xiaokuan ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:9}}

ADD REPLY • link 13.8 years ago Xiaokuan Wei ▴ 230

0

Entering edit mode

Hi All, I want countGenomicOverlaps to output a count of uniquely mapping reads within a genomic feature. Will setting the resolution parameter to 'none' allow countGenomicOverlaps to ignore reads which map to multiple locations in the genome? If so, countGenomicOverlaps doesn't behave the way I expect it to. I am using the Bioconductor GenomicRanges package version 1.4.6. Example: library(GenomicRanges) subj = GRangesList(feature1=GRanges(seq='1', IRanges(10,30), strand='+')) qry = GRangesList(read1=GRanges(seq='1', IRanges(c(10,60,100),c(20,70,110)), strand='+')) countGenomicOverlaps(qry, subj, resolution='none') I would have expected the hit count to be 0 but instead it reports it as 1/3. Am I using this function correctly? Thanks, Mete IMPORTANT WARNING: This email (and any attachments) is ...{{dropped:9}}

ADD REPLY • link 13.8 years ago Mete Civelek ▴ 180

0

Entering edit mode

Hi Mete, In the context of countGenomicOverlaps, a GRangesLit is used to represent reads with gaps in the CIGAR. The top level of a GRangesList represents a single read and the list elements represent the multiple segments of the read. Taking your qry object as an example, this list would represent a single read from 10 to 110 that is broken into three portions by gaps from 20-60 and 70-100. Only one of the three segments overlaps with the subject, hence the score of 1/3. If I understand correctly, you have a GRangesList where each top level is a read and the list elements are the multiple ranges where the read aligns to the genome. In this example you have a read of length 10 that aligned to 3 different locations. If you want to identify when all list elements map to the subject you could do something like query <- GRangesList(read1=GRanges(seq='1', IRanges(c(10,60,100), c(20,70,110))), read2=GRanges(seq='2', IRanges(c(150,170), c(160,180)))) sub <- GRangesList(feature1=GRanges(seq='1', IRanges(10,30)), feature2=GRanges(seq='2', IRanges(145,195))) olap <- countOverlaps(unlist(query), sub) elements <- elementLengths(query) lst <- split(olap, rep(seq_len(length(query)), elements)) counts <- sapply(lst, sum) uniqueMap <- counts == elements Valerie On 07/22/11 11:55, Mete Civelek wrote: > Hi All, > > I want countGenomicOverlaps to output a count of uniquely mapping reads > within a genomic feature. Will setting the resolution parameter to 'none' > allow countGenomicOverlaps to ignore reads which map to multiple locations > in the genome? If so, countGenomicOverlaps doesn't behave the way I expect > it to. I am using the Bioconductor GenomicRanges package version 1.4.6. > > Example: > > library(GenomicRanges) > subj = GRangesList(feature1=GRanges(seq='1', IRanges(10,30), strand='+')) > qry = GRangesList(read1=GRanges(seq='1', IRanges(c(10,60,100),c(20,70,110)), > strand='+')) > countGenomicOverlaps(qry, subj, resolution='none') > > I would have expected the hit count to be 0 but instead it reports it as > 1/3. Am I using this function correctly? > > Thanks, > > Mete > > > IMPORTANT WARNING: This email (and any attachments) is ...{{dropped:9}} > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD REPLY • link 13.8 years ago Valerie Obenchain ★ 6.8k

Login before adding your answer.