Fwd: Paired Data Analysis in baySeq package

0

Entering edit mode

Thomas J Hardcastle ▴ 180

@thomas-j-hardcastle-3860

Last seen 7.2 years ago

United Kingdom

Hi Shilin, The confusion is in the groups. Data columns 1-4 are paired with columns 5-8 respectively, as you have identified. However, there is a further structure to the data, which is that the first two paired samples are biologically distinct from the second two paired samples. Replicate group 1 is thus 159:73, 44:24, and replicate group 2 is 0:49, 0:68 - the groups are defined on the *paired* data. This situation might arise, for example, if the first two pairings consist of normal and tumour tissue from patients responding to treatment, and the second two normal and tumour tissue patients from non-responders. The replicate structure 'c(1,1,2,2)' and the DE group is designed to describe this structure - the first two pairings are distinct from the second two pairings. The DE analysis (correctly) identifies that there is a large and consistent difference in ratio between the two groups - in the first group, the data are 159:73 and 44:24, so the first member of each pair is approximately twice that of the second member; in the second group, the data are 0:49 and 0:68, so the first member of each pair is substantially less than that of the second member. If the question you would like to answer is 'where are there consistent differences between the paired data', rather than 'where are there consistent differences in the ratios of paired data between replicate groups', you can use the nullProps = 0.5 option as given in the vignette, and look at the topCounts table for the non-differentially expressed (between replicates) data (page 9 of the vignette). Hope that helps, Tom > > ------- Original Message -------- > Subject: [BioC] Paired Data Analysis in baySeq package > Date: Wed, 12 Feb 2014 00:14:44 -0600 > From: zhao shilin <zhaoshilin@gmail.com> > To: bioconductor@r-project.org <bioconductor@r-project.org> > > > > Hi all, > > Is there anybody who is familiar with the Paired Data Analysis in baySeq > package. I'm following the instructions in its vignette. I used: > > library(baySeq) > data(pairData) > pairCD <- new("pairedData", data = pairData[,1:4], pairData = > pairData[,5:8],replicates = c(1,1,2,2),groups = list(NDE = c(1,1,1,1), DE = > c(1,1,2,2))) > > As it indicated, The first four columns in these data are paired with the > second four columns. So I think Sample 1-Sample 4 is group1 and Sample > 5-Sample 8 is group2. And Sample 1 is paired with Sample 5, Sample 2 is > paired with Sample 6~~~ > In the result, the most significant gene is the 5th row. The result is: > > rowID X1.1 X1.2 X2.1 X2.2 Likelihood DE FDR.DE > 1 5 159:73 44:24 0:49 0:68 0.9974276 1>2 0.002572417 > > Its expression is: > 159 44 0 0 73 24 49 68 > > It is very obvious that the software take 3th and 4 th samples (0,0) as > group 2, 7th and 8th samples (49, 68) as group 1, which is not correct. > So I am not very clear with the replicates = c(1,1,2,2) and DE = > c(1,1,2,2). What do they mean here? What is the correct method to do paired > data analysis in baySeq package? > > Thank you! > > Best, > Shilin > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives:http://news.gmane.org/gmane.science.biology.info rmatics.conductor > > -- Dr. Thomas J. Hardcastle Department of Plant Sciences University of Cambridge Downing Street Cambridge, CB2 3EA United Kingdom [[alternative HTML version deleted]]

baySeq baySeq • 1.7k views

ADD COMMENT • link updated 10.9 years ago by zhao shilin ▴ 60 • written 10.9 years ago by Thomas J Hardcastle ▴ 180

0

Entering edit mode

zhao shilin ▴ 60

@zhao-shilin-5674

Last seen 10.3 years ago

Thanks Tom. I think I misunderstood the experiment design here for pair data. As you said, the experiment design was : Samples Patients Tissue Response 1 1 Normal Y 2 2 Normal Y 3 3 Normal N 4 4 Normal N 5 1 Tumor Y 6 2 Tumor Y 7 3 Tumor N 8 4 Tumor N Now I can understand you said in this situation. The result: topCounts(pairCD, group = 2) focused on the difference between Response Y and Response N. And if I want to compare the difference between Normal and Tumor, I need to use: topCounts(pairCD, group = 1) But in my experiment, there are no "Response", How can I write the new("pairedData") command if I just want to compare the difference between Normal and Tumor? Thank you! Best, Shilin 2014-02-12 4:19 GMT-06:00 Tom Hardcastle <tjh48@cam.ac.uk>: > Hi Shilin, > > The confusion is in the groups. Data columns 1-4 are paired with columns > 5-8 respectively, as you have identified. However, there is a further > structure to the data, which is that the first two paired samples are > biologically distinct from the second two paired samples. Replicate group 1 > is thus 159:73, 44:24, and replicate group 2 is 0:49, 0:68 - the groups are > defined on the *paired* data. This situation might arise, for example, if > the first two pairings consist of normal and tumour tissue from patients > responding to treatment, and the second two normal and tumour tissue > patients from non-responders. The replicate structure 'c(1,1,2,2)' and the > DE group is designed to describe this structure - the first two pairings > are distinct from the second two pairings. > > The DE analysis (correctly) identifies that there is a large and > consistent difference in ratio between the two groups - in the first group, > the data are 159:73 and 44:24, so the first member of each pair is > approximately twice that of the second member; in the second group, the > data are 0:49 and 0:68, so the first member of each pair is substantially > less than that of the second member. > > If the question you would like to answer is 'where are there consistent > differences between the paired data', rather than 'where are there > consistent differences in the ratios of paired data between replicate > groups', you can use the nullProps = 0.5 option as given in the vignette, > and look at the topCounts table for the non-differentially expressed > (between replicates) data (page 9 of the vignette). > > Hope that helps, > > Tom > > > ------- Original Message -------- Subject: [BioC] Paired Data Analysis > in baySeq package Date: Wed, 12 Feb 2014 00:14:44 -0600 From: zhao > shilin <zhaoshilin@gmail.com> <zhaoshilin@gmail.com> To: > bioconductor@r-project.org <bioconductor@r-project.org><bioconductor@r-project.org> > > Hi all, > > Is there anybody who is familiar with the Paired Data Analysis in baySeq > package. I'm following the instructions in its vignette. I used: > > library(baySeq) > data(pairData) > pairCD <- new("pairedData", data = pairData[,1:4], pairData = > pairData[,5:8],replicates = c(1,1,2,2),groups = list(NDE = c(1,1,1,1), DE = > c(1,1,2,2))) > > As it indicated, The first four columns in these data are paired with the > second four columns. So I think Sample 1-Sample 4 is group1 and Sample > 5-Sample 8 is group2. And Sample 1 is paired with Sample 5, Sample 2 is > paired with Sample 6~~~ > In the result, the most significant gene is the 5th row. The result is: > > rowID X1.1 X1.2 X2.1 X2.2 Likelihood DE FDR.DE > > 1 5 159:73 44:24 0:49 0:68 0.9974276 1>2 0.002572417 > > Its expression is: > 159 44 0 0 73 24 49 68 > > It is very obvious that the software take 3th and 4 th samples (0,0) as > group 2, 7th and 8th samples (49, 68) as group 1, which is not correct. > So I am not very clear with the replicates = c(1,1,2,2) and DE = > c(1,1,2,2). What do they mean here? What is the correct method to do paired > data analysis in baySeq package? > > Thank you! > > Best, > Shilin > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing listBioconductor@r-project.orghttps://stat.ethz .ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > -- > Dr. Thomas J. Hardcastle > > Department of Plant Sciences > University of Cambridge > Downing Street > Cambridge, CB2 3EA > United Kingdom > > [[alternative HTML version deleted]]

ADD COMMENT • link 10.9 years ago zhao shilin ▴ 60

0

Entering edit mode

On 12/02/14 16:52, zhao shilin wrote: > > But in my experiment, there are no "Response", How can I write the > new("pairedData") command if I just want to compare the difference > between Normal and Tumor? In this case, all the pairs would be replicates; the replicate structure would be c(1,1,1,1,...) , the groups structure would be list(c(1,1,1,1,...)) - and then use the 'nullProps = 0.5' in the getLikelihoods.BB function. Best wishes, Tom -- Dr. Thomas J. Hardcastle Department of Plant Sciences University of Cambridge Downing Street Cambridge, CB2 3EA United Kingdom [[alternative HTML version deleted]]

ADD REPLY • link 10.9 years ago Thomas J Hardcastle ▴ 180

Login before adding your answer.