edgeR contrast

0

Entering edit mode

Guest User ★ 13k

@guest-user-4897

Last seen 10.6 years ago

I try different contrasts with edgeR to get a feeling for my data and also to find out the best contrast for my question. I would like to know what genes are enriched in IP.treat compared to IP.control, both adjusted for unspecific IgG binding. So it seems like contrast 1 is the best: (IP.treat-IgG.treat)-(IP .control-IgG.control), however it seems like IgG.control is added to IP.treat as ??? and ??? is +. I then tried contrast 3 and 4, and get totally different results with genes only FC>1. My question: Is it allowed to have more levels -1 than +1 or how can it be explained that contrast3 and 4 look very similar but totally different than contrast1 (and IP)? Levels IP contrast1 contrast2 contrast3 contrast4 contrast5 IP.treat 1 1 1 1 1 0 IP.control -1 -1 -1 -1 -1 0 IgG.treat 0 -1 0 -1 -1 1 IgG.control 0 1 0 -1 0 -1 Thanks for any help. Julia -- output of sessionInfo(): . -- Sent via the guest posting facility at bioconductor.org.

edgeR • 1.6k views

ADD COMMENT • link updated 10.6 years ago by James W. MacDonald 68k • written 10.6 years ago by Guest User ★ 13k

0

Entering edit mode

Pickl, Julia ▴ 60

@pickl-julia-6722

Last seen 10.6 years ago

Hi all, I try different contrasts to get a feeling for my data and also to find out the best contrast for my question. I would like to know what genes are enriched in IP.treat compared to IP.control, both adjusted for unspecific IgG binding. So it seems like contrast 1 is the best: (IP.treat-IgG.treat)-(IP .control-IgG.control), however it seems like IgG.control is added to IP.treat as - and - is +. I then tried contrast 3 and 4, and get totally different results with genes only FC>1. My question: Is it allowed to have more levels -1 than +1 or how can it be explained that contrast3 and 4 look very similar but totally different than contrast1 (and IP)? Levels IP contrast1 contrast2 contrast3 contrast4 contrast5 IP.treat 1 1 1 1 1 0 IP.control -1 -1 -1 -1 -1 0 IgG.treat 0 -1 0 -1 -1 1 IgG.control 0 1 0 -1 0 -1 subj 0 0 0 0 0 0 Thanks for any help. Julia [[alternative HTML version deleted]]

ADD COMMENT • link 10.6 years ago Pickl, Julia ▴ 60

0

Entering edit mode

Pickl, Julia ▴ 60

@pickl-julia-6722

Last seen 10.6 years ago

Hi all, I try different contrasts to get a feeling for my data and also to find out the best contrast for my question. I would like to know what genes are enriched in IP.treat compared to IP.control, both adjusted for unspecific IgG binding. So it seems like contrast 1 is the best: (IP.treat-IgG.treat)-(IP .control-IgG.control), however it seems like IgG.control is added to IP.treat as - and - is +. I then tried contrast 3 and 4, and get totally different results with genes only FC>1. My question: Is it allowed to have more levels -1 than +1 or how can it be explained that contrast3 and 4 look very similar but totally different than contrast1 (and IP)? Levels IP contrast1 contrast2 contrast3 contrast4 contrast5 IP.treat 1 1 1 1 1 0 IP.control -1 -1 -1 -1 -1 0 IgG.treat 0 -1 0 -1 -1 1 IgG.control 0 1 0 -1 0 -1 subj 0 0 0 0 0 0 Thanks for any help. Julia [[alternative HTML version deleted]]

ADD COMMENT • link 10.6 years ago Pickl, Julia ▴ 60

0

Entering edit mode

I'm a bit late to the party, but if this is a ChIP-seq analysis, you might consider giving the csaw package a try. This performs a de novo differential binding analysis, by counting reads into sliding windows and analyzing them with edgeR. For sharp binding, this may be more appropriate than counting over genes, which is what you seem to be doing.

ADD REPLY • link 10.4 years ago Aaron Lun ★ 28k

0

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 2 days ago

United States

Hi Julia, This appears to be a ChIP-Seq experiment, in which case I wouldn't analyze it this way. Instead, I would use something like MACS to call peaks, using the IgG fractions as the 'input' fraction. In other words, the IgG fraction is used to help distinguish real IP regions from those regions that have high sequencing depth due to technical factors. You would then use edgeR to compare IP.treat versus IP.control. This is not a trivial analysis, and you should look at this paper (*http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4066778/ <http: www.ncbi.nlm.nih.gov="" pmc="" articles="" pmc4066778=""/>) *for more information on how you should normalize your counts. But maybe I completely misunderstand the experiment. In that case, back to your question. contrasts 3 and 4 aren't valid contrasts, so you should just ignore those results. Contrast 1 is an interaction contrast, and is testing for genes that have different amounts of IP binding between treatment and control, after adjusting for non-specific binding. If this isn't ChIP-Seq, but instead is some transcript binding experiment, then this is likely the contrast you want. Best, Jim On Tue, Sep 2, 2014 at 6:42 AM, Julia [guest] <guest at="" bioconductor.org=""> wrote: > I try different contrasts with edgeR to get a feeling for my data and also > to find out the best contrast for my question. I would like to know what > genes are enriched in IP.treat compared to IP.control, both adjusted for > unspecific IgG binding. > So it seems like contrast 1 is the best: > (IP.treat-IgG.treat)-(IP.control-IgG.control), however it seems like > IgG.control is added to IP.treat as ??? and ??? is +. I then tried contrast > 3 and 4, and get totally different results with genes only FC>1. > My question: Is it allowed to have more levels -1 than +1 or how can it be > explained that contrast3 and 4 look very similar but totally different than > contrast1 (and IP)? > > > Levels IP contrast1 contrast2 contrast3 contrast4 contrast5 > > IP.treat 1 1 1 1 1 0 > > IP.control -1 -1 -1 -1 -1 0 > > IgG.treat 0 -1 0 -1 -1 1 > > IgG.control 0 1 0 -1 0 -1 > > > > > > Thanks for any help. > > Julia > > -- output of sessionInfo(): > > . > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099 [[alternative HTML version deleted]]

ADD COMMENT • link 10.6 years ago James W. MacDonald 68k

0

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 2 days ago

United States

Hi Julia, Please don't take conversations off list (e.g., use Reply-All to respond). On Wed, Sep 3, 2014 at 2:18 AM, Pickl, Julia <j.pickl at="" dkfz-="" heidelberg.de=""> wrote: > Hi Jim, > > could you please tell me, why contrast 3 and 4 are not valid contrasts? I > do not understand it completely. Is it because the same amount of factors > should have +1 and -1 in the contrast matrix? > > They aren't valid contrasts because the coefficients don't add up to zero. You use the contrast to form a t-statistic, which you then use to test a null hypothesis versus an alternative hypothesis. In general, the null hypothesis is that the numerator of the t-statistic is equal to zero, and the alternative hypothesis is that the numerator is not equal to zero (depending on the alternative you can also test that the numerator is greater or less than zero). Because of this, the coefficients of the contrast have to add up to zero (or else you aren't testing the null that the numerator equals zero). So if we look at your contrast 3, you have IP.treat - IP.control - IgG.treat Now remember, ANOVA is simply algebra. You could be hypothesizing that IP.treat - IP.control - IgG.treat = 0, which would imply that IP.control and IgG.treat somehow sum to be equal to IP.treat. But that is a weird sort of null hypothesis (and from a biological perspective, why would you think that would be true?). One would usually assume that under the null, there are no differences between any of those three groups. In which case this contrast would be testing that IP.treat - IP.control - IgG.treat = -1, which is certainly a valid thing to test, I suppose, but what would it mean to reject that null hypothesis? There are any number of ways that those three coefficients could add up to something different from -1, so it isn't clear what you are testing here. > From a biological point of view I have still problems with the contrast 1 > > (IP.treat-IgG.treat)-(IP.control-IgG.control), > > as it is also > > IP.treat ? IgG.treat ? IP.control *+* IgG.control > > And this looks like the counts of IP.treat* plus* IgG.control are > compared to IgG.treat and IP.control. > > And that is another interpretation for that contrast. This is why the associative law is useful; you can move things around in such a way to make interpretation of the result easier (or harder, if you so desire). There are two things to consider. First, you want to set up both your coefficients and any contrasts in such a way that you can most easily interpret the results. In this case, setting up the contrast as (and thinking of the contrast in terms of) (IP.treat - IP.control)-(IgG.treat - IgG.control) is easiest. This is because you can then formulate the null hypothesis as firstpart - secondpart = 0 or alternatively firstpart = secondpart which means that the difference between IP.treat and IP.control is equal to the difference between IgG.treat and IgG.control, which you can then interpret as meaning that the IP results are indistinguishable from the IgG results. And since that is a useful null hypothesis, given the experiment, it is best to interpret the contrast that way. The second issue has to do with rejecting the null hypothesis, and what that means. For a simple contrast, interpreting a rejected null hypothesis is simple. Say you tested IP.treat - IP.control and you reject the null with a p < 0.05, and the t-statistic has a value of 13.4. It's easy then to say that there appears to be a difference between those two samples, and it is also easy to see that the treatment results in way more of the given gene being pulled down by the IP step (because the t-statistic has a positive sign, implying a positive fold change, which can only come about if the IP.treat coefficient is larger than the IP.control coefficient). But if you get a p < 0.05 and a t-statistic of 13.4 for the interaction term (IP.treat - IP.control - IgG.treat + IgG.control), then how do you interpret that result? With just the t-statistic (or even the log fold change) all you can say is that there is a difference between treatment and control that is dependent on whether or not you used the IP antibody or non-specific IgG. But this result can arise in any number of ways, and you need to explore the data further to see exactly what is going on, by e.g. plotting the logCPM values by group. Best, Jim > > Thank you for your help! > > Best wishes, > > Julia > > > > > > > > > > > > *Von:* James W. MacDonald [mailto:jmacdon at uw.edu] > *Gesendet:* Dienstag, 2. September 2014 16:35 > *An:* Julia [guest] > *Cc:* bioconductor at r-project.org; Pickl, Julia > *Betreff:* Re: [BioC] edgeR contrast > > > > Hi Julia, > > > > This appears to be a ChIP-Seq experiment, in which case I wouldn't analyze > it this way. Instead, I would use something like MACS to call peaks, using > the IgG fractions as the 'input' fraction. In other words, the IgG fraction > is used to help distinguish real IP regions from those regions that have > high sequencing depth due to technical factors. You would then use edgeR to > compare IP.treat versus IP.control. This is not a trivial analysis, and you > should look at this paper (*http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4066778/ > <http: www.ncbi.nlm.nih.gov="" pmc="" articles="" pmc4066778=""/>) *for more > information on how you should normalize your counts. But maybe I completely > misunderstand the experiment. > > > > In that case, back to your question. contrasts 3 and 4 aren't valid > contrasts, so you should just ignore those results. Contrast 1 is an > interaction contrast, and is testing for genes that have different amounts > of IP binding between treatment and control, after adjusting for > non-specific binding. If this isn't ChIP-Seq, but instead is some > transcript binding experiment, then this is likely the contrast you want. > > > > Best, > > > > Jim > > > > > > > > On Tue, Sep 2, 2014 at 6:42 AM, Julia [guest] <guest at="" bioconductor.org=""> > wrote: > > I try different contrasts with edgeR to get a feeling for my data and also > to find out the best contrast for my question. I would like to know what > genes are enriched in IP.treat compared to IP.control, both adjusted for > unspecific IgG binding. > So it seems like contrast 1 is the best: > (IP.treat-IgG.treat)-(IP.control-IgG.control), however it seems like > IgG.control is added to IP.treat as ??? and ??? is +. I then tried contrast > 3 and 4, and get totally different results with genes only FC>1. > My question: Is it allowed to have more levels -1 than +1 or how can it be > explained that contrast3 and 4 look very similar but totally different than > contrast1 (and IP)? > > > Levels IP contrast1 contrast2 contrast3 contrast4 contrast5 > > IP.treat 1 1 1 1 1 0 > > IP.control -1 -1 -1 -1 -1 0 > > IgG.treat 0 -1 0 -1 -1 1 > > IgG.control 0 1 0 -1 0 -1 > > > > > > Thanks for any help. > > Julia > > -- output of sessionInfo(): > > . > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > -- > > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > > > > > -- > > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099 [[alternative HTML version deleted]]

ADD COMMENT • link 10.6 years ago James W. MacDonald 68k

0

Entering edit mode

Thank you very much for that very good explanation!!! Best wishes Julia Von: James W. MacDonald [mailto:jmacdon at uw.edu] Gesendet: Mittwoch, 3. September 2014 16:44 An: Pickl, Julia Cc: bioconductor at r-project.org Betreff: Re: [BioC] edgeR contrast Hi Julia, Please don't take conversations off list (e.g., use Reply-All to respond). On Wed, Sep 3, 2014 at 2:18 AM, Pickl, Julia <j.pickl at="" dkfz-="" heidelberg.de<mailto:j.pickl="" at="" dkfz-heidelberg.de="">> wrote: Hi Jim, could you please tell me, why contrast 3 and 4 are not valid contrasts? I do not understand it completely. Is it because the same amount of factors should have +1 and -1 in the contrast matrix? They aren't valid contrasts because the coefficients don't add up to zero. You use the contrast to form a t-statistic, which you then use to test a null hypothesis versus an alternative hypothesis. In general, the null hypothesis is that the numerator of the t-statistic is equal to zero, and the alternative hypothesis is that the numerator is not equal to zero (depending on the alternative you can also test that the numerator is greater or less than zero). Because of this, the coefficients of the contrast have to add up to zero (or else you aren't testing the null that the numerator equals zero). So if we look at your contrast 3, you have IP.treat - IP.control - IgG.treat Now remember, ANOVA is simply algebra. You could be hypothesizing that IP.treat - IP.control - IgG.treat = 0, which would imply that IP.control and IgG.treat somehow sum to be equal to IP.treat. But that is a weird sort of null hypothesis (and from a biological perspective, why would you think that would be true?). One would usually assume that under the null, there are no differences between any of those three groups. In which case this contrast would be testing that IP.treat - IP.control - IgG.treat = -1, which is certainly a valid thing to test, I suppose, but what would it mean to reject that null hypothesis? There are any number of ways that those three coefficients could add up to something different from -1, so it isn't clear what you are testing here.

ADD REPLY • link 10.6 years ago Pickl, Julia ▴ 60

Login before adding your answer.