Question

effect of polyploidy level on RUVs normalisation

0

Entering edit mode

thomas.wolfe • 0

@thomaswolfe-9344

Last seen 7.4 years ago

Dear all,

I am currently working with transcriptomic data obtained for diploid parental genomes and their polyploid progeny.

When comparing polyploids to diploids for differential gene expression, what is the effect of polyploidy level on RUVs normalisation ?

If anyone has a clear and possibly detailed explanation that would be much appreciated :D

Thanks in advance,

Thomas

normalization ruvseq ruvnormalize ruvs eda • 1.9k views

ADD COMMENT • link 9.0 years ago thomas.wolfe • 0

0

Entering edit mode

thomas.wolfe • 0

@thomaswolfe-9344

Last seen 7.4 years ago

Dear Davide, thanks for the quick answer, I will try and elaborate the experimental design.

we have

species (polyploidy)	number of individual / nbr of technical replicates
D. fuchsii (diploid)	4, 1 ind with tech rep
D. incarnata (diploid)	5, 0 ind with tech rep
D. majalis (polyploid)	8, 4 ind with tech rep
D. traunsteineri (polyploid)	11, 6 ind with tech rep
total	28, 11 ind with tech rep

I am doing two-by-two comparaison between these species. I observe a clear batch effect between the polyploid species

Image: http://imgur.com/OPowmEU

I believe (at least for the polyploid comparaisons) that the unwanted batch effect (W unwanted factor) and the biological variance of interest (difference between species) are not colinear. I have samples of both species in both batches.

When I compare the polyploid to the diploid on the other hand, none of the diploid samples are in either of the batches for which I observe a batch effect in the polyploids. I still correct, using RUVs, the batch effect in the polyploid. After finding DE genes, I get a strange MA plot

Image: http://imgur.com/DnFb5Pd

I hope the "design" is a bit clearer.

I am interested in differences between species (which happen to have different polyploidy levels). Could RUV be affected by these levels of polyploidy (e.g. if I look at DE genes overlaps between the comparaison diplo.vs.poly and poly.vs.poly, are these groups comparable after normalisation?)? Should I look into removing different numbers of factors of unwanted variance (k) when doing diplo.vs.poly, poly.vs.poly or diplo.vs.diplo comparaisons? I am not sure...

Thanks again for your help, cheers

ADD COMMENT • link 9.0 years ago thomas.wolfe • 0

0

Entering edit mode

Hi Thomas,

thanks for adding these details, they are really useful.

Is the first plot color-coded by date? (it looks like it from the legend..) In that case, I'm tempted to say that the time of the experiment might be more important that polyploidy to explain the batch effects. Is date confounded with species? If not, I think that the easiest thing to try would be to include the date variable in the design matrix, see for instance chapter 3.4 of the edgeR guide:

http://www.bioconductor.org/packages/release/bioc/vignettes/edgeR/inst/doc/edgeRUsersGuide.pdf

If that doesn't work, you can use RUVg or RUVs (defining the groups by date rather than by polyploidy I think it's a better idea -- provided species and date are not confounded).

ADD REPLY • link 9.0 years ago davide risso ▴ 980

0

Entering edit mode

Thanks for your answer,

I tried integrating the batch (as dates) in the EdgeR model. My problem mainly is that my "definition" of batch is arbitrary. I have some dates where there is only one sample preparation (in which case I use the year), some where there are two or more... As I do not really have any a priori knowledge of the batches I went for RUVs (polyploidy is never considered as a batch).

ADD REPLY • link 9.0 years ago thomas.wolfe • 0

0

Entering edit mode

OK, sorry I misunderstood how you were using RUVs. How are you defining your groups? Have you considered RUVg instead? Provided that you have a list of genes that you don't expect to change much between species.

Also, it would be useful to look at the RLE and PCA plots after RUV correction to see if things are better or worse.

ADD REPLY • link 9.0 years ago davide risso ▴ 980

score 2 · Accepted Answer · 2016-04-25

Hi Thomas,

I am not sure that I completely understand your question. Adding more details on the experimental design will make it easier to give you some indication on whether RUV is appropriate in this setting.

In general, the RUV model can be described as a (log-)linear model with two terms, X beta + W alpha.

X beta represents "wanted effects" and W alpha "unwanted effects." Wanted effects are what you are interested in, while unwanted effects are confounders that can be both of technical or biological nature.

It is not clear to me if in your case you are thinking of polyploidy as a wanted or unwanted factor. If you can elaborate on the goal of the experiment, I can possibly give you a better answer.

Note that one thing to keep in mind is confounding: if X and W are collinear, it is not possible to use this model. A simple example would be if you have two batches and you analyze all your treated samples in one batch and all the controls in another batch: there is no way to separate the effects of treatment and batch with RUV (or any other model). Depending on your experimental design, this could be true for you too. There may simply be no way to distinguish between diploid vs. polyploid and parental vs. progeny effects.

But I could be wrong, it's hard to say without details on what is your question and experimental design.