Hello,
First of all I would like to apologise if this is a straight forward question/ similar to another thread.
I have been given some RNA-seq data to do my dissertation (i say this as there is no possibility for more sequencing!), and have assembled a de novo transcriptome. Now I want to look at differentially expressed genes.
Unfortunately, there are no replicates, and nestedness within the design.
There are 6 individuals- normal male, infected male (intersex1), infected male (intersex2), normal female, normal female infected, infected female (intersex3). From this I want to look at the differences between males/females, infected/uninfected and normal males/females/intersex1/2/3. However, I also have 4 different tissue types, from each of these individuals which I also want to compare.
So my question is, what would be the best way to approach such analysis?
I can't really remove any explanatory factors, as they are all important- and even then i still probably do not have enough replicates. I am not entirely sure what a 'reasonable' dispersion value would be, as this is not a controlled experiment (environmental samples). So perhaps the method which appeals to me most would be to identify the non DE genes and calculate the dispersion value from that?
Any advice/ suggestions would be greatly appreciated!
Many thanks,
Hazel
The suffering part is an unfortunate consequence of the (sometimes perverse) power dynamic in academia where graduate students often feel they have little power to assert their views/will "up the chain"
Thank you for the quick response!
I see what you mean, thank you. The only libraries that I can think of that would probably be the most similar/ less influenced by infection or sex are the muscle and 'head' (I know, not technically a tissue!), although I cannot say this for sure. Would it work if I used either of those to estimate the dispersion and then apply to the full design matrix? So I would ignore sex and infection for the estimation, and then just look at the other tissues and sexes for my main analysis (hepatopancreas, ovary, testes and muscle or head).
Yes, ignoring sex and/or infection seems like a good place to start (tissues will have to much DE between them to be sensibly ignored). In effect, you treat each individual as replicates of each other, allowing you to block on the individual and tissue in your design matrix for dispersion estimation. You then use the full design matrix to do your contrasts - presumably this has
6*4 = 24
coefficients for all the different combinations of individual/tissue.That's great, thank you so much!