Hi,
I am helping a colleague to conduct a differential expression analysis with RNAseq data but I have some concerns about the expression levels stated in the analysis. Based on the design of the experiment, my colleague states that protein A controls the stability of protein B; when prot A is reduced, prot B increases.
In the benchwork, my colleague used an shRNA against prot A and prot B (independently) and saw a significant reduction in the expressions (both western and qRT-PCR); I believe the shRNA targets the mRNA levels of the protein. Basically, the bench work was validated.
She conducted the same experiment and sent the samples for RNA sequencing. Prior to the library preparation, the samples were subjected to rRNA depletion. When the datasets came back, I aligned them with STAR alignment, and processed them with Rsubread and DESeq2; I check the padj values for significance. I found two strange findings - (1) shRNA A was able to significantly reduce prot A, but prot B was also reduced slightly (not significantly though), and (2) shRNA B was not able to significantly reduce prot B.
I checked the PCA plots and they seemed alright; consistent patterns and clear distinguishing features between batch and treatment.
Here are my questions - is it common to find an shRNA significantly reduce during benchwork, but RNAseq data not able to detect the difference? Is it then acceptable to take the results as it is, and use it for publication? Because our concern is that the reviewers will question "why would we accept the data when we used an shRNA, and not see significant reduction in the RNAseq datasets"? Would it now be mandatory for us to repeat the experiment to get the proper readouts? Is there a way for me to check in the genome browser (or any programs for that matter) to see where the RNAseq datasets have gone wrong? Usually RNA sequencing does 30 million reads. Would 30 million reads be sufficient to encompass the whole library?
Dear Dr Michael,
My colleague and I didn't use plotCounts() but we checked the counts using counts['gene',].
So what you mean is (as far as you can understand) the data is true and it's probably not something related to the programming? I have some suspicion about the benchwork though. I do appreciate you giving me advice thus far.
Yes, I will make plans to post there soon, and check with my colleague whether any validation was done for the RNA levels prior to sending it for RNA sequencing.
Thank you once again.
Regards,
Johann