Dear Bioconductor community, I am confronted to a puzzling question. I can't find specific answers on biostars or other websites, and can't figure it out. I have been working on RNA from olfactory epithelium. I had to extract the full sample provided as olfactory receptors can be locally distributed (I had to be sure I have everyone). Some tissues were too "big" so I extracted them in two batches. I prepared separate libraries for each of them. In total I have 27 libraries, and for three samples I have two replicates. I am not sure if there are real technical replicates. I saw one post where it was not recommended to use duplicatecorrelation if I just have a few replicates. If I average the expression, I might lose power on some gene expression because I may capture some genes in the second extraction that were not in the first. The overall library size is quite similar though. Hopefully it should not make a lot of difference except for some genes but I wonder if I should sum the count post normalization (to account for library size difference between replicates) or if I should average them. I hope my question makes sense, any suggestions would be appreciated! Best regards Alice
My preference is to align and featureCount separately, then make an MDS plot to check that the replicate samples really do appear to be technical replicates. If so, then sum the counts using
edgeR::sumTechReps
.Many thanks James and Gordon, They indeed cluster well together so I will sum the counts as suggested. Best regards Alice