I am going to use RNA-seq counts provided by SEQC package to benchmark my method, however I am a bit confused about the final counts for each replicate. In each dataset, there are libraries from different lanes and flow cells for each group's replicate. My question is, should I add up the counts from all lanes and flow cells to obtain the final count for that replicate? For example, is the counts for replicate A_1 the sum of all A_1_L_X where L is lane number, and X is in {flow-cell1, flow-cell2}?
I believe I have answered your question in an email you sent me few days ago. But I just paste my answer here -
----------------------------
Dear Siamak,
Yes, the counts for A_1 should be the sum of counts from all its technical replicates. For differential expression analysis, you will need to use biological replicates to estimate biological variations. Technical replicates are not useful for this purpose and they should be added up.
Best wishes,
--------------------
Wei Shi, PhD
Laboratory Head
The Walter and Eliza Hall Institute of Medical Research
Melbourne, Australia