Hi,
Thank you for taking my question into consideration.
I have RNA-seq dataset, a time series experiment, 3 time points with 8 replicates each. I wanna check the variation in gene expression across time. But I want to take into consideration the surgeon effect (2 surgeons were performing the surgery on mice).
I will run LRT test but I'm not really sure of what the reduced model should be: reduced~surgeon or reduced~1, the variable of interest should be removed which is time.
Hello again Michael,
I wanted to include batch effect into my design, in which the 32 samples are distributed among 13 pools of flow cells during RNA sequencing. So i assigned each sample to the pool it was placed in from p1 to p13 but I got half the genes having padj <0.05 compared to the run without batch effect and in the resultsNames this is what I get I'm not really sure that this must be happening, p1 is not a control to do such computation!!
resultsNames(dds)
[1] "Intercept" "surgeon_Lionel_vs_Bruno" "replicate"
[4] "batch_p10_vs_p1" "batch_p11_vs_p1" "batch_p12_vs_p1"
[7] "batch_p13_vs_p1" "batch_p2_vs_p1" "batch_p3_vs_p1"
[10] "batch_p4_vs_p1" "batch_p5_vs_p1" "batch_p6_vs_p1"
[13] "batch_p7_vs_p1" "batch_p8_vs_p1" "batch_p9_vs_p1"
[16] "time_t1_vs_t0" "time_t2_vs_t0" "time_t3_vs_t0"
Is there anything I'm missing?
Thank you,
Sally
Can you make a PCA plot where you color by the flow cell pools? I've found that the sequencing batch doesn't influence the measurements that much (meanwhile, library preparation batch has a great influence). If you see not much grouping by flow cell pool you could skip it, or if you are concerned about unwanted variation you could instead use a method like svaseq or RUVSeq to estimate say 1-3 factors of unwanted variation, and then include these in the design. See here:
http://www.bioconductor.org/help/workflows/rnaseqGene/#removing-hidden-batch-effects