Dear all,
I am using ballgown to quantify differential gene expression in a data set after performing the hisat2+stringtie pipeline. I have read other questions about how to handle technical replicates in DE analyses but I still have some doubts on how to handle them, and the purpose of the adjustvars
option in ballgown. My phenotype file is as follows:
"ids", "hpi", "exp", "rep"
"CoffeR1C24", 24, "C", 1
"CoffeR1C48", 48, "C", 2
"CoffeR1C72", 72, "C", 3
"CoffeR1Q24", 24, "Q", 4
"CoffeR1Q48", 48, "Q", 5
"CoffeR1Q72", 72, "Q", 6
"CoffeR2C24", 24, "C", 1
"CoffeR2C48", 48, "C", 2
"CoffeR2C72", 72, "C", 3
"CoffeR2Q24", 24, "Q", 4
"CoffeR2Q48", 48, "Q", 5
"CoffeR2Q72", 72, "Q", 6
"CoffeS1C24", 24, "C", 7
"CoffeS1C48", 48, "C", 8
"CoffeS1C72", 72, "C", 9
"CoffeS1Q24", 24, "Q", 10
"CoffeS1Q48", 48, "Q", 11
"CoffeS1Q72", 72, "Q", 12
"CoffeS2C24", 24, "C", 7
"CoffeS2C48", 48, "C", 8
"CoffeS2C72", 72, "C", 9
"CoffeS2Q24", 24, "Q", 10
"CoffeS2Q48", 48, "Q", 11
"CoffeS2Q72", 72, "Q", 12
I want to test for deferentially expressed genes between the two "exp" conditions ('C' and 'Q'). Each sample has a technical replicate ("rep" column) and the experimental conditions were assessed at multiple times ("hpi" column).
From what I have gathered from other questions, the best way of dealing with technical replicates would be average the expression measurements across replicates, is that correct?
And since each sample also contains expression results at 24, 48 and 72h, should this be considered a potential confounding factor when testing between the two experimental conditions, and therefore specified in the adjustvars option? Or is this a major confusion?
Thank you very much for your time and attention.
Cheers,
Diogo