Hi,
I have this host and pathogen RNA-seq data generated from the same individuals and I have performed DGEA on both data to determine the DE genes between two disease groups. With this, I would like to find out which host DE genes are correlated with which pathogen DE genes. I thought a simple pearson correlation would suffice, but I would like to adjust for several covariates of interest (ex. age).
I'm thinking a straightforward way is to do as follows:
design <- model.matrix(~1+pathogen_gene_exp+Age) disp <- estimateDisp(y, design, robust = TRUE) fit <- glmFit(disp, design, robust = TRUE) lrt <- glmLRT(fit, coef = 2) topTags(lrt)
This would compare the expression of one pathogen gene against all host genes. However, I can't think of a good way to represent the expression of pathogen gene - under edgeR, would it be acceptable in this scenario to have the expression of pathogen gene to be logCPM? Or is there another (better/more straightforward) way to perform co-expression analysis with adjustment for covariates? Thank you.
Thank you for your answer. I tried your approach and it works quite well.