I have been using voom+limma+sva to analyse my RNA-Seq data of leukaemia. After some clustering analysis and prior knowledge to the data set, I now am trying to compare some of the samples in a group with the remaining data set. I used fgseaMultilevel() from fgsea package and did a pre-ranked GSEA using log(fold change) as the ranking matrix.
However, as it is RNA-Seq data, I actually want to use SeqGSEA instead. However there is no clear instruction on how one can include the surrogate variables in the pipeline. runDESeq(), which perfroms DESeq::estimateSizeFactors and DESeq::estimateDispersions, is the step where I think I can include the sva results, but I am not sure how I can do it. runDESeq() function takes an argument called label, which is the label of control vs condition. So even if I just run DESeq() I don't know how to integrate the outcome in the pipeline.
So my questions are: 1. How do I include sva results in SeqGSEA? 2. Is there a way where I can use voom transformation instead? I know it is compatible to most microarray pipelines. So does it mean that I can just use the output, calculate the S2N matrix and use pre-ranked GSEA? 3. Are there other suggestions for gene set analysis/pathway analysis for RNA-Seq data?
I will be very grateful if any could provide me with any advice. Thanks a lot.