Hi
I would like to ask for how to perform swish with a design similar to limma blocking situation?
# For example in the design below, the test needed is Disease vs Normal in Condition
# but multiple samples were taken from same individual.
# I thought I could consider Patient as a secondary covariates,
# but the level is greater than 2 and swish will not run it.
Patient Condition
1 1 Disease
2 1 Disease
3 1 Disease
4 2 Disease
5 2 Disease
6 2 Disease
7 3 Disease
8 3 Disease
9 3 Disease
10 4 Normal
11 4 Normal
12 4 Normal
13 5 Normal
14 5 Normal
15 5 Normal
16 6 Normal
17 6 Normal
18 6 Normal
Thank you very much
Thank you very much for the reply. I ended up feeding multiple fastq files from same patients to salmon and setting large Gibbs samples (200). Not sure how it would compare with limma duplicateCorrelation, but will run both and compare.
There's an important modeling point here: are samples 1-3 biological replicates (different samples from same patient), or just 3 sequencing runs of the same library.
They are different samples collected from the same patient, hence, different libraries. Reading your swish paper, I was impressed by the huge performance advantage of swish in terms of sensitivity and FDR control when compared to limma. I am just curious even with the duplicatedCorrelation, will limma catch up with swish. It might be dependent on the variances in the samples from same patient. If they are very similar, is it a good a idea to treat them as technical replicates and use swish? Thank you again for comments.
Modeling the random effect -- if these samples are substantially different from each and not just Poisson variation that you could sum to create a new single sample per patient -- is very important and would take priority in my opinion, arguing for duplicateCorrelation.
Are you doing isoform level analysis? We found that Swish had strong control of FDR across the range of uncertainty that presented with isoforms, and we could do this without filtering out low features.
For the simulations in that paper, I think the gain in sensitivity for all the other methods (EBSeq, SAMseq, sleuth, DESeq2, swish) over limma was due to feature filtering.
If you are doing gene-level analysis, then this extra work to model uncertainty is less important.
OK, thanks for the insight about the filtering for the isoform and the paper. That is really helpful. I do plan to do transcript level analysis, both DTE and DTU.
As a pointer for future analysis, we did develop a DTU analysis within Swish, by converting the counts to isoform proportions:
https://mikelove.github.io/fishpond/articles/swish.html#differential-transcript-usage
But for this analysis, I think you'd need to use duplicateCorrelation().
Understood.