Question

Salmon, RUVSeq and DESeq2

0

Entering edit mode

reganhayward • 0

@reganhayward-10620

Last seen 3.8 years ago

New Zealand

I'm looking at identifying differentially expressed genes. I'm using Salmon, importing the quantification using tximport as outlined here. It's a simple comparison using infected and non-infected, and I've tried using three differential expression pieces of software (edgeR, DESeq2 and Limma).

However, for one of the conditions (there are only 2 reps), one rep is unfortunately a bit different from the other (actual gene expression). As a result, the number of DE genes is reduced and is making any biological interpretations challenging. To help with this, I'd like to use RUVSeq (RUVs), but I'm not entirely sure how to use the offsets correctly.

Although slightly different but related, I've looked at the offsets section of the EDASeq vignette, and from what I can see, theoretically I should be able to use the offsets as input into RUVSeq. I just don't want to calculate this incorrectly and end up interpreting the data wrong

Any ideas or suggestions would be greatly appreciated!

Salmon DESeq2 DEG RUVSeq • 5.8k views

ADD COMMENT • link 3.9 years ago reganhayward • 0

score 2 · Accepted Answer · 2021-06-16

2

Entering edit mode

Michael Love 43k

@mikelove

Last seen 9 days ago

United States

You could use this paradigm:

https://bioconductor.org/packages/release/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html#using-ruv-with-deseq2

Here you estimate factors of unwanted variation and supply these to the design as nuisance covariates.

ADD COMMENT • link 3.9 years ago Michael Love 43k

0

Entering edit mode

Thanks for above reply Mike - that's a cool idea which I'll use.

However, applying this to a different experimental design, this probably isn't the best way right. For example:

No. tissue      infection       rep condition
1   Bladder     Bacteria_A      1   Bladder_bact_A
2   Bladder     Bacteria_A      2   Bladder_bact_A
3   Bladder     Bacteria_B      1   Bladder_bact_B
4   Bladder     Bacteria_B      2   Bladder_bact_B
5   Bladder     Mock-infected   1   Bladder_mock
6   Bladder     Mock-infected   2   Bladder_mock
7   Kidney      Bacteria_A      1   Kidney_bact_A
8   Kidney      Bacteria_A      2   Kidney_bact_A
9   Kidney      Bacteria_B      1   Kidney_bact_B
10  Kidney      Bacteria_B      2   Kidney_bact_B
11  Kidney      Mock-infected   1   Kidney_mock
12  Kidney      Mock-infected   2   Kidney_mock

How does this look if I was to apply RUVs?

differences <- makeGroups(c("A","A","B","B","C","C","D","D","E","E","F","F"))
differences

set <- newSeqExpressionSet(counts(dds)) 
set <- newSeqExpressionSet(counts(dds, normalized = TRUE)) #Does this line make more sense to use?

idx  <- rowSums(counts(set) > 5) >= 2
set  <- set[idx, ]

set3 <- RUVs(set, genes, k=1, differences)

ddsruv <- dds
ddsruv$W1 <- set3$W[,1]
design(ddsruv) <- ~ W1 + condition


dds2 <- DESeq(ddsRUVs)
res <- results(dds2, contrast=c("condition","Bladder_bact_A","Bladder_mock")) #this does't incorporate the RUV factors right

So, I'd need to create a model.matrix (m1 below) as outlined here and run something like this?

dds2 <- DESeq(ddsRUVs, test="LRT", betaPrior=FALSE, full=m1, reduced=~0 + W1 + condition) 
res1 <- results(dds2, contrast=c("condition","Bladder_bact_A","Bladder_mock")) 
res2 <- results(dds2, contrast=c("condition","Bladder_bact_B","Bladder_mock"))

Thanks in advance - and apologies for all the nested questions

ADD REPLY • link 3.9 years ago reganhayward • 0

0

Entering edit mode

Why do you say the DESeq() and results() line don’t incorporate RUV factors? They do. Putting a covariate into the design is “controlling for” that covariate when estimating LFC for others.

ADD REPLY • link 3.9 years ago Michael Love 43k

0

Entering edit mode

ahh, that was me misunderstanding how the RUV factors were being incorporated into the design - clearly they are. Thanks for clarifying!

Last bit of clarification.... Which of these would you suggest as input to RUVs? I ask as I saw in the link you supplied, that when using svaseq, normalised values were passed

set <- newSeqExpressionSet(counts(dds)) 
set <- newSeqExpressionSet(counts(dds, normalized = TRUE))

ADD REPLY • link 3.9 years ago reganhayward • 0

0

Entering edit mode

The SVA workflow doesn't have any steps to account for sequencing depth, hence we recommend supplying the scaled counts to SVA.

However, RUVSeq does have steps to deal with sequencing depth, see vignette section 2.1, the fourth code chunk. So if you are following the RUVSeq workflow including that step you can provide original counts, not scaled counts.