Question

RUV: removing unwanted variation in RNA-seq

0

Entering edit mode

sergio.espeso-gil • 0

@sergioespeso-gil-6997

Last seen 5.0 years ago

New York

Hi,

I am starting with RUVseq and I have questions and some errors to understand-solve.

First (sorry if it is really naive question) I would like to know if it is possible to introduce ERCC spikes a posteriori in our data. I guess the answer will be no, but maybe you have tried or you have some suggestions. We have mouse ran-seq.

I am using the other two alternatives with the next error messages:

For RUVs:

> differences<- matrix(data=c(1:2, 3:4), byrow=TRUE, nrow=2)

> differences

[,1] [,2]

[1,] 1 2

[2,] 3 4

> set3<-RUVs(set,genes,k=1,differences)

Error in solve.default(a[, cIdx, drop = FALSE] %*% t(a[, cIdx, drop = FALSE]), :

no right-hand side in 'b'

I guess the problem is the number of replicates (only 2) , but I don't know how to set it up correctly.

For RUVr I think I have the same problem:

> set4<-RUVr(set, genes, k=1, res)

Error in svd(E[, cIdx]) : a dimension is zero

I have been trying to set "cIdx" by reading the reference manual, but I couldn't achieve it to work properly.

Thanks in advance. Best,

S.

RUV ruvseq rnaseq • 3.1k views

ADD COMMENT • link 9.8 years ago sergio.espeso-gil • 0

score 0 · Answer 1 · 2015-05-07

Hi,

I recently started to use RUV-seq too.

I'm not sure if I understand well your first question but the ERCC spikes have to be spiked into the RNAseq mix before sequencing.

You can try using housekeeping genes as controls.

For you error messages, could you tell us how you defined your 'set' and 'genes' object ?

score 0 · Answer 2 · 2015-05-07

0

Entering edit mode

sergio.espeso-gil • 0

@sergioespeso-gil-6997

Last seen 5.0 years ago

New York

Hi Marie,

Yeap, I know that spikes need to be introduced before, but as we didn't I was wondering if it is possible to introduce afterwards taking into a count sequencing depth. Maybe it is a bit crazy idea, but if housekeeping genes do not change much across samples I guess that it is not really difficult to predict their counts. It will be quite artificial for sure, but I was wondering if some tried to do it. A bit crazy maybe...sorry.

'Set' and 'genes' defined as:

set<-newSeqExpressionSet(as.matrix(filtered),
phenoData=data.frame(x,row.names=colnames(filtered)))

genes<-rownames(filtered)[grep("ˆENS",rownames(filtered))]

Thanks a lot Marie,
Sergio

ADD COMMENT • link 9.8 years ago sergio.espeso-gil • 0

1

Entering edit mode

Even though you solved in a different way, I just want to add one comment, as I think that many people run into this.

When copy/pasting from the vignette's PDF (at least on MAC, but I suspect windows will behave similarly) the "^" symbol in "^ENS" is not the right ASCII symbol used by regexp to signify "start with" (I guess it's unicode?). Hence, it won't work in R. If you manually replace the "^" symbol your code should work.

ADD REPLY • link 9.8 years ago davide risso ▴ 980

0

Entering edit mode

Oh! Thanks Davide!

I didn't know! I am using MAC yeap. I will check.

ADD REPLY • link 9.8 years ago sergio.espeso-gil • 0

score 0 · Answer 3 · 2015-05-07

0

Entering edit mode

Marie • 0

@marie-7720

Last seen 9.8 years ago

France

Can you show the result of "head(genes)" please ?

Are you sure all your genes names start by "ENS" ?

ADD COMMENT • link 9.8 years ago Marie • 0

score 0 · Answer 4 · 2015-05-07

Ok, I see the problem...

> genes<-rownames(filtered)[grep("ˆENS",rownames(filtered))]

> head(genes)

character(0)

> head(filtered)

WNN1 WNN2 WEN1 WEN2

ENSMUSG00000051951 2649 1290 2772 1660

ENSMUSG00000025902 262 555 275 498

ENSMUSG00000033845 1393 1388 1374 1616

ENSMUSG00000025903 1826 2029 1954 2231

ENSMUSG00000033813 4532 4441 4666 5300

ENSMUSG00000002459 1734 1817 1870 2332

> tail(filtered)

WNN1 WNN2 WEN1 WEN2

ENSMUSG00000081137 419 574 493 571

ENSMUSG00000035299 953 1550 1297 1687

ENSMUSG00000072844 559 689 597 977

ENSMUSG00000087263 83 120 68 153

ENSMUSG00000086695 10 21 14 35

ENSMUSG00000069053 53 46 46 39

How I can solve it? I did the table with featureCounts instead of HTSeq , and I did then

EE<-read.table("counts.txt", row.names=1 , header=T )

colnames(EE)<-c( "Chr", "Start", "End", "Strand", "Length", "WNN1", "WNN2", "WEN1", "WEN2")

EEb <- EE[c('WNN1','WNN2','WEN1','WEN2')]

> head(EEb)

WNN1 WNN2 WEN1 WEN2

ENSMUSG00000090025 0 0 0 0

ENSMUSG00000064842 0 0 0 0

ENSMUSG00000051951 2649 1290 2772 1660

ENSMUSG00000089699 0 0 0 0

ENSMUSG00000088390 2 0 0 1

ENSMUSG00000089420 0 0 0 0

Any guess? Thanks a lot!

Sergio

score 0 · Answer 5 · 2015-05-07

Ok, ok. I have done

genes<-rownames(filtered)

> head(genes)

[1] "ENSMUSG00000051951" "ENSMUSG00000025902" "ENSMUSG00000033845"

[4] "ENSMUSG00000025903" "ENSMUSG00000033813" "ENSMUSG00000002459"

And now both methods RUVr and RUVs are working, but I need to check if makes sense. I didn't realise that already all my gene names starts by ENS.

Thanks a lot Marie!!! Yeah!!!

Sergio

score 0 · Answer 6 · 2015-05-07

0

Entering edit mode

Marie • 0

@marie-7720

Last seen 9.8 years ago

France

Ok I think you did well !

I'm glad if it works now.

Best

ADD COMMENT • link 9.8 years ago Marie • 0

score 0 · Answer 7 · 2015-05-07

0

Entering edit mode

sergio.espeso-gil • 0

@sergioespeso-gil-6997

Last seen 5.0 years ago

New York

Yeaaaaaah , yeah!!! it worked!!! I have check the edgeR results and seems nice. Concerning my first naive question (XD) I didn't realise that there is some similar strategy explained in the documentation (sorry, always on rush) , check point 2.4 Empirical control genes.

I am happy :D

Thanks a lot Marie!

Sergio

ADD COMMENT • link 9.8 years ago sergio.espeso-gil • 0