Question

Duplication of same sample as if two different samples

0

Entering edit mode

A ▴ 40

@a-14337

Last seen 13 months ago

United Kingdom

Hi all,

I really need help for a problem I have come across since I received another batch of RNA-seq data which I have combined with the first batch. Within both batches I have the same organ but different biological replicates... for example, 2 replicates of the lungs in batch 1 and 2 and the third and fourth replicate of the lungs in batch 2.

With the metadata, replicate info and the column order of the counts table aligning with the row order of the metadata sheet, all samples model well with DESeq2 with regards to replicates etc... apart from one organ.. the small intestine. The small intestine is actually recognised as two separate samples and not replicates (or the same organ) across different ages. The small intestine samples are actually separated according to the batches so that in a PCA plot for example, there are separate coloured dots for one set of small intestine samples and another, as if they are two different organs.

This is not happening with the other organs and all organs are recognised correctly as one sample in terms of one organ within which there are different replicates for different ages.

Is this a known issue/bug? Could this result from mistakes in the metadata sheet? I have checked this over and unless I am missing something really obvious, I cannot see any inconsistencies in the metadata table... Any help would be greatly appreciated..

I am also happy to provide code although I don't know what to add as the steps are a standard DESeq2 pipeline!

Many thanks!

deseq2 • 962 views

ADD COMMENT • link updated 6.3 years ago by Michael Love 43k • written 6.3 years ago by A ▴ 40

score 0 · Answer 1 · 2018-09-25

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 1 day ago

United States

I don’t follow what the perceived problem is. Is the problem that the points are separated in the PCA plot?

ADD COMMENT • link 6.3 years ago Michael Love 43k

0

Entering edit mode

Hi Michael,

Thank you for the quick response.

No the points are not separated, they cluster together. The problem is, is that there are two colours assigned for organ as if they are two separate samples even though they are the same. So I have 24 samples in total for the small intestine. I get a split in to two separate samples. about 20 named small intestine and another 5 samples also called small intestine but treated as a separate sample.

I am really sorry if this is unclear, I am not sure how to link the PCA through a link... as I dont know where to upload.

ADD REPLY • link 6.3 years ago A ▴ 40

0

Entering edit mode

Check table(dds$organ) and make sure that there isn’t a typo in the levels. Recent releases of DESeq2 checks that there aren’t spaces or stray punctuation (typos) potentially affecting factor levels but you may have an older version of DESeq2.

ADD REPLY • link 6.3 years ago Michael Love 43k

0

Entering edit mode

To be more clear, R won’t tolerate any changes in the exact characters. It doesn’t do any kind of fuzzy clumping of characters into levels. “small intestine” is different than “small intestine ” is different than “small.intestine” etc

ADD REPLY • link 6.3 years ago Michael Love 43k

0

Entering edit mode

Thank you Michael,

I get the following result:

Small_Intestine...12

Small_Intestine...8

I can only imagine there may be an alteration in the apostrophe, although there is none entered in the metadata sheet...

ADD REPLY • link 6.3 years ago A ▴ 40

0

Entering edit mode

There is a small difference somewhere, which you can’t see by eye... R doesn’t make mistakes in comparisons. Just recode from scratch.

Like I said earlier, not sure what version of DESeq2 you are using, but if you used the ones from the past few years, they check if extra spaces are present in the coding of variables and warn the user.

ADD REPLY • link 6.3 years ago Michael Love 43k

0

Entering edit mode

Thanks Michael, I will edit from scracth as you suggest and update DESeq2!

Also, I just realised that 'small intestine' is different to 'small intestine ' with a space in the line! Will edit this all and report back if there is a fix for anyone else who might have this problem!

Many thanks

ADD REPLY • link 6.3 years ago A ▴ 40

0

Entering edit mode

Solved! Indeed an invisible space in some of the samples! Thanks again!

ADD REPLY • link 6.3 years ago A ▴ 40