Can i perform DEseq2 from 2 different rse objects download from RECOUNT database
1
0
Entering edit mode
David ▴ 20
@93249c3e
Last seen 19 months ago
Hong Kong

Hello everyone/ RECOUNT3/ DEseq2,

Before i raise this question, i kind of read some previous post (Counts from recount3 for DESeq2 analysis) about DEseq2 and RECOUNT3 object. But i am still have no idea how to do it or is it possible doing so.

Therefore, my question is is it possible and how to combine 2 rse objects download from Recount3 and perform DEseq2 analysis on 2 different datasets, for example, between cancer and GTEx normal tissue? Also, if it is possible, how can i tell/ what is the code to tell DEseq2 function the 2 conditions that i want to compare? I basically cannot translate what i learnt from DEseq2 tutorial (http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#summarizedexperiment-input) and apply to what i am doing.

I apologize if my question is not good and my coding basic is weak as a biologist. I have attached some detail of the datasets below as reference. Thank you everyone for your help.

Best Regards, David LIU

Code should be placed in three backticks as shown below


# this is the detail of the rse from normal tissue from RECOUNT3:

> rse_GTEx_Brain_tissuex
class: RangedSummarizedExperiment 
dim: 63856 224 
metadata(8): time_created recount3_version ... annotation recount3_url
assays(4): raw_counts counts RPKM TPM
rownames(63856): ENSG00000278704.1 ENSG00000277400.1 ...
  ENSG00000182484.15_PAR_Y ENSG00000227159.8_PAR_Y
rowData names(10): source type ... havana_gene tag
colnames(224): GTEX-14LZ3-0011-R10b-SM-6AJA9.1
  GTEX-17HII-0011-R10a-SM-79OMY.1 ... GTEX-1GTWX-0011-R10b-SM-CJI2X.1
  GTEX-1HCU6-0011-R10a-SM-CKZP8.1
colData names(198): rail_id external_id ... recount_seq_qc.errq BigWigURL

#this is the rse detail for a cancer dataset  from RECOUNT3

>rse_TCGA_CANCER
class: RangedSummarizedExperiment 
dim: 63856 175 
metadata(8): time_created recount3_version ... annotation recount3_url
assays(3): raw_counts counts TPM
rownames(63856): ENSG00000278704.1 ENSG00000277400.1 ...
  ENSG00000182484.15_PAR_Y ENSG00000227159.8_PAR_Y
rowData names(10): source type ... havana_gene tag
colnames(175): 3b7f80e6-74a7-4da1-ac4b-c260b6be880a
  97a8e2b1-059b-4527-8fb4-116f718ffd16 ...
  2dab6792-f9af-4565-bdd5-c9802f3132f4 c33edb64-c7b9-49a1-ae73-1d293f8522f6
colData names(937): rail_id external_id ... recount_seq_qc.errq BigWigURL

#In DEseq2 tutorial, from what i understand, there seem to be a few ways to tell what condition DEseq2 should do comparison with, like:

library("airway")
data("airway")
se <- airway

library("DESeq2")
ddsSE <- DESeqDataSet(se, design = ~ cell + dex)
ddsSE

#OR

dds$condition <- factor(dds$condition, levels = c("untreated","treated"))

#OR

res <- results(dds, name="condition_treated_vs_untreated")
res <- results(dds, contrast=c("condition","treated","untreated"))

#Is it possible or how i can i translate this to doing DEseq2 on RECOUNT3 dataset?


# include your problematic code here with any corresponding output 
# please also include the results of running the following in an R session 

sessionInfo( )
DESeq2 recount GTEx TCGA recount3 • 1.5k views
ADD COMMENT
1
Entering edit mode
ATpoint ★ 4.6k
@atpoint-13662
Last seen 1 hour ago
Germany

Technically (in terms of "the code") yes you could do that, but since these are independent datasets there is a batch effect nested with the condition you test (batch1=cancer, batch2=normal) that you cannot correct, so it's a largely pointless analysis since you cannot discriminate between technical batch DEGs and real ones. This (combining TCGA and GTEx) has been asked many times before in several fora.

ADD COMMENT
0
Entering edit mode

Thank you very much for the comment. I also notice this problem and you have a very good point. However, if i can do so i would also have normal and cancer sample on hand that i can work with, of course i can go with that route, since normal tissue samples and large cancer tumours are not easy to get in reality. Try to get at least some inspiration in hand using RECOUNT3 that process raw data from different sources in a unified pipeline is one of the best bet to see observation/result from experimental model like cell lines.

ADD REPLY
0
Entering edit mode

You will get thousands of DEGs, I do not see how would would say what is technical coincidence and what is not.

ADD REPLY

Login before adding your answer.

Traffic: 768 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6