Mapping DESeq2 output back to chromosomal positions
1
0
Entering edit mode
rbronste ▴ 60
@rbronste-12189
Last seen 5.0 years ago

So I output a matrix from DiffBind as my input into DESeq2, with the chromosome positions being removed from the count matrix, just leaving the first column key as a reference followed by the counts in all the replicates in the matrix:

dds<-DESeqDataSetFromMatrix(countData = countData,
                            colData = colData,
                            design =  ~ sex + treatment)

Following this I ran DESeq2 and have output the results as follows:

write.csv(as.data.frame(results), file="results.csv")

Now this of course is a limited list by LFC and adjusted p-val so I was wondering how I can remap this list to the original matrix output from DiffBind so I can recover the chromosome positions since this contains only the key. 

Thanks!

 

deseq2 deseq diffbind differential binding analysis • 1.5k views
ADD COMMENT
0
Entering edit mode

I guess another way to look at is how to from the following results recover the associated countData?

head(results)
log2 fold change (MLE): +1,-0.333333333333333,-0.333333333333333,-0.333333333333333 
Wald test p-value: +1,-0.333333333333333,-0.333333333333333,-0.333333333333333 
DataFrame with 6 rows and 6 columns
        baseMean log2FoldChange     lfcSE      stat    pvalue      padj
       <numeric>      <numeric> <numeric> <numeric> <numeric> <numeric>
440385  24.01924      -2.625855  1.659993 0.0000000        NA        NA
440386  30.58690       4.390651  1.499186 1.5946323        NA        NA
440387  32.86232       2.876218  1.407257 0.6226421 0.2667599         1
440388  19.38288      -1.482686  1.485418 0.0000000 0.9904758         1
440389  80.59038       1.277523  1.458940 0.0000000 0.6897729         1
440390  41.60001       0.122999  1.434733 0.0000000 0.9046071         1
ADD REPLY
1
Entering edit mode
@mikelove
Last seen 2 hours ago
United States

You can match by the rownames, right? Here you would just use base R functionality, e.g. ordering by a character vector of rownames with square brackets or using the match() function.

Note that DESeqDataSet is actually built on RangedSummarizedExperiment objects, and so results() can output *ranges* in this case, see ?results. This would be my preferred option, to give DESeq2 ranged data to begin with, and then the package will keep track of this for you.

ADD COMMENT
0
Entering edit mode

I guess another way I considered doing it is by adding the (CHR, START, STOP) columns as metadata to the: 

DESeqDataSetFromMatrix

As they would be there initially from DiffBind if I did not strip them out. This would leave them in the results if I understand correctly? 

Is there a good way to go about this given the info above? Thanks!

ADD REPLY
1
Entering edit mode

I’d recommend making a RangedSummarizedExperiment object (it’s the same amount of work as what you are suggesting) then use DESeqDataSet(rse, design). Then everything will “just work”, and you can get a ranged results table.

ADD REPLY
0
Entering edit mode

Thanks for the advice! 

I think the following (from DiffBind) should do the job of outputting the RangedSummarizedExperiment object that I can use in DESeq2.

dba.peakset
ADD REPLY

Login before adding your answer.

Traffic: 791 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6