Recount2 Bigwigs for TCGA
1
0
Entering edit mode
@alexvnesta-23044
Last seen 4.7 years ago

Hi,

I would like to obtain bigwig RNA-Seq coverage files for specific TCGA samples. Additionally, It would be great if I could download a specific locus from those bigwigs, and even average the coverage of multiple samples.

I believe this can all be achieved using the Recount2 Package, and associated packages in the QuickStart guide.

I simply cannot figure out how to download the TCGA bigwig files.

The download_study function in the recount QuickStart guide only explains how to download data using the SRA ID. I don't think TCGA has SRA IDs. If someone can help me get over the hump of importing the TCGA data as a RangedSummarizedExperiment simply by providing the TCGA ID, that would be great.

http://bioconductor.org/packages/release/bioc/vignettes/recount/inst/doc/recount-quickstart.html#11downloadallthedata

Thanks, Alex

R TCGA recount2 recount • 1.7k views
ADD COMMENT
0
Entering edit mode

I found out how to download all of the TCGA bigwig files, but I want to just download bigwigs for certain TCGA IDs.

Here is what I have tried so far:

sapply(unique("TCGA"), download_study, type = 'samples')

Is there a way to specify TCGA IDs in the project argument?

ADD REPLY
0
Entering edit mode
@lcolladotor
Last seen 5 weeks ago
United States

Hi,

I don't know why I never got an email about your question despite the fact that you did use the recount tag. In any case, the answer lies in accessing the recount_url data.frame object provided by the recount package. You can subset it for TCGA and keep only the bigwig files, then use the resulting URLs. You might or mightnot want to download the mean_TCGA.bw file.

> library(recount)
> head(subset(recount_url, grepl('\\.bw$', file_name) & project == 'TCGA'))
                                                                                            path
83847                              /dcl01/leek/data/recount-website/mean/means_tcga/mean_TCGA.bw
83848 /dcl01/leek/data/tcga/v1/batch_29/coverage_bigwigs/3DFF72D2-F292-497E-ACE3-6FAA9C884205.bw
83849 /dcl01/leek/data/tcga/v1/batch_27/coverage_bigwigs/B1E54366-42B9-463C-8615-B34D52BD14DC.bw
83850 /dcl01/leek/data/tcga/v1/batch_14/coverage_bigwigs/473713F7-EB41-4F20-A37F-ACD209E3CB75.bw
83851 /dcl01/leek/data/tcga/v1/batch_22/coverage_bigwigs/11F18F54-9B33-4C33-BDF9-0F093F4F3336.bw
83852 /dcl01/leek/data/tcga/v1/batch_26/coverage_bigwigs/136B7576-1108-4FA3-8254-6069F0CA879A.bw
                                    file_name project version1 version2
83847                            mean_TCGA.bw    TCGA     TRUE    FALSE
83848 3DFF72D2-F292-497E-ACE3-6FAA9C884205.bw    TCGA     TRUE    FALSE
83849 B1E54366-42B9-463C-8615-B34D52BD14DC.bw    TCGA     TRUE    FALSE
83850 473713F7-EB41-4F20-A37F-ACD209E3CB75.bw    TCGA     TRUE    FALSE
83851 11F18F54-9B33-4C33-BDF9-0F093F4F3336.bw    TCGA     TRUE    FALSE
83852 136B7576-1108-4FA3-8254-6069F0CA879A.bw    TCGA     TRUE    FALSE
                                                                                 url
83847                            http://duffel.rail.bio/recount/TCGA/bw/mean_TCGA.bw
83848 http://duffel.rail.bio/recount/TCGA/bw/3DFF72D2-F292-497E-ACE3-6FAA9C884205.bw
83849 http://duffel.rail.bio/recount/TCGA/bw/B1E54366-42B9-463C-8615-B34D52BD14DC.bw
83850 http://duffel.rail.bio/recount/TCGA/bw/473713F7-EB41-4F20-A37F-ACD209E3CB75.bw
83851 http://duffel.rail.bio/recount/TCGA/bw/11F18F54-9B33-4C33-BDF9-0F093F4F3336.bw
83852 http://duffel.rail.bio/recount/TCGA/bw/136B7576-1108-4FA3-8254-6069F0CA879A.bw
> dim(subset(recount_url, grepl('\\.bw$', file_name) & project == 'TCGA'))
[1] 11285     6
> packageVersion('recount')
[1] ‘1.12.1’

Best, Leonardo

ADD COMMENT

Login before adding your answer.

Traffic: 580 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6