How to obtain clinical data from TCGA via Bioconductor GenomicDataCommons
1
1
Entering edit mode
Hashirama ▴ 10
@0df7ded5
Last seen 3.0 years ago
Germany

Dear community,

I am totally new to TCGA and Bioconductor and I am really confused how to obtain more clinical data (e.g. for survival analysis, gender, RNA-seq read count data, ...) from some cases I got. For every "patient" I have

gdc_file_uuid (e.g. 52F6329C-CDC6-4196-A4A0-58952332905C)
filename (e.g. UNCID_1552290.d6b7779f-a245-48ee-b9a8-2570c023a531.sorted_genome_alignments.bam)
case_uuid (e.g. 2be42cc2-9b97-4821-afc2-d1e42eb3932d)

How can I use this in the R package GenomicDataCommons to get more clinical data?

I would be glad for any help!

Kind regards, Hashirama

TCGA GenomicDataCommons • 1.4k views
ADD COMMENT
1
Entering edit mode
ADD REPLY
1
Entering edit mode
@sean-davis-490
Last seen 4 months ago
United States

The GenomicDataCommons package can take a set of uuids for the cases to get quite a bit of clinical detail. See available_expand(cases()) for the types of data that can be returned. Here is some code to get you started

library(GenomicDataCommons)
cases() %>% 
  expand(c('diagnoses','demographic','diagnoses.pathology_details')) %>% 
  GenomicDataCommons::filter(case_id %in% c("2be42cc2-9b97-4821-afc2-d1e42eb3932d"))  %>% 
  results() %>% 
  tibble::as_tibble() %>% 
  dplyr::glimpse()

Results:

Rows: 1
Columns: 22
$ id                      <chr> "2be42cc2-9b97-4821-afc2-d1e42eb3932d"
$ slide_ids               <named list> <"9a182c4a-6085-4829-a3d0-c46114f0875b", "4236…
$ submitter_slide_ids     <named list> <"TCGA-HZ-7926-01Z-00-DX1", "TCGA-HZ-79…
$ disease_type            <chr> "Ductal and Lobular Neoplasms"
$ analyte_ids             <named list> <"05fce9a0-fa4d-4a30-ad33-a4f04bf84abf"…
$ submitter_id            <chr> "TCGA-HZ-7926"
$ submitter_analyte_ids   <named list> <"TCGA-HZ-7926-01A-11R", "TCGA-HZ-7926-10A-01W…
$ aliquot_ids             <named list> <"1925e7c2-1730-48a4-8257-772fc4448d9b"…
$ submitter_aliquot_ids   <named list> <"TCGA-HZ-7926-10A-01D-2153-01", "TCGA-HZ-7926…
$ diagnoses               <named list> [<data.frame[1 x 28]>]
$ diagnosis_ids           <named list> "f172c483-6888-5e06-9e5c-0b2bb4be64dd"
$ created_datetime        <lgl> NA
$ sample_ids              <named list> <"8b7bd592-74f0-48e3-9e21-8005ab8d419e"…
$ demographic             <df[,14]> <data.frame[1 x 14]>
$ submitter_sample_ids    <named list> <"TCGA-HZ-7926-01A", "TCGA-HZ-7926-10A"…
$ submitter_diagnosis_ids <named list> "TCGA-HZ-7926_diagnosis"
$ primary_site            <chr> "Pancreas"
$ updated_datetime        <chr> "2019-08-06T14:42:37.317113-05:00"
$ case_id                 <chr> "2be42cc2-9b97-4821-afc2-d1e42eb3932d"
$ portion_ids             <named list> <"de913076-84e6-4ed7-8f2f-16cdd2a7f7b0"…
$ state                   <chr> "released"
$ submitter_portion_ids   <named list> <"TCGA-HZ-7926-01A-11", "TCGA-HZ-7926-1…
ADD COMMENT

Login before adding your answer.

Traffic: 592 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6