Matching TCGA Aliquot ID to UUID or Barcode
2
0
Entering edit mode
Dario Strbenac ★ 1.5k
@dario-strbenac-5916
Last seen 1 day ago
Australia

Genomic Data Commons hosts a gene-wise copy number summary for each cancer, which has genes as rows and samples as columns. The column headings are aliquot UUIDs. How may these be matched to other data types, such as a MAF file of SNVs which contains TCGA barcodes as the sample identifier?

TCGAutils TCGAbiolinks GenomicDataCommons • 4.6k views
ADD COMMENT
3
Entering edit mode
@marcel-ramos-7325
Last seen 2 days ago
United States

Hi Dario, Thanks for your question. I've added support for this in TCGAutils 1.5.5.

library(TCGAutils)
UUIDtoBarcode("d85d8a17-8aea-49d3-8a03-8f13141c163b", "aliquot_ids")
#>            analytes.aliquots.aliquot_id analytes.aliquots.submitter_id
#> 13 d85d8a17-8aea-49d3-8a03-8f13141c163b   TCGA-CV-5443-01A-01D-1510-01

Created on 2019-07-17 by the reprex package (v0.3.0)

ADD COMMENT
0
Entering edit mode

Hi, thank you very much for this library.

I seem to have noticed a mismatch on some UUID when converting from file_id. For example, '56467ebd-af89-4413-84b5-1e00699a2744' returns 'TCGA-2L-AAQM-01A-11D-A396-01' but the GDC portal returns 'TCGA-IB-A5SO' instead. I am converting the masked copy number segment data and I noticed that a number of these mismatch comes from those cases with multiple aliquots. Could you confirm this or perhaps I have done something wrong?

My code is simply: UUIDtoBarcode('56467ebd-af89-4413-84b5-1e00699a2744', fromtype = "fileid")

Thank you in advance.

ADD REPLY
0
Entering edit mode

Hi e0338272, Thank you for your report. I will look into this today. It seems like the function should be returning multiple identifiers. I'll check the package's tests. Follow this issue for updates: https://github.com/waldronlab/TCGAutils/issues/24 Best, Marcel

ADD REPLY
0
Entering edit mode

It seems the UUID you have '56467ebd-af89-4413-84b5-1e00699a2744' is the file ID that contains multiple Barcodes (https://imgshare.io/image/vYlHe)

ADD REPLY
0
Entering edit mode

Thanks this has been fixed.
-Marcel

ADD REPLY
1
Entering edit mode
@tiago-chedraoui-silva-8877
Last seen 4.2 years ago
Brazil - University of São Paulo/ Los A…

In TCGAbiolinks, when reading the copy number data we use the GDC API to map the aliquot id to barcode: https://github.com/BioinformaticsFMRP/TCGAbiolinks/blob/master/R/prepare.R#L1182-L1211

ADD COMMENT
1
Entering edit mode

It looks like the function takes a barcode as input and returns the aliquot ID. What about converting an aliquot ID to a barcode?

ADD REPLY
0
Entering edit mode

Hi Dario, I just tested Marcel code and it is working fine. I think that is the easiest way would be using TCGAutils.

From my code you would need to change the filter from barcode to aliquot ID.

https://github.com/BioinformaticsFMRP/TCGAbiolinks/blob/master/R/prepare.R#L1190 -> cases.submitterid to samples.portions.analytes.aliquots.aliquotid. But this would give you all aliquots/barcodes to a patient.

So, Marcel code is a better solution.

ADD REPLY
0
Entering edit mode

Hi Dario, I just tested Marcel code and it is working fine. I think that is the easiest way would be using TCGAutils.

From my code you would need to change the filter from barcode to aliquot ID.

https://github.com/BioinformaticsFMRP/TCGAbiolinks/blob/master/R/prepare.R#L1190 -> cases.submitterid to samples.portions.analytes.aliquots.aliquotid. But this would give you all aliquots/barcodes to a patient.

So, Marcel code is a better solution.

ADD REPLY
0
Entering edit mode

Hi Dario, I just tested Marcel code and it is working fine. I think that is the easiest way would be using TCGAutils.

From my code you would need to change the filter from barcode to aliquot ID.

https://github.com/BioinformaticsFMRP/TCGAbiolinks/blob/master/R/prepare.R#L1190 -> cases.submitterid to samples.portions.analytes.aliquots.aliquotid. But this would give you all aliquots/barcodes to a patient.

So, Marcel code is a better solution.

ADD REPLY
0
Entering edit mode

Hi Dario, I just tested Marcel code and it is working fine. I think that is the easiest way would be using TCGAutils.

From my code you would need to change the filter from barcode to aliquot ID.

https://github.com/BioinformaticsFMRP/TCGAbiolinks/blob/master/R/prepare.R#L1190 -> cases.submitterid to samples.portions.analytes.aliquots.aliquotid. But this would give you all aliquots/barcodes to a patient.

So, Marcel code is a better solution.

ADD REPLY

Login before adding your answer.

Traffic: 966 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6