Hi,
I would like to add a matrix to the DGEList in a similar way assays can be added to a SummarizedExperiment.
I calculated the log transformed cpm values of a DGEList and I would like to store them in the DGEList, so subsetting this object would subset the log transformed cpm values too.
here is my code:
query <- GDCquery(project = "TCGA-BLCA", # project-code for Bladder Cancer
data.category = "Transcriptome Profiling", # catogry for RNA-Seq Data
data.type = "Gene Expression Quantification", # raw reads, or , aligned data
workflow.type = "HTSeq - Counts") # FPKM / HTSeq-Count / etc.
GDCdownload(query, # name of the filtered dataset assigned above
method = "api", # needs to be set in order to download from the api
files.per.chunk = 10, # this should minimise prob. of corruption
directory = "GDCdata") # save files to a seperate directory
TCGA_BLCA <- GDCprepare(query,
summarizedExperiment = TRUE) # Downloads also clinical data
TCGA_BLCA_DGEList <- DGEList(TCGA_BLCA)
keep <- rowSums(cpm(TCGA_BLCA_DGEList)>0.25) >= 217
TCGA_BLCA_DGEList <- TCGA_BLCA_DGEList[keep, ,
keep.lib.sizes = FALSE]
TCGA_BLCA_DGEList <- calcNormFactors(TCGA_BLCA_DGEList,
method = 'upperquartile',
p = 0.75)
TCGA_BLCA_DGEList = estimateCommonDisp(TCGA_BLCA_DGEList,
verbose = TRUE)
TCGA_BLCA_DGEList$log2ps_counts <- cpm(TCGA_BLCA_DGEList$pseudo.counts,
log = TRUE)
When I now subset my DGEList, log2cpm won´t be subsetted
tumor_tcga <- TCGA_BLCA_DGEList[,
TCGA_BLCA_DGEList$samples$shortLetterCode == "TP"]
> dim(tumor_tcga$samples)
[1] 414 234
> dim(tumor_tcga$log2cpm)
[1] 18476 433
> dim(tumor_tcga$counts)
[1] 18476 414
I hope you can help me with that.
Thanks and best regards, Aljosch Leusmann
Please respond to existing answers with "Add comment", unless you are answering your own question.
If you want to do it, you'll have to do it manually, e.g.,
Of course, this doesn't account for all the other fields that may be of interest, e.g., aveLogCPM, dispersions.
Okay, than I´ll have to do it manually, thank you for the hints!