When I use summarizeVariants with a GRangesList of genes, the genes are then reordered in the RangedSummarizedExperiment object and the assay(counts) values are incorrect. Is this a bug or can I control this reordering?
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene genes <- genes(txdb) # get the gene symbol for corresponding ENTREZID gene.map <- select(org.Hs.eg.db, keys=elementMetadata(genes)$gene_id, keytype="ENTREZID", column="SYMBOL") genes.df <- merge(genes, gene.map, by.x="gene_id", by.y="ENTREZID") elementMetadata(genes)$SYMBOL <- genes.df$SYMBOL # get the gene symbols GRanges and push into a GRangesList grl <- split(genes, elementMetadata(genes)$SYMBOL) # subset by a few genes grl <- grl[c('STK11','TP53','GCLM','SLC6A3')] print(grl) GRangesList object of length 4: $STK11 GRanges object with 1 range and 2 metadata columns: seqnames ranges strand | gene_id SYMBOL <Rle> <IRanges> <Rle> | <character> <character> 6794 chr19 [1205798, 1228434] + | 6794 STK11 $TP53 GRanges object with 1 range and 2 metadata columns: seqnames ranges strand | gene_id SYMBOL 7157 chr17 [7565097, 7590868] - | 7157 TP53 $GCLM GRanges object with 1 range and 2 metadata columns: seqnames ranges strand | gene_id SYMBOL 2730 chr1 [94352590, 94375012] - | 2730 GCLM ... <1 more element> ------- # read in the vcf file print("Read filtered vcf...") vcf <- readVcf(vcf.file, "hg19", row.names=TRUE) # change sample headers header(vcf)@samples <- sub("(.*)_S.*", "\\1", header(vcf)@samples) rownames(colData(vcf)) <- sub("(.*)_S.*", "\\1", rownames(colData(vcf))) # summarize variant counts for coding regions (this will take a long time) print("Summarize variant counts...") x <- summarizeVariants(grl, vcf, CodingVariants()) print(x) seqinfo: 93 sequences (1 circular) from hg19 genome [1] "Summarize variant counts..." class: RangedSummarizedExperiment dim: 4 118 metadata(1): header assays(1): counts rownames(4): GCLM TP53 STK11 SLC6A3 rowData names(0): colnames(118): D02914TB04 D06699TB04_2nd ... D31681TB02 D33818TB02 colData names(1): Samples print(assays(x)$counts)