I'd like to remove the genes on the X and Y chromosomes from my human RNA-seq data before doing differential analysis using DESeq2. I've looked through the RNA-seq Workflow and DESeq2 manuals but didn't see this as an option. Any help in performing this step and still using the DESeq2 or RNA-seq Workflow pipeline would be much appreciated. Thanks.
Thank you for your response. I followed the instruction and got an error message and not sure how to fix it. I used the UCSC hg19 to make the count matrix using GenomicAlignments. I copied the codes and the error message here. (By the way, I named my dds "dds1"). Thanks.
csvfile1 <- "table1.csv"
(sampleTable1 <- read.csv(csvfile1, row.names=1))
filenames1 <- file.path(paste0(sampleTable1$Run, ".bam"))
library("Rsamtools")
bamfiles1 <- BamFileList(filenames1, yieldSize=2000000)
biocLite("TxDb.Hsapiens.UCSC.hg19.knownGene")
source("http://bioconductor.org/biocLite.R")
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
(genes <- transcriptsBy(TxDb.Hsapiens.UCSC.hg19.knownGene, by="gene"))
library("GenomicAlignments")
se1 <- summarizeOverlaps(features=genes, reads=bamfiles1, mode="Union", singleEnd=FALSE, ignore.strand=TRUE, fragments=TRUE)
library("DESeq2")
library("BiocParallel")
register(MulticoreParam(4))
dds1 <- DESeqDataSet(se, design = ~ gender + agecat3 + sfrace2 + RNAbatch + Site + PTSD_1mo_k)
dds1$PTSD_1mo_k <- relevel(dds1$PTSD_1mo_k, "control")
## subsetting dds1
seqnames(rowData(dds1))
dds1.sub <- dds1[ ! seqnames(rowData(dds1)) %in% c("chrX", "chrY"), ]
Error in x[i, , drop = FALSE] : invalid subscript type 'list'
I forgot, I was thinking GRanges but you have a GRangesList.
[edit] see Martin's answer above.
Hi Mike,
I tried to do the same but it gave me this error message:
dds.sub <- dds[ ! seqnames(rowRanges(dds)) %in% c("X","Y"), ]
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘NSBS’ for signature ‘"CompressedRleList"’
See Martins post below
thanks much mike, I tried it already but it doesn't seem working as the number of DE genes are still the same as if X and Y genes were not removed! .. I don't know what to do in order to make this correctly.
thanks
Are you using all()? I don’t see it in your code. Martin’s code shows how to properly subset if you have a GRangesList.