Hello dare,
I am a beginner in using Bioconductor and DESeq2. I have a dataset of RNAseq, which gets from RSEM tool, for different samples (no replicate). I want to use DESeq2 for the normalization of the gene counts. Here is the RNAseq dataset:
> setwd("./rsem_export_dataset")
> RSEM_output_gene_files <- file.path(getwd(), paste0(samples$sample_ID, ".genes.results"))
> names(RSEM_output_gene_files) <- samples$sample_ID
> txi.rsem <- tximport(RSEM_output_gene_files, type="rsem", txIn=FALSE, txOut=FALSE)
> txi.rsem$length[txi.rsem$length == 0] <- 1
> dds <- DESeqDataSetFromTximport(txi.rsem, samples, ~sample_ID)
> head(assay(dds),5)[,1:3]
s_00077E7AC5 s_00077E7B8A s_00077E7B99
ENSRNOG00000000001_AABR07013255.1 0 1 0
ENSRNOG00000000007_Gad1 5780 633 4914
ENSRNOG00000000008_Alx4 0 1 25
ENSRNOG00000000009_Tmco5b 0 0 0
ENSRNOG00000000010_Cbln1 231 219 160
My samples table has some information about 53 samples, like Age, Sex, Batch_ID, and IOP. IOP is a disease feature, which I want to calculate the correlation of IOP with some limit genes per each sample.
> head (samples,2)
sample_ID Sex Batch AgeInDays Class_IOP Avg_IOP Class_Age
s_00077E7AC5 s_00077E7AC5 F B14 151 High 24 A2_Adult
s_00077E7B8A s_00077E7B8A F B12 228 Normal 12 A4_Aged
The goal is to calculate the correlation between AvgIOP and some genes like ENSRNOG00000000007Gad1 for all 53 samples. The question is I have no replicates, and I need to put ~sample_ID as a design formula. However, when I run the rlog function, as below, I have an error:
> rld <- rlog(dds, blind=FALSE)
using 'avgTxLength' from assays(dds), correcting for library size
Error in estimateDispersionsGeneEst(object, quiet = TRUE) :
the number of samples and the number of model coefficients are equal,
i.e., there are no replicates to estimate the dispersion.
use an alternate design formula
To solve this problem, I used blind=TRUE, but as I understood it is not a correct solution. Could you please guide me. Thank you so much in advance for your time.
Sincerely, Hadi