Deseq to output continous sequencing data
0
0
Entering edit mode
Neha • 0
@aeb6d1e5
Last seen 4 days ago
United States

Hi,

I am new to DESeq and am wondering if I can input my raw RNAseq count data, perform all the normalizations and pre-processing as per DeSeq, but not run the negative-binomial GLM. I'm interested in instead outputting the continuous RNAseq data (that accounts for e.g., the library size, composition differences, dispersion etc) and running a different regression model that's not a GLM. I have come across numerous posts on using VST in context of DeSeq-WGCNA which are a similar idea but I have a few questions:

1. Would the following steps be sound and sufficient to get normalized, pre-processed, continuous RNAseq data?

# Step 1. Save model in deseq
 dds_adj <- DESeqDataSetFromMatrix(countData = RNA,
                                  colData = design_formula,
                                  design = ~ covar1 + as.character(covar2) + condition)  

# Step 2. Pre-filter low count RNAs
dds_adj <- estimateSizeFactors(dds_adj)
smallestGroupSize <- ncol(dds_adj)*0.10                                        # 10% of total sample size (N)
keep_1 <- rowSums(counts(dds_adj, normalized=TRUE) >= 1 ) >= smallestGroupSize # cut off is 1 cpm here
dds_adj <- dds_adj[keep_1,]

# Step 3. VST the data to take care of 1. estimating and normalizing for size factor (library composition) and 2. dispersion
vst<-  varianceStabilizingTransformation(dds_adj, blind = FALSE)
norm.data <- assay(vst)
  1. I would like to explore different conditions (e.g., continous and categorical age) but when I consider them in Step1, each leads a slightly different norm.data. I'd like a single, final norm.data. By reading the WGCNA issues, my understanding is that setting blind = T in vst() or setting the design = ~1 could help but that comes with repurcussions. Would it be appropriate to design = ~ covar1 + as.character(covar2) and ignoring the main condition of interest altogether and why? Also, by using a design that includes covariates, it doesn't mean that the RNAseq data is now adjusted for these covariates and I still do need to control for them downstream, right?

Thank you so much!

DESeq2 • 65 views
ADD COMMENT

Login before adding your answer.

Traffic: 912 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6