Question

RNA-seq analysis without replicates

0

Entering edit mode

4214811 • 0

@559345f9

Last seen 19 months ago

United States

We have RNA-seq data for 12 samples for 12 conditions. Unfortunately, we do not have any replicates and each sample corresponds to one condition. For differential gene expression analysis, I will need at least 3 replicates (or patients) for each condition to be able to compare gene expression, which I don't have. I am interested in changes in particular genes across conditions. What can I do in this case? E.g., maybe there are some strategies for normalizing counts and comparing particular gene expressions.

Another thing I tried to solve this problem is to create artificial technical replicates using RESEQ. But apparently, the method is no longer maintained. I will appreciate any advice on this.

DESeq2 RESEQ RNASeqData • 2.1k views

ADD COMMENT • link updated 19 months ago by Cynthia • 0 • written 19 months ago by 4214811 • 0

2

Entering edit mode

Simply pick a reasonable dispersion value, based on your experience with similar data, and use that for exactTest or glmFit . Typical values for the common BCV (square-root- dispersion) for datasets arising from well-controlled experiments are 0.4 for human data, 0.1 for data on genetically identical model organisms or 0.01 for technical replicates. Here is a toy example with simulated data:

bcv <- 0.2
counts <- matrix( rnbinom(40,size=1/bcv^2,mu=10), 20,2)
y <- DGEList(counts=counts, group=1:2)
et <- exactTest(y, dispersion=bcv^2)

ADD REPLY • link 19 months ago zzzsssyyy1995 ▴ 20

score 0 · Answer 1 · 2023-04-06

0

Entering edit mode

ATpoint ★ 4.5k

@atpoint-13662

Last seen 1 hour ago

Germany

Cross-posted and answered: https://www.biostars.org/p/9559745/

ADD COMMENT • link 19 months ago ATpoint ★ 4.5k

score 0 · Answer 2 · 2023-04-06

Hi

You can try the resampling with bootstrap, which I think is the simplest method for creating synthetic data, you can find a wide explanation about how to do this here: Bootstrapping RNA-seq Data

An alternative is to try to search public RNA-Seq datasets to complete your analysis. For example, the NCBI-SRA (Short-Read Archive) has the largest RNA-Seq data collection for NGS from Illumina. There, you have the option of doing a direct search through the Advanced search - SRA - NCBI ( https://www.ncbi.nlm.nih.gov/sra/advanced ), which allows you to specify your organism of interest, layout, Mbases, platform, etc. Hopefully you can find some datasets useful for your analysis. Below are the links:

https://ddbj.nig.ac.jp/DRASearch and https://www.ncbi.nlm.nih.gov/sra for direct search in the SRA dataset collection

You can also search with the SRAdb ( https://bootcamp.biostars.io/archives/2016/day3/docs/BootstrapRNAseq.html ), which is an R package on Bioconductor that can help you retrieve metadata associated with samples on SRA. It also can help you look up URLs for files you might want to retrieve.

CSC