Regressing effect of treatment on RNA-seq expected count from rsem.
1
0
Entering edit mode
hrishi27n ▴ 20
@hrishi27n-11821
Last seen 3.4 years ago
United States

Hello All,

I have RNA-seq data collected by sequencing around 30 individuals(similar phenotypes) and my goal is to  group/cluster these patients based upon their expected counts obtained from rsem.  Most of these patients are on a few different medications and some are untreated, my PCA shows a clear separation between these patients based upon their medication types. I was wondering if there was some way of regressing the medication effect, so that both medicated and unmedicated individuals look relatively consistent. Any input or suggestion is highly appreciated. 

Thanks. 

RNA rna-seq science bioinformatics • 2.2k views
ADD COMMENT
3
Entering edit mode
@james-w-macdonald-5106
Last seen 12 hours ago
United States

You will want to first convert those data to something more amenable (than counts) for clustering. Alternatives include voom or cpm in edgeR, or rlog or vst in DESeq2. I won't go into arguments for or against any of those choices other than to say that they exist.

Once you have converted using the tool of your choice, you could use either removeBatchEffect from limma, or you could use ComBat from sva to regress out the medication effect.

ADD COMMENT
0
Entering edit mode

Thank you James. 

ADD REPLY
0
Entering edit mode

James, so I used both removeBatchEffect and ComBat separately to see what worked better for me, it seems that removeBatchEffect regressed most of the medication effect.  Just to be sure and to see if we can improve this, I am pasting my code snippet below.  For removeBatchEffect is it necessary to provide a design matrix? considering the fact that I don't have any grouping and any other condition worth regressing?            

 myDGElist <-DGEList(counts=CountFrame) # CountFrame is my rsem dataFrame
 myNormalized <- calcNormFactors(myDGElist)
 design <- model.matrix(~1, data=myPheno) # myPheno includes treatment and other information
  v <-voom(myNormalized)
  treatment <- myPheno$Medications
  combatProcess <- ComBat(dat=v$E,batch=treatment,design,par.prior=TRUE, prior.plots=FALSE)
   usinglimma <- removeBatchEffect(v$E, treatment)

 

ADD REPLY
0
Entering edit mode

From ?removeBatchEffect:

 design: optional design matrix relating to treatment conditions to be
          preserved
ADD REPLY
0
Entering edit mode

James, thank you for responding. 

My medication vector is something like below, after running removeBatchEffect it seems from the PCA that the medication effect is gone but the untreated points have also switched a little bit which I think should not have happened. Is there a way to prevent this from happening? (Also, does removeBatchEffect protects the biological variability?)

meds <-c("medication1","medication2",...."untreated","untreated","medicationX","untreated")

ADD REPLY
0
Entering edit mode

All removeBatchEffect does is regress out the mean expression for each of the batches you specify. It cannot 'protect' anything, as it doesn't know what should or should not be protected. If you have a poorly designed experiment, then removeBatchEffect may do things you don't like, which is why it is important to have a well designed experiment.

You can use a design matrix to attempt to preserve treatment conditions, but if the treatment conditions are confounded with the batch effects, then you have problems that no amount of statistical wizardry will be able to fix.
 

ADD REPLY

Login before adding your answer.

Traffic: 468 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6