Entering edit mode
Hi Guan,
I think in your case the correct thing to do is to change Base1 and
Base2 to Base. However, if you are only interested in comparing
'Post7-Base1' and 'During4 -Base2' then it seems that you are fine
doing the adjustment in two separate batches as well. So it is up to
you.
Hope this helps.
Evan
On May 5, 2014, at 5:26 AM, Guan Wang <guan.wang at="" glasgow.ac.uk="">
wrote:
> Dear Amit and Evan,
>
> Sorry to write to you out of blue. I read your post http://permalink
.gmane.org/gmane.science.biology.informatics.conductor/49978 regarding
a combat error message as having had the same problem.
>
> Your post helped me to understand what was the reason. Have several
other questions related to the analysis strategy given the error. I
posted these through bioconductor mailing list a few days ago,
however, have not received further opinions. Not sure if you may take
a few minutes to have a look at below? Many thanks for your time and
any suggestions you may have.
>
> Post from bioconductor attached below. Thanks.
>
> Hi All,
>
> I understood from the preivous post "[BioC] ComBat_ Error in
solve.default(t(design) %*% design): Lapack routine dgesv: system is
exactly singular: U[4, 4] = 0" that this error is to do with the
confounded batch and covariate status. I have the same ComBat_Error
appeared when running surrogate variable analysis (SVA) and have
several other related questions. Hope you could have a look. Many
thanks for any opinions/suggestions.
>
> Data set: 24 samples from 6 subjects (4 time points/subject: 2
baseline samples collected on different days, 1 during drug treatment,
1 after drug treatment). Experiments were done with Affymetrix
GeneChip 3.0 for miRNA expression profiling.
>
> Initial data analysis: "oligo" is used to handle Affy CEL files,
"rma()" is used for data normalization. After this, I still see PC1
seems to correlate with certain batch effect (which I'm not aware,
i.e. not come from different
> scan dates) on the PCA plot. Then "sva" package is used to estimate
the surrogate variables, followed by "ComBat()".
>
> Now, come to the ComBat_Error, when I specified the contrasts as
(Base2-Base1, During-Base1, Post-Base1). The pheno input attached
below:
>
> sample batch Status
> GW2miRNA1_(miRNA-3_0).CEL 1 1 Base1
> GW2miRNA2_(miRNA-3_0).CEL 1 1 Post7
> GW2miRNA3_(miRNA-3_0).CEL 2 1 Base1
> GW2miRNA4_(miRNA-3_0).CEL 2 1 Post7
> GW2miRNA5_(miRNA-3_0).CEL 3 1 Base1
> GW2miRNA6_(miRNA-3_0).CEL 3 1 Post7
> GW2miRNA7_(miRNA-3_0).CEL 4 1 Base1
> GW2miRNA8_(miRNA-3_0).CEL 4 1 Post7
> GW2miRNA9_(miRNA-3_0).CEL 5 1 Base1
> GW2miRNA10_(miRNA-3_0).CEL 5 1 Post7
> GW2miRNA11_(miRNA-3_0).CEL 6 1 Base1
> GW2miRNA12_(miRNA-3_0).CEL 6 1 Post7
> GW1miRNA13_(miRNA-3_0).CEL 6 2 Base2
> GW1miRNA14_(miRNA-3_0).CEL 6 2 During4
> GW1miRNA15_(miRNA-3_0).CEL 4 2 Base2
> GW1miRNA16_(miRNA-3_0).CEL 1 2 During4
> GW1miRNA17_(miRNA-3_0).CEL 5 2 Base2
> GW1miRNA18_(miRNA-3_0).CEL 5 2 During4
> GW1miRNA19_(miRNA-3_0).CEL 4 2 During4
> GW1miRNA20_(miRNA-3_0).CEL 3 2 Base2
> GW1miRNA21_(miRNA-3_0).CEL 3 2 During4
> GW1miRNA22_(miRNA-3_0).CEL 1 2 Base2
> GW1miRNA23_(miRNA-3_0).CEL 2 3 During4
> GW1miRNA24_(miRNA-3_0).CEL 2 3 Base2
>
> I understand that the batch is confounded with the status as you
could see in the phenotype file above. Since the two baseline samples
are from same subjects, however, collected on different days before
injecting the drug. I'm thinking whether it makes sense to classify
"Base1 + Base2" as "Base", and make contrasts for "During - Base" and
"Post - Base". Other columns in above pheno file will be kept the same
and re-run the "sva"? Or is it more appropriate to do two separate
"sva" analyses, i.e. "Post7 - Base1" for first 12 samples as
hybridized and scanned at the same time and "During4 - Base2" for the
last 12 samples as they were treated as a batch (however, scanned at
two times, that's why they were labelled as batch 2 and 3 in column of
"batch").
>
> Hope I've described clearly. Much appreciated for
suggestions/opinions.
>
> Regards
> Guan