ComBat_ Error in solve.default(t(design) %*% design): Lapack routine dgesv: system is exactly singular: U[4, 4] = 0
3
1
Entering edit mode
@w-evan-johnson-5447
Last seen 5 months ago
United States
Amit, The "singularity" error you are getting occurs when your covariates are confounded with batch (or with each other). In the example you are trying is there a batch that contains only one covariate level and is that covariate level exclusive to the batch? If this does not make sense, post your 'pheno' variable in a reply and I will be happy to help you figure out the problem. Evan On Aug 19, 2013, at 6:00 AM, <bioconductor-request at="" r-project.org=""> <bioconductor-request at="" r-project.org=""> wrote: > Date: Sun, 18 Aug 2013 19:58:35 +0530 > From: amit kumar subudhi <amit4help at="" gmail.com=""> > To: bioconductor at r-project.org > Subject: [BioC] ComBat_ Error in solve.default(t(design) %*% design) : > Lapack routine dgesv: system is exactly singular: U[4, 4] = 0 > Message-ID: > <cadxjrxwkyc3provl3rnmyc03qpyvh_vdvxvzymu-wkvmw+nkiw at="" mail.gmail.com=""> > Content-Type: text/plain > > Hello to all ComBat users, > > I am trying to remove the batch effects from some of my microarray data but > at last I am getting an error message which read as > > Found 3 batches > Found 1 categorical covariate(s) > Standardizing Data across genes > Error in solve.default(t(design) %*% design) : > Lapack routine dgesv: system is exactly singular: U[4,4] = 0 > > The head(edata) looks like this > AL AO AP AQ CF > GT_pfalci_specific_0000001 16.053898 16.080540 16.101114 16.046898 16.087206 > GT_pfalci_specific_0000002 10.051407 10.477143 8.369233 10.657850 13.312936 > GT_pfalci_specific_0000003 8.910620 8.683393 7.812817 8.496099 10.920685 > GT_pfalci_specific_0000004 6.603195 8.993232 6.476777 6.792369 3.319346 > GT_pfalci_specific_0000005 9.813562 11.084574 9.055613 11.568550 12.977261 > GT_pfalci_specific_0000006 15.989252 15.993513 15.963054 16.000675 15.983985 > CL CU CV GA_UC GB_UC > GT_pfalci_specific_0000001 16.082037 16.071299 16.090370 15.971335 15.994304 > GT_pfalci_specific_0000002 12.653076 9.703247 8.827624 5.697412 8.060719 > GT_pfalci_specific_0000003 11.470758 10.548943 10.718349 6.132614 8.007271 > GT_pfalci_specific_0000004 5.328515 8.398546 6.351136 3.045112 3.891578 > GT_pfalci_specific_0000005 8.520699 11.791610 11.535907 6.791468 9.930246 > GT_pfalci_specific_0000006 15.980660 15.984256 15.970124 13.353012 13.740395 > GC_UC GE_UC GR_UC > GT_pfalci_specific_0000001 15.855644 16.090246 16.086956 > GT_pfalci_specific_0000002 9.026398 8.015609 7.814614 > GT_pfalci_specific_0000003 5.341252 8.658231 5.788790 > GT_pfalci_specific_0000004 4.191565 3.040515 3.517175 > GT_pfalci_specific_0000005 5.446910 11.982848 5.477334 > GT_pfalci_specific_0000006 11.872469 13.675290 13.117105 > > GT_pfalci_specific_0000006 15.983985 15.970124 > > and the head(pheno) looks like this > sample batch malaria > AL 1 1 severe > AO 2 1 severe > AP 3 1 severe > AQ 4 1 severe > CF 5 2 severe > CL 6 2 severe > > > the commands that I have used for ComBat is > mod = model.matrix(~as.factor(malaria), data=pheno) > combat_edata = ComBat(dat=edata, batch=batch, mod=mod, numCovs=NULL, > par.prior=TRUE, prior.plots=FALSE) > > head(mod) looks like this > (Intercept) as.factor(malaria)uncomplicated > AL 1 0 > AO 1 0 > AP 1 0 > AQ 1 0 > CF 1 0 > CL 1 0 > > Why I am getting this error meassage? Please help me out. When I am taking > the larger sample size (n=33) I could able to remove the batch effects but > a subset of those samples giving me the above problem. > > > -- > Amit Kumar Subudhi > Research Scholar, > CSIR-Senior Research Fellow, > Molecular Parasitology and Systems Biology Lab, > Department of Biological Sciences , > FD III, BITS, Pilani, > Rajasthan- 333031 > e mail- > amit4help at gmail.com > amit.subudhi at pilani.bits-pilani.ac.in > Mob No- 919983525845
• 10k views
ADD COMMENT
0
Entering edit mode
@amit-kumar-subudhi-6098
Last seen 10.2 years ago
Hello Dr.Evan, Thanks for the prompt reply. Below is the complete pheno table sample batch malaria AL 1 1 Severe AO 2 1 Severe AQ 3 1 Severe AP 4 1 Severe CF 5 2 Severe CL 6 2 Severe CU 7 2 Severe CV 8 2 Severe GA_UC 9 3 uncomplicated GB_UC 10 3 uncomplicated GC_UC 11 3 uncomplicated GE_UC 12 3 uncomplicated GR_UC 13 3 uncomplicated On Mon, Aug 19, 2013 at 5:50 PM, Johnson, William Evan <wej@bu.edu> wrote: > Amit, > > The "singularity" error you are getting occurs when your covariates are > confounded with batch (or with each other). In the example you are trying > is there a batch that contains only one covariate level and is that > covariate level exclusive to the batch? If this does not make sense, post > your 'pheno' variable in a reply and I will be happy to help you figure out > the problem. > > Evan > > > On Aug 19, 2013, at 6:00 AM, <bioconductor-request@r-project.org> > <bioconductor-request@r-project.org> wrote: > > > Date: Sun, 18 Aug 2013 19:58:35 +0530 > > From: amit kumar subudhi <amit4help@gmail.com> > > To: bioconductor@r-project.org > > Subject: [BioC] ComBat_ Error in solve.default(t(design) %*% design) : > > Lapack routine dgesv: system is exactly singular: U[4, 4] = 0 > > Message-ID: > > < > CADxjrxWKyC3prOvL3RnmYc03qPyvh_VdVxvzymu-WkVmW+nKiw@mail.gmail.com> > > Content-Type: text/plain > > > > Hello to all ComBat users, > > > > I am trying to remove the batch effects from some of my microarray data > but > > at last I am getting an error message which read as > > > > Found 3 batches > > Found 1 categorical covariate(s) > > Standardizing Data across genes > > Error in solve.default(t(design) %*% design) : > > Lapack routine dgesv: system is exactly singular: U[4,4] = 0 > > > > The head(edata) looks like this > > AL AO AP AQ > CF > > GT_pfalci_specific_0000001 16.053898 16.080540 16.101114 16.046898 > 16.087206 > > GT_pfalci_specific_0000002 10.051407 10.477143 8.369233 10.657850 > 13.312936 > > GT_pfalci_specific_0000003 8.910620 8.683393 7.812817 8.496099 > 10.920685 > > GT_pfalci_specific_0000004 6.603195 8.993232 6.476777 6.792369 > 3.319346 > > GT_pfalci_specific_0000005 9.813562 11.084574 9.055613 11.568550 > 12.977261 > > GT_pfalci_specific_0000006 15.989252 15.993513 15.963054 16.000675 > 15.983985 > > CL CU CV GA_UC > GB_UC > > GT_pfalci_specific_0000001 16.082037 16.071299 16.090370 15.971335 > 15.994304 > > GT_pfalci_specific_0000002 12.653076 9.703247 8.827624 5.697412 > 8.060719 > > GT_pfalci_specific_0000003 11.470758 10.548943 10.718349 6.132614 > 8.007271 > > GT_pfalci_specific_0000004 5.328515 8.398546 6.351136 3.045112 > 3.891578 > > GT_pfalci_specific_0000005 8.520699 11.791610 11.535907 6.791468 > 9.930246 > > GT_pfalci_specific_0000006 15.980660 15.984256 15.970124 13.353012 > 13.740395 > > GC_UC GE_UC GR_UC > > GT_pfalci_specific_0000001 15.855644 16.090246 16.086956 > > GT_pfalci_specific_0000002 9.026398 8.015609 7.814614 > > GT_pfalci_specific_0000003 5.341252 8.658231 5.788790 > > GT_pfalci_specific_0000004 4.191565 3.040515 3.517175 > > GT_pfalci_specific_0000005 5.446910 11.982848 5.477334 > > GT_pfalci_specific_0000006 11.872469 13.675290 13.117105 > > > > GT_pfalci_specific_0000006 15.983985 15.970124 > > > > and the head(pheno) looks like this > > sample batch malaria > > AL 1 1 severe > > AO 2 1 severe > > AP 3 1 severe > > AQ 4 1 severe > > CF 5 2 severe > > CL 6 2 severe > > > > > > the commands that I have used for ComBat is > > mod = model.matrix(~as.factor(malaria), data=pheno) > > combat_edata = ComBat(dat=edata, batch=batch, mod=mod, numCovs=NULL, > > par.prior=TRUE, prior.plots=FALSE) > > > > head(mod) looks like this > > (Intercept) as.factor(malaria)uncomplicated > > AL 1 0 > > AO 1 0 > > AP 1 0 > > AQ 1 0 > > CF 1 0 > > CL 1 0 > > > > Why I am getting this error meassage? Please help me out. When I am > taking > > the larger sample size (n=33) I could able to remove the batch effects > but > > a subset of those samples giving me the above problem. > > > > > > -- > > Amit Kumar Subudhi > > Research Scholar, > > CSIR-Senior Research Fellow, > > Molecular Parasitology and Systems Biology Lab, > > Department of Biological Sciences , > > FD III, BITS, Pilani, > > Rajasthan- 333031 > > e mail- > > amit4help@gmail.com > > amit.subudhi@pilani.bits-pilani.ac.in > > Mob No- 919983525845 > > -- Amit Kumar Subudhi Research Scholar, CSIR-Senior Research Fellow, Molecular Parasitology and Systems Biology Lab, Department of Biological Sciences , FD III, BITS, Pilani, Rajasthan- 333031 e mail- amit4help@gmail.com amit.subudhi@pilani.bits-pilani.ac.in Mob No- 919983525845 [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
@amit-kumar-subudhi-6098
Last seen 10.2 years ago
Hello Dr. Evan, Thanks for the prompt reply. Below is the whole pheno table. Looking at the whole table might give you an idea about the probable cause of the error. Batch 1 and 2 contains only severe malaria samples where as batch 2 contains uncomplicated malaria samples. sample batch malaria AL 1 1 Severe AO 2 1 Severe AQ 3 1 Severe AP 4 1 Severe CF 5 2 Severe CL 6 2 Severe CU 7 2 Severe CV 8 2 Severe GA_UC 9 3 uncomplicated GB_UC 10 3 uncomplicated GC_UC 11 3 uncomplicated GE_UC 12 3 uncomplicated GR_UC 13 3 uncomplicated With best regards On Mon, Aug 19, 2013 at 5:50 PM, Johnson, William Evan <wej@bu.edu> wrote: > Amit, > > The "singularity" error you are getting occurs when your covariates are > confounded with batch (or with each other). In the example you are trying > is there a batch that contains only one covariate level and is that > covariate level exclusive to the batch? If this does not make sense, post > your 'pheno' variable in a reply and I will be happy to help you figure out > the problem. > > Evan > > > On Aug 19, 2013, at 6:00 AM, <bioconductor-request@r-project.org> > <bioconductor-request@r-project.org> wrote: > > > Date: Sun, 18 Aug 2013 19:58:35 +0530 > > From: amit kumar subudhi <amit4help@gmail.com> > > To: bioconductor@r-project.org > > Subject: [BioC] ComBat_ Error in solve.default(t(design) %*% design) : > > Lapack routine dgesv: system is exactly singular: U[4, 4] = 0 > > Message-ID: > > < > CADxjrxWKyC3prOvL3RnmYc03qPyvh_VdVxvzymu-WkVmW+nKiw@mail.gmail.com> > > Content-Type: text/plain > > > > Hello to all ComBat users, > > > > I am trying to remove the batch effects from some of my microarray data > but > > at last I am getting an error message which read as > > > > Found 3 batches > > Found 1 categorical covariate(s) > > Standardizing Data across genes > > Error in solve.default(t(design) %*% design) : > > Lapack routine dgesv: system is exactly singular: U[4,4] = 0 > > > > The head(edata) looks like this > > AL AO AP AQ > CF > > GT_pfalci_specific_0000001 16.053898 16.080540 16.101114 16.046898 > 16.087206 > > GT_pfalci_specific_0000002 10.051407 10.477143 8.369233 10.657850 > 13.312936 > > GT_pfalci_specific_0000003 8.910620 8.683393 7.812817 8.496099 > 10.920685 > > GT_pfalci_specific_0000004 6.603195 8.993232 6.476777 6.792369 > 3.319346 > > GT_pfalci_specific_0000005 9.813562 11.084574 9.055613 11.568550 > 12.977261 > > GT_pfalci_specific_0000006 15.989252 15.993513 15.963054 16.000675 > 15.983985 > > CL CU CV GA_UC > GB_UC > > GT_pfalci_specific_0000001 16.082037 16.071299 16.090370 15.971335 > 15.994304 > > GT_pfalci_specific_0000002 12.653076 9.703247 8.827624 5.697412 > 8.060719 > > GT_pfalci_specific_0000003 11.470758 10.548943 10.718349 6.132614 > 8.007271 > > GT_pfalci_specific_0000004 5.328515 8.398546 6.351136 3.045112 > 3.891578 > > GT_pfalci_specific_0000005 8.520699 11.791610 11.535907 6.791468 > 9.930246 > > GT_pfalci_specific_0000006 15.980660 15.984256 15.970124 13.353012 > 13.740395 > > GC_UC GE_UC GR_UC > > GT_pfalci_specific_0000001 15.855644 16.090246 16.086956 > > GT_pfalci_specific_0000002 9.026398 8.015609 7.814614 > > GT_pfalci_specific_0000003 5.341252 8.658231 5.788790 > > GT_pfalci_specific_0000004 4.191565 3.040515 3.517175 > > GT_pfalci_specific_0000005 5.446910 11.982848 5.477334 > > GT_pfalci_specific_0000006 11.872469 13.675290 13.117105 > > > > GT_pfalci_specific_0000006 15.983985 15.970124 > > > > and the head(pheno) looks like this > > sample batch malaria > > AL 1 1 severe > > AO 2 1 severe > > AP 3 1 severe > > AQ 4 1 severe > > CF 5 2 severe > > CL 6 2 severe > > > > > > the commands that I have used for ComBat is > > mod = model.matrix(~as.factor(malaria), data=pheno) > > combat_edata = ComBat(dat=edata, batch=batch, mod=mod, numCovs=NULL, > > par.prior=TRUE, prior.plots=FALSE) > > > > head(mod) looks like this > > (Intercept) as.factor(malaria)uncomplicated > > AL 1 0 > > AO 1 0 > > AP 1 0 > > AQ 1 0 > > CF 1 0 > > CL 1 0 > > > > Why I am getting this error meassage? Please help me out. When I am > taking > > the larger sample size (n=33) I could able to remove the batch effects > but > > a subset of those samples giving me the above problem. > > > > > > -- > > Amit Kumar Subudhi > > Research Scholar, > > CSIR-Senior Research Fellow, > > Molecular Parasitology and Systems Biology Lab, > > Department of Biological Sciences , > > FD III, BITS, Pilani, > > Rajasthan- 333031 > > e mail- > > amit4help@gmail.com > > amit.subudhi@pilani.bits-pilani.ac.in > > Mob No- 919983525845 > > -- Amit Kumar Subudhi Research Scholar, CSIR-Senior Research Fellow, Molecular Parasitology and Systems Biology Lab, Department of Biological Sciences , FD III, BITS, Pilani, Rajasthan- 333031 e mail- amit4help@gmail.com amit.subudhi@pilani.bits-pilani.ac.in Mob No- 919983525845 [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Okay, yes this is clear now. Your batch and covariate status are completely confounded. In other words, if you see a difference between "severe" and "uncomplicated" you won't know if this is really due to a covariate effect or if this is due to a batch (batch 3) effect. In short, this is really an experimental design issue and ComBat cannot help you. If you were to remove the "malaria" covariate, then ComBat would work, but it would also take out all malaria covariate effects as well. How bad are the batch effects between batches 1 and 2? Do you expect batch 3 to have a similar level of batch differences? You could combine batches 1 and 2, and then look for differences with batch 3--but you wouldn't know whether the differential expression is due to the treatment or due to batch--hence the confounding... Sorry I couldn't be much more of a help, but like I said, the issue here is due to experimental design. Evan On Aug 19, 2013, at 8:55 AM, amit kumar subudhi wrote: Hello Dr. Evan, Thanks for the prompt reply. Below is the whole pheno table. Looking at the whole table might give you an idea about the probable cause of the error. Batch 1 and 2 contains only severe malaria samples where as batch 2 contains uncomplicated malaria samples. sample batch malaria AL 1 1 Severe AO 2 1 Severe AQ 3 1 Severe AP 4 1 Severe CF 5 2 Severe CL 6 2 Severe CU 7 2 Severe CV 8 2 Severe GA_UC 9 3 uncomplicated GB_UC 10 3 uncomplicated GC_UC 11 3 uncomplicated GE_UC 12 3 uncomplicated GR_UC 13 3 uncomplicated With best regards On Mon, Aug 19, 2013 at 5:50 PM, Johnson, William Evan <wej@bu.edu<mailto:wej@bu.edu>> wrote: Amit, The "singularity" error you are getting occurs when your covariates are confounded with batch (or with each other). In the example you are trying is there a batch that contains only one covariate level and is that covariate level exclusive to the batch? If this does not make sense, post your 'pheno' variable in a reply and I will be happy to help you figure out the problem. Evan On Aug 19, 2013, at 6:00 AM, <bioconductor- request@r-project.org<mailto:bioconductor-request@r-project.org="">> <bioconductor-request@r-project.org<mailto:bioconductor- request@r-project.org="">> wrote: > Date: Sun, 18 Aug 2013 19:58:35 +0530 > From: amit kumar subudhi <amit4help@gmail.com<mailto:amit4help@gmail.com>> > To: bioconductor@r-project.org<mailto:bioconductor@r-project.org> > Subject: [BioC] ComBat_ Error in solve.default(t(design) %*% design) : > Lapack routine dgesv: system is exactly singular: U[4, 4] = 0 > Message-ID: > <cadxjrxwkyc3provl3rnmyc03qpyvh_vdvxvzymu- wkvmw+nkiw@mail.gmail.com<mailto="" :cadxjrxwkyc3provl3rnmyc03qpyvh_vdvxvzymu-="" wkvmw%2bnkiw@mail.gmail.com="">> > Content-Type: text/plain > > Hello to all ComBat users, > > I am trying to remove the batch effects from some of my microarray data but > at last I am getting an error message which read as > > Found 3 batches > Found 1 categorical covariate(s) > Standardizing Data across genes > Error in solve.default(t(design) %*% design) : > Lapack routine dgesv: system is exactly singular: U[4,4] = 0 > > The head(edata) looks like this > AL AO AP AQ CF > GT_pfalci_specific_0000001 16.053898 16.080540 16.101114 16.046898 16.087206 > GT_pfalci_specific_0000002 10.051407 10.477143 8.369233 10.657850 13.312936 > GT_pfalci_specific_0000003 8.910620 8.683393 7.812817 8.496099 10.920685 > GT_pfalci_specific_0000004 6.603195 8.993232 6.476777 6.792369 3.319346 > GT_pfalci_specific_0000005 9.813562 11.084574 9.055613 11.568550 12.977261 > GT_pfalci_specific_0000006 15.989252 15.993513 15.963054 16.000675 15.983985 > CL CU CV GA_UC GB_UC > GT_pfalci_specific_0000001 16.082037 16.071299 16.090370 15.971335 15.994304 > GT_pfalci_specific_0000002 12.653076 9.703247 8.827624 5.697412 8.060719 > GT_pfalci_specific_0000003 11.470758 10.548943 10.718349 6.132614 8.007271 > GT_pfalci_specific_0000004 5.328515 8.398546 6.351136 3.045112 3.891578 > GT_pfalci_specific_0000005 8.520699 11.791610 11.535907 6.791468 9.930246 > GT_pfalci_specific_0000006 15.980660 15.984256 15.970124 13.353012 13.740395 > GC_UC GE_UC GR_UC > GT_pfalci_specific_0000001 15.855644 16.090246 16.086956 > GT_pfalci_specific_0000002 9.026398 8.015609 7.814614 > GT_pfalci_specific_0000003 5.341252 8.658231 5.788790 > GT_pfalci_specific_0000004 4.191565 3.040515 3.517175 > GT_pfalci_specific_0000005 5.446910 11.982848 5.477334 > GT_pfalci_specific_0000006 11.872469 13.675290 13.117105 > > GT_pfalci_specific_0000006 15.983985 15.970124 > > and the head(pheno) looks like this > sample batch malaria > AL 1 1 severe > AO 2 1 severe > AP 3 1 severe > AQ 4 1 severe > CF 5 2 severe > CL 6 2 severe > > > the commands that I have used for ComBat is > mod = model.matrix(~as.factor(malaria), data=pheno) > combat_edata = ComBat(dat=edata, batch=batch, mod=mod, numCovs=NULL, > par.prior=TRUE, prior.plots=FALSE) > > head(mod) looks like this > (Intercept) as.factor(malaria)uncomplicated > AL 1 0 > AO 1 0 > AP 1 0 > AQ 1 0 > CF 1 0 > CL 1 0 > > Why I am getting this error meassage? Please help me out. When I am taking > the larger sample size (n=33) I could able to remove the batch effects but > a subset of those samples giving me the above problem. > > > -- > Amit Kumar Subudhi > Research Scholar, > CSIR-Senior Research Fellow, > Molecular Parasitology and Systems Biology Lab, > Department of Biological Sciences , > FD III, BITS, Pilani, > Rajasthan- 333031 > e mail- > amit4help@gmail.com<mailto:amit4help@gmail.com> > amit.subudhi@pilani.bits-pilani.ac.in<mailto:amit.subudhi@pilani .bits-pilani.ac.in=""> > Mob No- 919983525845 -- Amit Kumar Subudhi Research Scholar, CSIR-Senior Research Fellow, Molecular Parasitology and Systems Biology Lab, Department of Biological Sciences , FD III, BITS, Pilani, Rajasthan- 333031 e mail- amit4help@gmail.com<mailto:amit4help@gmail.com> amit.subudhi@pilani.bits-pilani.ac.in<mailto:amit.subudhi@pilani.bits- pilani.ac.in=""> Mob No- 919983525845 [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Thanks again for the reply Dr. Evans, This set of samples is a subset from a larger set and contain many more samples in each batch. When I have performed the ComBat on the larger dataset I could able remove the batch effects to some extend. To Inform you, the known batch effect here is the different dates of hybridization and a simple hierarchical analysis showed that most of the samples are clustering based on the date of hybridization and hence tried the ComBat to remove the batch effects. The third batch contains most of the uncomplicated malaria samples. The subset of samples that I have posted here contains specific symptoms pertaining to severe malaria and hence selected for comparison with uncomplicated malaria samples. Question- As I have mentioned above, I have applied the ComBat to remove the batch effects from the larger data set, can I take the smaller set of samples from the larger data set to find out deferentially regulated genes? Answer to this question would really be helpful. With best regards Amit On Mon, Aug 19, 2013 at 6:31 PM, Johnson, William Evan <wej@bu.edu> wrote: > Okay, yes this is clear now. Your batch and covariate status are > completely confounded. In other words, if you see a difference between > "severe" and "uncomplicated" you won't know if this is really due to a > covariate effect or if this is due to a batch (batch 3) effect. In short, > this is really an experimental design issue and ComBat cannot help you. > > If you were to remove the "malaria" covariate, then ComBat would work, > but it would also take out all malaria covariate effects as well. How bad > are the batch effects between batches 1 and 2? Do you expect batch 3 to > have a similar level of batch differences? You could combine batches 1 and > 2, and then look for differences with batch 3--but you wouldn't know > whether the differential expression is due to the treatment or due to > batch--hence the confounding... > > Sorry I couldn't be much more of a help, but like I said, the issue here > is due to experimental design. > > Evan > > > > > On Aug 19, 2013, at 8:55 AM, amit kumar subudhi wrote: > > Hello Dr. Evan, > > Thanks for the prompt reply. Below is the whole pheno table. Looking at > the whole table might give you an idea about the probable cause of the > error. Batch 1 and 2 contains only severe malaria samples where as batch 2 > contains uncomplicated malaria samples. > sample batch malaria > AL 1 1 Severe > AO 2 1 Severe > AQ 3 1 Severe > AP 4 1 Severe > CF 5 2 Severe > CL 6 2 Severe > CU 7 2 Severe > CV 8 2 Severe > GA_UC 9 3 uncomplicated > GB_UC 10 3 uncomplicated > GC_UC 11 3 uncomplicated > GE_UC 12 3 uncomplicated > GR_UC 13 3 uncomplicated > > With best regards > > > > On Mon, Aug 19, 2013 at 5:50 PM, Johnson, William Evan <wej@bu.edu> wrote: > >> Amit, >> >> The "singularity" error you are getting occurs when your covariates are >> confounded with batch (or with each other). In the example you are trying >> is there a batch that contains only one covariate level and is that >> covariate level exclusive to the batch? If this does not make sense, post >> your 'pheno' variable in a reply and I will be happy to help you figure out >> the problem. >> >> Evan >> >> >> On Aug 19, 2013, at 6:00 AM, <bioconductor-request@r-project.org> >> <bioconductor-request@r-project.org> wrote: >> >> > Date: Sun, 18 Aug 2013 19:58:35 +0530 >> > From: amit kumar subudhi <amit4help@gmail.com> >> > To: bioconductor@r-project.org >> > Subject: [BioC] ComBat_ Error in solve.default(t(design) %*% design) : >> > Lapack routine dgesv: system is exactly singular: U[4, 4] = 0 >> > Message-ID: >> > < >> CADxjrxWKyC3prOvL3RnmYc03qPyvh_VdVxvzymu-WkVmW+nKiw@mail.gmail.com> >> > Content-Type: text/plain >> > >> > Hello to all ComBat users, >> > >> > I am trying to remove the batch effects from some of my microarray >> data but >> > at last I am getting an error message which read as >> > >> > Found 3 batches >> > Found 1 categorical covariate(s) >> > Standardizing Data across genes >> > Error in solve.default(t(design) %*% design) : >> > Lapack routine dgesv: system is exactly singular: U[4,4] = 0 >> > >> > The head(edata) looks like this >> > AL AO AP AQ >> CF >> > GT_pfalci_specific_0000001 16.053898 16.080540 16.101114 16.046898 >> 16.087206 >> > GT_pfalci_specific_0000002 10.051407 10.477143 8.369233 10.657850 >> 13.312936 >> > GT_pfalci_specific_0000003 8.910620 8.683393 7.812817 8.496099 >> 10.920685 >> > GT_pfalci_specific_0000004 6.603195 8.993232 6.476777 6.792369 >> 3.319346 >> > GT_pfalci_specific_0000005 9.813562 11.084574 9.055613 11.568550 >> 12.977261 >> > GT_pfalci_specific_0000006 15.989252 15.993513 15.963054 16.000675 >> 15.983985 >> > CL CU CV GA_UC >> GB_UC >> > GT_pfalci_specific_0000001 16.082037 16.071299 16.090370 15.971335 >> 15.994304 >> > GT_pfalci_specific_0000002 12.653076 9.703247 8.827624 5.697412 >> 8.060719 >> > GT_pfalci_specific_0000003 11.470758 10.548943 10.718349 6.132614 >> 8.007271 >> > GT_pfalci_specific_0000004 5.328515 8.398546 6.351136 3.045112 >> 3.891578 >> > GT_pfalci_specific_0000005 8.520699 11.791610 11.535907 6.791468 >> 9.930246 >> > GT_pfalci_specific_0000006 15.980660 15.984256 15.970124 13.353012 >> 13.740395 >> > GC_UC GE_UC GR_UC >> > GT_pfalci_specific_0000001 15.855644 16.090246 16.086956 >> > GT_pfalci_specific_0000002 9.026398 8.015609 7.814614 >> > GT_pfalci_specific_0000003 5.341252 8.658231 5.788790 >> > GT_pfalci_specific_0000004 4.191565 3.040515 3.517175 >> > GT_pfalci_specific_0000005 5.446910 11.982848 5.477334 >> > GT_pfalci_specific_0000006 11.872469 13.675290 13.117105 >> > >> > GT_pfalci_specific_0000006 15.983985 15.970124 >> > >> > and the head(pheno) looks like this >> > sample batch malaria >> > AL 1 1 severe >> > AO 2 1 severe >> > AP 3 1 severe >> > AQ 4 1 severe >> > CF 5 2 severe >> > CL 6 2 severe >> > >> > >> > the commands that I have used for ComBat is >> > mod = model.matrix(~as.factor(malaria), data=pheno) >> > combat_edata = ComBat(dat=edata, batch=batch, mod=mod, numCovs=NULL, >> > par.prior=TRUE, prior.plots=FALSE) >> > >> > head(mod) looks like this >> > (Intercept) as.factor(malaria)uncomplicated >> > AL 1 0 >> > AO 1 0 >> > AP 1 0 >> > AQ 1 0 >> > CF 1 0 >> > CL 1 0 >> > >> > Why I am getting this error meassage? Please help me out. When I am >> taking >> > the larger sample size (n=33) I could able to remove the batch >> effects but >> > a subset of those samples giving me the above problem. >> > >> > >> > -- >> > Amit Kumar Subudhi >> > Research Scholar, >> > CSIR-Senior Research Fellow, >> > Molecular Parasitology and Systems Biology Lab, >> > Department of Biological Sciences , >> > FD III, BITS, Pilani, >> > Rajasthan- 333031 >> > e mail- >> > amit4help@gmail.com >> > amit.subudhi@pilani.bits-pilani.ac.in >> > Mob No- 919983525845 >> >> > > > -- > Amit Kumar Subudhi > Research Scholar, > CSIR-Senior Research Fellow, > Molecular Parasitology and Systems Biology Lab, > Department of Biological Sciences , > FD III, BITS, Pilani, > Rajasthan- 333031 > e mail- > amit4help@gmail.com > amit.subudhi@pilani.bits-pilani.ac.in > Mob No- 919983525845 > > > -- Amit Kumar Subudhi Research Scholar, CSIR-Senior Research Fellow, Molecular Parasitology and Systems Biology Lab, Department of Biological Sciences , FD III, BITS, Pilani, Rajasthan- 333031 e mail- amit4help@gmail.com amit.subudhi@pilani.bits-pilani.ac.in Mob No- 919983525845 [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Yes, it should be fine to remove batch effects on the larger dataset and then use a smaller subset to do your comparisons. In fact, this approach might even be preferred even if it were possible to adjust for batch in the smaller subset. On Aug 19, 2013, at 9:34 AM, amit kumar subudhi wrote: Thanks again for the reply Dr. Evans, This set of samples is a subset from a larger set and contain many more samples in each batch. When I have performed the ComBat on the larger dataset I could able remove the batch effects to some extend. To Inform you, the known batch effect here is the different dates of hybridization and a simple hierarchical analysis showed that most of the samples are clustering based on the date of hybridization and hence tried the ComBat to remove the batch effects. The third batch contains most of the uncomplicated malaria samples. The subset of samples that I have posted here contains specific symptoms pertaining to severe malaria and hence selected for comparison with uncomplicated malaria samples. Question- As I have mentioned above, I have applied the ComBat to remove the batch effects from the larger data set, can I take the smaller set of samples from the larger data set to find out deferentially regulated genes? Answer to this question would really be helpful. With best regards Amit On Mon, Aug 19, 2013 at 6:31 PM, Johnson, William Evan <wej@bu.edu<mailto:wej@bu.edu>> wrote: Okay, yes this is clear now. Your batch and covariate status are completely confounded. In other words, if you see a difference between "severe" and "uncomplicated" you won't know if this is really due to a covariate effect or if this is due to a batch (batch 3) effect. In short, this is really an experimental design issue and ComBat cannot help you. If you were to remove the "malaria" covariate, then ComBat would work, but it would also take out all malaria covariate effects as well. How bad are the batch effects between batches 1 and 2? Do you expect batch 3 to have a similar level of batch differences? You could combine batches 1 and 2, and then look for differences with batch 3--but you wouldn't know whether the differential expression is due to the treatment or due to batch--hence the confounding... Sorry I couldn't be much more of a help, but like I said, the issue here is due to experimental design. Evan On Aug 19, 2013, at 8:55 AM, amit kumar subudhi wrote: Hello Dr. Evan, Thanks for the prompt reply. Below is the whole pheno table. Looking at the whole table might give you an idea about the probable cause of the error. Batch 1 and 2 contains only severe malaria samples where as batch 2 contains uncomplicated malaria samples. sample batch malaria AL 1 1 Severe AO 2 1 Severe AQ 3 1 Severe AP 4 1 Severe CF 5 2 Severe CL 6 2 Severe CU 7 2 Severe CV 8 2 Severe GA_UC 9 3 uncomplicated GB_UC 10 3 uncomplicated GC_UC 11 3 uncomplicated GE_UC 12 3 uncomplicated GR_UC 13 3 uncomplicated With best regards On Mon, Aug 19, 2013 at 5:50 PM, Johnson, William Evan <wej@bu.edu<mailto:wej@bu.edu>> wrote: Amit, The "singularity" error you are getting occurs when your covariates are confounded with batch (or with each other). In the example you are trying is there a batch that contains only one covariate level and is that covariate level exclusive to the batch? If this does not make sense, post your 'pheno' variable in a reply and I will be happy to help you figure out the problem. Evan On Aug 19, 2013, at 6:00 AM, <bioconductor- request@r-project.org<mailto:bioconductor-request@r-project.org="">> <bioconductor-request@r-project.org<mailto:bioconductor- request@r-project.org="">> wrote: > Date: Sun, 18 Aug 2013 19:58:35 +0530 > From: amit kumar subudhi <amit4help@gmail.com<mailto:amit4help@gmail.com>> > To: bioconductor@r-project.org<mailto:bioconductor@r-project.org> > Subject: [BioC] ComBat_ Error in solve.default(t(design) %*% design) : > Lapack routine dgesv: system is exactly singular: U[4, 4] = 0 > Message-ID: > <cadxjrxwkyc3provl3rnmyc03qpyvh_vdvxvzymu- wkvmw+nkiw@mail.gmail.com<mailto="" :cadxjrxwkyc3provl3rnmyc03qpyvh_vdvxvzymu-="" wkvmw%2bnkiw@mail.gmail.com="">> > Content-Type: text/plain > > Hello to all ComBat users, > > I am trying to remove the batch effects from some of my microarray data but > at last I am getting an error message which read as > > Found 3 batches > Found 1 categorical covariate(s) > Standardizing Data across genes > Error in solve.default(t(design) %*% design) : > Lapack routine dgesv: system is exactly singular: U[4,4] = 0 > > The head(edata) looks like this > AL AO AP AQ CF > GT_pfalci_specific_0000001 16.053898 16.080540 16.101114 16.046898 16.087206 > GT_pfalci_specific_0000002 10.051407 10.477143 8.369233 10.657850 13.312936 > GT_pfalci_specific_0000003 8.910620 8.683393 7.812817 8.496099 10.920685 > GT_pfalci_specific_0000004 6.603195 8.993232 6.476777 6.792369 3.319346 > GT_pfalci_specific_0000005 9.813562 11.084574 9.055613 11.568550 12.977261 > GT_pfalci_specific_0000006 15.989252 15.993513 15.963054 16.000675 15.983985 > CL CU CV GA_UC GB_UC > GT_pfalci_specific_0000001 16.082037 16.071299 16.090370 15.971335 15.994304 > GT_pfalci_specific_0000002 12.653076 9.703247 8.827624 5.697412 8.060719 > GT_pfalci_specific_0000003 11.470758 10.548943 10.718349 6.132614 8.007271 > GT_pfalci_specific_0000004 5.328515 8.398546 6.351136 3.045112 3.891578 > GT_pfalci_specific_0000005 8.520699 11.791610 11.535907 6.791468 9.930246 > GT_pfalci_specific_0000006 15.980660 15.984256 15.970124 13.353012 13.740395 > GC_UC GE_UC GR_UC > GT_pfalci_specific_0000001 15.855644 16.090246 16.086956 > GT_pfalci_specific_0000002 9.026398 8.015609 7.814614 > GT_pfalci_specific_0000003 5.341252 8.658231 5.788790 > GT_pfalci_specific_0000004 4.191565 3.040515 3.517175 > GT_pfalci_specific_0000005 5.446910 11.982848 5.477334 > GT_pfalci_specific_0000006 11.872469 13.675290 13.117105 > > GT_pfalci_specific_0000006 15.983985 15.970124 > > and the head(pheno) looks like this > sample batch malaria > AL 1 1 severe > AO 2 1 severe > AP 3 1 severe > AQ 4 1 severe > CF 5 2 severe > CL 6 2 severe > > > the commands that I have used for ComBat is > mod = model.matrix(~as.factor(malaria), data=pheno) > combat_edata = ComBat(dat=edata, batch=batch, mod=mod, numCovs=NULL, > par.prior=TRUE, prior.plots=FALSE) > > head(mod) looks like this > (Intercept) as.factor(malaria)uncomplicated > AL 1 0 > AO 1 0 > AP 1 0 > AQ 1 0 > CF 1 0 > CL 1 0 > > Why I am getting this error meassage? Please help me out. When I am taking > the larger sample size (n=33) I could able to remove the batch effects but > a subset of those samples giving me the above problem. > > > -- > Amit Kumar Subudhi > Research Scholar, > CSIR-Senior Research Fellow, > Molecular Parasitology and Systems Biology Lab, > Department of Biological Sciences , > FD III, BITS, Pilani, > Rajasthan- 333031 > e mail- > amit4help@gmail.com<mailto:amit4help@gmail.com> > amit.subudhi@pilani.bits-pilani.ac.in<mailto:amit.subudhi@pilani .bits-pilani.ac.in=""> > Mob No- 919983525845 -- Amit Kumar Subudhi Research Scholar, CSIR-Senior Research Fellow, Molecular Parasitology and Systems Biology Lab, Department of Biological Sciences , FD III, BITS, Pilani, Rajasthan- 333031 e mail- amit4help@gmail.com<mailto:amit4help@gmail.com> amit.subudhi@pilani.bits-pilani.ac.in<mailto:amit.subudhi@pilani.bits- pilani.ac.in=""> Mob No- 919983525845 -- Amit Kumar Subudhi Research Scholar, CSIR-Senior Research Fellow, Molecular Parasitology and Systems Biology Lab, Department of Biological Sciences , FD III, BITS, Pilani, Rajasthan- 333031 e mail- amit4help@gmail.com<mailto:amit4help@gmail.com> amit.subudhi@pilani.bits-pilani.ac.in<mailto:amit.subudhi@pilani.bits- pilani.ac.in=""> Mob No- 919983525845 [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
This reply solved my problem. Thanks again Dr. Evan for your kind and prompt reply and suggestions. Regards Amit On Mon, Aug 19, 2013 at 7:08 PM, Johnson, William Evan <wej@bu.edu> wrote: > Yes, it should be fine to remove batch effects on the larger dataset and > then use a smaller subset to do your comparisons. In fact, this approach > might even be preferred even if it were possible to adjust for batch in the > smaller subset. > > > On Aug 19, 2013, at 9:34 AM, amit kumar subudhi wrote: > > Thanks again for the reply Dr. Evans, > > This set of samples is a subset from a larger set and contain many more > samples in each batch. When I have performed the ComBat on the larger > dataset I could able remove the batch effects to some extend. To Inform > you, the known batch effect here is the different dates of hybridization > and a simple hierarchical analysis showed that most of the samples are > clustering based on the date of hybridization and hence tried the ComBat to > remove the batch effects. The third batch contains most of the > uncomplicated malaria samples. The subset of samples that I have posted > here contains specific symptoms pertaining to severe malaria and hence > selected for comparison with uncomplicated malaria samples. > > Question- As I have mentioned above, I have applied the ComBat to remove > the batch effects from the larger data set, can I take the smaller set of > samples from the larger data set to find out deferentially regulated genes? > Answer to this question would really be helpful. > > > With best regards > Amit > > > On Mon, Aug 19, 2013 at 6:31 PM, Johnson, William Evan <wej@bu.edu> wrote: > >> Okay, yes this is clear now. Your batch and covariate status are >> completely confounded. In other words, if you see a difference between >> "severe" and "uncomplicated" you won't know if this is really due to a >> covariate effect or if this is due to a batch (batch 3) effect. In short, >> this is really an experimental design issue and ComBat cannot help you. >> >> If you were to remove the "malaria" covariate, then ComBat would work, >> but it would also take out all malaria covariate effects as well. How bad >> are the batch effects between batches 1 and 2? Do you expect batch 3 to >> have a similar level of batch differences? You could combine batches 1 and >> 2, and then look for differences with batch 3--but you wouldn't know >> whether the differential expression is due to the treatment or due to >> batch--hence the confounding... >> >> Sorry I couldn't be much more of a help, but like I said, the issue >> here is due to experimental design. >> >> Evan >> >> >> >> >> On Aug 19, 2013, at 8:55 AM, amit kumar subudhi wrote: >> >> Hello Dr. Evan, >> >> Thanks for the prompt reply. Below is the whole pheno table. Looking at >> the whole table might give you an idea about the probable cause of the >> error. Batch 1 and 2 contains only severe malaria samples where as batch 2 >> contains uncomplicated malaria samples. >> sample batch malaria >> AL 1 1 Severe >> AO 2 1 Severe >> AQ 3 1 Severe >> AP 4 1 Severe >> CF 5 2 Severe >> CL 6 2 Severe >> CU 7 2 Severe >> CV 8 2 Severe >> GA_UC 9 3 uncomplicated >> GB_UC 10 3 uncomplicated >> GC_UC 11 3 uncomplicated >> GE_UC 12 3 uncomplicated >> GR_UC 13 3 uncomplicated >> >> With best regards >> >> >> >> On Mon, Aug 19, 2013 at 5:50 PM, Johnson, William Evan <wej@bu.edu>wrote: >> >>> Amit, >>> >>> The "singularity" error you are getting occurs when your covariates are >>> confounded with batch (or with each other). In the example you are trying >>> is there a batch that contains only one covariate level and is that >>> covariate level exclusive to the batch? If this does not make sense, post >>> your 'pheno' variable in a reply and I will be happy to help you figure out >>> the problem. >>> >>> Evan >>> >>> >>> On Aug 19, 2013, at 6:00 AM, <bioconductor-request@r-project.org> >>> <bioconductor-request@r-project.org> wrote: >>> >>> > Date: Sun, 18 Aug 2013 19:58:35 +0530 >>> > From: amit kumar subudhi <amit4help@gmail.com> >>> > To: bioconductor@r-project.org >>> > Subject: [BioC] ComBat_ Error in solve.default(t(design) %*% design) : >>> > Lapack routine dgesv: system is exactly singular: U[4, 4] = 0 >>> > Message-ID: >>> > < >>> CADxjrxWKyC3prOvL3RnmYc03qPyvh_VdVxvzymu- WkVmW+nKiw@mail.gmail.com> >>> > Content-Type: text/plain >>> > >>> > Hello to all ComBat users, >>> > >>> > I am trying to remove the batch effects from some of my microarray >>> data but >>> > at last I am getting an error message which read as >>> > >>> > Found 3 batches >>> > Found 1 categorical covariate(s) >>> > Standardizing Data across genes >>> > Error in solve.default(t(design) %*% design) : >>> > Lapack routine dgesv: system is exactly singular: U[4,4] = 0 >>> > >>> > The head(edata) looks like this >>> > AL AO AP AQ >>> CF >>> > GT_pfalci_specific_0000001 16.053898 16.080540 16.101114 16.046898 >>> 16.087206 >>> > GT_pfalci_specific_0000002 10.051407 10.477143 8.369233 10.657850 >>> 13.312936 >>> > GT_pfalci_specific_0000003 8.910620 8.683393 7.812817 8.496099 >>> 10.920685 >>> > GT_pfalci_specific_0000004 6.603195 8.993232 6.476777 6.792369 >>> 3.319346 >>> > GT_pfalci_specific_0000005 9.813562 11.084574 9.055613 11.568550 >>> 12.977261 >>> > GT_pfalci_specific_0000006 15.989252 15.993513 15.963054 16.000675 >>> 15.983985 >>> > CL CU CV GA_UC >>> GB_UC >>> > GT_pfalci_specific_0000001 16.082037 16.071299 16.090370 15.971335 >>> 15.994304 >>> > GT_pfalci_specific_0000002 12.653076 9.703247 8.827624 5.697412 >>> 8.060719 >>> > GT_pfalci_specific_0000003 11.470758 10.548943 10.718349 6.132614 >>> 8.007271 >>> > GT_pfalci_specific_0000004 5.328515 8.398546 6.351136 3.045112 >>> 3.891578 >>> > GT_pfalci_specific_0000005 8.520699 11.791610 11.535907 6.791468 >>> 9.930246 >>> > GT_pfalci_specific_0000006 15.980660 15.984256 15.970124 13.353012 >>> 13.740395 >>> > GC_UC GE_UC GR_UC >>> > GT_pfalci_specific_0000001 15.855644 16.090246 16.086956 >>> > GT_pfalci_specific_0000002 9.026398 8.015609 7.814614 >>> > GT_pfalci_specific_0000003 5.341252 8.658231 5.788790 >>> > GT_pfalci_specific_0000004 4.191565 3.040515 3.517175 >>> > GT_pfalci_specific_0000005 5.446910 11.982848 5.477334 >>> > GT_pfalci_specific_0000006 11.872469 13.675290 13.117105 >>> > >>> > GT_pfalci_specific_0000006 15.983985 15.970124 >>> > >>> > and the head(pheno) looks like this >>> > sample batch malaria >>> > AL 1 1 severe >>> > AO 2 1 severe >>> > AP 3 1 severe >>> > AQ 4 1 severe >>> > CF 5 2 severe >>> > CL 6 2 severe >>> > >>> > >>> > the commands that I have used for ComBat is >>> > mod = model.matrix(~as.factor(malaria), data=pheno) >>> > combat_edata = ComBat(dat=edata, batch=batch, mod=mod, numCovs=NULL, >>> > par.prior=TRUE, prior.plots=FALSE) >>> > >>> > head(mod) looks like this >>> > (Intercept) as.factor(malaria)uncomplicated >>> > AL 1 0 >>> > AO 1 0 >>> > AP 1 0 >>> > AQ 1 0 >>> > CF 1 0 >>> > CL 1 0 >>> > >>> > Why I am getting this error meassage? Please help me out. When I am >>> taking >>> > the larger sample size (n=33) I could able to remove the batch >>> effects but >>> > a subset of those samples giving me the above problem. >>> > >>> > >>> > -- >>> > Amit Kumar Subudhi >>> > Research Scholar, >>> > CSIR-Senior Research Fellow, >>> > Molecular Parasitology and Systems Biology Lab, >>> > Department of Biological Sciences , >>> > FD III, BITS, Pilani, >>> > Rajasthan- 333031 >>> > e mail- >>> > amit4help@gmail.com >>> > amit.subudhi@pilani.bits-pilani.ac.in >>> > Mob No- 919983525845 >>> >>> >> >> >> -- >> Amit Kumar Subudhi >> Research Scholar, >> CSIR-Senior Research Fellow, >> Molecular Parasitology and Systems Biology Lab, >> Department of Biological Sciences , >> FD III, BITS, Pilani, >> Rajasthan- 333031 >> e mail- >> amit4help@gmail.com >> amit.subudhi@pilani.bits-pilani.ac.in >> Mob No- 919983525845 >> >> >> > > > -- > Amit Kumar Subudhi > Research Scholar, > CSIR-Senior Research Fellow, > Molecular Parasitology and Systems Biology Lab, > Department of Biological Sciences , > FD III, BITS, Pilani, > Rajasthan- 333031 > e mail- > amit4help@gmail.com > amit.subudhi@pilani.bits-pilani.ac.in > Mob No- 919983525845 > > > -- Amit Kumar Subudhi Research Scholar, CSIR-Senior Research Fellow, Molecular Parasitology and Systems Biology Lab, Department of Biological Sciences , FD III, BITS, Pilani, Rajasthan- 333031 e mail- amit4help@gmail.com amit.subudhi@pilani.bits-pilani.ac.in Mob No- 919983525845 [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
@w-evan-johnson-5447
Last seen 5 months ago
United States
ComBat should be done after normalization, and only of there are clear signs of batch effects after normalization (either through significance testing, clustering, or principle component analysis). On Aug 21, 2013, at 12:33 AM, amit kumar subudhi wrote: Hello Dr. Evan, One more doubt, hopefully you will answer it. Is it recommended that before doing ComBat, required normalization on the data should be carried out or after ComBat we can do the normalization step? This particular question making me confused. Please answer to this question if you can. With best regards Amit On Mon, Aug 19, 2013 at 7:12 PM, amit kumar subudhi <amit4help@gmail.com<mailto:amit4help@gmail.com>> wrote: This reply solved my problem. Thanks again Dr. Evan for your kind and prompt reply and suggestions. Regards Amit On Mon, Aug 19, 2013 at 7:08 PM, Johnson, William Evan <wej@bu.edu<mailto:wej@bu.edu>> wrote: Yes, it should be fine to remove batch effects on the larger dataset and then use a smaller subset to do your comparisons. In fact, this approach might even be preferred even if it were possible to adjust for batch in the smaller subset. On Aug 19, 2013, at 9:34 AM, amit kumar subudhi wrote: Thanks again for the reply Dr. Evans, This set of samples is a subset from a larger set and contain many more samples in each batch. When I have performed the ComBat on the larger dataset I could able remove the batch effects to some extend. To Inform you, the known batch effect here is the different dates of hybridization and a simple hierarchical analysis showed that most of the samples are clustering based on the date of hybridization and hence tried the ComBat to remove the batch effects. The third batch contains most of the uncomplicated malaria samples. The subset of samples that I have posted here contains specific symptoms pertaining to severe malaria and hence selected for comparison with uncomplicated malaria samples. Question- As I have mentioned above, I have applied the ComBat to remove the batch effects from the larger data set, can I take the smaller set of samples from the larger data set to find out deferentially regulated genes? Answer to this question would really be helpful. With best regards Amit On Mon, Aug 19, 2013 at 6:31 PM, Johnson, William Evan <wej@bu.edu<mailto:wej@bu.edu>> wrote: Okay, yes this is clear now. Your batch and covariate status are completely confounded. In other words, if you see a difference between "severe" and "uncomplicated" you won't know if this is really due to a covariate effect or if this is due to a batch (batch 3) effect. In short, this is really an experimental design issue and ComBat cannot help you. If you were to remove the "malaria" covariate, then ComBat would work, but it would also take out all malaria covariate effects as well. How bad are the batch effects between batches 1 and 2? Do you expect batch 3 to have a similar level of batch differences? You could combine batches 1 and 2, and then look for differences with batch 3--but you wouldn't know whether the differential expression is due to the treatment or due to batch--hence the confounding... Sorry I couldn't be much more of a help, but like I said, the issue here is due to experimental design. Evan On Aug 19, 2013, at 8:55 AM, amit kumar subudhi wrote: Hello Dr. Evan, Thanks for the prompt reply. Below is the whole pheno table. Looking at the whole table might give you an idea about the probable cause of the error. Batch 1 and 2 contains only severe malaria samples where as batch 2 contains uncomplicated malaria samples. sample batch malaria AL 1 1 Severe AO 2 1 Severe AQ 3 1 Severe AP 4 1 Severe CF 5 2 Severe CL 6 2 Severe CU 7 2 Severe CV 8 2 Severe GA_UC 9 3 uncomplicated GB_UC 10 3 uncomplicated GC_UC 11 3 uncomplicated GE_UC 12 3 uncomplicated GR_UC 13 3 uncomplicated With best regards On Mon, Aug 19, 2013 at 5:50 PM, Johnson, William Evan <wej@bu.edu<mailto:wej@bu.edu>> wrote: Amit, The "singularity" error you are getting occurs when your covariates are confounded with batch (or with each other). In the example you are trying is there a batch that contains only one covariate level and is that covariate level exclusive to the batch? If this does not make sense, post your 'pheno' variable in a reply and I will be happy to help you figure out the problem. Evan On Aug 19, 2013, at 6:00 AM, <bioconductor- request@r-project.org<mailto:bioconductor-request@r-project.org="">> <bioconductor-request@r-project.org<mailto:bioconductor- request@r-project.org="">> wrote: > Date: Sun, 18 Aug 2013 19:58:35 +0530 > From: amit kumar subudhi <amit4help@gmail.com<mailto:amit4help@gmail.com>> > To: bioconductor@r-project.org<mailto:bioconductor@r-project.org> > Subject: [BioC] ComBat_ Error in solve.default(t(design) %*% design) : > Lapack routine dgesv: system is exactly singular: U[4, 4] = 0 > Message-ID: > <cadxjrxwkyc3provl3rnmyc03qpyvh_vdvxvzymu- wkvmw+nkiw@mail.gmail.com<mailto="" :cadxjrxwkyc3provl3rnmyc03qpyvh_vdvxvzymu-="" wkvmw%2bnkiw@mail.gmail.com="">> > Content-Type: text/plain > > Hello to all ComBat users, > > I am trying to remove the batch effects from some of my microarray data but > at last I am getting an error message which read as > > Found 3 batches > Found 1 categorical covariate(s) > Standardizing Data across genes > Error in solve.default(t(design) %*% design) : > Lapack routine dgesv: system is exactly singular: U[4,4] = 0 > > The head(edata) looks like this > AL AO AP AQ CF > GT_pfalci_specific_0000001 16.053898 16.080540 16.101114 16.046898 16.087206 > GT_pfalci_specific_0000002 10.051407 10.477143 8.369233 10.657850 13.312936 > GT_pfalci_specific_0000003 8.910620 8.683393 7.812817 8.496099 10.920685 > GT_pfalci_specific_0000004 6.603195 8.993232 6.476777 6.792369 3.319346 > GT_pfalci_specific_0000005 9.813562 11.084574 9.055613 11.568550 12.977261 > GT_pfalci_specific_0000006 15.989252 15.993513 15.963054 16.000675 15.983985 > CL CU CV GA_UC GB_UC > GT_pfalci_specific_0000001 16.082037 16.071299 16.090370 15.971335 15.994304 > GT_pfalci_specific_0000002 12.653076 9.703247 8.827624 5.697412 8.060719 > GT_pfalci_specific_0000003 11.470758 10.548943 10.718349 6.132614 8.007271 > GT_pfalci_specific_0000004 5.328515 8.398546 6.351136 3.045112 3.891578 > GT_pfalci_specific_0000005 8.520699 11.791610 11.535907 6.791468 9.930246 > GT_pfalci_specific_0000006 15.980660 15.984256 15.970124 13.353012 13.740395 > GC_UC GE_UC GR_UC > GT_pfalci_specific_0000001 15.855644 16.090246 16.086956 > GT_pfalci_specific_0000002 9.026398 8.015609 7.814614 > GT_pfalci_specific_0000003 5.341252 8.658231 5.788790 > GT_pfalci_specific_0000004 4.191565 3.040515 3.517175 > GT_pfalci_specific_0000005 5.446910 11.982848 5.477334 > GT_pfalci_specific_0000006 11.872469 13.675290 13.117105 > > GT_pfalci_specific_0000006 15.983985 15.970124 > > and the head(pheno) looks like this > sample batch malaria > AL 1 1 severe > AO 2 1 severe > AP 3 1 severe > AQ 4 1 severe > CF 5 2 severe > CL 6 2 severe > > > the commands that I have used for ComBat is > mod = model.matrix(~as.factor(malaria), data=pheno) > combat_edata = ComBat(dat=edata, batch=batch, mod=mod, numCovs=NULL, > par.prior=TRUE, prior.plots=FALSE) > > head(mod) looks like this > (Intercept) as.factor(malaria)uncomplicated > AL 1 0 > AO 1 0 > AP 1 0 > AQ 1 0 > CF 1 0 > CL 1 0 > > Why I am getting this error meassage? Please help me out. When I am taking > the larger sample size (n=33) I could able to remove the batch effects but > a subset of those samples giving me the above problem. > > > -- > Amit Kumar Subudhi > Research Scholar, > CSIR-Senior Research Fellow, > Molecular Parasitology and Systems Biology Lab, > Department of Biological Sciences , > FD III, BITS, Pilani, > Rajasthan- 333031 > e mail- > amit4help@gmail.com<mailto:amit4help@gmail.com> > amit.subudhi@pilani.bits-pilani.ac.in<mailto:amit.subudhi@pilani .bits-pilani.ac.in=""> > Mob No- 919983525845 -- Amit Kumar Subudhi Research Scholar, CSIR-Senior Research Fellow, Molecular Parasitology and Systems Biology Lab, Department of Biological Sciences , FD III, BITS, Pilani, Rajasthan- 333031 e mail- amit4help@gmail.com<mailto:amit4help@gmail.com> amit.subudhi@pilani.bits-pilani.ac.in<mailto:amit.subudhi@pilani.bits- pilani.ac.in=""> Mob No- 919983525845 -- Amit Kumar Subudhi Research Scholar, CSIR-Senior Research Fellow, Molecular Parasitology and Systems Biology Lab, Department of Biological Sciences , FD III, BITS, Pilani, Rajasthan- 333031 e mail- amit4help@gmail.com<mailto:amit4help@gmail.com> amit.subudhi@pilani.bits-pilani.ac.in<mailto:amit.subudhi@pilani.bits- pilani.ac.in=""> Mob No- 919983525845 -- Amit Kumar Subudhi Research Scholar, CSIR-Senior Research Fellow, Molecular Parasitology and Systems Biology Lab, Department of Biological Sciences , FD III, BITS, Pilani, Rajasthan- 333031 e mail- amit4help@gmail.com<mailto:amit4help@gmail.com> amit.subudhi@pilani.bits-pilani.ac.in<mailto:amit.subudhi@pilani.bits- pilani.ac.in=""> Mob No- 919983525845 -- Amit Kumar Subudhi Research Scholar, CSIR-Senior Research Fellow, Molecular Parasitology and Systems Biology Lab, Department of Biological Sciences , FD III, BITS, Pilani, Rajasthan- 333031 e mail- amit4help@gmail.com<mailto:amit4help@gmail.com> amit.subudhi@pilani.bits-pilani.ac.in<mailto:amit.subudhi@pilani.bits- pilani.ac.in=""> Mob No- 919983525845 [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 769 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6