Entering edit mode
Hi Tam,
Sorry about the confusion. Two items:
1. Adding a column of numbers (row number) is a a known "harmless" bug
in the ComBat function. It is basically adding a row of column numbers
to your dataset, which can easily just be deleted in Excel (and the
data can be shifted over). However I recognize that your case is a
little different, hence:
2. On line 3154, your gene description has some character formatting
that is causing issues with R's "read.table" function--which is used
by ComBat. I think its reading the apostrophe as quoted text, so it is
then concatenating everything after that as text until the next
apostrophe. Anyway, here is how you fix it: open up your dataset
('12arraysCombatImputed_2.txt') in Excel, then save it as a .csv. Then
run ComBat using the option: type='csv'. Alternatively, you can remove
the second column from your dataset for ComBat and then add them back
in after adjustment.
I just did this myself on your data and it worked. However, you still
need to delete the column numbers from item #1 before the data are
ready to go!
Also, I noticed that your variances are also not well-behaved (see the
plot that comes up), so I'd recommend that you use a non-parametric
prior (par.prior=F). Note that this may take an hour or so to run so
make sure that the parametric prior is working before you try the non-
parametric one.
Thanks!
Evan
Okay, I looked at your data. On line 3154, R's "read.table"
On Oct 31, 2012, at 10:33 AM, SSc Array Core wrote:
> I am running Combat on the attached files. 2 channel array (with
reference). Problem is the adjusted file returns an added column of
numbers where CLID should be. This column then stops delivering said
numbers around line 3154, returning to CLID, shifting all the
information and data to the left. I am stumped as to why this is
happening. Please advise.
>
> thanks,
>
> tam
>
> Reading Sample Information File
> Reading Expression Data File
> Found 2 batches
> Found 1 covariate(s)
> Found 260 Missing Data Values
> Standardizing Data across genes
> Fitting L/S model and finding priors
> Finding parametric adjustments
> Adjusting the Data
> Adjusted data saved in file:
Adjusted_12arraysCombatImputed_2.txt_.xls
> > ComBat('12arraysCombatImputed_2.txt','sample_info_file_mouse.txt',
skip=2,write=T)
> <12arraysCombatImputed_2.txt><sample_info_file_mouse.txt><adjusted_1 2arrayscombatimputed_2.txt_.xls="">