Entering edit mode
Dear Roman, all
Recently we tried your version of Annotation files for Gene 1.0 ST
array that your team built from PLANdbAffy DB . I encountered some
problems so I hope you can help.
You provide nice CDF and Affy PGF/CLF files , but, the PGF/CLF were
not
useful in bioConductor packages for affy Exon/Gene type arrays
,namely:
oligo & XPS as they require annotation file in csv format. I tried
the
annotation csv file from Affymetrix and after that from PLANdbAffy DB.
The PLANdbAffy csv file is very different from Affymetrix one so
import
is not possible (actually csv file on the website is TAB delimited
instead of comma so problem already starts there , and it requires
reformatting).
Christian from XPS was kind to inform me that :
>... PLANdbAffy annotation columns have nothing to do with the
Affymetrix
>annotation columns. Thus xps will not read these annotation files.
>Alternative annotation files must contain exactly the same columns as
>the Affymetrix annotation files.
>For whole genome and exon arrays it is not possible to use only the
PGF->files w/o the annotation files, since I extract most of the
important >information from the probeset-annotation file first, so
this
file is >absolutely essential. For example, column "level" contains
the
information >Core/Extended/Full, see the corresponding annotation
README
files for an >explanation of all columns.
>xps error you get simply says that their PGF-file does not contain
the
>AFFX controls, so maybe adding the AFFX controls to their PGF-file
might >help. However, as you mention, they use their own Probesetids,
which will >not match the Probesetids of the Affymetrix annotation
files, thus it may >not work anyhow.
>It is not quite clear to me why they created their own PGF-file. The
>Affymetrix PGF-file contains only 1-4 probes for each probeset, where
each >exon consists of one or more probesets, thus the probability
that
a probe >within a probeset is not correct should be pretty small.
However, a >probeset could be mapped to a wrong exon/gene or no gene
at
all, so it >should be sufficient to correct the Affymetrix annotation
files.
The tools like RMAExpress, EC., and Aroma.affymetrix, can work with
CDF only. So after using RMAExpress (in command line mode) I did get
Expression matrix out but I could not link 19532 Probeset ids to
PLANdbAffy annotation csv file to collect gene basic information. What
i
did was , 1st load the full annotation file (not filtered) from
PLANdbAffy:
http://affymetrix2.bioinf.fbb.msu.ru/files.html
and search the 2nd colum (Probe_Sets) with ids after RMA and I find
0...
then i tried the 1st column (the Probes ) and found 8664... but I
would
expect vice versa situation ?
So Roman can you please:
1) advise how to get real ids after RMAExpress run?
2) do you plan to build Annotation csv file as Affymetrix dose so that
other software from Bioconductor oligo & XPS can use it?
3) comment on Christian feedback.
Btw. Christian, how come RMAExpress, EC., and Aroma.affymetrix can
work
with CDFs only and oligo & XPS require extra annotation? From what I
gather (after peaking into CDF and PGF files ) they show what probes
are
belonging to probe_set. So for probe_set level analysis (or more
exon_like analysis) the PGF/CLF files alone seem to be enough?
For bioc list, just to bring attention to this article & DB :
PLANdbAffy: probe-level annotation database for Affymetrix expression
microarrays , Ramil N. Nurtdinov1 et al.
http://nar.oxfordjournals.org/content/38/suppl_1/D726.full
http://affymetrix2.bioinf.fbb.msu.ru/
Maybe some of bioC experts have comments about it?
Best,
Branko
--------------------------
Branislav Misovic,
Department of Toxicogenetics
Leiden University Medical Center
Einthovenweg 20, 2333 ZC Leiden
PO.box 9600, Building2,Room:T3-11
2300 RC Leiden
The Netherlands
Phone: +31 71 526 9636
Mob: 0653135855
E-mail:
b.misovic@lumc.nl
braniti@gmail.com
[[alternative HTML version deleted]]