Entering edit mode
Joao Sollari Lopes
▴
80
@joao-sollari-lopes-6122
Last seen 10.3 years ago
Hi Jim,
Following on the discussion on annotation in Affymetrix Gene ST
arrays,
I wonder if there is a standard way to deal with multiple mRNAs (from
different genes) that are assigned to the same transcript cluster. Is
it
generally accepted to follow the naive approach of picking the first
mrna of the list.
I know that the mRNA Assignments are ordered in a ranking so is it
safe
just to assume the ranking already performed by Affymetrix?
Joao
On 08/29/2013 04:22 PM, James W. MacDonald wrote:
> Hi Joao,
>
> Unfortunately there are no readily available packages for annotating
> all the new model organism arrays from Affy. However, the functions
to
> create your own annotation package do exist. If you look at the
> AnnotationForge package, specifically the SQLForge vignette
> (http://www.bioconductor.org/packages/release/bioc/vignettes/Annotat
ionForge/inst/doc/SQLForge.pdf),
> it is pretty straightforward to make your own annotation package.
>
> I am assuming you are summarizing at the transcript level, so would
> want to make a zebgene11sttranscriptcluster.db package. For this you
> need the transcript csv file from Affy
> (http://www.affymetrix.com/Auth/analysis/downloads/na33/wtgene-33_3
/ZebGene-1_1-st-v1.na33.3.zv9.transcript.csv.zip).
> From this you want to generate a two-column file with the probeset
ID
> in the first column, and then GenBank or RefSeq IDs in the second.
>
> This is the tough part, as the annotation files need to be parsed to
> create this file.
>
> I wrote an Rscript to parse these files that you could use. It is
> pretty naive, but seems to do a relatively reasonable job. You will
> obviously need to change the first line to point to the correct
> directory, and will have to have the org.Dr.eg.db package installed,
> but this should
>
> <copy from="" below="">
>
> #!/data/programs/lib64/R/bin/Rscript
> args <- commandArgs(TRUE)
> if(length(args) < 3) stop(paste("Usage: parseAffyTranscripts.R
> <transcript.csv> <organism.db package="">