Hi An,
Our custom CDF annotation package has only gene name for each
probeset
because we designed it this way.
A probeset's probes could have matches on different location
or
chromosomes, even some probes have no match on genome at all, but they
belong to this probeset because they all have perfect match on the
gene's sequence.
So it is difficult to assign a single genome location to the
probeset.
But we do have Map/Group files for probe's genome location. It would
show that most probesets' probes have adjacent genome location, but
some
don't. Those files are at
http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/CDF
_download_v8.asp If you are using version 8 of custom cdf.
To get more detail, please google 'custom cdf' or just drop me
a
message.
Best,
Manhong Dai
> Message: 5
> Date: Fri, 13 Oct 2006 09:36:20 -0400
> From: "James W. MacDonald" <jmacdon at="" med.umich.edu="">
> Subject: Re: [BioC] hs133phsentrezg metadata
> To: "De Bondt, An-7114 [PRDBE]" <adbondt at="" prdbe.jnj.com="">,
> BioConductor_list <bioconductor at="" stat.math.ethz.ch="">
> Message-ID: <452F9654.6000902 at med.umich.edu>
> Content-Type: text/plain; charset="utf-8"; format=flowed
>
> Hi An,
>
> You should not respond just to me. The goal is to keep these
> conversations on the list so others can benefit as well.
>
> De Bondt, An-7114 [PRDBE] wrote:
> > Dear Jim,
> >
> > Indeed, this is the info I was looking for, thanks!
> > Could you also give me guidance on how I can get this CHRLOC info
into a
> > metadata package like e.g. hs133phsentrezg? I guess I would have
to create
> > a .CDF file first but I do not know how this file needs to be set
up...
> > Probably a tab delimited file with and as many rows as gene
identifiers on
> > the chip and with the following columns:
> > gene identifiers on the chip
> > gene name
> > chromosome
> > chromosome_start of the identifier on the chip
> > chromosome_end of the identifier on the chip
> >
> > Is this right or should I post this on the mailing list?
>
> Well, trying to reverse-engineer a metaData package is probably more
> trouble than it is worth. Why exactly do you need this data to be in
a
> package? The rationale for the metaData packages is to supply end
users
> with a single package that has a relatively simple interface to the
> data, but once you have the data in your working environment, it is
> there for you to use.
>
> Anyway, if you really want the data in an annotation package, you
can
> use AnnBuilder to make one yourself. There are a couple of vignettes
in
> that package that show how to do things, and if you have problems,
there
> are plenty of threads on the list that you can search for common
answers.
>
> I guess the only compelling reason I can think one might want a
package
> is if the goal is to use annaffy to output annotated tables with
your
> data. Is this the case? If so, you can do the same sort of thing
using
> biomaRt and htmlpage() in the annotate package. There is a vignette
in
> biomaRt that shows how to do that. I have also written some
functions
> for affycoretools that automate the process, but they currently
don't
> include the chromosomal location, mainly because I don't find that
> information very useful for say, an HTML table. However, if there is
> interest, I am willing to add that capability.
>
> Best,
>
> Jim
>
>
> >
> > Thanks,
> > An
> >
> >
> > -----Original Message-----
> > From: James W. MacDonald [mailto:jmacdon at med.umich.edu]
> > Sent: Thursday, 12 October 2006 17:04
> > To: De Bondt, An-7114 [PRDBE]
> > Cc: 'bioconductor at stat.math.ethz.ch'
> > Subject: Re: [BioC] hs133phsentrezg metadata
> >
> >
> > De Bondt, An-7114 [PRDBE] wrote:
> >
> >>Dear useR,
> >>
> >>The 'hs133phsentrezg' metadata have only 'hs133phsentrezgGENENAME'
mapping
> >>info. The 'hgu133plus2' metadata has also 'hgu133plus2CHRLOC'
info
> >
> > (besides
> >
> >>lots of other info). How can I find 'hs133phsentrezgCHRLOC' info?
> >
> >
> > I hadn't realized how sparse the information in these annotation
> > packages really is. I think your best bet is to use biomaRt to get
the
> > annotation you want.
> >
> > Something like
> >
> > > mart <- useMart("ensembl","hsapiens_gene_ensembl")
> > Checking attributes and filters ... ok
> > > a <- getBM("chromosome_location", "entrezgene", sub("_at", "",
> > ls(hs133phsentrezgGENENAME)[1:10]), mart=mart, output="list")
> > > sapply(a[[1]], length)
> > 1 10 100 1000 10000 10001 10002 10003 10004 10005
> > 62 157 233 1457 907 105 92 371 80 123
> > > a[[1]][[1]]
> > [1] 63544227 63545175 63546378 63546412 63547557
> > [6] 63547599 63547672 63548610 63548624 63548943
> > [11] 63549372 63549373 63549374 63549543 63550044
> > [16] 63550148 63556679 63556692 63556702 63556866
> > [21] 63556880 63556894 63556903 63557399 63557422
> > [26] 63557669 63558246 63558375 63559327 63559846
> > [31] 63560064 63560292 63560992 63561327 63561328
> > [36] 63561647 63561650 63550747 63553556 63556291
> > [41] 63556303 63550430 63550488 63550864 63550878
> > [46] 63551634 63552081 63552199 63552253 63552624
> > [51] 63552827 63553507 63554072 63554973 63554974
> > [56] 63554975 63554981 63554982 63554984 63555253
> > [61] 63555261 63555962
> >
> > Should do the trick.
> >
> > HTH,
> >
> > Jim
> >
> >
> >
> >>Thanks in advance,
> >>An De Bondt
> >>
> >>
> >>
> >> [[alternative HTML version deleted]]
> >>
> >>_______________________________________________
> >>Bioconductor mailing list
> >>Bioconductor at stat.math.ethz.ch
> >>
https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>Search the archives:
> >
> >
http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
> >
>
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> Affymetrix and cDNA Microarray Core
> University of Michigan Cancer Center
> 1500 E. Medical Center Drive
> 7410 CCGC
> Ann Arbor MI 48109
> 734-647-5623
>
>
> **********************************************************
> Electronic Mail is not secure, may not be read every day, and should
not be used for urgent or sensitive issues.