Hi,
I have written some functions for making your own cdf environment for
one of
my collegues who is interested in this. In one function, you have to
input a
list. Each element of this list has to be a gene, and each of this
elements
must consist of a vector of the perfect match IDs you are interested
in. So
this list should look something like this
$ Gene1
[1] 12123 1412414 12231 4421233
$ Gene2
[1] 342352 12312 1234112 412211
and so on, where 12123, 1412414, ... are the PM IDs. Using this list
as your
argument you will get a cdf environment that contains only the probe
sets
specified in this list, and only the probe pairs of the probe sets in
this
list that correspond to the PM IDs (you get both PMs and MMs
corresponding
to the PM ID).
Would this function solve your problem?
Best,
Holger
> Thanks Laurent for the tip but I encountered other problems if I
create
> enough identifiers. When I created one unique identifier for each
probe
pair
> I want to be inside the new cdf, then I would get 99112 probe *sets*
> because
> > length(unique(ind))
> [1] 99112
>
> This isn't exactly what I have in mind. If say, I want to have 4
probe
> pairs (nearest to the 5-prime end) from each set, how can I proceed
to
create
> this new cdf?
>
> What I realised from what I've done below is that I will get one
probe
> pair that's furthest from 5-prime end for each set because the
furthest
pair
> is at the *bottom* of the probe set. The probe table is arranged in
> increasing order and so it seems to me that it updates itself and
did not
keep the
> earlier ones.
>
> Please advice and thanks for the help.
>
> Cheers
> sw
>
> -----Original Message-----
> From: Laurent Gautier [mailto:lgautier@altern.org]
> Sent: Mon 08-Nov-04 1:10 PM
> To: Hee Siew Wan
> Cc: bioconductor@stat.math.ethz.ch
> Subject: Re: [BioC] altcdfenvs
>
>
>
> Hee Siew Wan wrote:
> > Dear All
> >
> > I was trying to use a trial data (Dilution) to create a new
cdf using
> "altcdfenvs". Instead of using "matchprobes", I created the "m":
>
> ...let's see how 'the "m"' was made then...
>
> > ind <- c(seq(1,199084,by=11), seq(1,199084,by=10),
seq(1,199084,by=9),
> > seq(1,199084,by=8), seq(1,199084,by=7), seq(1,199084,by=6))
> >
> > m.dil <- new.env()
> > m.dil$match <- list(ind[1])
> > m.dil$match <- c(m.dil$match, ind[2:length(ind)])
> > m.dil <- as.list(m.dil)
> > length(m.dil$match) # [1] 146637
> >
> > id.dil <- hgu95av2probe$Probe.Set.Name[ind]
> >
> > dil.cdf <- buildCdfEnv.matchprobes(m.dil, id.dil,
nrow.chip=640,
> ncol.chip=640,
> > chiptype="HG-U95Av2", probes.pack="hgu95av2probe")
> >
> > new.dil <- Dilution[,1:2]
> > validAffyBatch(new.dil, dil.cdf) # [1] TRUE
> > new.dil.cdfenv <- dil.cdf@envir <mailto:dil.cdf@envir>
> > new.dil@cdfName <mailto:new.dil@cdfname> <-
"new.dil.cdfenv"
> >
> >
> >>new.dil
> >
> > AffyBatch object
> > size of arrays=640x640 features (6405 kb)
> > cdf=new.dil.cdfenv (12453 affyids)
> > number of samples=2
> > number of genes=12453
> > annotation=hgu95av2
> >
> >
> >>length(pm(new.dil[,1]))
> >
> > [1] 12453
> >
> > As noted above, I have 12453 probe sets with my new cdf but
I also have
> 12453 probe pairs when in fact I want 146637 probe pairs. The new
cdf only
> returns 1 probe pair per set. Is there a way where I can have the
146637
> probe pairs?
>
> ...then you may want to actually provide enough _identifiers_
(i.e.,
> unique strings) to achieve this. On my side, having made the
variable
> 'id.dil' the way you did, I have:
> > length(unique(id.dil))
> [1] 12453
>
> (I did not anticipate this could be a 'gotcha'; a warning will
be added
> to 'buildCdfEnv.matchprobes')
>
>
> > I tried doing the same thing for ath1121501 array. For this
case, I
> created a data.frame from "ath1121501probe" with the following
columns:
> >
> >>names(newath)
> >
> > [1] "sequence" "probe" "X" "Y" "position"
> >
> > However, when I run
> >
> > m <- matchprobes(newath$sequence, ath1121501probe$sequence)
> >
> > I found out that for some sequences, I have more than 1
match. For
> example,
> >
> >
> >>ath1121501probe$sequence[16023]
> >
> > [1] "GAGTATGCAGTCGAGTGGTGTGATG"
> >
> >>ath1121501probe$sequence[16012]
> >
> > [1] "GAGTATGCAGTCGAGTGGTGTGATG"
> >
> > Hence, the probe that I'm interested in may not be matched
to the
> correct one.
>
>
> ...I am not certain to follow completely what you mean...
>
>
> > The versions I'm using:
> > R: 1.9.0
> > altcdfenvs: 1.0.0
> > affy: 1.4.31
> > ath1121501probe: 1.01
>
> You may want to upgrade to a more recent version of R and of
the
> packages.
>
>
>
> Hoping it helps,
>
>
> L.
>
>
> > on Windows XP Professional Version 2002.
> >
> > Did I do something wrong along the way for both methods? I'd
appreciate
> any help or advice regarding how to get the selected probe pairs for
> analysis. Also, how do I cite the package "altcdfenvs"? Thank you.
> >
> > Regards
> > Hee, Siew Wan
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor@stat.math.ethz.ch
> >
https://stat.ethz.ch/mailman/listinfo/bioconductor
> >
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@stat.math.ethz.ch
>
https://stat.ethz.ch/mailman/listinfo/bioconductor
>
--
Geschenkt: 3 Monate GMX ProMail + 3 Top-Spielfilme auf DVD
++ Jetzt kostenlos testen
http://www.gmx.net/de/go/mail ++