question regarding MAS5 normalization with reduced probes
2
0
Entering edit mode
@james-anderson-1641
Last seen 10.2 years ago
Hi, I am trying to use MAS5 to normalize some cel files with reduced set of probes (some probes whose PM is not significantly higher than MM is filtered), does anyone know how to do this? Does that require creating a new CDF file? thanks a bunch, -James [[alternative HTML version deleted]]
cdf cdf • 1.2k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 10 hours ago
United States
Hi James, On 8/26/2010 1:05 PM, James Anderson wrote: > Hi, > > I am trying to use MAS5 to normalize some cel files with reduced set > of probes (some probes whose PM is not significantly higher than MM > is filtered), does anyone know how to do this? Does that require > creating a new CDF file? Have you tried running mas5() from the affy package? Having never tried, I don't know, but it seems a simple enough test. If you do need to create a new cdf, you will want to use the affxparser package. Best, Jim > > thanks a bunch, > > -James > > > > > [[alternative HTML version deleted]] > > _______________________________________________ Bioconductor mailing > list Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor Search the > archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
ADD COMMENT
0
Entering edit mode
Hi Jim, Thanks for your email. I've run mas5 before, but only using default setting. From the help, it does not look like there is a way to specify which reduced set of probes you can use. In addition, from the file, it looks like it has more to do with whether the "object" is read using a reduced set of probes. (I believe if the "object" is read using only the reduced set, mas5 will do the job), so don't know whether it has more to do with the function ReadAffy, but from that, it does not look like it has the option of specifying which reduced set of probes, if we don't use alternative CDF file. Below is the usage of mas5 function. mas5(object, normalize = TRUE, sc = 500, analysis = "absolute", ...) Thanks, -James --- On Fri, 8/27/10, James W. MacDonald <jmacdon@med.umich.edu> wrote: From: James W. MacDonald <jmacdon@med.umich.edu> Subject: Re: [BioC] question regarding MAS5 normalization with reduced probes To: "James Anderson" <janderson_net@yahoo.com> Cc: "bioconductor" <bioconductor@stat.math.ethz.ch> Date: Friday, August 27, 2010, 10:04 AM Hi James, On 8/26/2010 1:05 PM, James Anderson wrote: > Hi, > > I am trying to use MAS5 to normalize some cel files with reduced set > of probes (some probes whose PM is not significantly higher than MM > is filtered), does anyone know how to do this? Does that require > creating a new CDF file? Have you tried running mas5() from the affy package? Having never tried, I don't know, but it seems a simple enough test. If you do need to create a new cdf, you will want to use the affxparser package. Best, Jim > > thanks a bunch, > > -James > > > > > [[alternative HTML version deleted]] > > _______________________________________________ Bioconductor mailing > list Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor Search the > archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi James, I misunderstood your question. I thought you already had a reduced set of probes you wanted to run mas5() on. So yeah, if you want to use a reduced set of probes you could use some code written by Ariel Chernomoretz (and modified by Jenny Drnevitch) that has been posted and referenced many times on this list: https://stat.ethz.ch/pipermail/bioconductor/2006-September/014242.html Alternatively, you could play with the affxparser package, which has the capability (IIRC) to do the same. Best, Jim On 8/30/2010 10:29 AM, James Anderson wrote: > Hi Jim, > > Thanks for your email. I've run mas5 before, but only using default > setting. From the help, it does not look like there is a way to > specify which reduced set of probes you can use. In addition, from > the file, it looks like it has more to do with whether the "object" > is read using a reduced set of probes. (I believe if the "object" is > read using only the reduced set, mas5 will do the job), so don't know > whether it has more to do with the function ReadAffy, but from that, > it does not look like it has the option of specifying which reduced > set of probes, if we don't use alternative CDF file. Below is the > usage of mas5 function. mas5(object, normalize = TRUE, sc = 500, > analysis = "absolute", ...) Thanks, > > -James > > --- On Fri, 8/27/10, James W. MacDonald<jmacdon at="" med.umich.edu=""> > wrote: > > From: James W. MacDonald<jmacdon at="" med.umich.edu=""> Subject: Re: [BioC] > question regarding MAS5 normalization with reduced probes To: "James > Anderson"<janderson_net at="" yahoo.com=""> Cc: > "bioconductor"<bioconductor at="" stat.math.ethz.ch=""> Date: Friday, August > 27, 2010, 10:04 AM > > Hi James, > > On 8/26/2010 1:05 PM, James Anderson wrote: >> Hi, >> >> I am trying to use MAS5 to normalize some cel files with reduced >> set of probes (some probes whose PM is not significantly higher >> than MM is filtered), does anyone know how to do this? Does that >> require creating a new CDF file? > > Have you tried running mas5() from the affy package? Having never > tried, I don't know, but it seems a simple enough test. > > If you do need to create a new cdf, you will want to use the > affxparser package. > > Best, > > Jim > > >> >> thanks a bunch, >> >> -James >> >> >> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ Bioconductor >> mailing list Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the >> archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
ADD REPLY
0
Entering edit mode
Hi Jim, Thanks a lot for the link. I've tried the code in the link, it works without any problem if I were to take the whole probesets out. However, I do encounter some problem when I need to take not only some probe sets, but also some probes (but not the whole probe set) out, maybe because I did not provide the correct format of the probes. (I assume you are familiar with the content in the script provided in the link). If I randomly take out 2000 probe sets from U133A, maskedprobeSets = rownames(MAS5_matrix)[sample(1:22283,2000)] RemoveProbes(listOutProbes=NULL, listOutProbeSets=maskedprobeSets, cleancdf) It works fine and whatever affyBatch object read using the cleancdf has a reduced dimension. However, if I do maskedprobeSets = rownames(MAS5_matrix)[sample(1:22283,2000)] maskedprobes = rownames(pm(A))[1:2000] RemoveProbes(listOutProbes=maskedprobes, listOutProbeSets=maskedprobeSets, cleancdf) The error msg shows as: Error in get(pset[i], env = get(cdfpackagename)) :   object '315997at' not found Do you know what is the correct format of the input for the probes (not probe sets) to be taken out? Thanks a lot, -James --- On Mon, 8/30/10, James W. MacDonald <jmacdon@med.umich.edu> wrote: From: James W. MacDonald <jmacdon@med.umich.edu> Subject: Re: [BioC] question regarding MAS5 normalization with reduced probes To: "James Anderson" <janderson_net@yahoo.com> Cc: "bioconductor" <bioconductor@stat.math.ethz.ch> Date: Monday, August 30, 2010, 12:25 PM Hi James, I misunderstood your question. I thought you already had a reduced set of probes you wanted to run mas5() on. So yeah, if you want to use a reduced set of probes you could use some code written by Ariel Chernomoretz (and modified by Jenny Drnevitch) that has been posted and referenced many times on this list: https://stat.ethz.ch/pipermail/bioconductor/2006-September/014242.html Alternatively, you could play with the affxparser package, which has the capability (IIRC) to do the same. Best, Jim On 8/30/2010 10:29 AM, James Anderson wrote: > Hi Jim, > > Thanks for your email. I've run mas5 before, but only using default > setting. From the help, it does not look like there is a way to > specify which reduced set of probes you can use. In addition, from > the file, it looks like it has more to do with whether the "object" > is read using a reduced set of probes. (I believe if the "object" is > read using only the reduced set, mas5 will do the job), so don't know > whether it has more to do with the function ReadAffy, but from that, > it does not look like it has the option of specifying which reduced > set of probes, if we don't use alternative CDF file. Below is the > usage of mas5 function. mas5(object, normalize = TRUE, sc = 500, > analysis = "absolute", ...) Thanks, > > -James > > --- On Fri, 8/27/10, James W. MacDonald<jmacdon@med.umich.edu> > wrote: > > From: James W. MacDonald<jmacdon@med.umich.edu> Subject: Re: [BioC] > question regarding MAS5 normalization with reduced probes To: "James > Anderson"<janderson_net@yahoo.com> Cc: > "bioconductor"<bioconductor@stat.math.ethz.ch> Date: Friday, August > 27, 2010, 10:04 AM > > Hi James, > > On 8/26/2010 1:05 PM, James Anderson wrote: >> Hi, >> >> I am trying to use MAS5 to normalize some cel files with reduced >> set of probes (some probes whose PM is not significantly higher >> than MM is filtered), does anyone know how to do this? Does that >> require creating a new CDF file? > > Have you tried running mas5() from the affy package? Having never > tried, I don't know, but it seems a simple enough test. > > If you do need to create a new cdf, you will want to use the > affxparser package. > > Best, > > Jim > > >> >> thanks a bunch, >> >> -James >> >> >> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ Bioconductor >> mailing list Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the >> archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi James, On 8/31/2010 12:17 PM, James Anderson wrote: > Hi Jim, > > Thanks a lot for the link. I've tried the code in the link, it works without any problem if I were to take the whole probesets out. However, I do encounter some problem when I need to take not only some probe sets, but also some probes (but not the whole probe set) out, maybe because I did not provide the correct format of the probes. > > (I assume you are familiar with the content in the script provided in the link). > > If I randomly take out 2000 probe sets from U133A, > maskedprobeSets = rownames(MAS5_matrix)[sample(1:22283,2000)] > RemoveProbes(listOutProbes=NULL, listOutProbeSets=maskedprobeSets, cleancdf) > > It works fine and whatever affyBatch object read using the cleancdf has a reduced dimension. > > However, if I do > > maskedprobeSets = rownames(MAS5_matrix)[sample(1:22283,2000)] > maskedprobes = rownames(pm(A))[1:2000] Assuming that 'A' is an AffyBatch, what you will get back from that call to rownames is a bunch of numbers in character format. An example using the Dilution dataset: > rownames(pm(Dilution))[1:10] [1] "175218" "356689" "227696" "237919" "275173" "203444" "357984" "368524" [9] "285352" "304510" Which you can see is not very useful. What you want are the probeset IDs, along with an appended number (which is equal to the position of the probe in the probeset). Now, say we are concerned about the "100_g_at" probeset in the Dilution dataset: > pm(Dilution, "100_g_at") 20A 20B 10A 10B 100_g_at1 221.3 146.3 192.0 116.0 100_g_at2 685.0 479.0 493.0 328.3 100_g_at3 1126.3 724.3 849.0 498.3 100_g_at4 205.0 126.5 136.0 97.0 100_g_at5 580.8 341.8 374.0 226.0 100_g_at6 161.3 109.5 139.0 92.3 100_g_at7 1645.3 992.3 1006.8 670.0 100_g_at8 624.0 348.0 336.3 224.5 100_g_at9 274.0 156.0 203.8 119.0 100_g_at10 240.0 156.3 223.0 122.0 100_g_at11 438.0 278.3 362.5 198.0 100_g_at12 554.0 334.8 421.5 220.0 100_g_at13 235.0 148.0 151.0 107.5 100_g_at14 571.3 415.0 508.0 271.0 100_g_at15 904.0 562.0 689.0 330.0 100_g_at16 141.0 93.0 113.5 75.5 And we don't like the third and seventh probes. We could use > rownames(pm(Dilution, "100_g_at"))[c(3,7)] [1] "100_g_at3" "100_g_at7" And feed that into RemoveProbes(), which will then work. Best, Jim > RemoveProbes(listOutProbes=maskedprobes, listOutProbeSets=maskedprobeSets, cleancdf) > > The error msg shows as: > Error in get(pset[i], env = get(cdfpackagename)) : > object '315997at' not found > > Do you know what is the correct format of the input for the probes (not probe sets) to be taken out? > > > > Thanks a lot, > > > -James > > > --- On Mon, 8/30/10, James W. MacDonald<jmacdon at="" med.umich.edu=""> wrote: > > From: James W. MacDonald<jmacdon at="" med.umich.edu=""> > Subject: Re: [BioC] question regarding MAS5 normalization with reduced probes > To: "James Anderson"<janderson_net at="" yahoo.com=""> > Cc: "bioconductor"<bioconductor at="" stat.math.ethz.ch=""> > Date: Monday, August 30, 2010, 12:25 PM > > Hi James, > > I misunderstood your question. I thought you already had a reduced set > of probes you wanted to run mas5() on. > > So yeah, if you want to use a reduced set of probes you could use some > code written by Ariel Chernomoretz (and modified by Jenny Drnevitch) > that has been posted and referenced many times on this list: > > https://stat.ethz.ch/pipermail/bioconductor/2006-September/014242.html > > Alternatively, you could play with the affxparser package, which has the > capability (IIRC) to do the same. > > Best, > > Jim > > > > On 8/30/2010 10:29 AM, James Anderson wrote: >> Hi Jim, >> >> Thanks for your email. I've run mas5 before, but only using default >> setting. From the help, it does not look like there is a way to >> specify which reduced set of probes you can use. In addition, from >> the file, it looks like it has more to do with whether the "object" >> is read using a reduced set of probes. (I believe if the "object" is >> read using only the reduced set, mas5 will do the job), so don't know >> whether it has more to do with the function ReadAffy, but from that, >> it does not look like it has the option of specifying which reduced >> set of probes, if we don't use alternative CDF file. Below is the >> usage of mas5 function. mas5(object, normalize = TRUE, sc = 500, >> analysis = "absolute", ...) Thanks, >> >> -James >> >> --- On Fri, 8/27/10, James W. MacDonald<jmacdon at="" med.umich.edu=""> >> wrote: >> >> From: James W. MacDonald<jmacdon at="" med.umich.edu=""> Subject: Re: [BioC] >> question regarding MAS5 normalization with reduced probes To: "James >> Anderson"<janderson_net at="" yahoo.com=""> Cc: >> "bioconductor"<bioconductor at="" stat.math.ethz.ch=""> Date: Friday, August >> 27, 2010, 10:04 AM >> >> Hi James, >> >> On 8/26/2010 1:05 PM, James Anderson wrote: >>> Hi, >>> >>> I am trying to use MAS5 to normalize some cel files with reduced >>> set of probes (some probes whose PM is not significantly higher >>> than MM is filtered), does anyone know how to do this? Does that >>> require creating a new CDF file? >> >> Have you tried running mas5() from the affy package? Having never >> tried, I don't know, but it seems a simple enough test. >> >> If you do need to create a new cdf, you will want to use the >> affxparser package. >> >> Best, >> >> Jim >> >> >>> >>> thanks a bunch, >>> >>> -James >>> >>> >>> >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ Bioconductor >>> mailing list Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the >>> archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
ADD REPLY
0
Entering edit mode
Hi Jim, Thanks a bunch for your help on this, it works. Sorry to bother you again, but is there a function to convert the probe indices into the probe characters you described? For example, for U133A, the probe indices is 1:247965, is there a function to convert it to Probeset1_1, ProbeSet1_2, ...,ProbeSet1_11, ProbeSet2_1, ProbeSet2_2, ...ProbSet2_11, ...., ProbeSet22283_1, ProbeSet22283_2, ProbeSet22283_11? Thanks again, -James --- On Tue, 8/31/10, James W. MacDonald <jmacdon@med.umich.edu> wrote: From: James W. MacDonald <jmacdon@med.umich.edu> Subject: Re: [BioC] question regarding MAS5 normalization with reduced probes To: "James Anderson" <janderson_net@yahoo.com> Cc: "bioconductor" <bioconductor@stat.math.ethz.ch> Date: Tuesday, August 31, 2010, 1:15 PM Hi James, On 8/31/2010 12:17 PM, James Anderson wrote: > Hi Jim, > > Thanks a lot for the link. I've tried the code in the link, it works without any problem if I were to take the whole probesets out. However, I do encounter some problem when I need to take not only some probe sets, but also some probes (but not the whole probe set) out, maybe because I did not provide the correct format of the probes. > > (I assume you are familiar with the content in the script provided in the link). > > If I randomly take out 2000 probe sets from U133A, > maskedprobeSets = rownames(MAS5_matrix)[sample(1:22283,2000)] > RemoveProbes(listOutProbes=NULL, listOutProbeSets=maskedprobeSets, cleancdf) > > It works fine and whatever affyBatch object read using the cleancdf has a reduced dimension. > > However, if I do > > maskedprobeSets = rownames(MAS5_matrix)[sample(1:22283,2000)] > maskedprobes = rownames(pm(A))[1:2000] Assuming that 'A' is an AffyBatch, what you will get back from that call to rownames is a bunch of numbers in character format. An example using the Dilution dataset: > rownames(pm(Dilution))[1:10]   [1] "175218" "356689" "227696" "237919" "275173" "203444" "357984" "368524"   [9] "285352" "304510" Which you can see is not very useful. What you want are the probeset IDs, along with an appended number (which is equal to the position of the probe in the probeset). Now, say we are concerned about the "100_g_at" probeset in the Dilution dataset: > pm(Dilution, "100_g_at")                20A   20B    10A   10B 100_g_at1   221.3 146.3  192.0 116.0 100_g_at2   685.0 479.0  493.0 328.3 100_g_at3  1126.3 724.3  849.0 498.3 100_g_at4   205.0 126.5  136.0  97.0 100_g_at5   580.8 341.8  374.0 226.0 100_g_at6   161.3 109.5  139.0  92.3 100_g_at7  1645.3 992.3 1006.8 670.0 100_g_at8   624.0 348.0  336.3 224.5 100_g_at9   274.0 156.0  203.8 119.0 100_g_at10  240.0 156.3  223.0 122.0 100_g_at11  438.0 278.3  362.5 198.0 100_g_at12  554.0 334.8  421.5 220.0 100_g_at13  235.0 148.0  151.0 107.5 100_g_at14  571.3 415.0  508.0 271.0 100_g_at15  904.0 562.0  689.0 330.0 100_g_at16  141.0  93.0  113.5  75.5 And we don't like the third and seventh probes. We could use > rownames(pm(Dilution, "100_g_at"))[c(3,7)] [1] "100_g_at3" "100_g_at7" And feed that into RemoveProbes(), which will then work. Best, Jim > RemoveProbes(listOutProbes=maskedprobes, listOutProbeSets=maskedprobeSets, cleancdf) > > The error msg shows as: > Error in get(pset[i], env = get(cdfpackagename)) : >    object '315997at' not found > > Do you know what is the correct format of the input for the probes (not probe sets) to be taken out? > > > > Thanks a lot, > > > -James > > > --- On Mon, 8/30/10, James W. MacDonald<jmacdon@med.umich.edu> wrote: > > From: James W. MacDonald<jmacdon@med.umich.edu> > Subject: Re: [BioC] question regarding MAS5 normalization with reduced probes > To: "James Anderson"<janderson_net@yahoo.com> > Cc: "bioconductor"<bioconductor@stat.math.ethz.ch> > Date: Monday, August 30, 2010, 12:25 PM > > Hi James, > > I misunderstood your question. I thought you already had a reduced set > of probes you wanted to run mas5() on. > > So yeah, if you want to use a reduced set of probes you could use some > code written by Ariel Chernomoretz (and modified by Jenny Drnevitch) > that has been posted and referenced many times on this list: > > https://stat.ethz.ch/pipermail/bioconductor/2006-September/014242.html > > Alternatively, you could play with the affxparser package, which has the > capability (IIRC) to do the same. > > Best, > > Jim > > > > On 8/30/2010 10:29 AM, James Anderson wrote: >> Hi Jim, >> >> Thanks for your email. I've run mas5 before, but only using default >> setting. From the help, it does not look like there is a way to >> specify which reduced set of probes you can use. In addition, from >> the file, it looks like it has more to do with whether the "object" >> is read using a reduced set of probes. (I believe if the "object" is >> read using only the reduced set, mas5 will do the job), so don't know >> whether it has more to do with the function ReadAffy, but from that, >> it does not look like it has the option of specifying which reduced >> set of probes, if we don't use alternative CDF file. Below is the >> usage of mas5 function. mas5(object, normalize = TRUE, sc = 500, >> analysis = "absolute", ...) Thanks, >> >> -James >> >> --- On Fri, 8/27/10, James W. MacDonald<jmacdon@med.umich.edu> >> wrote: >> >> From: James W. MacDonald<jmacdon@med.umich.edu>  Subject: Re: [BioC] >> question regarding MAS5 normalization with reduced probes To: "James >> Anderson"<janderson_net@yahoo.com>  Cc: >> "bioconductor"<bioconductor@stat.math.ethz.ch>  Date: Friday, August >> 27, 2010, 10:04 AM >> >> Hi James, >> >> On 8/26/2010 1:05 PM, James Anderson wrote: >>> Hi, >>> >>> I am trying to use MAS5 to normalize some cel files with reduced >>> set of probes (some probes whose PM is not significantly higher >>> than MM is filtered), does anyone know how to do this? Does that >>> require creating a new CDF file? >> >> Have you tried running mas5() from the affy package? Having never >> tried, I don't know, but it seems a simple enough test. >> >> If you do need to create a new cdf, you will want to use the >> affxparser package. >> >> Best, >> >> Jim >> >> >>> >>> thanks a bunch, >>> >>> -James >>> >>> >>> >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ Bioconductor >>> mailing list Bioconductor@stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the >>> archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
@james-w-macdonald-5106
Last seen 10 hours ago
United States
Hi James, On 8/31/2010 2:04 PM, James Anderson wrote: > Hi Jim, > > Thanks a bunch for your help on this, it works. Sorry to bother you > again, but is there a function to convert the probe indices into the > probe characters you described? For example, for U133A, the probe > indices is 1:247965, is there a function to convert it to > Probeset1_1, ProbeSet1_2, ...?ProbeSet1_11, ProbeSet2_1, ProbeSet2_2, > ...ProbSet2_11, ...., ProbeSet22283_1, ProbeSet22283_2, > ProbeSet22283_11? So you want *all* of the probes? Not sure where you are heading with this, but it isn't difficult to get them. I don't have the hgu133acdf installed, so for an example I will use the hgu95av2cdf: > library(hgu95av2cdf) > x <- as.list(hgu95av2cdf) > y <- sort(unlist(sapply(x, function(q) q[,1]))) > head(y) 31483_g_at16 33941_at4 33941_at5 646 647 648 31977_at2 32448_at1 38227_at12 649 650 651 > head(names(y)) [1] "31483_g_at16" "33941_at4" "33941_at5" [4] "31977_at2" "32448_at1" "38227_at12" > length(y) [1] 201800 Note that there are actually 403600 usable probe positions on this particular chip, but the other 201800 are MM probes, and have the same exact name, so we don't need those. Also note that there are 6000 probes on this chip that we ignore (there are actually 409600 rows in the exprs slot of the AffyBatch). These extra probes are the oligo-B2 probes that are on the outside of the chip, used by the scanner to align to the chip. Best, Jim > > Thanks again, > > -James > > --- On Tue, 8/31/10, James W. MacDonald<jmacdon at="" med.umich.edu=""> > wrote: > > From: James W. MacDonald<jmacdon at="" med.umich.edu=""> Subject: Re: [BioC] > question regarding MAS5 normalization with reduced probes To: "James > Anderson"<janderson_net at="" yahoo.com=""> Cc: > "bioconductor"<bioconductor at="" stat.math.ethz.ch=""> Date: Tuesday, August > 31, 2010, 1:15 PM > > Hi James, > > On 8/31/2010 12:17 PM, James Anderson wrote: >> Hi Jim, >> >> Thanks a lot for the link. I've tried the code in the link, it >> works without any problem if I were to take the whole probesets >> out. However, I do encounter some problem when I need to take not >> only some probe sets, but also some probes (but not the whole probe >> set) out, maybe because I did not provide the correct format of the >> probes. >> >> (I assume you are familiar with the content in the script provided >> in the link). >> >> If I randomly take out 2000 probe sets from U133A, maskedprobeSets >> = rownames(MAS5_matrix)[sample(1:22283,2000)] >> RemoveProbes(listOutProbes=NULL, listOutProbeSets=maskedprobeSets, >> cleancdf) >> >> It works fine and whatever affyBatch object read using the cleancdf >> has a reduced dimension. >> >> However, if I do >> >> maskedprobeSets = rownames(MAS5_matrix)[sample(1:22283,2000)] >> maskedprobes = rownames(pm(A))[1:2000] > > Assuming that 'A' is an AffyBatch, what you will get back from that > call to rownames is a bunch of numbers in character format. > > An example using the Dilution dataset: > >> rownames(pm(Dilution))[1:10] > [1] "175218" "356689" "227696" "237919" "275173" "203444" "357984" > "368524" [9] "285352" "304510" > > Which you can see is not very useful. What you want are the probeset > IDs, along with an appended number (which is equal to the position > of the probe in the probeset). > > Now, say we are concerned about the "100_g_at" probeset in the > Dilution dataset: > >> pm(Dilution, "100_g_at") > 20A 20B 10A 10B 100_g_at1 221.3 146.3 192.0 116.0 100_g_at2 > 685.0 479.0 493.0 328.3 100_g_at3 1126.3 724.3 849.0 498.3 > 100_g_at4 205.0 126.5 136.0 97.0 100_g_at5 580.8 341.8 374.0 > 226.0 100_g_at6 161.3 109.5 139.0 92.3 100_g_at7 1645.3 992.3 > 1006.8 670.0 100_g_at8 624.0 348.0 336.3 224.5 100_g_at9 274.0 > 156.0 203.8 119.0 100_g_at10 240.0 156.3 223.0 122.0 100_g_at11 > 438.0 278.3 362.5 198.0 100_g_at12 554.0 334.8 421.5 220.0 > 100_g_at13 235.0 148.0 151.0 107.5 100_g_at14 571.3 415.0 508.0 > 271.0 100_g_at15 904.0 562.0 689.0 330.0 100_g_at16 141.0 93.0 > 113.5 75.5 > > And we don't like the third and seventh probes. We could use > >> rownames(pm(Dilution, "100_g_at"))[c(3,7)] > [1] "100_g_at3" "100_g_at7" > > And feed that into RemoveProbes(), which will then work. > > Best, > > Jim > > > >> RemoveProbes(listOutProbes=maskedprobes, >> listOutProbeSets=maskedprobeSets, cleancdf) >> >> The error msg shows as: Error in get(pset[i], env = >> get(cdfpackagename)) : object '315997at' not found >> >> Do you know what is the correct format of the input for the probes >> (not probe sets) to be taken out? >> >> >> >> Thanks a lot, >> >> >> -James >> >> >> --- On Mon, 8/30/10, James W. MacDonald<jmacdon at="" med.umich.edu=""> >> wrote: >> >> From: James W. MacDonald<jmacdon at="" med.umich.edu=""> Subject: Re: [BioC] >> question regarding MAS5 normalization with reduced probes To: >> "James Anderson"<janderson_net at="" yahoo.com=""> Cc: >> "bioconductor"<bioconductor at="" stat.math.ethz.ch=""> Date: Monday, August >> 30, 2010, 12:25 PM >> >> Hi James, >> >> I misunderstood your question. I thought you already had a reduced >> set of probes you wanted to run mas5() on. >> >> So yeah, if you want to use a reduced set of probes you could use >> some code written by Ariel Chernomoretz (and modified by Jenny >> Drnevitch) that has been posted and referenced many times on this >> list: >> >> https://stat.ethz.ch/pipermail/bioconductor/2006-September/014242.html >> >> >> Alternatively, you could play with the affxparser package, which has the >> capability (IIRC) to do the same. >> >> Best, >> >> Jim >> >> >> >> On 8/30/2010 10:29 AM, James Anderson wrote: >>> Hi Jim, >>> >>> Thanks for your email. I've run mas5 before, but only using >>> default setting. From the help, it does not look like there is a >>> way to specify which reduced set of probes you can use. In >>> addition, from the file, it looks like it has more to do with >>> whether the "object" is read using a reduced set of probes. (I >>> believe if the "object" is read using only the reduced set, mas5 >>> will do the job), so don't know whether it has more to do with >>> the function ReadAffy, but from that, it does not look like it >>> has the option of specifying which reduced set of probes, if we >>> don't use alternative CDF file. Below is the usage of mas5 >>> function. mas5(object, normalize = TRUE, sc = 500, analysis = >>> "absolute", ...) Thanks, >>> >>> -James >>> >>> --- On Fri, 8/27/10, James W. MacDonald<jmacdon at="" med.umich.edu=""> >>> wrote: >>> >>> From: James W. MacDonald<jmacdon at="" med.umich.edu=""> Subject: Re: >>> [BioC] question regarding MAS5 normalization with reduced probes >>> To: "James Anderson"<janderson_net at="" yahoo.com=""> Cc: >>> "bioconductor"<bioconductor at="" stat.math.ethz.ch=""> Date: Friday, >>> August 27, 2010, 10:04 AM >>> >>> Hi James, >>> >>> On 8/26/2010 1:05 PM, James Anderson wrote: >>>> Hi, >>>> >>>> I am trying to use MAS5 to normalize some cel files with >>>> reduced set of probes (some probes whose PM is not >>>> significantly higher than MM is filtered), does anyone know how >>>> to do this? Does that require creating a new CDF file? >>> >>> Have you tried running mas5() from the affy package? Having >>> never tried, I don't know, but it seems a simple enough test. >>> >>> If you do need to create a new cdf, you will want to use the >>> affxparser package. >>> >>> Best, >>> >>> Jim >>> >>> >>>> >>>> thanks a bunch, >>>> >>>> -James >>>> >>>> >>>> >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> _______________________________________________ Bioconductor >>>> mailing list Bioconductor at stat.math.ethz.ch >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the >>>> archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> >> >> >>>> _______________________________________________ >> Bioconductor mailing list Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the >> archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
ADD COMMENT

Login before adding your answer.

Traffic: 674 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6