gcrma problem while processing HuGene-1_0-st-v1 genechip from Affymetrix
4
0
Entering edit mode
suparna mitra ▴ 290
@suparna-mitra-5328
Last seen 10.3 years ago
Hi, I am very new to biocondunctor and microaray. I have limited experience with R. I am trying to use biocondunctor for analyzing HuGene-1_0-st-v1 microarray data. I selectected different normalization method (rma, gcrma and mas5). For my data rma worked but for for gcrma and mas5 both I have problem. For gcrma it gives me a error like: Computing affinitiesError: length(prlen) == 1 is not TRUE And for mas 5 it seems working but I get only a whole list of NA. Here is what I have done. > mydata <- ReadAffy() > mydata AffyBatch object size of arrays=1050x1050 features (16 kb) cdf=HuGene-1_0-st-v1 (32321 affyids) number of samples=18 number of genes=32321 annotation=hugene10stv1 > eset <- rma(mydata) Background correcting Normalizing Calculating Expression > eset_justrma=justRMA() > eset_mas5 <- mas5(mydata) background correction: mas PM/MM correction : mas expression values: mas background correcting...done. 32321 ids to be processed | | |####################| > eset_gcrma <- gcrma(mydata) Adjusting for optical effect..................Done. Computing affinitiesError: length(prlen) == 1 is not TRUE Here is the error > eset_justrma # this worked fine ExpressionSet (storageMode: lockedEnvironment) assayData: 32321 features, 18 samples element names: exprs, se.exprs protocolData sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene-1_0-st-v1).CEL ... MC9_(HuGene-1_0-st-v1).CEL (18 total) varLabels: ScanDate varMetadata: labelDescription phenoData sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene-1_0-st-v1).CEL ... MC9_(HuGene-1_0-st-v1).CEL (18 total) varLabels: sample varMetadata: labelDescription featureData: none experimentData: use 'experimentData(object)' Annotation: hugene10stv1 > eset_mas5 # this seems worked fine but resulted all NA ExpressionSet (storageMode: lockedEnvironment) assayData: 32321 features, 18 samples element names: exprs, se.exprs protocolData sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene-1_0-st-v1).CEL ... MC9_(HuGene-1_0-st-v1).CEL (18 total) varLabels: ScanDate varMetadata: labelDescription phenoData sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene-1_0-st-v1).CEL ... MC9_(HuGene-1_0-st-v1).CEL (18 total) varLabels: sample varMetadata: labelDescription featureData: none experimentData: use 'experimentData(object)' Annotation: hugene10stv1 > write.exprs(eset_justrma,file="eset_justrma.csv") > write.exprs(eset_mas5,file="eset_mas5.csv") > write.exprs(eset,file="eset.csv") Any help in this will be really great. Being a novice, I am very sorry if I am doing any silly mistake. Thanks a lot, Suparna. -- Dr. Suparna Mitra Wolfson Centre for Personalised Medicine Department of Molecular and Clinical Pharmacology Institute of Translational Medicine University of Liverpool Block A: Waterhouse Buildings, L69 3GL Liverpool Tel. +44 (0)151 795 5414, Internal ext: 55414 M: +44 (0) 7523228621 Email id: smitra@liverpool.ac.uk Alternative Email id: suparna.mitra.sm@gmail.com [[alternative HTML version deleted]]
Normalization gcrma Normalization gcrma • 3.0k views
ADD COMMENT
0
Entering edit mode
@richard-friedman-513
Last seen 10.3 years ago
Dear Suparna, Both GCRMA and MAS 5 require mismatch probes which are not on HuGene-1_0-st-v1, so they cannot be used on this chip. Best wishes, Rich ------------------------------------------------------------ Richard A. Friedman, PhD Associate Research Scientist, Biomedical Informatics Shared Resource Herbert Irving Comprehensive Cancer Center (HICCC) Lecturer, Department of Biomedical Informatics (DBMI) Educational Coordinator, Center for Computational Biology and Bioinformatics (C2B2)/ National Center for Multiscale Analysis of Genomic Networks (MAGNet) Room 824 Irving Cancer Research Center Columbia University 1130 St. Nicholas Ave New York, NY 10032 (212)851-4765 (voice) friedman at cancercenter.columbia.edu http://cancercenter.columbia.edu/~friedman/ "School is an evil plot to suppress my individuality" Rose Friedman, age15 On Jun 11, 2012, at 11:00 AM, suparna mitra wrote: > Hi, > I am very new to biocondunctor and microaray. I have limited > experience > with R. > I am trying to use biocondunctor for analyzing HuGene-1_0-st-v1 > microarray > data. I selectected different normalization method (rma, gcrma and > mas5). > For my data rma worked but for for gcrma and mas5 both I have problem. > For gcrma it gives me a error like: Computing affinitiesError: > length(prlen) == 1 is not TRUE > > And for mas 5 it seems working but I get only a whole list of NA. > > Here is what I have done. > >> mydata <- ReadAffy() >> mydata > AffyBatch object > size of arrays=1050x1050 features (16 kb) > cdf=HuGene-1_0-st-v1 (32321 affyids) > number of samples=18 > number of genes=32321 > annotation=hugene10stv1 > >> eset <- rma(mydata) > Background correcting > Normalizing > Calculating Expression >> eset_justrma=justRMA() >> eset_mas5 <- mas5(mydata) > background correction: mas > PM/MM correction : mas > expression values: mas > background correcting...done. > 32321 ids to be processed > | | > |####################| >> eset_gcrma <- gcrma(mydata) > Adjusting for optical effect..................Done. > Computing affinitiesError: length(prlen) == 1 is not TRUE Here is > the > error > >> eset_justrma # this worked fine > ExpressionSet (storageMode: lockedEnvironment) > assayData: 32321 features, 18 samples > element names: exprs, se.exprs > protocolData > sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene-1_0-st- > v1).CEL ... > MC9_(HuGene-1_0-st-v1).CEL (18 total) > varLabels: ScanDate > varMetadata: labelDescription > phenoData > sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene-1_0-st- > v1).CEL ... > MC9_(HuGene-1_0-st-v1).CEL (18 total) > varLabels: sample > varMetadata: labelDescription > featureData: none > experimentData: use 'experimentData(object)' > Annotation: hugene10stv1 >> eset_mas5 # this seems worked fine but resulted all NA > ExpressionSet (storageMode: lockedEnvironment) > assayData: 32321 features, 18 samples > element names: exprs, se.exprs > protocolData > sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene-1_0-st- > v1).CEL ... > MC9_(HuGene-1_0-st-v1).CEL (18 total) > varLabels: ScanDate > varMetadata: labelDescription > phenoData > sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene-1_0-st- > v1).CEL ... > MC9_(HuGene-1_0-st-v1).CEL (18 total) > varLabels: sample > varMetadata: labelDescription > featureData: none > experimentData: use 'experimentData(object)' > Annotation: hugene10stv1 >> write.exprs(eset_justrma,file="eset_justrma.csv") >> write.exprs(eset_mas5,file="eset_mas5.csv") >> write.exprs(eset,file="eset.csv") > > Any help in this will be really great. Being a novice, I am very > sorry if I > am doing any silly mistake. > Thanks a lot, > Suparna. > > -- > Dr. Suparna Mitra > Wolfson Centre for Personalised Medicine > Department of Molecular and Clinical Pharmacology > Institute of Translational Medicine University of Liverpool > Block A: Waterhouse Buildings, L69 3GL Liverpool > > Tel. +44 (0)151 795 5414, Internal ext: 55414 > M: +44 (0) 7523228621 > Email id: smitra at liverpool.ac.uk > Alternative Email id: suparna.mitra.sm at gmail.com > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Dear Richard, Thank you very much for the quick response. Being new in this area I was bit confused with the options. Can you suggest any quick tutorial which can help be in self learning these options? hanks a lot, Suparna. On 11 June 2012 16:03, Richard Friedman <friedman@cancercenter.columbia.edu>wrote: > Dear Suparna, > > Both GCRMA and MAS 5 require mismatch probes which are not > on HuGene-1_0-st-v1, so they cannot be used on this chip. > > Best wishes, > Rich > ------------------------------**------------------------------ > Richard A. Friedman, PhD > Associate Research Scientist, > Biomedical Informatics Shared Resource > Herbert Irving Comprehensive Cancer Center (HICCC) > Lecturer, > Department of Biomedical Informatics (DBMI) > Educational Coordinator, > Center for Computational Biology and Bioinformatics (C2B2)/ > National Center for Multiscale Analysis of Genomic Networks (MAGNet) > Room 824 > Irving Cancer Research Center > Columbia University > 1130 St. Nicholas Ave > New York, NY 10032 > (212)851-4765 (voice) > friedman@cancercenter.**columbia.edu <friedman@cancercenter.columbia.edu> > http://cancercenter.columbia.**edu/~friedman/<http: cancercenter.co="" lumbia.edu="" ~friedman=""/> > > "School is an evil plot to suppress my individuality" > > Rose Friedman, age15 > > > > > > > > > > > > On Jun 11, 2012, at 11:00 AM, suparna mitra wrote: > > Hi, >> I am very new to biocondunctor and microaray. I have limited experience >> with R. >> I am trying to use biocondunctor for analyzing HuGene-1_0-st-v1 microarray >> data. I selectected different normalization method (rma, gcrma and mas5). >> For my data rma worked but for for gcrma and mas5 both I have problem. >> For gcrma it gives me a error like: Computing affinitiesError: >> length(prlen) == 1 is not TRUE >> >> And for mas 5 it seems working but I get only a whole list of NA. >> >> Here is what I have done. >> >> mydata <- ReadAffy() >>> mydata >>> >> AffyBatch object >> size of arrays=1050x1050 features (16 kb) >> cdf=HuGene-1_0-st-v1 (32321 affyids) >> number of samples=18 >> number of genes=32321 >> annotation=hugene10stv1 >> >> eset <- rma(mydata) >>> >> Background correcting >> Normalizing >> Calculating Expression >> >>> eset_justrma=justRMA() >>> eset_mas5 <- mas5(mydata) >>> >> background correction: mas >> PM/MM correction : mas >> expression values: mas >> background correcting...done. >> 32321 ids to be processed >> | | >> |####################| >> >>> eset_gcrma <- gcrma(mydata) >>> >> Adjusting for optical effect..................Done. >> Computing affinitiesError: length(prlen) == 1 is not TRUE Here is the >> error >> >> eset_justrma # this worked fine >>> >> ExpressionSet (storageMode: lockedEnvironment) >> assayData: 32321 features, 18 samples >> element names: exprs, se.exprs >> protocolData >> sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene- 1_0-st-v1).CEL ... >> MC9_(HuGene-1_0-st-v1).CEL (18 total) >> varLabels: ScanDate >> varMetadata: labelDescription >> phenoData >> sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene- 1_0-st-v1).CEL ... >> MC9_(HuGene-1_0-st-v1).CEL (18 total) >> varLabels: sample >> varMetadata: labelDescription >> featureData: none >> experimentData: use 'experimentData(object)' >> Annotation: hugene10stv1 >> >>> eset_mas5 # this seems worked fine but resulted all NA >>> >> ExpressionSet (storageMode: lockedEnvironment) >> assayData: 32321 features, 18 samples >> element names: exprs, se.exprs >> protocolData >> sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene- 1_0-st-v1).CEL ... >> MC9_(HuGene-1_0-st-v1).CEL (18 total) >> varLabels: ScanDate >> varMetadata: labelDescription >> phenoData >> sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene- 1_0-st-v1).CEL ... >> MC9_(HuGene-1_0-st-v1).CEL (18 total) >> varLabels: sample >> varMetadata: labelDescription >> featureData: none >> experimentData: use 'experimentData(object)' >> Annotation: hugene10stv1 >> >>> write.exprs(eset_justrma,file=**"eset_justrma.csv") >>> write.exprs(eset_mas5,file="**eset_mas5.csv") >>> write.exprs(eset,file="eset.**csv") >>> >> >> Any help in this will be really great. Being a novice, I am very sorry if >> I >> am doing any silly mistake. >> Thanks a lot, >> Suparna. >> >> -- >> Dr. Suparna Mitra >> Wolfson Centre for Personalised Medicine >> Department of Molecular and Clinical Pharmacology >> Institute of Translational Medicine University of Liverpool >> Block A: Waterhouse Buildings, L69 3GL Liverpool >> >> Tel. +44 (0)151 795 5414, Internal ext: 55414 >> M: +44 (0) 7523228621 >> Email id: smitra@liverpool.ac.uk >> Alternative Email id: suparna.mitra.sm@gmail.com >> >> [[alternative HTML version deleted]] >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: http://news.gmane.org/gmane.** >> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> > > -- Dr. Suparna Mitra Wolfson Centre for Personalised Medicine Department of Molecular and Clinical Pharmacology Institute of Translational Medicine University of Liverpool Block A: Waterhouse Buildings, L69 3GL Liverpool Tel. +44 (0)151 795 5414, Internal ext: 55414 M: +44 (0) 7523228621 Email id: smitra@liverpool.ac.uk Alternative Email id: suparna.mitra.sm@gmail.com [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States
Hi Suparna, On 6/11/2012 11:00 AM, suparna mitra wrote: > Hi, > I am very new to biocondunctor and microaray. I have limited experience > with R. > I am trying to use biocondunctor for analyzing HuGene-1_0-st-v1 microarray > data. I selectected different normalization method (rma, gcrma and mas5). > For my data rma worked but for for gcrma and mas5 both I have problem. This is to be expected. The HuGene array is a PM-only design, so mas5() won't work (because the mas5 algorithm requires subtracting MM from PM, and there are no MM probes). In addition, the default for gcrma() is to estimate the background for probes based on the GC content, using bins of MM probes. Again, without any MM probes, this won't work. Note however that gcrma() has an 'NCprobe' argument that you can use to specify an index of negative control probes. This is a non-trivial thing to do, and may be beyond your abilities if you are very new to R and BioC. To get the index of these probes, you will need to decide which probes are negative control probes, and then you can use the indexProbes() function, passing a character vector of the negative control probes to the genenames argument. This will return a list of indices for each probeset that you can unlist() prior to feeding in to gcrma(). Or you could just stick with rma(), like the vast majority of people do. Best, Jim > For gcrma it gives me a error like: Computing affinitiesError: > length(prlen) == 1 is not TRUE > > And for mas 5 it seems working but I get only a whole list of NA. > > Here is what I have done. > >> mydata<- ReadAffy() >> mydata > AffyBatch object > size of arrays=1050x1050 features (16 kb) > cdf=HuGene-1_0-st-v1 (32321 affyids) > number of samples=18 > number of genes=32321 > annotation=hugene10stv1 > >> eset<- rma(mydata) > Background correcting > Normalizing > Calculating Expression >> eset_justrma=justRMA() >> eset_mas5<- mas5(mydata) > background correction: mas > PM/MM correction : mas > expression values: mas > background correcting...done. > 32321 ids to be processed > | | > |####################| >> eset_gcrma<- gcrma(mydata) > Adjusting for optical effect..................Done. > Computing affinitiesError: length(prlen) == 1 is not TRUE Here is the > error > >> eset_justrma # this worked fine > ExpressionSet (storageMode: lockedEnvironment) > assayData: 32321 features, 18 samples > element names: exprs, se.exprs > protocolData > sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene- 1_0-st-v1).CEL ... > MC9_(HuGene-1_0-st-v1).CEL (18 total) > varLabels: ScanDate > varMetadata: labelDescription > phenoData > sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene- 1_0-st-v1).CEL ... > MC9_(HuGene-1_0-st-v1).CEL (18 total) > varLabels: sample > varMetadata: labelDescription > featureData: none > experimentData: use 'experimentData(object)' > Annotation: hugene10stv1 >> eset_mas5 # this seems worked fine but resulted all NA > ExpressionSet (storageMode: lockedEnvironment) > assayData: 32321 features, 18 samples > element names: exprs, se.exprs > protocolData > sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene- 1_0-st-v1).CEL ... > MC9_(HuGene-1_0-st-v1).CEL (18 total) > varLabels: ScanDate > varMetadata: labelDescription > phenoData > sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene- 1_0-st-v1).CEL ... > MC9_(HuGene-1_0-st-v1).CEL (18 total) > varLabels: sample > varMetadata: labelDescription > featureData: none > experimentData: use 'experimentData(object)' > Annotation: hugene10stv1 >> write.exprs(eset_justrma,file="eset_justrma.csv") >> write.exprs(eset_mas5,file="eset_mas5.csv") >> write.exprs(eset,file="eset.csv") > Any help in this will be really great. Being a novice, I am very sorry if I > am doing any silly mistake. > Thanks a lot, > Suparna. > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
Jim and list, Thank you for bringing the NCprobe option to my attention. I did not know that it had been implemented. Does anyone out there have a list of negative control probes more Human st 1.0 and for Mouse st 1.0 ? Thanks and best wishes, Rich On Jun 11, 2012, at 11:18 AM, James W. MacDonald wrote: > Hi Suparna, > > On 6/11/2012 11:00 AM, suparna mitra wrote: >> Hi, >> I am very new to biocondunctor and microaray. I have limited >> experience >> with R. >> I am trying to use biocondunctor for analyzing HuGene-1_0-st-v1 >> microarray >> data. I selectected different normalization method (rma, gcrma and >> mas5). >> For my data rma worked but for for gcrma and mas5 both I have >> problem. > > This is to be expected. The HuGene array is a PM-only design, so > mas5() won't work (because the mas5 algorithm requires subtracting > MM from PM, and there are no MM probes). In addition, the default > for gcrma() is to estimate the background for probes based on the GC > content, using bins of MM probes. Again, without any MM probes, this > won't work. > > Note however that gcrma() has an 'NCprobe' argument that you can use > to specify an index of negative control probes. This is a non- > trivial thing to do, and may be beyond your abilities if you are > very new to R and BioC. > > To get the index of these probes, you will need to decide which > probes are negative control probes, and then you can use the > indexProbes() function, passing a character vector of the negative > control probes to the genenames argument. This will return a list of > indices for each probeset that you can unlist() prior to feeding in > to gcrma(). > > Or you could just stick with rma(), like the vast majority of people > do. > > Best, > > Jim > > >> For gcrma it gives me a error like: Computing affinitiesError: >> length(prlen) == 1 is not TRUE >> >> And for mas 5 it seems working but I get only a whole list of NA. >> >> Here is what I have done. >> >>> mydata<- ReadAffy() >>> mydata >> AffyBatch object >> size of arrays=1050x1050 features (16 kb) >> cdf=HuGene-1_0-st-v1 (32321 affyids) >> number of samples=18 >> number of genes=32321 >> annotation=hugene10stv1 >> >>> eset<- rma(mydata) >> Background correcting >> Normalizing >> Calculating Expression >>> eset_justrma=justRMA() >>> eset_mas5<- mas5(mydata) >> background correction: mas >> PM/MM correction : mas >> expression values: mas >> background correcting...done. >> 32321 ids to be processed >> | | >> |####################| >>> eset_gcrma<- gcrma(mydata) >> Adjusting for optical effect..................Done. >> Computing affinitiesError: length(prlen) == 1 is not TRUE Here is >> the >> error >> >>> eset_justrma # this worked fine >> ExpressionSet (storageMode: lockedEnvironment) >> assayData: 32321 features, 18 samples >> element names: exprs, se.exprs >> protocolData >> sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene-1_0-st- >> v1).CEL ... >> MC9_(HuGene-1_0-st-v1).CEL (18 total) >> varLabels: ScanDate >> varMetadata: labelDescription >> phenoData >> sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene-1_0-st- >> v1).CEL ... >> MC9_(HuGene-1_0-st-v1).CEL (18 total) >> varLabels: sample >> varMetadata: labelDescription >> featureData: none >> experimentData: use 'experimentData(object)' >> Annotation: hugene10stv1 >>> eset_mas5 # this seems worked fine but resulted all NA >> ExpressionSet (storageMode: lockedEnvironment) >> assayData: 32321 features, 18 samples >> element names: exprs, se.exprs >> protocolData >> sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene-1_0-st- >> v1).CEL ... >> MC9_(HuGene-1_0-st-v1).CEL (18 total) >> varLabels: ScanDate >> varMetadata: labelDescription >> phenoData >> sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene-1_0-st- >> v1).CEL ... >> MC9_(HuGene-1_0-st-v1).CEL (18 total) >> varLabels: sample >> varMetadata: labelDescription >> featureData: none >> experimentData: use 'experimentData(object)' >> Annotation: hugene10stv1 >>> write.exprs(eset_justrma,file="eset_justrma.csv") >>> write.exprs(eset_mas5,file="eset_mas5.csv") >>> write.exprs(eset,file="eset.csv") >> Any help in this will be really great. Being a novice, I am very >> sorry if I >> am doing any silly mistake. >> Thanks a lot, >> Suparna. >> > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
Hi Rich, On 6/11/2012 11:24 AM, Richard Friedman wrote: > Jim and list, > > Thank you for bringing the NCprobe option to my attention. > I did not know that it had been implemented. > > Does anyone out there have a list of negative control probes more > Human st 1.0 and for Mouse st 1.0 ? No, but it's easy enough to get. > library(pd.hugene.1.0.st.v1) > con <- db(pd.hugene.1.0.st.v1) > dbListTables(con) [1] "chrom_dict" "core_mps" "featureSet" "level_dict" "pmfeature" [6] "table_info" "type_dict" > dbGetQuery(con, "select * from type_dict;") type type_id 1 1 main 2 2 control->affx 3 3 control->chip 4 4 control->bgp->antigenomic 5 5 control->bgp->genomic 6 6 normgene->exon 7 7 normgene->intron 8 8 rescue->FLmRNA->unmapped So let's say we want to call just bgp probes background. Now I happen to know we want the featureSet table, but you can look to see what is in each table using dbListFields() > dbListFields(con, "featureSet") [1] "fsetid" "strand" "start" [4] "stop" "transcript_cluster_id" "exon_id" [7] "crosshyb_type" "level" "chrom" [10] "type" You can also do something like dbGetQuery(con, "select * from featureSet limit 10;") to get an idea what is in a given table. So to get the probesets we want, > x <-dbGetQuery(con, "select fsetid from featureSet where type in ('4','5');") > head(x) fsetid 1 7892601 2 7892698 3 7892756 4 7892815 5 7892916 6 7892943 Now there may be a further complication that I don't have the time or desire to check out. These are probeset level IDs, and most people do things at the transcript level (and if you are doing gcrma() this is all you can do). So I don't know if you need to convert these fsetids to meta_fsetids, which are transcript level probesets. If so, you can map from fsetid to meta_fsetid using the core_mps table. I leave doing that mapping up to you, grasshopper. As a further test of your SQL awesomeness, you could figure out how to use an inner join statement so you can get the meta_fsetids in one database query. Knowing how to do that sort of thing can come in really handy - you can even link different .db packages to do cross-database queries, which can make your life much better if you need to do some complex mappings. There are some examples in one of the AnnotationDbi vignettes. Best, Jim > > Thanks and best wishes, > Rich > > > On Jun 11, 2012, at 11:18 AM, James W. MacDonald wrote: > >> Hi Suparna, >> >> On 6/11/2012 11:00 AM, suparna mitra wrote: >>> Hi, >>> I am very new to biocondunctor and microaray. I have limited >>> experience >>> with R. >>> I am trying to use biocondunctor for analyzing HuGene-1_0-st-v1 >>> microarray >>> data. I selectected different normalization method (rma, gcrma and >>> mas5). >>> For my data rma worked but for for gcrma and mas5 both I have problem. >> >> This is to be expected. The HuGene array is a PM-only design, so >> mas5() won't work (because the mas5 algorithm requires subtracting MM >> from PM, and there are no MM probes). In addition, the default for >> gcrma() is to estimate the background for probes based on the GC >> content, using bins of MM probes. Again, without any MM probes, this >> won't work. >> >> Note however that gcrma() has an 'NCprobe' argument that you can use >> to specify an index of negative control probes. This is a non- trivial >> thing to do, and may be beyond your abilities if you are very new to >> R and BioC. >> >> To get the index of these probes, you will need to decide which >> probes are negative control probes, and then you can use the >> indexProbes() function, passing a character vector of the negative >> control probes to the genenames argument. This will return a list of >> indices for each probeset that you can unlist() prior to feeding in >> to gcrma(). >> >> Or you could just stick with rma(), like the vast majority of people do. >> >> Best, >> >> Jim >> >> >>> For gcrma it gives me a error like: Computing affinitiesError: >>> length(prlen) == 1 is not TRUE >>> >>> And for mas 5 it seems working but I get only a whole list of NA. >>> >>> Here is what I have done. >>> >>>> mydata<- ReadAffy() >>>> mydata >>> AffyBatch object >>> size of arrays=1050x1050 features (16 kb) >>> cdf=HuGene-1_0-st-v1 (32321 affyids) >>> number of samples=18 >>> number of genes=32321 >>> annotation=hugene10stv1 >>> >>>> eset<- rma(mydata) >>> Background correcting >>> Normalizing >>> Calculating Expression >>>> eset_justrma=justRMA() >>>> eset_mas5<- mas5(mydata) >>> background correction: mas >>> PM/MM correction : mas >>> expression values: mas >>> background correcting...done. >>> 32321 ids to be processed >>> | | >>> |####################| >>>> eset_gcrma<- gcrma(mydata) >>> Adjusting for optical effect..................Done. >>> Computing affinitiesError: length(prlen) == 1 is not TRUE Here is the >>> error >>> >>>> eset_justrma # this worked fine >>> ExpressionSet (storageMode: lockedEnvironment) >>> assayData: 32321 features, 18 samples >>> element names: exprs, se.exprs >>> protocolData >>> sampleNames: MC1_(HuGene-1_0-st-v1).CEL >>> MC10_(HuGene-1_0-st-v1).CEL ... >>> MC9_(HuGene-1_0-st-v1).CEL (18 total) >>> varLabels: ScanDate >>> varMetadata: labelDescription >>> phenoData >>> sampleNames: MC1_(HuGene-1_0-st-v1).CEL >>> MC10_(HuGene-1_0-st-v1).CEL ... >>> MC9_(HuGene-1_0-st-v1).CEL (18 total) >>> varLabels: sample >>> varMetadata: labelDescription >>> featureData: none >>> experimentData: use 'experimentData(object)' >>> Annotation: hugene10stv1 >>>> eset_mas5 # this seems worked fine but resulted all NA >>> ExpressionSet (storageMode: lockedEnvironment) >>> assayData: 32321 features, 18 samples >>> element names: exprs, se.exprs >>> protocolData >>> sampleNames: MC1_(HuGene-1_0-st-v1).CEL >>> MC10_(HuGene-1_0-st-v1).CEL ... >>> MC9_(HuGene-1_0-st-v1).CEL (18 total) >>> varLabels: ScanDate >>> varMetadata: labelDescription >>> phenoData >>> sampleNames: MC1_(HuGene-1_0-st-v1).CEL >>> MC10_(HuGene-1_0-st-v1).CEL ... >>> MC9_(HuGene-1_0-st-v1).CEL (18 total) >>> varLabels: sample >>> varMetadata: labelDescription >>> featureData: none >>> experimentData: use 'experimentData(object)' >>> Annotation: hugene10stv1 >>>> write.exprs(eset_justrma,file="eset_justrma.csv") >>>> write.exprs(eset_mas5,file="eset_mas5.csv") >>>> write.exprs(eset,file="eset.csv") >>> Any help in this will be really great. Being a novice, I am very >>> sorry if I >>> am doing any silly mistake. >>> Thanks a lot, >>> Suparna. >>> >> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD REPLY
0
Entering edit mode
Dear James, On 11 June 2012 16:18, James W. MacDonald <jmacdon@uw.edu> wrote: > Hi Suparna, > > > On 6/11/2012 11:00 AM, suparna mitra wrote: > >> Hi, >> I am very new to biocondunctor and microaray. I have limited experience >> with R. >> I am trying to use biocondunctor for analyzing HuGene-1_0-st-v1 microarray >> data. I selectected different normalization method (rma, gcrma and mas5). >> For my data rma worked but for for gcrma and mas5 both I have problem. >> > > This is to be expected. The HuGene array is a PM-only design, so mas5() > won't work (because the mas5 algorithm requires subtracting MM from PM, and > there are no MM probes). In addition, the default for gcrma() is to > estimate the background for probes based on the GC content, using bins of > MM probes. Again, without any MM probes, this won't work. > > Note however that gcrma() has an 'NCprobe' argument that you can use to > specify an index of negative control probes. This is a non-trivial thing to > do, and may be beyond your abilities if you are very new to R and BioC. > Thank you for this information I will try for this. I have some knowledge of R but for BioC I am very new. > > To get the index of these probes, you will need to decide which probes are > negative control probes, and then you can use the indexProbes() function, > passing a character vector of the negative control probes to the genenames > argument. This will return a list of indices for each probeset that you can > unlist() prior to feeding in to gcrma(). > > Or you could just stick with rma(), like the vast majority of people do. > Thanks a lot for your help, best wishes, Suparna. > > Best, > > Jim > > > > For gcrma it gives me a error like: Computing affinitiesError: >> length(prlen) == 1 is not TRUE >> >> And for mas 5 it seems working but I get only a whole list of NA. >> >> Here is what I have done. >> >> mydata<- ReadAffy() >>> mydata >>> >> AffyBatch object >> size of arrays=1050x1050 features (16 kb) >> cdf=HuGene-1_0-st-v1 (32321 affyids) >> number of samples=18 >> number of genes=32321 >> annotation=hugene10stv1 >> >> eset<- rma(mydata) >>> >> Background correcting >> Normalizing >> Calculating Expression >> >>> eset_justrma=justRMA() >>> eset_mas5<- mas5(mydata) >>> >> background correction: mas >> PM/MM correction : mas >> expression values: mas >> background correcting...done. >> 32321 ids to be processed >> | | >> |####################| >> >>> eset_gcrma<- gcrma(mydata) >>> >> Adjusting for optical effect..................Done. >> Computing affinitiesError: length(prlen) == 1 is not TRUE Here is the >> error >> >> eset_justrma # this worked fine >>> >> ExpressionSet (storageMode: lockedEnvironment) >> assayData: 32321 features, 18 samples >> element names: exprs, se.exprs >> protocolData >> sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene- 1_0-st-v1).CEL ... >> MC9_(HuGene-1_0-st-v1).CEL (18 total) >> varLabels: ScanDate >> varMetadata: labelDescription >> phenoData >> sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene- 1_0-st-v1).CEL ... >> MC9_(HuGene-1_0-st-v1).CEL (18 total) >> varLabels: sample >> varMetadata: labelDescription >> featureData: none >> experimentData: use 'experimentData(object)' >> Annotation: hugene10stv1 >> >>> eset_mas5 # this seems worked fine but resulted all NA >>> >> ExpressionSet (storageMode: lockedEnvironment) >> assayData: 32321 features, 18 samples >> element names: exprs, se.exprs >> protocolData >> sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene- 1_0-st-v1).CEL ... >> MC9_(HuGene-1_0-st-v1).CEL (18 total) >> varLabels: ScanDate >> varMetadata: labelDescription >> phenoData >> sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene- 1_0-st-v1).CEL ... >> MC9_(HuGene-1_0-st-v1).CEL (18 total) >> varLabels: sample >> varMetadata: labelDescription >> featureData: none >> experimentData: use 'experimentData(object)' >> Annotation: hugene10stv1 >> >>> write.exprs(eset_justrma,file=**"eset_justrma.csv") >>> write.exprs(eset_mas5,file="**eset_mas5.csv") >>> write.exprs(eset,file="eset.**csv") >>> >> Any help in this will be really great. Being a novice, I am very sorry if >> I >> am doing any silly mistake. >> Thanks a lot, >> Suparna. >> >> > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > -- Dr. Suparna Mitra Wolfson Centre for Personalised Medicine Department of Molecular and Clinical Pharmacology Institute of Translational Medicine University of Liverpool Block A: Waterhouse Buildings, L69 3GL Liverpool Tel. +44 (0)151 795 5414, Internal ext: 55414 M: +44 (0) 7523228621 Email id: smitra@liverpool.ac.uk Alternative Email id: suparna.mitra.sm@gmail.com [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
suparna mitra ▴ 290
@suparna-mitra-5328
Last seen 10.3 years ago
Hi, I am very new to biocondunctor and microaray. I have limited experience with R. I am trying to use biocondunctor for analyzing HuGene-1_0-st-v1 microarray data. I selectected different normalization method (rma, gcrma and mas5). For my data rma worked but for for gcrma and mas5 both I have problem. For gcrma it gives me a error like: Computing affinitiesError: length(prlen) == 1 is not TRUE And for mas 5 it seems working but I get only a whole list of NA. Here is what I have done. > mydata <- ReadAffy() > mydata AffyBatch object size of arrays=1050x1050 features (16 kb) cdf=HuGene-1_0-st-v1 (32321 affyids) number of samples=18 number of genes=32321 annotation=hugene10stv1 > eset <- rma(mydata) Background correcting Normalizing Calculating Expression > eset_justrma=justRMA() > eset_mas5 <- mas5(mydata) background correction: mas PM/MM correction : mas expression values: mas background correcting...done. 32321 ids to be processed | | |####################| > eset_gcrma <- gcrma(mydata) Adjusting for optical effect..................Done. Computing affinitiesError: length(prlen) == 1 is not TRUE Here is the error > eset_justrma # this worked fine ExpressionSet (storageMode: lockedEnvironment) assayData: 32321 features, 18 samples element names: exprs, se.exprs protocolData sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene-1_0-st-v1).CEL ... MC9_(HuGene-1_0-st-v1).CEL (18 total) varLabels: ScanDate varMetadata: labelDescription phenoData sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene-1_0-st-v1).CEL ... MC9_(HuGene-1_0-st-v1).CEL (18 total) varLabels: sample varMetadata: labelDescription featureData: none experimentData: use 'experimentData(object)' Annotation: hugene10stv1 > eset_mas5 # this seems worked fine but resulted all NA ExpressionSet (storageMode: lockedEnvironment) assayData: 32321 features, 18 samples element names: exprs, se.exprs protocolData sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene-1_0-st-v1).CEL ... MC9_(HuGene-1_0-st-v1).CEL (18 total) varLabels: ScanDate varMetadata: labelDescription phenoData sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene-1_0-st-v1).CEL ... MC9_(HuGene-1_0-st-v1).CEL (18 total) varLabels: sample varMetadata: labelDescription featureData: none experimentData: use 'experimentData(object)' Annotation: hugene10stv1 > write.exprs(eset_justrma,file="eset_justrma.csv") > write.exprs(eset_mas5,file="eset_mas5.csv") > write.exprs(eset,file="eset.csv") Any help in this will be really great. Being a novice, I am very sorry if I am doing any silly mistake. Thanks a lot, Suparna. -- Dr. Suparna Mitra Wolfson Centre for Personalised Medicine Department of Molecular and Clinical Pharmacology Institute of Translational Medicine University of Liverpool Block A: Waterhouse Buildings, L69 3GL Liverpool Tel. +44 (0)151 795 5414, Internal ext: 55414 M: +44 (0) 7523228621 Email id: smitra@liverpool.ac.uk Alternative Email id: suparna.mitra.sm@gmail.com [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Hi, I am very new to biocondunctor and microaray. I have limited experience with R. I am trying to use biocondunctor for analyzing HuGene-1_0-st-v1 microarray data. I selectected different normalization method (rma, gcrma and mas5). For my data rma worked but for for gcrma and mas5 both I have problem. For gcrma it gives me a error like: Computing affinitiesError: length(prlen) == 1 is not TRUE And for mas 5 it seems working but I get only a whole list of NA. Here is what I have done. > mydata <- ReadAffy() > mydata AffyBatch object size of arrays=1050x1050 features (16 kb) cdf=HuGene-1_0-st-v1 (32321 affyids) number of samples=18 number of genes=32321 annotation=hugene10stv1 > eset <- rma(mydata) Background correcting Normalizing Calculating Expression > eset_justrma=justRMA() > eset_mas5 <- mas5(mydata) background correction: mas PM/MM correction : mas expression values: mas background correcting...done. 32321 ids to be processed | | |####################| > eset_gcrma <- gcrma(mydata) Adjusting for optical effect..................Done. Computing affinitiesError: length(prlen) == 1 is not TRUE Here is the error > eset_justrma # this worked fine ExpressionSet (storageMode: lockedEnvironment) assayData: 32321 features, 18 samples element names: exprs, se.exprs protocolData sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene-1_0-st-v1).CEL ... MC9_(HuGene-1_0-st-v1).CEL (18 total) varLabels: ScanDate varMetadata: labelDescription phenoData sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene-1_0-st-v1).CEL ... MC9_(HuGene-1_0-st-v1).CEL (18 total) varLabels: sample varMetadata: labelDescription featureData: none experimentData: use 'experimentData(object)' Annotation: hugene10stv1 > eset_mas5 # this seems worked fine but resulted all NA ExpressionSet (storageMode: lockedEnvironment) assayData: 32321 features, 18 samples element names: exprs, se.exprs protocolData sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene-1_0-st-v1).CEL ... MC9_(HuGene-1_0-st-v1).CEL (18 total) varLabels: ScanDate varMetadata: labelDescription phenoData sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene-1_0-st-v1).CEL ... MC9_(HuGene-1_0-st-v1).CEL (18 total) varLabels: sample varMetadata: labelDescription featureData: none experimentData: use 'experimentData(object)' Annotation: hugene10stv1 > write.exprs(eset_justrma,file="eset_justrma.csv") > write.exprs(eset_mas5,file="eset_mas5.csv") > write.exprs(eset,file="eset.csv") Any help in this will be really great. Being a novice, I am very sorry if I am doing any silly mistake. Thanks a lot, Suparna. -- Dr. Suparna Mitra Wolfson Centre for Personalised Medicine Department of Molecular and Clinical Pharmacology Institute of Translational Medicine University of Liverpool Block A: Waterhouse Buildings, L69 3GL Liverpool Tel. +44 (0)151 795 5414, Internal ext: 55414 M: +44 (0) 7523228621 Email id: smitra@liverpool.ac.uk Alternative Email id: suparna.mitra.sm@gmail.com [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
cstrato ★ 3.9k
@cstrato-908
Last seen 6.2 years ago
Austria
Dear Suparna, In principle you could use package xps to run mas5 for HuGene arrays, however, I would suggest to stay with rma (which xps also supports). Best regards Christian _._._._._._._._._._._._._._._._._._ C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a e.m.a.i.l: cstrato at aon.at _._._._._._._._._._._._._._._._._._ On 6/11/12 5:00 PM, suparna mitra wrote: > Hi, > I am very new to biocondunctor and microaray. I have limited experience > with R. > I am trying to use biocondunctor for analyzing HuGene-1_0-st-v1 microarray > data. I selectected different normalization method (rma, gcrma and mas5). > For my data rma worked but for for gcrma and mas5 both I have problem. > For gcrma it gives me a error like: Computing affinitiesError: > length(prlen) == 1 is not TRUE > > And for mas 5 it seems working but I get only a whole list of NA. > > Here is what I have done. > >> mydata<- ReadAffy() >> mydata > AffyBatch object > size of arrays=1050x1050 features (16 kb) > cdf=HuGene-1_0-st-v1 (32321 affyids) > number of samples=18 > number of genes=32321 > annotation=hugene10stv1 > >> eset<- rma(mydata) > Background correcting > Normalizing > Calculating Expression >> eset_justrma=justRMA() >> eset_mas5<- mas5(mydata) > background correction: mas > PM/MM correction : mas > expression values: mas > background correcting...done. > 32321 ids to be processed > | | > |####################| >> eset_gcrma<- gcrma(mydata) > Adjusting for optical effect..................Done. > Computing affinitiesError: length(prlen) == 1 is not TRUE Here is the > error > >> eset_justrma # this worked fine > ExpressionSet (storageMode: lockedEnvironment) > assayData: 32321 features, 18 samples > element names: exprs, se.exprs > protocolData > sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene- 1_0-st-v1).CEL ... > MC9_(HuGene-1_0-st-v1).CEL (18 total) > varLabels: ScanDate > varMetadata: labelDescription > phenoData > sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene- 1_0-st-v1).CEL ... > MC9_(HuGene-1_0-st-v1).CEL (18 total) > varLabels: sample > varMetadata: labelDescription > featureData: none > experimentData: use 'experimentData(object)' > Annotation: hugene10stv1 >> eset_mas5 # this seems worked fine but resulted all NA > ExpressionSet (storageMode: lockedEnvironment) > assayData: 32321 features, 18 samples > element names: exprs, se.exprs > protocolData > sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene- 1_0-st-v1).CEL ... > MC9_(HuGene-1_0-st-v1).CEL (18 total) > varLabels: ScanDate > varMetadata: labelDescription > phenoData > sampleNames: MC1_(HuGene-1_0-st-v1).CEL MC10_(HuGene- 1_0-st-v1).CEL ... > MC9_(HuGene-1_0-st-v1).CEL (18 total) > varLabels: sample > varMetadata: labelDescription > featureData: none > experimentData: use 'experimentData(object)' > Annotation: hugene10stv1 >> write.exprs(eset_justrma,file="eset_justrma.csv") >> write.exprs(eset_mas5,file="eset_mas5.csv") >> write.exprs(eset,file="eset.csv") > > Any help in this will be really great. Being a novice, I am very sorry if I > am doing any silly mistake. > Thanks a lot, > Suparna. >
ADD COMMENT

Login before adding your answer.

Traffic: 593 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6