Unable to Generate QC Report for mogene10stv1
1
0
Entering edit mode
Rick Frausto ▴ 110
@rick-frausto-4392
Last seen 10.2 years ago
Hi James, Below is the information that you requested - traceback() and sessioninfo(). Doesn't seem like much to me, but perhaps you can help. As you answer to a lot of e-mails, thought I'd remind you that this is in regards to the "some row.names duplicated" error. Hope your holidays were good! -Rick [R.app GUI 1.35 (5632) x86_64-apple-darwin9.8.0] [Workspace restored from /Users/rickfrausto/.RData] [History restored from /Users/rickfrausto/.Rapp.history] > library(affy) Loading required package: Biobase Welcome to Bioconductor Vignettes contain introductory material. To view, type 'openVignette()'. To cite Bioconductor, see 'citation("Biobase")' and for packages 'citation(pkgname)'. > mydata <- ReadAffy() > eset <- rma(mydata) Background correcting Normalizing Calculating Expression > write.exprs(eset, file="mydata.txt") > mypm <- pm(mydata) > mymm <- mm(mydata) > myaffyids <- probeNames(mydata) > result <- data.frame(myaffyids, mypm, mymm) > library(affyQCReport); QCReport(mydata, file="ExampleQC.pdf") Loading required package: lattice Warning message: In data.row.names(row.names, rowsi, i) : some row.names duplicated: 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,49,50,51,5 2,53,5 4,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99 ,102,1 03,104,108,109,110,111,114,119,120,121,122,127,134,136,137,138,139,141 ,142,1 47,148,149,152,153,156,157,158,159,162,163,164,165,166,167,168,169,170 ,171,1 73,175,176,179,180,183,184,185,186,191,192,195,197,198,199,200,202,206 ,207,2 10,219,220,227,228,229,230,233,234,235,240,241,243,245,246,248,249,250 ,251,2 52,253,257,259,260,266,271,272,276,277,280,281,284,286,287,289,290,291 ,292,2 96,297,298,302,304,305,306,310,311,312,313,317,318,319,321,322,324,334 ,337,3 38,339,340,341,345,346,350,351,356,359,362,364,366,367,370,371,373,376 ,378,3 82,383,384,385,386,387,388,389,391,394,395,397,398,399,400,402,403,405 ,406,4 07,409,410,411,415,416,418,419,425,431,432,433,434,435,440,441,443,445 ,447,4 49,450,452,454,455,456,461,464,466,470,472,473,481,487,488,491,492,493 ,494,4 95,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51 [... truncated] Error in plot(qc(object)) : error in evaluating the argument 'x' in selecting a method for function 'plot' > traceback() 2: plot(qc(object)) 1: QCReport(mydata, file = "ExampleQC.pdf") > sessionInfo() R version 2.12.0 (2010-10-15) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] affyQCReport_1.28.1 lattice_0.19-13 mogene10stv1cdf_2.7.0 [4] affy_1.28.0 Biobase_2.10.0 loaded via a namespace (and not attached): [1] affyio_1.18.0 affyPLM_1.26.0 annotate_1.28.0 [4] AnnotationDbi_1.12.0 Biostrings_2.18.2 DBI_0.2-5 [7] gcrma_2.22.0 genefilter_1.32.0 grid_2.12.0 [10] IRanges_1.8.7 preprocessCore_1.12.0 RColorBrewer_1.0-2 [13] RSQLite_0.9-4 simpleaffy_2.26.1 splines_2.12.0 [16] survival_2.36-2 tools_2.12.0 xtable_1.5-6 > On 20/12/10 6:33 AM, "James W. MacDonald" <jmacdon at="" med.umich.edu=""> wrote: > Hi Rick, > > On 12/17/2010 9:24 PM, Rick Frausto wrote: >> Hey Jim, >> >> Ok, I will give that a go. The only problem is an ExpressionSet contains all >> of the necessary information for further analysis (e.g. phenodata, >> featuredata and annotation, etc - including, treatment type, cell type, time >> points, replicates). I am still learning how to include all of these for a >> complete ExpressionSet. As a starting point I've loaded a txt file >> containing some of this information (gene abbrev, ontology, probeset ID) >> which I created using Affymetrix's Expression Console software, without >> replicate, time point and cell type info. Doing this I've gotten as far as >> creating a minimal ExpressionSet, which I guess the functions you mention >> below do just that but with the information contained in the CEL file only. >> >> In any case, since as you say, the functions in the online manual create a >> proper ExpressionSet why would I get the issue of duplication? > > Oh yeah, the original question ;-D. Try running QCreport() again, and > when it errors out run traceback() and send the output. Also include the > output of sessionInfo(). > > Jim > > >> >> In regards to the 64-bit discussion. It may have very well made enough of a >> difference as it did not come up with the memory error the last time I tried >> it. Going to upgrade to 8GB RAM anyways, can't hurt. >> >> Cheers, >> Rick >> >> >> On 17/12/10 7:20 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> wrote: >> >>> Hi Rick, >>> >>> On 12/16/2010 4:13 PM, Rick Frausto wrote: >>>> Hi Jim, >>>> >>>> How do I run an RMA analysis without a proper ExpresionSet? Honest answer, >>>> I >>>> don't know, I just put in a command line from a manual I found online and >>>> it >>>> spit out some result- see #3 Affy packages in following link ( >>>> http://manuals.bioinformatics.ucr.edu/home/R_BioCondManual#biocon _intro). >>> >>> You are mistaken. All of the functions mentioned there result in a >>> proper ExpressionSet. And if you just do >>> >>> abatch<- ReadAffy() >>> eset<- rma(abatch) >>> >>> Then you will 100% surely get an ExpressionSet. >>> >>>> >>>> Perhaps you don't need an ExpressionSet until after the preprocessing, at >>>> least that is what I get from the "An Introduction to Bioconductor's >>>> ExpressionSet Class" written by Seth Falcon, Martin Morgan and Robert >>>> Gentleman. Everything seemed to be going smoothly until I tried to get a QC >>>> Report. >>>> >>>> Now, the answer for why I would want to do such a thing is easy. Simply >>>> that >>>> I don't know any better :) Just started working with R a few days ago, but >>>> I'm learning. >>>> >>>> >>>> Apparently Snow Leopard running on 32bit can only utilize about 3.2GB of >>>> RAM, whereas 64bit can make use of all 4GB. I'll switch to the 64 bit OS >>>> and >>>> see if it makes a difference. >>> >>> Well, it won't be much different. The reason a 32-bit OS can only use >>> about 3.2 Gb of RAM is that the OS needs some to run. The 64-bit OS also >>> needs to use some RAM, so you won't get all 4 Gb there either. The issue >>> is how much RAM can be allocated to a single process, and on a 64-bit OS >>> that gets bumped up significantly. >>> >>> Best, >>> >>> Jim >>> >>> >>> >>>> >>>> Thanks for your insight! >>>> >>>> Cheers, >>>> Rick >>>> >>>> >>>> >>>> >>>> On 16/12/10 11:31 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> wrote: >>>> >>>>> Hi Rick, >>>>> >>>>> On 12/16/2010 12:57 PM, Rick Frausto wrote: >>>>>> Thanks Jim! How much memory would I need, I currently have 4GB, but have >>>>>> quite a few other programs running in the background...I'll see if >>>>>> closing >>>>>> them helps. Perhaps setting up an "ExpressionSet" would solve the >>>>>> problem. >>>>>> I >>>>>> just started reading up on how to set one of these up yesterday. Will do >>>>>> this and see if the duplicates will go away. >>>>>> >>>>>> The "mydata" originates from CEL files and then I run the RMA analysis on >>>>>> it, but I didn't actually set up a proper ExpressionSet. I'm guessing >>>>>> that >>>>>> doing this might reduce the QCReport PDF file size quite considerably >>>>>> since >>>>>> I won't have any duplication and will make further analysis easier. >>>>> >>>>> How do you run an RMA analysis without setting up a proper >>>>> ExpressionSet? The default behavior is to create one. In addition, why >>>>> would you want to do such a thing? The ExpressionSet class is >>>>> specifically designed to contain these sorts of data. >>>>> >>>>> >>>>>> >>>>>> I'm running Snow Leopard OSX which can be set up as 64bit. Would running >>>>>> as >>>>>> 64bit still necessitate more RAM? >>>>> >>>>> Probably. The difference isn't efficiency, but the ability to address >>>>> more RAM. A 32-bit OS can still address all the available memory that >>>>> you will have with just 4 Gb RAM, so you need to bump that up if you >>>>> want to do all the chips together. As for how much, I don't know. Since >>>>> RAM isn't that expensive these days, you might look at maxing your box >>>>> out. >>>>> >>>>> Best, >>>>> >>>>> Jim >>>>> >>>>> >>>>> >>>>> >>>>>> >>>>>> Thanks again, >>>>>> Rick >>>>>> >>>>>> >>>>>> On 15/12/10 7:45 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> >>>>>> wrote: >>>>>> >>>>>>> Hi Rick, >>>>>>> >>>>>>> On 12/14/2010 3:55 PM, Rick Frausto wrote: >>>>>>>> Dear All, >>>>>>>> >>>>>>>> I have recently entered the world of R. Through some trial and error >>>>>>>> I'm >>>>>>>> becoming more familiar with R and the relevant Bioconductor Affy >>>>>>>> packages. >>>>>>>> I?m a molecular and cell biologist with rudimentary statistical >>>>>>>> knowledge >>>>>>>> and even less knowledge with respect to R. >>>>>>>> >>>>>>>> When I enter the following: >>>>>>>> >>>>>>>> library(affyQCReport); QCReport(mydata, file="ExampleQC.pdf") >>>>>>>> >>>>>>>> I get some errors in return. >>>>>>>> >>>>>>>> Loading required package: lattice >>>>>>>> Error: cannot allocate vector of size 437.4 Mb >>>>>>> >>>>>>> This indicates that you need more RAM, as you are running out of memory. >>>>>>> >>>>>>>> In addition: Warning message: >>>>>>>> In data.row.names(row.names, rowsi, i) : >>>>>>>> some row.names duplicated: >>>>>>>> >>>> 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,49,50 ,51,52,53, >>>> >> >>>>>> >>>> 5 >>>>>>>> >>>> 4,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97, 98,99,102, >>>> >> >>>>>> >>>> 1 >>>>>>>> >>>> 03,104,108,109,110,111,114,119,120,121,122,127,134,136,137,138,13 9,141,142, >>>> >> >>>>>> >>>> 1 >>>>>>>> >>>> 47,148,149,152,153,156,157,158,159,162,163,164,165,166,167,168,16 9,170,171, >>>> >> >>>>>> >>>> 1 >>>>>>>> >>>> 73,175,176,179,180,183,184,185,186,191,192,195,197,198,199,200,20 2,206,207, >>>> >> >>>>>> >>>> 2 >>>>>>>> >>>> 10,219,220,227,228,229,230,233,234,235,240,241,243,245,246,248,24 9,250,251, >>>> >> >>>>>> >>>> 2 >>>>>>>> >>>> 52,253,257,259,260,266,271,272,276,277,280,281,284,286,287,289,29 0,291,292, >>>> >> >>>>>> >>>> 2 >>>>>>>> >>>> 96,297,298,302,304,305,306,310,311,312,313,317,318,319,321,322,32 4,334,337, >>>> >> >>>>>> >>>> 3 >>>>>>>> >>>> 38,339,340,341,345,346,350,351,356,359,362,364,366,367,370,371,37 3,376,378, >>>> >> >>>>>> >>>> 3 >>>>>>>> >>>> 82,383,384,385,386,387,388,389,391,394,395,397,398,399,400,402,40 3,405,406, >>>> >> >>>>>> >>>> 4 >>>>>>>> >>>> 07,409,410,411,415,416,418,419,425,431,432,433,434,435,440,441,44 3,445,447, >>>> >> >>>>>> >>>> 4 >>>>>>>> >>>> 49,450,452,454,455,456,461,464,466,470,472,473,481,487,488,491,49 2,493,494, >>>> >> >>>>>> >>>> 4 >>>>>>>> 95,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51 [... >>>>>>>> truncated] >>>>>>> >>>>>>> What exactly is 'mydata', and how did you generate it? The above error >>>>>>> indicates that you have duplicate row names, which IIRC isn't possible >>>>>>> to do with an expressionSet. >>>>>>> >>>>>>>> R(9062,0xa05c5540) malloc: *** mmap(size=458665984) failed (error >>>>>>>> code=12) >>>>>>>> *** error: can't allocate region >>>>>>>> *** set a breakpoint in malloc_error_break to debug >>>>>>>> R(9062,0xa05c5540) malloc: *** mmap(size=458665984) failed (error >>>>>>>> code=12) >>>>>>>> *** error: can't allocate region >>>>>>>> *** set a breakpoint in malloc_error_break to debug >>>>>>> >>>>>>> More lack of memory errors. >>>>>>> >>>>>>> >>>>>>>> Error in help(dt[i], package = pkg[i], htmlhelp = TRUE) : >>>>>>>> unused argument(s) (htmlhelp = TRUE) >>>>>>>> In addition: Warning messages: >>>>>>>> 1: In data(package = .packages(all.available = TRUE)) : >>>>>>>> datasets have been moved from package 'base' to package >>>>>>>> 'datasets' >>>>>>>> 2: In data(package = .packages(all.available = TRUE)) : >>>>>>>> datasets have been moved from package 'stats' to package >>>>>>>> 'datasets' >>>>>>>> starting httpd help server ... done >>>>>>>> >>>>>>>> Would someone be able to diagnose the problem and suggest a solution? >>>>>>> >>>>>>> First, get more RAM. Second, you will be better off using a 64-bit OS. >>>>>>> Depending on your hardware, you might be able to just install a 64-bit >>>>>>> version of R. >>>>>>> >>>>>>> Best, >>>>>>> >>>>>>> Jim >>>>>>> >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> If it is useful, I am using the following R software: R for Mac OS X >>>>>>>> GUI >>>>>>>> 1.35-dev Leopard build 32-bit. If there is any other info that would be >>>>>>>> useful please let me know. >>>>>>>> >>>>>>>> I had a read of the AffyQCReport Package pdf and I have added the >>>>>>>> following >>>>>>>> line: QCReport(ReadAffy(widget=TRUE)). Then I tried >>>>>>>> library(affyQCReport); >>>>>>>> QCReport(mydata, file="ExampleQC.pdf") again. It now seems to be doing >>>>>>>> something, in other words it doesn?t go to the error, yet, but it?s >>>>>>>> been >>>>>>>> processing for about 10 minutes. I am analyzing 35 chips. >>>>>>>> >>>>>>>> Perhaps it would work if I tried to generate each QCReport page >>>>>>>> separately >>>>>>>> rather than as a whole. >>>>>>>> >>>>>>>> Cordially, >>>>>>>> Rick >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Bioconductor mailing list >>>>>>>> Bioconductor at r-project.org >>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>>>> Search the archives: >>>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>>> >>>> >> -- Rick Frausto PhD Candidate The University of Sydney School of Molecular Bioscience G08 Camperdown, NSW 2006 AUSTRALIA ricardo.frausto at sydney.edu.au Phone: 61 2 9036 5354 Lab of Iain L. Campbell
Annotation GO GUI affy PROcess affyQCReport Annotation GO GUI affy PROcess affyQCReport • 1.6k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 16 hours ago
United States
Hi Rick, What happens if you load the simpleaffy package first? Best, Jim On 1/7/2011 2:14 PM, Rick Frausto wrote: > Hi James, > > Below is the information that you requested - traceback() and sessioninfo(). > Doesn't seem like much to me, but perhaps you can help. As you answer to a > lot of e-mails, thought I'd remind you that this is in regards to the "some > row.names duplicated" error. > > Hope your holidays were good! > > -Rick > > [R.app GUI 1.35 (5632) x86_64-apple-darwin9.8.0] > > [Workspace restored from /Users/rickfrausto/.RData] > [History restored from /Users/rickfrausto/.Rapp.history] > >> library(affy) > Loading required package: Biobase > > Welcome to Bioconductor > > Vignettes contain introductory material. To view, type > 'openVignette()'. To cite Bioconductor, see > 'citation("Biobase")' and for packages 'citation(pkgname)'. > >> mydata<- ReadAffy() >> eset<- rma(mydata) > Background correcting > Normalizing > Calculating Expression >> write.exprs(eset, file="mydata.txt") >> mypm<- pm(mydata) >> mymm<- mm(mydata) >> myaffyids<- probeNames(mydata) >> result<- data.frame(myaffyids, mypm, mymm) >> library(affyQCReport); QCReport(mydata, file="ExampleQC.pdf") > Loading required package: lattice > Warning message: > In data.row.names(row.names, rowsi, i) : > some row.names duplicated: > 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,49,50,51 ,52,53,5 > 4,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98, 99,102,1 > 03,104,108,109,110,111,114,119,120,121,122,127,134,136,137,138,139,1 41,142,1 > 47,148,149,152,153,156,157,158,159,162,163,164,165,166,167,168,169,1 70,171,1 > 73,175,176,179,180,183,184,185,186,191,192,195,197,198,199,200,202,2 06,207,2 > 10,219,220,227,228,229,230,233,234,235,240,241,243,245,246,248,249,2 50,251,2 > 52,253,257,259,260,266,271,272,276,277,280,281,284,286,287,289,290,2 91,292,2 > 96,297,298,302,304,305,306,310,311,312,313,317,318,319,321,322,324,3 34,337,3 > 38,339,340,341,345,346,350,351,356,359,362,364,366,367,370,371,373,3 76,378,3 > 82,383,384,385,386,387,388,389,391,394,395,397,398,399,400,402,403,4 05,406,4 > 07,409,410,411,415,416,418,419,425,431,432,433,434,435,440,441,443,4 45,447,4 > 49,450,452,454,455,456,461,464,466,470,472,473,481,487,488,491,492,4 93,494,4 > 95,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51 [... > truncated] > Error in plot(qc(object)) : > error in evaluating the argument 'x' in selecting a method for function > 'plot' >> traceback() > 2: plot(qc(object)) > 1: QCReport(mydata, file = "ExampleQC.pdf") >> sessionInfo() > R version 2.12.0 (2010-10-15) > Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) > > locale: > [1] en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] affyQCReport_1.28.1 latptice_0.19-13 mogene10stv1cdf_2.7.0 > [4] affy_1.28.0 Biobase_2.10.0 > > loaded via a namespace (and not attached): > [1] affyio_1.18.0 affyPLM_1.26.0 annotate_1.28.0 > [4] AnnotationDbi_1.12.0 Biostrings_2.18.2 DBI_0.2-5 > [7] gcrma_2.22.0 genefilter_1.32.0 grid_2.12.0 > [10] IRanges_1.8.7 preprocessCore_1.12.0 RColorBrewer_1.0-2 > [13] RSQLite_0.9-4 simpleaffy_2.26.1 splines_2.12.0 > [16] survival_2.36-2 tools_2.12.0 xtable_1.5-6 >> > > > > > On 20/12/10 6:33 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> wrote: > >> Hi Rick, >> >> On 12/17/2010 9:24 PM, Rick Frausto wrote: >>> Hey Jim, >>> >>> Ok, I will give that a go. The only problem is an ExpressionSet contains all >>> of the necessary information for further analysis (e.g. phenodata, >>> featuredata and annotation, etc - including, treatment type, cell type, time >>> points, replicates). I am still learning how to include all of these for a >>> complete ExpressionSet. As a starting point I've loaded a txt file >>> containing some of this information (gene abbrev, ontology, probeset ID) >>> which I created using Affymetrix's Expression Console software, without >>> replicate, time point and cell type info. Doing this I've gotten as far as >>> creating a minimal ExpressionSet, which I guess the functions you mention >>> below do just that but with the information contained in the CEL file only. >>> >>> In any case, since as you say, the functions in the online manual create a >>> proper ExpressionSet why would I get the issue of duplication? >> >> Oh yeah, the original question ;-D. Try running QCreport() again, and >> when it errors out run traceback() and send the output. Also include the >> output of sessionInfo(). >> >> Jim >> >> >>> >>> In regards to the 64-bit discussion. It may have very well made enough of a >>> difference as it did not come up with the memory error the last time I tried >>> it. Going to upgrade to 8GB RAM anyways, can't hurt. >>> >>> Cheers, >>> Rick >>> >>> >>> On 17/12/10 7:20 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> wrote: >>> >>>> Hi Rick, >>>> >>>> On 12/16/2010 4:13 PM, Rick Frausto wrote: >>>>> Hi Jim, >>>>> >>>>> How do I run an RMA analysis without a proper ExpresionSet? Honest answer, >>>>> I >>>>> don't know, I just put in a command line from a manual I found online and >>>>> it >>>>> spit out some result- see #3 Affy packages in following link ( >>>>> http://manuals.bioinformatics.ucr.edu/home/R_BioCondManual#bioco n_intro). >>>> >>>> You are mistaken. All of the functions mentioned there result in a >>>> proper ExpressionSet. And if you just do >>>> >>>> abatch<- ReadAffy() >>>> eset<- rma(abatch) >>>> >>>> Then you will 100% surely get an ExpressionSet. >>>> >>>>> >>>>> Perhaps you don't need an ExpressionSet until after the preprocessing, at >>>>> least that is what I get from the "An Introduction to Bioconductor's >>>>> ExpressionSet Class" written by Seth Falcon, Martin Morgan and Robert >>>>> Gentleman. Everything seemed to be going smoothly until I tried to get a QC >>>>> Report. >>>>> >>>>> Now, the answer for why I would want to do such a thing is easy. Simply >>>>> that >>>>> I don't know any better :) Just started working with R a few days ago, but >>>>> I'm learning. >>>>> >>>>> >>>>> Apparently Snow Leopard running on 32bit can only utilize about 3.2GB of >>>>> RAM, whereas 64bit can make use of all 4GB. I'll switch to the 64 bit OS >>>>> and >>>>> see if it makes a difference. >>>> >>>> Well, it won't be much different. The reason a 32-bit OS can only use >>>> about 3.2 Gb of RAM is that the OS needs some to run. The 64-bit OS also >>>> needs to use some RAM, so you won't get all 4 Gb there either. The issue >>>> is how much RAM can be allocated to a single process, and on a 64-bit OS >>>> that gets bumped up significantly. >>>> >>>> Best, >>>> >>>> Jim >>>> >>>> >>>> >>>>> >>>>> Thanks for your insight! >>>>> >>>>> Cheers, >>>>> Rick >>>>> >>>>> >>>>> >>>>> >>>>> On 16/12/10 11:31 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> wrote: >>>>> >>>>>> Hi Rick, >>>>>> >>>>>> On 12/16/2010 12:57 PM, Rick Frausto wrote: >>>>>>> Thanks Jim! How much memory would I need, I currently have 4GB, but have >>>>>>> quite a few other programs running in the background...I'll see if >>>>>>> closing >>>>>>> them helps. Perhaps setting up an "ExpressionSet" would solve the >>>>>>> problem. >>>>>>> I >>>>>>> just started reading up on how to set one of these up yesterday. Will do >>>>>>> this and see if the duplicates will go away. >>>>>>> >>>>>>> The "mydata" originates from CEL files and then I run the RMA analysis on >>>>>>> it, but I didn't actually set up a proper ExpressionSet. I'm guessing >>>>>>> that >>>>>>> doing this might reduce the QCReport PDF file size quite considerably >>>>>>> since >>>>>>> I won't have any duplication and will make further analysis easier. >>>>>> >>>>>> How do you run an RMA analysis without setting up a proper >>>>>> ExpressionSet? The default behavior is to create one. In addition, why >>>>>> would you want to do such a thing? The ExpressionSet class is >>>>>> specifically designed to contain these sorts of data. >>>>>> >>>>>> >>>>>>> >>>>>>> I'm running Snow Leopard OSX which can be set up as 64bit. Would running >>>>>>> as >>>>>>> 64bit still necessitate more RAM? >>>>>> >>>>>> Probably. The difference isn't efficiency, but the ability to address >>>>>> more RAM. A 32-bit OS can still address all the available memory that >>>>>> you will have with just 4 Gb RAM, so you need to bump that up if you >>>>>> want to do all the chips together. As for how much, I don't know. Since >>>>>> RAM isn't that expensive these days, you might look at maxing your box >>>>>> out. >>>>>> >>>>>> Best, >>>>>> >>>>>> Jim >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> Thanks again, >>>>>>> Rick >>>>>>> >>>>>>> >>>>>>> On 15/12/10 7:45 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Rick, >>>>>>>> >>>>>>>> On 12/14/2010 3:55 PM, Rick Frausto wrote: >>>>>>>>> Dear All, >>>>>>>>> >>>>>>>>> I have recently entered the world of R. Through some trial and error >>>>>>>>> I'm >>>>>>>>> becoming more familiar with R and the relevant Bioconductor Affy >>>>>>>>> packages. >>>>>>>>> I?m a molecular and cell biologist with rudimentary statistical >>>>>>>>> knowledge >>>>>>>>> and even less knowledge with respect to R. >>>>>>>>> >>>>>>>>> When I enter the following: >>>>>>>>> >>>>>>>>> library(affyQCReport); QCReport(mydata, file="ExampleQC.pdf") >>>>>>>>> >>>>>>>>> I get some errors in return. >>>>>>>>> >>>>>>>>> Loading required package: lattice >>>>>>>>> Error: cannot allocate vector of size 437.4 Mb >>>>>>>> >>>>>>>> This indicates that you need more RAM, as you are running out of memory. >>>>>>>> >>>>>>>>> In addition: Warning message: >>>>>>>>> In data.row.names(row.names, rowsi, i) : >>>>>>>>> some row.names duplicated: >>>>>>>>> >>>>> 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,49,5 0,51,52,53, >>>>>>> >>>>>>> >>>>> 5 >>>>>>>>> >>>>> 4,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97 ,98,99,102, >>>>>>> >>>>>>> >>>>> 1 >>>>>>>>> >>>>> 03,104,108,109,110,111,114,119,120,121,122,127,134,136,137,138,1 39,141,142, >>>>>>> >>>>>>> >>>>> 1 >>>>>>>>> >>>>> 47,148,149,152,153,156,157,158,159,162,163,164,165,166,167,168,1 69,170,171, >>>>>>> >>>>>>> >>>>> 1 >>>>>>>>> >>>>> 73,175,176,179,180,183,184,185,186,191,192,195,197,198,199,200,2 02,206,207, >>>>>>> >>>>>>> >>>>> 2 >>>>>>>>> >>>>> 10,219,220,227,228,229,230,233,234,235,240,241,243,245,246,248,2 49,250,251, >>>>>>> >>>>>>> >>>>> 2 >>>>>>>>> >>>>> 52,253,257,259,260,266,271,272,276,277,280,281,284,286,287,289,2 90,291,292, >>>>>>> >>>>>>> >>>>> 2 >>>>>>>>> >>>>> 96,297,298,302,304,305,306,310,311,312,313,317,318,319,321,322,3 24,334,337, >>>>>>> >>>>>>> >>>>> 3 >>>>>>>>> >>>>> 38,339,340,341,345,346,350,351,356,359,362,364,366,367,370,371,3 73,376,378, >>>>>>> >>>>>>> >>>>> 3 >>>>>>>>> >>>>> 82,383,384,385,386,387,388,389,391,394,395,397,398,399,400,402,4 03,405,406, >>>>>>> >>>>>>> >>>>> 4 >>>>>>>>> >>>>> 07,409,410,411,415,416,418,419,425,431,432,433,434,435,440,441,4 43,445,447, >>>>>>> >>>>>>> >>>>> 4 >>>>>>>>> >>>>> 49,450,452,454,455,456,461,464,466,470,472,473,481,487,488,491,4 92,493,494, >>>>>>> >>>>>>> >>>>> 4 >>>>>>>>> 95,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51 [... >>>>>>>>> truncated] >>>>>>>> >>>>>>>> What exactly is 'mydata', and how did you generate it? The above error >>>>>>>> indicates that you have duplicate row names, which IIRC isn't possible >>>>>>>> to do with an expressionSet. >>>>>>>> >>>>>>>>> R(9062,0xa05c5540) malloc: *** mmap(size=458665984) failed (error >>>>>>>>> code=12) >>>>>>>>> *** error: can't allocate region >>>>>>>>> *** set a breakpoint in malloc_error_break to debug >>>>>>>>> R(9062,0xa05c5540) malloc: *** mmap(size=458665984) failed (error >>>>>>>>> code=12) >>>>>>>>> *** error: can't allocate region >>>>>>>>> *** set a breakpoint in malloc_error_break to debug >>>>>>>> >>>>>>>> More lack of memory errors. >>>>>>>> >>>>>>>> >>>>>>>>> Error in help(dt[i], package = pkg[i], htmlhelp = TRUE) : >>>>>>>>> unused argument(s) (htmlhelp = TRUE) >>>>>>>>> In addition: Warning messages: >>>>>>>>> 1: In data(package = .packages(all.available = TRUE)) : >>>>>>>>> datasets have been moved from package 'base' to package >>>>>>>>> 'datasets' >>>>>>>>> 2: In data(package = .packages(all.available = TRUE)) : >>>>>>>>> datasets have been moved from package 'stats' to package >>>>>>>>> 'datasets' >>>>>>>>> starting httpd help server ... done >>>>>>>>> >>>>>>>>> Would someone be able to diagnose the problem and suggest a solution? >>>>>>>> >>>>>>>> First, get more RAM. Second, you will be better off using a 64-bit OS. >>>>>>>> Depending on your hardware, you might be able to just install a 64-bit >>>>>>>> version of R. >>>>>>>> >>>>>>>> Best, >>>>>>>> >>>>>>>> Jim >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> If it is useful, I am using the following R software: R for Mac OS X >>>>>>>>> GUI >>>>>>>>> 1.35-dev Leopard build 32-bit. If there is any other info that would be >>>>>>>>> useful please let me know. >>>>>>>>> >>>>>>>>> I had a read of the AffyQCReport Package pdf and I have added the >>>>>>>>> following >>>>>>>>> line: QCReport(ReadAffy(widget=TRUE)). Then I tried >>>>>>>>> library(affyQCReport); >>>>>>>>> QCReport(mydata, file="ExampleQC.pdf") again. It now seems to be doing >>>>>>>>> something, in other words it doesn?t go to the error, yet, but it?s >>>>>>>>> been >>>>>>>>> processing for about 10 minutes. I am analyzing 35 chips. >>>>>>>>> >>>>>>>>> Perhaps it would work if I tried to generate each QCReport page >>>>>>>>> separately >>>>>>>>> rather than as a whole. >>>>>>>>> >>>>>>>>> Cordially, >>>>>>>>> Rick >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Bioconductor mailing list >>>>>>>>> Bioconductor at r-project.org >>>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>>>>> Search the archives: >>>>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>>>> >>>>> >>> > -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
ADD COMMENT
0
Entering edit mode
Hi Jim, Ok, so after doing a bit of reading and re-reading I was eventually able to generate each page in a quartz window that the "QCReport" function should also generate. I found which ones give me the errors. So, there should be 6 pages in total. Page 2 gives me the duplication error and page 3 gives me the error in evaluating the argument x. The other pages are ok and are generated as expected. In brief, page 2 is suppose to be generated with the "signalDist(mydata)" command. Page 3 is suppose to generated with the "plot(qc(mydata))" command. So, I guess there must be particular requirements for these commands that I'm missing.I've included the session below along with traceback() and sessionInfo(). R version 2.12.0 (2010-10-15) Copyright (C) 2010 The R Foundation for Statistical Computing ISBN 3-900051-07-0 Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. [R.app GUI 1.35 (5632) x86_64-apple-darwin9.8.0] [Workspace restored from /Users/rickfrausto/.RData] [History restored from /Users/rickfrausto/.Rapp.history] > library(simpleaffy) Loading required package: affy Loading required package: Biobase Welcome to Bioconductor Vignettes contain introductory material. To view, type 'openVignette()'. To cite Bioconductor, see 'citation("Biobase")' and for packages 'citation(pkgname)'. Loading required package: genefilter Loading required package: gcrma Attaching package: 'simpleaffy' The following object(s) are masked _by_ '.GlobalEnv': getBioC > library(affy) > mydata <- ReadAffy() > eset <- rma(mydata) Background correcting Normalizing Calculating Expression > library(affycoretools); affystart(plot=T, express="rma") Loading required package: GO.db Loading required package: AnnotationDbi Loading required package: DBI Loading required package: KEGG.db Background correcting Normalizing Calculating Expression Please give the x-coordinate for a legend.30 Please give the y-coordinate for a legend.80 ExpressionSet (storageMode: lockedEnvironment) assayData: 34760 features, 35 samples element names: exprs protocolData sampleNames: A_WT1_NT_2hr.CEL B_WT1_NT_2hr.CEL ... ZI_ST1KO_HIL6_12hr.CEL (35 total) varLabels: ScanDate varMetadata: labelDescription phenoData sampleNames: A_WT1_NT_2hr.CEL B_WT1_NT_2hr.CEL ... ZI_ST1KO_HIL6_12hr.CEL (35 total) varLabels: sample varMetadata: labelDescription featureData: none experimentData: use 'experimentData(object)' Annotation: mogene10stv1 > write.exprs(eset, file="mydata.txt") > x <- data.frame(exprs(eset), exprs(eset_PMA), assayDataElement(eset_PMA, "se.exprs")); x <- x[,sort(names(x))]; write.table(x, file="mydata_PMA.xls", quote=F, col.names = NA, sep="\t") Error in exprs(eset_PMA) : error in evaluating the argument 'object' in selecting a method for function 'exprs' > mypm <- pm(mydata) > mymm <- mm(mydata) > myaffyids <- probeNames(mydata) > result <- data.frame(myaffyids, mypm, mymm) > eset; pData(eset) ExpressionSet (storageMode: lockedEnvironment) assayData: 34760 features, 35 samples element names: exprs protocolData sampleNames: A_WT1_NT_2hr.CEL B_WT1_NT_2hr.CEL ... ZI_ST1KO_HIL6_12hr.CEL (35 total) varLabels: ScanDate varMetadata: labelDescription phenoData sampleNames: A_WT1_NT_2hr.CEL B_WT1_NT_2hr.CEL ... ZI_ST1KO_HIL6_12hr.CEL (35 total) varLabels: sample varMetadata: labelDescription featureData: none experimentData: use 'experimentData(object)' Annotation: mogene10stv1 sample A_WT1_NT_2hr.CEL 1 B_WT1_NT_2hr.CEL 2 C_WT1_NT_12hr.CEL 3 D_WT1_NT_12hr.CEL 4 E_WT1_HIL6_2hr.CEL 5 F_WT1_HIL6_2hr.CEL 6 G_WT1_HIL6_12hr.CEL 7 H_WT1_HIL6_12hr.CEL 8 I_FF_NT_2hr.CEL 9 J_FF_NT_2hr.CEL 10 K_FF_NT_12hr.CEL 11 L_FF_NT_12hr.CEL 12 M_FF_HIL6_2hr.CEL 13 N_FF_HIL6_2hr.CEL 14 O_FF_HIL6_12hr.CEL 15 P_FF_HIL6_12hr.CEL 16 Q_WT2_NT_2hr.CEL 17 R_WT2_NT_2hr.CEL 18 S_WT2_NT_12hr.CEL 19 T_WT2_NT_12hr.CEL 20 U_WT2_HIL6_2hr.CEL 21 V_WT2_HIL6_2hr.CEL 22 W_WT2_HIL6_12hr.CEL 23 X_WT2_HIL6_12hr.CEL 24 Y_DD_NT_2hr.CEL 25 Z_DD_NT_2hr.CEL 26 ZA_DD_NT_12hr.CEL 27 ZB_DD_NT_12hr.CEL 28 ZC_DD_HIL6_2hr.CEL 29 ZD_DD_HIL6_2hr.CEL 30 ZE_DD_HIL6_12hr.CEL 31 ZF_DD_HIL6_12hr.CEL 32 ZG_ST1KO_NT_2hr.CEL 33 ZH_ST1KO_HIL6_2hr.CEL 34 ZI_ST1KO_HIL6_12hr.CEL 35 > data.frame(eset) X10338001 X10338003 X10338004 X10338017 X10338025 A_WT1_NT_2hr.CEL 11.71717 10.183620 9.440631 12.79412 8.823529 B_WT1_NT_2hr.CEL 11.78778 10.027760 9.489226 12.98544 8.843002 X10338026 X10338029 X10338035 X10338036 X10338037 A_WT1_NT_2hr.CEL 13.22585 9.405038 8.853564 9.379031 3.661987 B_WT1_NT_2hr.CEL 13.29043 9.575309 8.772872 9.513050 3.514885 X10338041 X10338042 X10338044 X10338047 X10338056 A_WT1_NT_2hr.CEL 10.94638 10.116516 11.88296 8.872839 3.133222 B_WT1_NT_2hr.CEL 11.23276 10.134084 12.03381 7.568584 3.088548 X10338059 X10338060 X10338063 X10338064 X10338065 JIM, I TRUNCATED THIS LIST, BUT THOUGHT IT MIGHT BE USEFUL IN DIAGNOSING THE PROBLEMS I'M HAVING. SESSION IS CONTINUED BELOW. > library(affyQCReport) Loading required package: lattice > titlePage(mydata) [1] TRUE > signalDist(mydata) Warning message: In data.row.names(row.names, rowsi, i) : some row.names duplicated: 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,49,50,51,5 2,53,5 4,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99 ,102,1 03,104,108,109,110,111,114,119,120,121,122,127,134,136,137,138,139,141 ,142,1 47,148,149,152,153,156,157,158,159,162,163,164,165,166,167,168,169,170 ,171,1 73,175,176,179,180,183,184,185,186,191,192,195,197,198,199,200,202,206 ,207,2 10,219,220,227,228,229,230,233,234,235,240,241,243,245,246,248,249,250 ,251,2 52,253,257,259,260,266,271,272,276,277,280,281,284,286,287,289,290,291 ,292,2 96,297,298,302,304,305,306,310,311,312,313,317,318,319,321,322,324,334 ,337,3 38,339,340,341,345,346,350,351,356,359,362,364,366,367,370,371,373,376 ,378,3 82,383,384,385,386,387,388,389,391,394,395,397,398,399,400,402,403,405 ,406,4 07,409,410,411,415,416,418,419,425,431,432,433,434,435,440,441,443,445 ,447,4 49,450,452,454,455,456,461,464,466,470,472,473,481,487,488,491,492,493 ,494,4 95,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51 [... truncated] > plot(qc(mydata)) Error in plot(qc(mydata)) : error in evaluating the argument 'x' in selecting a method for function 'plot' > borderQC1(mydata) [1] TRUE > borderQC2(mydata) [1] TRUE > correlationPlot(mydata) [1] TRUE > titlePage(mydata) [1] TRUE > titlePage(mydata) Error in polygon(c(0, 0, 0.9, 0.9, 0), c(0.05, 0.95, 0.95, 0.05, 0.05)) : plot.new has not been called yet > correlationPlot(mydata) [1] TRUE > titlePage(mydata) Error in polygon(c(0, 0, 0.9, 0.9, 0), c(0.05, 0.95, 0.95, 0.05, 0.05)) : plot.new has not been called yet In addition: Warning message: Display list redraw incomplete > borderQC1(mydata) [1] TRUE > titlePage(mydata) [1] TRUE > titlePage(mydata) Error in polygon(c(0, 0, 0.9, 0.9, 0), c(0.05, 0.95, 0.95, 0.05, 0.05)) : plot.new has not been called yet > traceback() 2: polygon(c(0, 0, 0.9, 0.9, 0), c(0.05, 0.95, 0.95, 0.05, 0.05)) 1: titlePage(mydata) > sessionInfo() R version 2.12.0 (2010-10-15) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] affyQCReport_1.28.1 lattice_0.19-13 affycoretools_1.22.0 [4] KEGG.db_2.4.5 GO.db_2.4.5 RSQLite_0.9-4 [7] DBI_0.2-5 AnnotationDbi_1.12.0 mogene10stv1cdf_2.7.0 [10] simpleaffy_2.26.1 gcrma_2.22.0 genefilter_1.32.0 [13] affy_1.28.0 Biobase_2.10.0 loaded via a namespace (and not attached): [1] affyio_1.18.0 affyPLM_1.26.0 annaffy_1.22.0 [4] annotate_1.28.0 biomaRt_2.6.0 Biostrings_2.18.2 [7] Category_2.16.0 GOstats_2.16.0 graph_1.28.0 [10] grid_2.12.0 GSEABase_1.12.2 IRanges_1.8.7 [13] limma_3.6.9 preprocessCore_1.12.0 RBGL_1.26.0 [16] RColorBrewer_1.0-2 RCurl_1.4-3 splines_2.12.0 [19] survival_2.36-2 tools_2.12.0 XML_3.2-0 [22] xtable_1.5-6 > On 7/01/11 12:47 PM, "James W. MacDonald" <jmacdon at="" med.umich.edu=""> wrote: > Hi Rick, > > What happens if you load the simpleaffy package first? > > Best, > > Jim > > On 1/7/2011 2:14 PM, Rick Frausto wrote: >> Hi James, >> >> Below is the information that you requested - traceback() and sessioninfo(). >> Doesn't seem like much to me, but perhaps you can help. As you answer to a >> lot of e-mails, thought I'd remind you that this is in regards to the "some >> row.names duplicated" error. >> >> Hope your holidays were good! >> >> -Rick >> >> [R.app GUI 1.35 (5632) x86_64-apple-darwin9.8.0] >> >> [Workspace restored from /Users/rickfrausto/.RData] >> [History restored from /Users/rickfrausto/.Rapp.history] >> >>> library(affy) >> Loading required package: Biobase >> >> Welcome to Bioconductor >> >> Vignettes contain introductory material. To view, type >> 'openVignette()'. To cite Bioconductor, see >> 'citation("Biobase")' and for packages 'citation(pkgname)'. >> >>> mydata<- ReadAffy() >>> eset<- rma(mydata) >> Background correcting >> Normalizing >> Calculating Expression >>> write.exprs(eset, file="mydata.txt") >>> mypm<- pm(mydata) >>> mymm<- mm(mydata) >>> myaffyids<- probeNames(mydata) >>> result<- data.frame(myaffyids, mypm, mymm) >>> library(affyQCReport); QCReport(mydata, file="ExampleQC.pdf") >> Loading required package: lattice >> Warning message: >> In data.row.names(row.names, rowsi, i) : >> some row.names duplicated: >> 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,49,50,5 1,52,53,5 >> 4,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98 ,99,102,1 >> 03,104,108,109,110,111,114,119,120,121,122,127,134,136,137,138,139, 141,142,1 >> 47,148,149,152,153,156,157,158,159,162,163,164,165,166,167,168,169, 170,171,1 >> 73,175,176,179,180,183,184,185,186,191,192,195,197,198,199,200,202, 206,207,2 >> 10,219,220,227,228,229,230,233,234,235,240,241,243,245,246,248,249, 250,251,2 >> 52,253,257,259,260,266,271,272,276,277,280,281,284,286,287,289,290, 291,292,2 >> 96,297,298,302,304,305,306,310,311,312,313,317,318,319,321,322,324, 334,337,3 >> 38,339,340,341,345,346,350,351,356,359,362,364,366,367,370,371,373, 376,378,3 >> 82,383,384,385,386,387,388,389,391,394,395,397,398,399,400,402,403, 405,406,4 >> 07,409,410,411,415,416,418,419,425,431,432,433,434,435,440,441,443, 445,447,4 >> 49,450,452,454,455,456,461,464,466,470,472,473,481,487,488,491,492, 493,494,4 >> 95,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51 [... >> truncated] >> Error in plot(qc(object)) : >> error in evaluating the argument 'x' in selecting a method for function >> 'plot' >>> traceback() >> 2: plot(qc(object)) >> 1: QCReport(mydata, file = "ExampleQC.pdf") >>> sessionInfo() >> R version 2.12.0 (2010-10-15) >> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >> >> locale: >> [1] en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] affyQCReport_1.28.1 latptice_0.19-13 mogene10stv1cdf_2.7.0 >> [4] affy_1.28.0 Biobase_2.10.0 >> >> loaded via a namespace (and not attached): >> [1] affyio_1.18.0 affyPLM_1.26.0 annotate_1.28.0 >> [4] AnnotationDbi_1.12.0 Biostrings_2.18.2 DBI_0.2-5 >> [7] gcrma_2.22.0 genefilter_1.32.0 grid_2.12.0 >> [10] IRanges_1.8.7 preprocessCore_1.12.0 RColorBrewer_1.0-2 >> [13] RSQLite_0.9-4 simpleaffy_2.26.1 splines_2.12.0 >> [16] survival_2.36-2 tools_2.12.0 xtable_1.5-6 >>> >> >> >> >> >> On 20/12/10 6:33 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> wrote: >> >>> Hi Rick, >>> >>> On 12/17/2010 9:24 PM, Rick Frausto wrote: >>>> Hey Jim, >>>> >>>> Ok, I will give that a go. The only problem is an ExpressionSet contains >>>> all >>>> of the necessary information for further analysis (e.g. phenodata, >>>> featuredata and annotation, etc - including, treatment type, cell type, >>>> time >>>> points, replicates). I am still learning how to include all of these for a >>>> complete ExpressionSet. As a starting point I've loaded a txt file >>>> containing some of this information (gene abbrev, ontology, probeset ID) >>>> which I created using Affymetrix's Expression Console software, without >>>> replicate, time point and cell type info. Doing this I've gotten as far as >>>> creating a minimal ExpressionSet, which I guess the functions you mention >>>> below do just that but with the information contained in the CEL file only. >>>> >>>> In any case, since as you say, the functions in the online manual create a >>>> proper ExpressionSet why would I get the issue of duplication? >>> >>> Oh yeah, the original question ;-D. Try running QCreport() again, and >>> when it errors out run traceback() and send the output. Also include the >>> output of sessionInfo(). >>> >>> Jim >>> >>> >>>> >>>> In regards to the 64-bit discussion. It may have very well made enough of a >>>> difference as it did not come up with the memory error the last time I >>>> tried >>>> it. Going to upgrade to 8GB RAM anyways, can't hurt. >>>> >>>> Cheers, >>>> Rick >>>> >>>> >>>> On 17/12/10 7:20 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> wrote: >>>> >>>>> Hi Rick, >>>>> >>>>> On 12/16/2010 4:13 PM, Rick Frausto wrote: >>>>>> Hi Jim, >>>>>> >>>>>> How do I run an RMA analysis without a proper ExpresionSet? Honest >>>>>> answer, >>>>>> I >>>>>> don't know, I just put in a command line from a manual I found online and >>>>>> it >>>>>> spit out some result- see #3 Affy packages in following link ( >>>>>> http://manuals.bioinformatics.ucr.edu/home/R_BioCondManual#bioc on_intro). >>>>> >>>>> You are mistaken. All of the functions mentioned there result in a >>>>> proper ExpressionSet. And if you just do >>>>> >>>>> abatch<- ReadAffy() >>>>> eset<- rma(abatch) >>>>> >>>>> Then you will 100% surely get an ExpressionSet. >>>>> >>>>>> >>>>>> Perhaps you don't need an ExpressionSet until after the preprocessing, at >>>>>> least that is what I get from the "An Introduction to Bioconductor's >>>>>> ExpressionSet Class" written by Seth Falcon, Martin Morgan and Robert >>>>>> Gentleman. Everything seemed to be going smoothly until I tried to get a >>>>>> QC >>>>>> Report. >>>>>> >>>>>> Now, the answer for why I would want to do such a thing is easy. Simply >>>>>> that >>>>>> I don't know any better :) Just started working with R a few days ago, >>>>>> but >>>>>> I'm learning. >>>>>> >>>>>> >>>>>> Apparently Snow Leopard running on 32bit can only utilize about 3.2GB of >>>>>> RAM, whereas 64bit can make use of all 4GB. I'll switch to the 64 bit OS >>>>>> and >>>>>> see if it makes a difference. >>>>> >>>>> Well, it won't be much different. The reason a 32-bit OS can only use >>>>> about 3.2 Gb of RAM is that the OS needs some to run. The 64-bit OS also >>>>> needs to use some RAM, so you won't get all 4 Gb there either. The issue >>>>> is how much RAM can be allocated to a single process, and on a 64-bit OS >>>>> that gets bumped up significantly. >>>>> >>>>> Best, >>>>> >>>>> Jim >>>>> >>>>> >>>>> >>>>>> >>>>>> Thanks for your insight! >>>>>> >>>>>> Cheers, >>>>>> Rick >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On 16/12/10 11:31 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> >>>>>> wrote: >>>>>> >>>>>>> Hi Rick, >>>>>>> >>>>>>> On 12/16/2010 12:57 PM, Rick Frausto wrote: >>>>>>>> Thanks Jim! How much memory would I need, I currently have 4GB, but >>>>>>>> have >>>>>>>> quite a few other programs running in the background...I'll see if >>>>>>>> closing >>>>>>>> them helps. Perhaps setting up an "ExpressionSet" would solve the >>>>>>>> problem. >>>>>>>> I >>>>>>>> just started reading up on how to set one of these up yesterday. Will >>>>>>>> do >>>>>>>> this and see if the duplicates will go away. >>>>>>>> >>>>>>>> The "mydata" originates from CEL files and then I run the RMA analysis >>>>>>>> on >>>>>>>> it, but I didn't actually set up a proper ExpressionSet. I'm guessing >>>>>>>> that >>>>>>>> doing this might reduce the QCReport PDF file size quite considerably >>>>>>>> since >>>>>>>> I won't have any duplication and will make further analysis easier. >>>>>>> >>>>>>> How do you run an RMA analysis without setting up a proper >>>>>>> ExpressionSet? The default behavior is to create one. In addition, why >>>>>>> would you want to do such a thing? The ExpressionSet class is >>>>>>> specifically designed to contain these sorts of data. >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> I'm running Snow Leopard OSX which can be set up as 64bit. Would >>>>>>>> running >>>>>>>> as >>>>>>>> 64bit still necessitate more RAM? >>>>>>> >>>>>>> Probably. The difference isn't efficiency, but the ability to address >>>>>>> more RAM. A 32-bit OS can still address all the available memory that >>>>>>> you will have with just 4 Gb RAM, so you need to bump that up if you >>>>>>> want to do all the chips together. As for how much, I don't know. Since >>>>>>> RAM isn't that expensive these days, you might look at maxing your box >>>>>>> out. >>>>>>> >>>>>>> Best, >>>>>>> >>>>>>> Jim >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Thanks again, >>>>>>>> Rick >>>>>>>> >>>>>>>> >>>>>>>> On 15/12/10 7:45 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi Rick, >>>>>>>>> >>>>>>>>> On 12/14/2010 3:55 PM, Rick Frausto wrote: >>>>>>>>>> Dear All, >>>>>>>>>> >>>>>>>>>> I have recently entered the world of R. Through some trial and error >>>>>>>>>> I'm >>>>>>>>>> becoming more familiar with R and the relevant Bioconductor Affy >>>>>>>>>> packages. >>>>>>>>>> I?m a molecular and cell biologist with rudimentary statistical >>>>>>>>>> knowledge >>>>>>>>>> and even less knowledge with respect to R. >>>>>>>>>> >>>>>>>>>> When I enter the following: >>>>>>>>>> >>>>>>>>>> library(affyQCReport); QCReport(mydata, file="ExampleQC.pdf") >>>>>>>>>> >>>>>>>>>> I get some errors in return. >>>>>>>>>> >>>>>>>>>> Loading required package: lattice >>>>>>>>>> Error: cannot allocate vector of size 437.4 Mb >>>>>>>>> >>>>>>>>> This indicates that you need more RAM, as you are running out of >>>>>>>>> memory. >>>>>>>>> >>>>>>>>>> In addition: Warning message: >>>>>>>>>> In data.row.names(row.names, rowsi, i) : >>>>>>>>>> some row.names duplicated: >>>>>>>>>> >>>>>> 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,49, 50,51,52,5 >>>>>> 3, >>>>>>>> >>>>>>>> >>>>>> 5 >>>>>>>>>> >>>>>> 4,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,94,95,96,9 7,98,99,10 >>>>>> 2, >>>>>>>> >>>>>>>> >>>>>> 1 >>>>>>>>>> >>>>>> 03,104,108,109,110,111,114,119,120,121,122,127,134,136,137,138, 139,141,14 >>>>>> 2, >>>>>>>> >>>>>>>> >>>>>> 1 >>>>>>>>>> >>>>>> 47,148,149,152,153,156,157,158,159,162,163,164,165,166,167,168, 169,170,17 >>>>>> 1, >>>>>>>> >>>>>>>> >>>>>> 1 >>>>>>>>>> >>>>>> 73,175,176,179,180,183,184,185,186,191,192,195,197,198,199,200, 202,206,20 >>>>>> 7, >>>>>>>> >>>>>>>> >>>>>> 2 >>>>>>>>>> >>>>>> 10,219,220,227,228,229,230,233,234,235,240,241,243,245,246,248, 249,250,25 >>>>>> 1, >>>>>>>> >>>>>>>> >>>>>> 2 >>>>>>>>>> >>>>>> 52,253,257,259,260,266,271,272,276,277,280,281,284,286,287,289, 290,291,29 >>>>>> 2, >>>>>>>> >>>>>>>> >>>>>> 2 >>>>>>>>>> >>>>>> 96,297,298,302,304,305,306,310,311,312,313,317,318,319,321,322, 324,334,33 >>>>>> 7, >>>>>>>> >>>>>>>> >>>>>> 3 >>>>>>>>>> >>>>>> 38,339,340,341,345,346,350,351,356,359,362,364,366,367,370,371, 373,376,37 >>>>>> 8, >>>>>>>> >>>>>>>> >>>>>> 3 >>>>>>>>>> >>>>>> 82,383,384,385,386,387,388,389,391,394,395,397,398,399,400,402, 403,405,40 >>>>>> 6, >>>>>>>> >>>>>>>> >>>>>> 4 >>>>>>>>>> >>>>>> 07,409,410,411,415,416,418,419,425,431,432,433,434,435,440,441, 443,445,44 >>>>>> 7, >>>>>>>> >>>>>>>> >>>>>> 4 >>>>>>>>>> >>>>>> 49,450,452,454,455,456,461,464,466,470,472,473,481,487,488,491, 492,493,49 >>>>>> 4, >>>>>>>> >>>>>>>> >>>>>> 4 >>>>>>>>>> 95,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51 [... >>>>>>>>>> truncated] >>>>>>>>> >>>>>>>>> What exactly is 'mydata', and how did you generate it? The above error >>>>>>>>> indicates that you have duplicate row names, which IIRC isn't possible >>>>>>>>> to do with an expressionSet. >>>>>>>>> >>>>>>>>>> R(9062,0xa05c5540) malloc: *** mmap(size=458665984) failed (error >>>>>>>>>> code=12) >>>>>>>>>> *** error: can't allocate region >>>>>>>>>> *** set a breakpoint in malloc_error_break to debug >>>>>>>>>> R(9062,0xa05c5540) malloc: *** mmap(size=458665984) failed (error >>>>>>>>>> code=12) >>>>>>>>>> *** error: can't allocate region >>>>>>>>>> *** set a breakpoint in malloc_error_break to debug >>>>>>>>> >>>>>>>>> More lack of memory errors. >>>>>>>>> >>>>>>>>> >>>>>>>>>> Error in help(dt[i], package = pkg[i], htmlhelp = TRUE) : >>>>>>>>>> unused argument(s) (htmlhelp = TRUE) >>>>>>>>>> In addition: Warning messages: >>>>>>>>>> 1: In data(package = .packages(all.available = TRUE)) : >>>>>>>>>> datasets have been moved from package 'base' to package >>>>>>>>>> 'datasets' >>>>>>>>>> 2: In data(package = .packages(all.available = TRUE)) : >>>>>>>>>> datasets have been moved from package 'stats' to package >>>>>>>>>> 'datasets' >>>>>>>>>> starting httpd help server ... done >>>>>>>>>> >>>>>>>>>> Would someone be able to diagnose the problem and suggest a solution? >>>>>>>>> >>>>>>>>> First, get more RAM. Second, you will be better off using a 64-bit OS. >>>>>>>>> Depending on your hardware, you might be able to just install a 64-bit >>>>>>>>> version of R. >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> >>>>>>>>> Jim >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> If it is useful, I am using the following R software: R for Mac OS X >>>>>>>>>> GUI >>>>>>>>>> 1.35-dev Leopard build 32-bit. If there is any other info that would >>>>>>>>>> be >>>>>>>>>> useful please let me know. >>>>>>>>>> >>>>>>>>>> I had a read of the AffyQCReport Package pdf and I have added the >>>>>>>>>> following >>>>>>>>>> line: QCReport(ReadAffy(widget=TRUE)). Then I tried >>>>>>>>>> library(affyQCReport); >>>>>>>>>> QCReport(mydata, file="ExampleQC.pdf") again. It now seems to be >>>>>>>>>> doing >>>>>>>>>> something, in other words it doesn?t go to the error, yet, but it?s >>>>>>>>>> been >>>>>>>>>> processing for about 10 minutes. I am analyzing 35 chips. >>>>>>>>>> >>>>>>>>>> Perhaps it would work if I tried to generate each QCReport page >>>>>>>>>> separately >>>>>>>>>> rather than as a whole. >>>>>>>>>> >>>>>>>>>> Cordially, >>>>>>>>>> Rick >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> Bioconductor mailing list >>>>>>>>>> Bioconductor at r-project.org >>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>>>>>> Search the archives: >>>>>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>>>>> >>>>>> >>>> >> -- Rick Frausto PhD Candidate The University of Sydney School of Molecular Bioscience G08 Camperdown, NSW 2006 AUSTRALIA ricardo.frausto at sydney.edu.au Phone: 61 2 9036 5354 Lab of Iain L. Campbell
ADD REPLY
0
Entering edit mode
Hi Rick, After all that, the reason is really simple. You are trying to use affyQCReport on a PM-only chip, which isn't going to work out so well. I don't have any mogene data around to play with (and don't have the time to go searching), so I will have to make some educated guesses. Internally in signalDist() you are calling boxplot() and hist() on your AffyBatch. And the default for both functions is to use both PM and MM probes. I'm betting that any(duplicated(unlist(indexProbes(mydata, "both")))) returns TRUE, indicating that indexProbes doesn't work correctly on a PM-only chip, which is fair enough, as it was never designed to do so. And plot(qc(mydata)) will never work, as it relies on computing a Wilcoxon signed-rank between the PM and MM probes, and since you don't have MM probes, well you get the picture... Best, Jim On 1/7/2011 6:56 PM, Rick Frausto wrote: > Hi Jim, > > Ok, so after doing a bit of reading and re-reading I was eventually able to > generate each page in a quartz window that the "QCReport" function should > also generate. I found which ones give me the errors. So, there should be 6 > pages in total. Page 2 gives me the duplication error and page 3 gives me > the error in evaluating the argument x. The other pages are ok and are > generated as expected. > > In brief, page 2 is suppose to be generated with the "signalDist(mydata)" > command. Page 3 is suppose to generated with the "plot(qc(mydata))" command. > > So, I guess there must be particular requirements for these commands that > I'm missing.I've included the session below along with traceback() and > sessionInfo(). > > > R version 2.12.0 (2010-10-15) > Copyright (C) 2010 The R Foundation for Statistical Computing > ISBN 3-900051-07-0 > Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) > > R is free software and comes with ABSOLUTELY NO WARRANTY. > You are welcome to redistribute it under certain conditions. > Type 'license()' or 'licence()' for distribution details. > > Natural language support but running in an English locale > > R is a collaborative project with many contributors. > Type 'contributors()' for more information and > 'citation()' on how to cite R or R packages in publications. > > Type 'demo()' for some demos, 'help()' for on-line help, or > 'help.start()' for an HTML browser interface to help. > Type 'q()' to quit R. > > [R.app GUI 1.35 (5632) x86_64-apple-darwin9.8.0] > > [Workspace restored from /Users/rickfrausto/.RData] > [History restored from /Users/rickfrausto/.Rapp.history] > >> library(simpleaffy) > Loading required package: affy > Loading required package: Biobase > > Welcome to Bioconductor > > Vignettes contain introductory material. To view, type > 'openVignette()'. To cite Bioconductor, see > 'citation("Biobase")' and for packages 'citation(pkgname)'. > > Loading required package: genefilter > Loading required package: gcrma > > Attaching package: 'simpleaffy' > > The following object(s) are masked _by_ '.GlobalEnv': > > getBioC > >> library(affy) >> mydata<- ReadAffy() >> eset<- rma(mydata) > Background correcting > Normalizing > Calculating Expression >> library(affycoretools); affystart(plot=T, express="rma") > Loading required package: GO.db > Loading required package: AnnotationDbi > Loading required package: DBI > Loading required package: KEGG.db > Background correcting > Normalizing > Calculating Expression > Please give the x-coordinate for a legend.30 > Please give the y-coordinate for a legend.80 > ExpressionSet (storageMode: lockedEnvironment) > assayData: 34760 features, 35 samples > element names: exprs > protocolData > sampleNames: A_WT1_NT_2hr.CEL B_WT1_NT_2hr.CEL ... > ZI_ST1KO_HIL6_12hr.CEL (35 total) > varLabels: ScanDate > varMetadata: labelDescription > phenoData > sampleNames: A_WT1_NT_2hr.CEL B_WT1_NT_2hr.CEL ... > ZI_ST1KO_HIL6_12hr.CEL (35 total) > varLabels: sample > varMetadata: labelDescription > featureData: none > experimentData: use 'experimentData(object)' > Annotation: mogene10stv1 >> write.exprs(eset, file="mydata.txt") >> x<- data.frame(exprs(eset), exprs(eset_PMA), assayDataElement(eset_PMA, > "se.exprs")); x<- x[,sort(names(x))]; write.table(x, file="mydata_PMA.xls", > quote=F, col.names = NA, sep="\t") > Error in exprs(eset_PMA) : > error in evaluating the argument 'object' in selecting a method for > function 'exprs' >> mypm<- pm(mydata) >> mymm<- mm(mydata) >> myaffyids<- probeNames(mydata) >> result<- data.frame(myaffyids, mypm, mymm) >> eset; pData(eset) > ExpressionSet (storageMode: lockedEnvironment) > assayData: 34760 features, 35 samples > element names: exprs > protocolData > sampleNames: A_WT1_NT_2hr.CEL B_WT1_NT_2hr.CEL ... > ZI_ST1KO_HIL6_12hr.CEL (35 total) > varLabels: ScanDate > varMetadata: labelDescription > phenoData > sampleNames: A_WT1_NT_2hr.CEL B_WT1_NT_2hr.CEL ... > ZI_ST1KO_HIL6_12hr.CEL (35 total) > varLabels: sample > varMetadata: labelDescription > featureData: none > experimentData: use 'experimentData(object)' > Annotation: mogene10stv1 > sample > A_WT1_NT_2hr.CEL 1 > B_WT1_NT_2hr.CEL 2 > C_WT1_NT_12hr.CEL 3 > D_WT1_NT_12hr.CEL 4 > E_WT1_HIL6_2hr.CEL 5 > F_WT1_HIL6_2hr.CEL 6 > G_WT1_HIL6_12hr.CEL 7 > H_WT1_HIL6_12hr.CEL 8 > I_FF_NT_2hr.CEL 9 > J_FF_NT_2hr.CEL 10 > K_FF_NT_12hr.CEL 11 > L_FF_NT_12hr.CEL 12 > M_FF_HIL6_2hr.CEL 13 > N_FF_HIL6_2hr.CEL 14 > O_FF_HIL6_12hr.CEL 15 > P_FF_HIL6_12hr.CEL 16 > Q_WT2_NT_2hr.CEL 17 > R_WT2_NT_2hr.CEL 18 > S_WT2_NT_12hr.CEL 19 > T_WT2_NT_12hr.CEL 20 > U_WT2_HIL6_2hr.CEL 21 > V_WT2_HIL6_2hr.CEL 22 > W_WT2_HIL6_12hr.CEL 23 > X_WT2_HIL6_12hr.CEL 24 > Y_DD_NT_2hr.CEL 25 > Z_DD_NT_2hr.CEL 26 > ZA_DD_NT_12hr.CEL 27 > ZB_DD_NT_12hr.CEL 28 > ZC_DD_HIL6_2hr.CEL 29 > ZD_DD_HIL6_2hr.CEL 30 > ZE_DD_HIL6_12hr.CEL 31 > ZF_DD_HIL6_12hr.CEL 32 > ZG_ST1KO_NT_2hr.CEL 33 > ZH_ST1KO_HIL6_2hr.CEL 34 > ZI_ST1KO_HIL6_12hr.CEL 35 >> data.frame(eset) > X10338001 X10338003 X10338004 X10338017 X10338025 > A_WT1_NT_2hr.CEL 11.71717 10.183620 9.440631 12.79412 8.823529 > B_WT1_NT_2hr.CEL 11.78778 10.027760 9.489226 12.98544 8.843002 > X10338026 X10338029 X10338035 X10338036 X10338037 > A_WT1_NT_2hr.CEL 13.22585 9.405038 8.853564 9.379031 3.661987 > B_WT1_NT_2hr.CEL 13.29043 9.575309 8.772872 9.513050 3.514885 > X10338041 X10338042 X10338044 X10338047 X10338056 > A_WT1_NT_2hr.CEL 10.94638 10.116516 11.88296 8.872839 3.133222 > B_WT1_NT_2hr.CEL 11.23276 10.134084 12.03381 7.568584 3.088548 > X10338059 X10338060 X10338063 X10338064 X10338065 > > JIM, I TRUNCATED THIS LIST, BUT THOUGHT IT MIGHT BE USEFUL IN DIAGNOSING THE > PROBLEMS I'M HAVING. SESSION IS CONTINUED BELOW. > >> library(affyQCReport) > Loading required package: lattice >> titlePage(mydata) > [1] TRUE >> signalDist(mydata) > Warning message: > In data.row.names(row.names, rowsi, i) : > some row.names duplicated: > 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,49,50,51 ,52,53,5 > 4,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98, 99,102,1 > 03,104,108,109,110,111,114,119,120,121,122,127,134,136,137,138,139,1 41,142,1 > 47,148,149,152,153,156,157,158,159,162,163,164,165,166,167,168,169,1 70,171,1 > 73,175,176,179,180,183,184,185,186,191,192,195,197,198,199,200,202,2 06,207,2 > 10,219,220,227,228,229,230,233,234,235,240,241,243,245,246,248,249,2 50,251,2 > 52,253,257,259,260,266,271,272,276,277,280,281,284,286,287,289,290,2 91,292,2 > 96,297,298,302,304,305,306,310,311,312,313,317,318,319,321,322,324,3 34,337,3 > 38,339,340,341,345,346,350,351,356,359,362,364,366,367,370,371,373,3 76,378,3 > 82,383,384,385,386,387,388,389,391,394,395,397,398,399,400,402,403,4 05,406,4 > 07,409,410,411,415,416,418,419,425,431,432,433,434,435,440,441,443,4 45,447,4 > 49,450,452,454,455,456,461,464,466,470,472,473,481,487,488,491,492,4 93,494,4 > 95,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51 [... > truncated] >> plot(qc(mydata)) > Error in plot(qc(mydata)) : > error in evaluating the argument 'x' in selecting a method for function > 'plot' >> borderQC1(mydata) > [1] TRUE >> borderQC2(mydata) > [1] TRUE >> correlationPlot(mydata) > [1] TRUE >> titlePage(mydata) > [1] TRUE >> titlePage(mydata) > Error in polygon(c(0, 0, 0.9, 0.9, 0), c(0.05, 0.95, 0.95, 0.05, 0.05)) : > plot.new has not been called yet >> correlationPlot(mydata) > [1] TRUE >> titlePage(mydata) > Error in polygon(c(0, 0, 0.9, 0.9, 0), c(0.05, 0.95, 0.95, 0.05, 0.05)) : > plot.new has not been called yet > In addition: Warning message: > Display list redraw incomplete >> borderQC1(mydata) > [1] TRUE >> titlePage(mydata) > [1] TRUE >> titlePage(mydata) > Error in polygon(c(0, 0, 0.9, 0.9, 0), c(0.05, 0.95, 0.95, 0.05, 0.05)) : > plot.new has not been called yet >> traceback() > 2: polygon(c(0, 0, 0.9, 0.9, 0), c(0.05, 0.95, 0.95, 0.05, 0.05)) > 1: titlePage(mydata) >> sessionInfo() > R version 2.12.0 (2010-10-15) > Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) > > locale: > [1] en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] affyQCReport_1.28.1 lattice_0.19-13 affycoretools_1.22.0 > [4] KEGG.db_2.4.5 GO.db_2.4.5 RSQLite_0.9-4 > [7] DBI_0.2-5 AnnotationDbi_1.12.0 mogene10stv1cdf_2.7.0 > [10] simpleaffy_2.26.1 gcrma_2.22.0 genefilter_1.32.0 > [13] affy_1.28.0 Biobase_2.10.0 > > loaded via a namespace (and not attached): > [1] affyio_1.18.0 affyPLM_1.26.0 annaffy_1.22.0 > [4] annotate_1.28.0 biomaRt_2.6.0 Biostrings_2.18.2 > [7] Category_2.16.0 GOstats_2.16.0 graph_1.28.0 > [10] grid_2.12.0 GSEABase_1.12.2 IRanges_1.8.7 > [13] limma_3.6.9 preprocessCore_1.12.0 RBGL_1.26.0 > [16] RColorBrewer_1.0-2 RCurl_1.4-3 splines_2.12.0 > [19] survival_2.36-2 tools_2.12.0 XML_3.2-0 > [22] xtable_1.5-6 >> > > On 7/01/11 12:47 PM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> wrote: > >> Hi Rick, >> >> What happens if you load the simpleaffy package first? >> >> Best, >> >> Jim >> >> On 1/7/2011 2:14 PM, Rick Frausto wrote: >>> Hi James, >>> >>> Below is the information that you requested - traceback() and sessioninfo(). >>> Doesn't seem like much to me, but perhaps you can help. As you answer to a >>> lot of e-mails, thought I'd remind you that this is in regards to the "some >>> row.names duplicated" error. >>> >>> Hope your holidays were good! >>> >>> -Rick >>> >>> [R.app GUI 1.35 (5632) x86_64-apple-darwin9.8.0] >>> >>> [Workspace restored from /Users/rickfrausto/.RData] >>> [History restored from /Users/rickfrausto/.Rapp.history] >>> >>>> library(affy) >>> Loading required package: Biobase >>> >>> Welcome to Bioconductor >>> >>> Vignettes contain introductory material. To view, type >>> 'openVignette()'. To cite Bioconductor, see >>> 'citation("Biobase")' and for packages 'citation(pkgname)'. >>> >>>> mydata<- ReadAffy() >>>> eset<- rma(mydata) >>> Background correcting >>> Normalizing >>> Calculating Expression >>>> write.exprs(eset, file="mydata.txt") >>>> mypm<- pm(mydata) >>>> mymm<- mm(mydata) >>>> myaffyids<- probeNames(mydata) >>>> result<- data.frame(myaffyids, mypm, mymm) >>>> library(affyQCReport); QCReport(mydata, file="ExampleQC.pdf") >>> Loading required package: lattice >>> Warning message: >>> In data.row.names(row.names, rowsi, i) : >>> some row.names duplicated: >>> 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,49,50, 51,52,53,5 >>> 4,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,9 8,99,102,1 >>> 03,104,108,109,110,111,114,119,120,121,122,127,134,136,137,138,139 ,141,142,1 >>> 47,148,149,152,153,156,157,158,159,162,163,164,165,166,167,168,169 ,170,171,1 >>> 73,175,176,179,180,183,184,185,186,191,192,195,197,198,199,200,202 ,206,207,2 >>> 10,219,220,227,228,229,230,233,234,235,240,241,243,245,246,248,249 ,250,251,2 >>> 52,253,257,259,260,266,271,272,276,277,280,281,284,286,287,289,290 ,291,292,2 >>> 96,297,298,302,304,305,306,310,311,312,313,317,318,319,321,322,324 ,334,337,3 >>> 38,339,340,341,345,346,350,351,356,359,362,364,366,367,370,371,373 ,376,378,3 >>> 82,383,384,385,386,387,388,389,391,394,395,397,398,399,400,402,403 ,405,406,4 >>> 07,409,410,411,415,416,418,419,425,431,432,433,434,435,440,441,443 ,445,447,4 >>> 49,450,452,454,455,456,461,464,466,470,472,473,481,487,488,491,492 ,493,494,4 >>> 95,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51 [... >>> truncated] >>> Error in plot(qc(object)) : >>> error in evaluating the argument 'x' in selecting a method for function >>> 'plot' >>>> traceback() >>> 2: plot(qc(object)) >>> 1: QCReport(mydata, file = "ExampleQC.pdf") >>>> sessionInfo() >>> R version 2.12.0 (2010-10-15) >>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >>> >>> locale: >>> [1] en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8 >>> >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> >>> other attached packages: >>> [1] affyQCReport_1.28.1 latptice_0.19-13 mogene10stv1cdf_2.7.0 >>> [4] affy_1.28.0 Biobase_2.10.0 >>> >>> loaded via a namespace (and not attached): >>> [1] affyio_1.18.0 affyPLM_1.26.0 annotate_1.28.0 >>> [4] AnnotationDbi_1.12.0 Biostrings_2.18.2 DBI_0.2-5 >>> [7] gcrma_2.22.0 genefilter_1.32.0 grid_2.12.0 >>> [10] IRanges_1.8.7 preprocessCore_1.12.0 RColorBrewer_1.0-2 >>> [13] RSQLite_0.9-4 simpleaffy_2.26.1 splines_2.12.0 >>> [16] survival_2.36-2 tools_2.12.0 xtable_1.5-6 >>>> >>> >>> >>> >>> >>> On 20/12/10 6:33 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> wrote: >>> >>>> Hi Rick, >>>> >>>> On 12/17/2010 9:24 PM, Rick Frausto wrote: >>>>> Hey Jim, >>>>> >>>>> Ok, I will give that a go. The only problem is an ExpressionSet contains >>>>> all >>>>> of the necessary information for further analysis (e.g. phenodata, >>>>> featuredata and annotation, etc - including, treatment type, cell type, >>>>> time >>>>> points, replicates). I am still learning how to include all of these for a >>>>> complete ExpressionSet. As a starting point I've loaded a txt file >>>>> containing some of this information (gene abbrev, ontology, probeset ID) >>>>> which I created using Affymetrix's Expression Console software, without >>>>> replicate, time point and cell type info. Doing this I've gotten as far as >>>>> creating a minimal ExpressionSet, which I guess the functions you mention >>>>> below do just that but with the information contained in the CEL file only. >>>>> >>>>> In any case, since as you say, the functions in the online manual create a >>>>> proper ExpressionSet why would I get the issue of duplication? >>>> >>>> Oh yeah, the original question ;-D. Try running QCreport() again, and >>>> when it errors out run traceback() and send the output. Also include the >>>> output of sessionInfo(). >>>> >>>> Jim >>>> >>>> >>>>> >>>>> In regards to the 64-bit discussion. It may have very well made enough of a >>>>> difference as it did not come up with the memory error the last time I >>>>> tried >>>>> it. Going to upgrade to 8GB RAM anyways, can't hurt. >>>>> >>>>> Cheers, >>>>> Rick >>>>> >>>>> >>>>> On 17/12/10 7:20 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> wrote: >>>>> >>>>>> Hi Rick, >>>>>> >>>>>> On 12/16/2010 4:13 PM, Rick Frausto wrote: >>>>>>> Hi Jim, >>>>>>> >>>>>>> How do I run an RMA analysis without a proper ExpresionSet? Honest >>>>>>> answer, >>>>>>> I >>>>>>> don't know, I just put in a command line from a manual I found online and >>>>>>> it >>>>>>> spit out some result- see #3 Affy packages in following link ( >>>>>>> http://manuals.bioinformatics.ucr.edu/home/R_BioCondManual#bio con_intro). >>>>>> >>>>>> You are mistaken. All of the functions mentioned there result in a >>>>>> proper ExpressionSet. And if you just do >>>>>> >>>>>> abatch<- ReadAffy() >>>>>> eset<- rma(abatch) >>>>>> >>>>>> Then you will 100% surely get an ExpressionSet. >>>>>> >>>>>>> >>>>>>> Perhaps you don't need an ExpressionSet until after the preprocessing, at >>>>>>> least that is what I get from the "An Introduction to Bioconductor's >>>>>>> ExpressionSet Class" written by Seth Falcon, Martin Morgan and Robert >>>>>>> Gentleman. Everything seemed to be going smoothly until I tried to get a >>>>>>> QC >>>>>>> Report. >>>>>>> >>>>>>> Now, the answer for why I would want to do such a thing is easy. Simply >>>>>>> that >>>>>>> I don't know any better :) Just started working with R a few days ago, >>>>>>> but >>>>>>> I'm learning. >>>>>>> >>>>>>> >>>>>>> Apparently Snow Leopard running on 32bit can only utilize about 3.2GB of >>>>>>> RAM, whereas 64bit can make use of all 4GB. I'll switch to the 64 bit OS >>>>>>> and >>>>>>> see if it makes a difference. >>>>>> >>>>>> Well, it won't be much different. The reason a 32-bit OS can only use >>>>>> about 3.2 Gb of RAM is that the OS needs some to run. The 64-bit OS also >>>>>> needs to use some RAM, so you won't get all 4 Gb there either. The issue >>>>>> is how much RAM can be allocated to a single process, and on a 64-bit OS >>>>>> that gets bumped up significantly. >>>>>> >>>>>> Best, >>>>>> >>>>>> Jim >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> Thanks for your insight! >>>>>>> >>>>>>> Cheers, >>>>>>> Rick >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 16/12/10 11:31 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Rick, >>>>>>>> >>>>>>>> On 12/16/2010 12:57 PM, Rick Frausto wrote: >>>>>>>>> Thanks Jim! How much memory would I need, I currently have 4GB, but >>>>>>>>> have >>>>>>>>> quite a few other programs running in the background...I'll see if >>>>>>>>> closing >>>>>>>>> them helps. Perhaps setting up an "ExpressionSet" would solve the >>>>>>>>> problem. >>>>>>>>> I >>>>>>>>> just started reading up on how to set one of these up yesterday. Will >>>>>>>>> do >>>>>>>>> this and see if the duplicates will go away. >>>>>>>>> >>>>>>>>> The "mydata" originates from CEL files and then I run the RMA analysis >>>>>>>>> on >>>>>>>>> it, but I didn't actually set up a proper ExpressionSet. I'm guessing >>>>>>>>> that >>>>>>>>> doing this might reduce the QCReport PDF file size quite considerably >>>>>>>>> since >>>>>>>>> I won't have any duplication and will make further analysis easier. >>>>>>>> >>>>>>>> How do you run an RMA analysis without setting up a proper >>>>>>>> ExpressionSet? The default behavior is to create one. In addition, why >>>>>>>> would you want to do such a thing? The ExpressionSet class is >>>>>>>> specifically designed to contain these sorts of data. >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> I'm running Snow Leopard OSX which can be set up as 64bit. Would >>>>>>>>> running >>>>>>>>> as >>>>>>>>> 64bit still necessitate more RAM? >>>>>>>> >>>>>>>> Probably. The difference isn't efficiency, but the ability to address >>>>>>>> more RAM. A 32-bit OS can still address all the available memory that >>>>>>>> you will have with just 4 Gb RAM, so you need to bump that up if you >>>>>>>> want to do all the chips together. As for how much, I don't know. Since >>>>>>>> RAM isn't that expensive these days, you might look at maxing your box >>>>>>>> out. >>>>>>>> >>>>>>>> Best, >>>>>>>> >>>>>>>> Jim >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Thanks again, >>>>>>>>> Rick >>>>>>>>> >>>>>>>>> >>>>>>>>> On 15/12/10 7:45 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi Rick, >>>>>>>>>> >>>>>>>>>> On 12/14/2010 3:55 PM, Rick Frausto wrote: >>>>>>>>>>> Dear All, >>>>>>>>>>> >>>>>>>>>>> I have recently entered the world of R. Through some trial and error >>>>>>>>>>> I'm >>>>>>>>>>> becoming more familiar with R and the relevant Bioconductor Affy >>>>>>>>>>> packages. >>>>>>>>>>> I?m a molecular and cell biologist with rudimentary statistical >>>>>>>>>>> knowledge >>>>>>>>>>> and even less knowledge with respect to R. >>>>>>>>>>> >>>>>>>>>>> When I enter the following: >>>>>>>>>>> >>>>>>>>>>> library(affyQCReport); QCReport(mydata, file="ExampleQC.pdf") >>>>>>>>>>> >>>>>>>>>>> I get some errors in return. >>>>>>>>>>> >>>>>>>>>>> Loading required package: lattice >>>>>>>>>>> Error: cannot allocate vector of size 437.4 Mb >>>>>>>>>> >>>>>>>>>> This indicates that you need more RAM, as you are running out of >>>>>>>>>> memory. >>>>>>>>>> >>>>>>>>>>> In addition: Warning message: >>>>>>>>>>> In data.row.names(row.names, rowsi, i) : >>>>>>>>>>> some row.names duplicated: >>>>>>>>>>> >>>>>>> 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,49 ,50,51,52,5 >>>>>>> 3, >>>>>>>>> >>>>>>>>> >>>>>>> 5 >>>>>>>>>>> >>>>>>> 4,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,94,95,96, 97,98,99,10 >>>>>>> 2, >>>>>>>>> >>>>>>>>> >>>>>>> 1 >>>>>>>>>>> >>>>>>> 03,104,108,109,110,111,114,119,120,121,122,127,134,136,137,138 ,139,141,14 >>>>>>> 2, >>>>>>>>> >>>>>>>>> >>>>>>> 1 >>>>>>>>>>> >>>>>>> 47,148,149,152,153,156,157,158,159,162,163,164,165,166,167,168 ,169,170,17 >>>>>>> 1, >>>>>>>>> >>>>>>>>> >>>>>>> 1 >>>>>>>>>>> >>>>>>> 73,175,176,179,180,183,184,185,186,191,192,195,197,198,199,200 ,202,206,20 >>>>>>> 7, >>>>>>>>> >>>>>>>>> >>>>>>> 2 >>>>>>>>>>> >>>>>>> 10,219,220,227,228,229,230,233,234,235,240,241,243,245,246,248 ,249,250,25 >>>>>>> 1, >>>>>>>>> >>>>>>>>> >>>>>>> 2 >>>>>>>>>>> >>>>>>> 52,253,257,259,260,266,271,272,276,277,280,281,284,286,287,289 ,290,291,29 >>>>>>> 2, >>>>>>>>> >>>>>>>>> >>>>>>> 2 >>>>>>>>>>> >>>>>>> 96,297,298,302,304,305,306,310,311,312,313,317,318,319,321,322 ,324,334,33 >>>>>>> 7, >>>>>>>>> >>>>>>>>> >>>>>>> 3 >>>>>>>>>>> >>>>>>> 38,339,340,341,345,346,350,351,356,359,362,364,366,367,370,371 ,373,376,37 >>>>>>> 8, >>>>>>>>> >>>>>>>>> >>>>>>> 3 >>>>>>>>>>> >>>>>>> 82,383,384,385,386,387,388,389,391,394,395,397,398,399,400,402 ,403,405,40 >>>>>>> 6, >>>>>>>>> >>>>>>>>> >>>>>>> 4 >>>>>>>>>>> >>>>>>> 07,409,410,411,415,416,418,419,425,431,432,433,434,435,440,441 ,443,445,44 >>>>>>> 7, >>>>>>>>> >>>>>>>>> >>>>>>> 4 >>>>>>>>>>> >>>>>>> 49,450,452,454,455,456,461,464,466,470,472,473,481,487,488,491 ,492,493,49 >>>>>>> 4, >>>>>>>>> >>>>>>>>> >>>>>>> 4 >>>>>>>>>>> 95,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51 [... >>>>>>>>>>> truncated] >>>>>>>>>> >>>>>>>>>> What exactly is 'mydata', and how did you generate it? The above error >>>>>>>>>> indicates that you have duplicate row names, which IIRC isn't possible >>>>>>>>>> to do with an expressionSet. >>>>>>>>>> >>>>>>>>>>> R(9062,0xa05c5540) malloc: *** mmap(size=458665984) failed (error >>>>>>>>>>> code=12) >>>>>>>>>>> *** error: can't allocate region >>>>>>>>>>> *** set a breakpoint in malloc_error_break to debug >>>>>>>>>>> R(9062,0xa05c5540) malloc: *** mmap(size=458665984) failed (error >>>>>>>>>>> code=12) >>>>>>>>>>> *** error: can't allocate region >>>>>>>>>>> *** set a breakpoint in malloc_error_break to debug >>>>>>>>>> >>>>>>>>>> More lack of memory errors. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Error in help(dt[i], package = pkg[i], htmlhelp = TRUE) : >>>>>>>>>>> unused argument(s) (htmlhelp = TRUE) >>>>>>>>>>> In addition: Warning messages: >>>>>>>>>>> 1: In data(package = .packages(all.available = TRUE)) : >>>>>>>>>>> datasets have been moved from package 'base' to package >>>>>>>>>>> 'datasets' >>>>>>>>>>> 2: In data(package = .packages(all.available = TRUE)) : >>>>>>>>>>> datasets have been moved from package 'stats' to package >>>>>>>>>>> 'datasets' >>>>>>>>>>> starting httpd help server ... done >>>>>>>>>>> >>>>>>>>>>> Would someone be able to diagnose the problem and suggest a solution? >>>>>>>>>> >>>>>>>>>> First, get more RAM. Second, you will be better off using a 64-bit OS. >>>>>>>>>> Depending on your hardware, you might be able to just install a 64-bit >>>>>>>>>> version of R. >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> >>>>>>>>>> Jim >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> If it is useful, I am using the following R software: R for Mac OS X >>>>>>>>>>> GUI >>>>>>>>>>> 1.35-dev Leopard build 32-bit. If there is any other info that would >>>>>>>>>>> be >>>>>>>>>>> useful please let me know. >>>>>>>>>>> >>>>>>>>>>> I had a read of the AffyQCReport Package pdf and I have added the >>>>>>>>>>> following >>>>>>>>>>> line: QCReport(ReadAffy(widget=TRUE)). Then I tried >>>>>>>>>>> library(affyQCReport); >>>>>>>>>>> QCReport(mydata, file="ExampleQC.pdf") again. It now seems to be >>>>>>>>>>> doing >>>>>>>>>>> something, in other words it doesn?t go to the error, yet, but it?s >>>>>>>>>>> been >>>>>>>>>>> processing for about 10 minutes. I am analyzing 35 chips. >>>>>>>>>>> >>>>>>>>>>> Perhaps it would work if I tried to generate each QCReport page >>>>>>>>>>> separately >>>>>>>>>>> rather than as a whole. >>>>>>>>>>> >>>>>>>>>>> Cordially, >>>>>>>>>>> Rick >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> Bioconductor mailing list >>>>>>>>>>> Bioconductor at r-project.org >>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>>>>>>> Search the archives: >>>>>>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>>>>>> >>>>>>> >>>>> >>> > -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
ADD REPLY
0
Entering edit mode
Hi Jim, You're right... > any(duplicated(unlist(indexProbes(mydata, "both")))) [1] TRUE > Figured it would be something simple, almost always is. Guess since the MM values are only really necessary for calculating a "real" PM value I should generally still be ok with using R Bioconductor packages for downstream analysis of these chips?? For example, using eset<-rma() to normalize my data should still be ok. By the way, the documentation on the AffyQCReport function regarding signalDist() states that "The first is a boxplot plot of the all pm intensities and the second plot consists of kernel density estimates of these intensities." From this it would seem to a novice like me that it only uses PM values, clearly I'm not correct. I guess these are PM values adjusted for the MM signal. Thanks for figuring this out for me. Let me know if these and other related questions would be better served as standalone e-mails. Cheers, Rick On 10/01/11 7:04 AM, "James W. MacDonald" <jmacdon at="" med.umich.edu=""> wrote: > Hi Rick, > > After all that, the reason is really simple. You are trying to use > affyQCReport on a PM-only chip, which isn't going to work out so well. I > don't have any mogene data around to play with (and don't have the time > to go searching), so I will have to make some educated guesses. > > Internally in signalDist() you are calling boxplot() and hist() on your > AffyBatch. And the default for both functions is to use both PM and MM > probes. I'm betting that > > any(duplicated(unlist(indexProbes(mydata, "both")))) > > returns TRUE, indicating that indexProbes doesn't work correctly on a > PM-only chip, which is fair enough, as it was never designed to do so. > > And plot(qc(mydata)) will never work, as it relies on computing a > Wilcoxon signed-rank between the PM and MM probes, and since you don't > have MM probes, well you get the picture... > > Best, > > Jim > > > > On 1/7/2011 6:56 PM, Rick Frausto wrote: >> Hi Jim, >> >> Ok, so after doing a bit of reading and re-reading I was eventually able to >> generate each page in a quartz window that the "QCReport" function should >> also generate. I found which ones give me the errors. So, there should be 6 >> pages in total. Page 2 gives me the duplication error and page 3 gives me >> the error in evaluating the argument x. The other pages are ok and are >> generated as expected. >> >> In brief, page 2 is suppose to be generated with the "signalDist(mydata)" >> command. Page 3 is suppose to generated with the "plot(qc(mydata))" command. >> >> So, I guess there must be particular requirements for these commands that >> I'm missing.I've included the session below along with traceback() and >> sessionInfo(). >> >> >> R version 2.12.0 (2010-10-15) >> Copyright (C) 2010 The R Foundation for Statistical Computing >> ISBN 3-900051-07-0 >> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >> >> R is free software and comes with ABSOLUTELY NO WARRANTY. >> You are welcome to redistribute it under certain conditions. >> Type 'license()' or 'licence()' for distribution details. >> >> Natural language support but running in an English locale >> >> R is a collaborative project with many contributors. >> Type 'contributors()' for more information and >> 'citation()' on how to cite R or R packages in publications. >> >> Type 'demo()' for some demos, 'help()' for on-line help, or >> 'help.start()' for an HTML browser interface to help. >> Type 'q()' to quit R. >> >> [R.app GUI 1.35 (5632) x86_64-apple-darwin9.8.0] >> >> [Workspace restored from /Users/rickfrausto/.RData] >> [History restored from /Users/rickfrausto/.Rapp.history] >> >>> library(simpleaffy) >> Loading required package: affy >> Loading required package: Biobase >> >> Welcome to Bioconductor >> >> Vignettes contain introductory material. To view, type >> 'openVignette()'. To cite Bioconductor, see >> 'citation("Biobase")' and for packages 'citation(pkgname)'. >> >> Loading required package: genefilter >> Loading required package: gcrma >> >> Attaching package: 'simpleaffy' >> >> The following object(s) are masked _by_ '.GlobalEnv': >> >> getBioC >> >>> library(affy) >>> mydata<- ReadAffy() >>> eset<- rma(mydata) >> Background correcting >> Normalizing >> Calculating Expression >>> library(affycoretools); affystart(plot=T, express="rma") >> Loading required package: GO.db >> Loading required package: AnnotationDbi >> Loading required package: DBI >> Loading required package: KEGG.db >> Background correcting >> Normalizing >> Calculating Expression >> Please give the x-coordinate for a legend.30 >> Please give the y-coordinate for a legend.80 >> ExpressionSet (storageMode: lockedEnvironment) >> assayData: 34760 features, 35 samples >> element names: exprs >> protocolData >> sampleNames: A_WT1_NT_2hr.CEL B_WT1_NT_2hr.CEL ... >> ZI_ST1KO_HIL6_12hr.CEL (35 total) >> varLabels: ScanDate >> varMetadata: labelDescription >> phenoData >> sampleNames: A_WT1_NT_2hr.CEL B_WT1_NT_2hr.CEL ... >> ZI_ST1KO_HIL6_12hr.CEL (35 total) >> varLabels: sample >> varMetadata: labelDescription >> featureData: none >> experimentData: use 'experimentData(object)' >> Annotation: mogene10stv1 >>> write.exprs(eset, file="mydata.txt") >>> x<- data.frame(exprs(eset), exprs(eset_PMA), assayDataElement(eset_PMA, >> "se.exprs")); x<- x[,sort(names(x))]; write.table(x, file="mydata_PMA.xls", >> quote=F, col.names = NA, sep="\t") >> Error in exprs(eset_PMA) : >> error in evaluating the argument 'object' in selecting a method for >> function 'exprs' >>> mypm<- pm(mydata) >>> mymm<- mm(mydata) >>> myaffyids<- probeNames(mydata) >>> result<- data.frame(myaffyids, mypm, mymm) >>> eset; pData(eset) >> ExpressionSet (storageMode: lockedEnvironment) >> assayData: 34760 features, 35 samples >> element names: exprs >> protocolData >> sampleNames: A_WT1_NT_2hr.CEL B_WT1_NT_2hr.CEL ... >> ZI_ST1KO_HIL6_12hr.CEL (35 total) >> varLabels: ScanDate >> varMetadata: labelDescription >> phenoData >> sampleNames: A_WT1_NT_2hr.CEL B_WT1_NT_2hr.CEL ... >> ZI_ST1KO_HIL6_12hr.CEL (35 total) >> varLabels: sample >> varMetadata: labelDescription >> featureData: none >> experimentData: use 'experimentData(object)' >> Annotation: mogene10stv1 >> sample >> A_WT1_NT_2hr.CEL 1 >> B_WT1_NT_2hr.CEL 2 >> C_WT1_NT_12hr.CEL 3 >> D_WT1_NT_12hr.CEL 4 >> E_WT1_HIL6_2hr.CEL 5 >> F_WT1_HIL6_2hr.CEL 6 >> G_WT1_HIL6_12hr.CEL 7 >> H_WT1_HIL6_12hr.CEL 8 >> I_FF_NT_2hr.CEL 9 >> J_FF_NT_2hr.CEL 10 >> K_FF_NT_12hr.CEL 11 >> L_FF_NT_12hr.CEL 12 >> M_FF_HIL6_2hr.CEL 13 >> N_FF_HIL6_2hr.CEL 14 >> O_FF_HIL6_12hr.CEL 15 >> P_FF_HIL6_12hr.CEL 16 >> Q_WT2_NT_2hr.CEL 17 >> R_WT2_NT_2hr.CEL 18 >> S_WT2_NT_12hr.CEL 19 >> T_WT2_NT_12hr.CEL 20 >> U_WT2_HIL6_2hr.CEL 21 >> V_WT2_HIL6_2hr.CEL 22 >> W_WT2_HIL6_12hr.CEL 23 >> X_WT2_HIL6_12hr.CEL 24 >> Y_DD_NT_2hr.CEL 25 >> Z_DD_NT_2hr.CEL 26 >> ZA_DD_NT_12hr.CEL 27 >> ZB_DD_NT_12hr.CEL 28 >> ZC_DD_HIL6_2hr.CEL 29 >> ZD_DD_HIL6_2hr.CEL 30 >> ZE_DD_HIL6_12hr.CEL 31 >> ZF_DD_HIL6_12hr.CEL 32 >> ZG_ST1KO_NT_2hr.CEL 33 >> ZH_ST1KO_HIL6_2hr.CEL 34 >> ZI_ST1KO_HIL6_12hr.CEL 35 >>> data.frame(eset) >> X10338001 X10338003 X10338004 X10338017 X10338025 >> A_WT1_NT_2hr.CEL 11.71717 10.183620 9.440631 12.79412 8.823529 >> B_WT1_NT_2hr.CEL 11.78778 10.027760 9.489226 12.98544 8.843002 >> X10338026 X10338029 X10338035 X10338036 X10338037 >> A_WT1_NT_2hr.CEL 13.22585 9.405038 8.853564 9.379031 3.661987 >> B_WT1_NT_2hr.CEL 13.29043 9.575309 8.772872 9.513050 3.514885 >> X10338041 X10338042 X10338044 X10338047 X10338056 >> A_WT1_NT_2hr.CEL 10.94638 10.116516 11.88296 8.872839 3.133222 >> B_WT1_NT_2hr.CEL 11.23276 10.134084 12.03381 7.568584 3.088548 >> X10338059 X10338060 X10338063 X10338064 X10338065 >> >> JIM, I TRUNCATED THIS LIST, BUT THOUGHT IT MIGHT BE USEFUL IN DIAGNOSING THE >> PROBLEMS I'M HAVING. SESSION IS CONTINUED BELOW. >> >>> library(affyQCReport) >> Loading required package: lattice >>> titlePage(mydata) >> [1] TRUE >>> signalDist(mydata) >> Warning message: >> In data.row.names(row.names, rowsi, i) : >> some row.names duplicated: >> 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,49,50,5 1,52,53,5 >> 4,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98 ,99,102,1 >> 03,104,108,109,110,111,114,119,120,121,122,127,134,136,137,138,139, 141,142,1 >> 47,148,149,152,153,156,157,158,159,162,163,164,165,166,167,168,169, 170,171,1 >> 73,175,176,179,180,183,184,185,186,191,192,195,197,198,199,200,202, 206,207,2 >> 10,219,220,227,228,229,230,233,234,235,240,241,243,245,246,248,249, 250,251,2 >> 52,253,257,259,260,266,271,272,276,277,280,281,284,286,287,289,290, 291,292,2 >> 96,297,298,302,304,305,306,310,311,312,313,317,318,319,321,322,324, 334,337,3 >> 38,339,340,341,345,346,350,351,356,359,362,364,366,367,370,371,373, 376,378,3 >> 82,383,384,385,386,387,388,389,391,394,395,397,398,399,400,402,403, 405,406,4 >> 07,409,410,411,415,416,418,419,425,431,432,433,434,435,440,441,443, 445,447,4 >> 49,450,452,454,455,456,461,464,466,470,472,473,481,487,488,491,492, 493,494,4 >> 95,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51 [... >> truncated] >>> plot(qc(mydata)) >> Error in plot(qc(mydata)) : >> error in evaluating the argument 'x' in selecting a method for function >> 'plot' >>> borderQC1(mydata) >> [1] TRUE >>> borderQC2(mydata) >> [1] TRUE >>> correlationPlot(mydata) >> [1] TRUE >>> titlePage(mydata) >> [1] TRUE >>> titlePage(mydata) >> Error in polygon(c(0, 0, 0.9, 0.9, 0), c(0.05, 0.95, 0.95, 0.05, 0.05)) : >> plot.new has not been called yet >>> correlationPlot(mydata) >> [1] TRUE >>> titlePage(mydata) >> Error in polygon(c(0, 0, 0.9, 0.9, 0), c(0.05, 0.95, 0.95, 0.05, 0.05)) : >> plot.new has not been called yet >> In addition: Warning message: >> Display list redraw incomplete >>> borderQC1(mydata) >> [1] TRUE >>> titlePage(mydata) >> [1] TRUE >>> titlePage(mydata) >> Error in polygon(c(0, 0, 0.9, 0.9, 0), c(0.05, 0.95, 0.95, 0.05, 0.05)) : >> plot.new has not been called yet >>> traceback() >> 2: polygon(c(0, 0, 0.9, 0.9, 0), c(0.05, 0.95, 0.95, 0.05, 0.05)) >> 1: titlePage(mydata) >>> sessionInfo() >> R version 2.12.0 (2010-10-15) >> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >> >> locale: >> [1] en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] affyQCReport_1.28.1 lattice_0.19-13 affycoretools_1.22.0 >> [4] KEGG.db_2.4.5 GO.db_2.4.5 RSQLite_0.9-4 >> [7] DBI_0.2-5 AnnotationDbi_1.12.0 mogene10stv1cdf_2.7.0 >> [10] simpleaffy_2.26.1 gcrma_2.22.0 genefilter_1.32.0 >> [13] affy_1.28.0 Biobase_2.10.0 >> >> loaded via a namespace (and not attached): >> [1] affyio_1.18.0 affyPLM_1.26.0 annaffy_1.22.0 >> [4] annotate_1.28.0 biomaRt_2.6.0 Biostrings_2.18.2 >> [7] Category_2.16.0 GOstats_2.16.0 graph_1.28.0 >> [10] grid_2.12.0 GSEABase_1.12.2 IRanges_1.8.7 >> [13] limma_3.6.9 preprocessCore_1.12.0 RBGL_1.26.0 >> [16] RColorBrewer_1.0-2 RCurl_1.4-3 splines_2.12.0 >> [19] survival_2.36-2 tools_2.12.0 XML_3.2-0 >> [22] xtable_1.5-6 >>> >> >> On 7/01/11 12:47 PM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> wrote: >> >>> Hi Rick, >>> >>> What happens if you load the simpleaffy package first? >>> >>> Best, >>> >>> Jim >>> >>> On 1/7/2011 2:14 PM, Rick Frausto wrote: >>>> Hi James, >>>> >>>> Below is the information that you requested - traceback() and >>>> sessioninfo(). >>>> Doesn't seem like much to me, but perhaps you can help. As you answer to a >>>> lot of e-mails, thought I'd remind you that this is in regards to the "some >>>> row.names duplicated" error. >>>> >>>> Hope your holidays were good! >>>> >>>> -Rick >>>> >>>> [R.app GUI 1.35 (5632) x86_64-apple-darwin9.8.0] >>>> >>>> [Workspace restored from /Users/rickfrausto/.RData] >>>> [History restored from /Users/rickfrausto/.Rapp.history] >>>> >>>>> library(affy) >>>> Loading required package: Biobase >>>> >>>> Welcome to Bioconductor >>>> >>>> Vignettes contain introductory material. To view, type >>>> 'openVignette()'. To cite Bioconductor, see >>>> 'citation("Biobase")' and for packages 'citation(pkgname)'. >>>> >>>>> mydata<- ReadAffy() >>>>> eset<- rma(mydata) >>>> Background correcting >>>> Normalizing >>>> Calculating Expression >>>>> write.exprs(eset, file="mydata.txt") >>>>> mypm<- pm(mydata) >>>>> mymm<- mm(mydata) >>>>> myaffyids<- probeNames(mydata) >>>>> result<- data.frame(myaffyids, mypm, mymm) >>>>> library(affyQCReport); QCReport(mydata, file="ExampleQC.pdf") >>>> Loading required package: lattice >>>> Warning message: >>>> In data.row.names(row.names, rowsi, i) : >>>> some row.names duplicated: >>>> 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,49,50,51,5 2,53,>>>> 5 >>>> 4,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99 ,102,>>>> 1 >>>> 03,104,108,109,110,111,114,119,120,121,122,127,134,136,137,138,139,141 ,142,>>>> 1 >>>> 47,148,149,152,153,156,157,158,159,162,163,164,165,166,167,168,169,170 ,171,>>>> 1 >>>> 73,175,176,179,180,183,184,185,186,191,192,195,197,198,199,200,202,206 ,207,>>>> 2 >>>> 10,219,220,227,228,229,230,233,234,235,240,241,243,245,246,248,249,250 ,251,>>>> 2 >>>> 52,253,257,259,260,266,271,272,276,277,280,281,284,286,287,289,290,291 ,292,>>>> 2 >>>> 96,297,298,302,304,305,306,310,311,312,313,317,318,319,321,322,324,334 ,337,>>>> 3 >>>> 38,339,340,341,345,346,350,351,356,359,362,364,366,367,370,371,373,376 ,378,>>>> 3 >>>> 82,383,384,385,386,387,388,389,391,394,395,397,398,399,400,402,403,405 ,406,>>>> 4 >>>> 07,409,410,411,415,416,418,419,425,431,432,433,434,435,440,441,443,445 ,447,>>>> 4 >>>> 49,450,452,454,455,456,461,464,466,470,472,473,481,487,488,491,492,493 ,494,>>>> 4 >>>> 95,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51 [... >>>> truncated] >>>> Error in plot(qc(object)) : >>>> error in evaluating the argument 'x' in selecting a method for function >>>> 'plot' >>>>> traceback() >>>> 2: plot(qc(object)) >>>> 1: QCReport(mydata, file = "ExampleQC.pdf") >>>>> sessionInfo() >>>> R version 2.12.0 (2010-10-15) >>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >>>> >>>> locale: >>>> [1] en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8 >>>> >>>> attached base packages: >>>> [1] stats graphics grDevices utils datasets methods base >>>> >>>> other attached packages: >>>> [1] affyQCReport_1.28.1 latptice_0.19-13 mogene10stv1cdf_2.7.0 >>>> [4] affy_1.28.0 Biobase_2.10.0 >>>> >>>> loaded via a namespace (and not attached): >>>> [1] affyio_1.18.0 affyPLM_1.26.0 annotate_1.28.0 >>>> [4] AnnotationDbi_1.12.0 Biostrings_2.18.2 DBI_0.2-5 >>>> [7] gcrma_2.22.0 genefilter_1.32.0 grid_2.12.0 >>>> [10] IRanges_1.8.7 preprocessCore_1.12.0 RColorBrewer_1.0-2 >>>> [13] RSQLite_0.9-4 simpleaffy_2.26.1 splines_2.12.0 >>>> [16] survival_2.36-2 tools_2.12.0 xtable_1.5-6 >>>>> >>>> >>>> >>>> >>>> >>>> On 20/12/10 6:33 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> wrote: >>>> >>>>> Hi Rick, >>>>> >>>>> On 12/17/2010 9:24 PM, Rick Frausto wrote: >>>>>> Hey Jim, >>>>>> >>>>>> Ok, I will give that a go. The only problem is an ExpressionSet contains >>>>>> all >>>>>> of the necessary information for further analysis (e.g. phenodata, >>>>>> featuredata and annotation, etc - including, treatment type, cell type, >>>>>> time >>>>>> points, replicates). I am still learning how to include all of these for >>>>>> a >>>>>> complete ExpressionSet. As a starting point I've loaded a txt file >>>>>> containing some of this information (gene abbrev, ontology, probeset ID) >>>>>> which I created using Affymetrix's Expression Console software, without >>>>>> replicate, time point and cell type info. Doing this I've gotten as far >>>>>> as >>>>>> creating a minimal ExpressionSet, which I guess the functions you mention >>>>>> below do just that but with the information contained in the CEL file >>>>>> only. >>>>>> >>>>>> In any case, since as you say, the functions in the online manual create >>>>>> a >>>>>> proper ExpressionSet why would I get the issue of duplication? >>>>> >>>>> Oh yeah, the original question ;-D. Try running QCreport() again, and >>>>> when it errors out run traceback() and send the output. Also include the >>>>> output of sessionInfo(). >>>>> >>>>> Jim >>>>> >>>>> >>>>>> >>>>>> In regards to the 64-bit discussion. It may have very well made enough of >>>>>> a >>>>>> difference as it did not come up with the memory error the last time I >>>>>> tried >>>>>> it. Going to upgrade to 8GB RAM anyways, can't hurt. >>>>>> >>>>>> Cheers, >>>>>> Rick >>>>>> >>>>>> >>>>>> On 17/12/10 7:20 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> >>>>>> wrote: >>>>>> >>>>>>> Hi Rick, >>>>>>> >>>>>>> On 12/16/2010 4:13 PM, Rick Frausto wrote: >>>>>>>> Hi Jim, >>>>>>>> >>>>>>>> How do I run an RMA analysis without a proper ExpresionSet? Honest >>>>>>>> answer, >>>>>>>> I >>>>>>>> don't know, I just put in a command line from a manual I found online >>>>>>>> and >>>>>>>> it >>>>>>>> spit out some result- see #3 Affy packages in following link ( >>>>>>>> http://manuals.bioinformatics.ucr.edu/home/R_BioCondManual#bi ocon_intro >>>>>>>> ). >>>>>>> >>>>>>> You are mistaken. All of the functions mentioned there result in a >>>>>>> proper ExpressionSet. And if you just do >>>>>>> >>>>>>> abatch<- ReadAffy() >>>>>>> eset<- rma(abatch) >>>>>>> >>>>>>> Then you will 100% surely get an ExpressionSet. >>>>>>> >>>>>>>> >>>>>>>> Perhaps you don't need an ExpressionSet until after the preprocessing, >>>>>>>> at >>>>>>>> least that is what I get from the "An Introduction to Bioconductor's >>>>>>>> ExpressionSet Class" written by Seth Falcon, Martin Morgan and Robert >>>>>>>> Gentleman. Everything seemed to be going smoothly until I tried to get >>>>>>>> a >>>>>>>> QC >>>>>>>> Report. >>>>>>>> >>>>>>>> Now, the answer for why I would want to do such a thing is easy. Simply >>>>>>>> that >>>>>>>> I don't know any better :) Just started working with R a few days ago, >>>>>>>> but >>>>>>>> I'm learning. >>>>>>>> >>>>>>>> >>>>>>>> Apparently Snow Leopard running on 32bit can only utilize about 3.2GB >>>>>>>> of >>>>>>>> RAM, whereas 64bit can make use of all 4GB. I'll switch to the 64 bit >>>>>>>> OS >>>>>>>> and >>>>>>>> see if it makes a difference. >>>>>>> >>>>>>> Well, it won't be much different. The reason a 32-bit OS can only use >>>>>>> about 3.2 Gb of RAM is that the OS needs some to run. The 64-bit OS also >>>>>>> needs to use some RAM, so you won't get all 4 Gb there either. The issue >>>>>>> is how much RAM can be allocated to a single process, and on a 64-bit OS >>>>>>> that gets bumped up significantly. >>>>>>> >>>>>>> Best, >>>>>>> >>>>>>> Jim >>>>>>> >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Thanks for your insight! >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Rick >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 16/12/10 11:31 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi Rick, >>>>>>>>> >>>>>>>>> On 12/16/2010 12:57 PM, Rick Frausto wrote: >>>>>>>>>> Thanks Jim! How much memory would I need, I currently have 4GB, but >>>>>>>>>> have >>>>>>>>>> quite a few other programs running in the background...I'll see if >>>>>>>>>> closing >>>>>>>>>> them helps. Perhaps setting up an "ExpressionSet" would solve the >>>>>>>>>> problem. >>>>>>>>>> I >>>>>>>>>> just started reading up on how to set one of these up yesterday. Will >>>>>>>>>> do >>>>>>>>>> this and see if the duplicates will go away. >>>>>>>>>> >>>>>>>>>> The "mydata" originates from CEL files and then I run the RMA >>>>>>>>>> analysis >>>>>>>>>> on >>>>>>>>>> it, but I didn't actually set up a proper ExpressionSet. I'm guessing >>>>>>>>>> that >>>>>>>>>> doing this might reduce the QCReport PDF file size quite considerably >>>>>>>>>> since >>>>>>>>>> I won't have any duplication and will make further analysis easier. >>>>>>>>> >>>>>>>>> How do you run an RMA analysis without setting up a proper >>>>>>>>> ExpressionSet? The default behavior is to create one. In addition, why >>>>>>>>> would you want to do such a thing? The ExpressionSet class is >>>>>>>>> specifically designed to contain these sorts of data. >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> I'm running Snow Leopard OSX which can be set up as 64bit. Would >>>>>>>>>> running >>>>>>>>>> as >>>>>>>>>> 64bit still necessitate more RAM? >>>>>>>>> >>>>>>>>> Probably. The difference isn't efficiency, but the ability to address >>>>>>>>> more RAM. A 32-bit OS can still address all the available memory that >>>>>>>>> you will have with just 4 Gb RAM, so you need to bump that up if you >>>>>>>>> want to do all the chips together. As for how much, I don't know. >>>>>>>>> Since >>>>>>>>> RAM isn't that expensive these days, you might look at maxing your box >>>>>>>>> out. >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> >>>>>>>>> Jim >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks again, >>>>>>>>>> Rick >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 15/12/10 7:45 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Rick, >>>>>>>>>>> >>>>>>>>>>> On 12/14/2010 3:55 PM, Rick Frausto wrote: >>>>>>>>>>>> Dear All, >>>>>>>>>>>> >>>>>>>>>>>> I have recently entered the world of R. Through some trial and >>>>>>>>>>>> error >>>>>>>>>>>> I'm >>>>>>>>>>>> becoming more familiar with R and the relevant Bioconductor Affy >>>>>>>>>>>> packages. >>>>>>>>>>>> I?m a molecular and cell biologist with rudimentary statistical >>>>>>>>>>>> knowledge >>>>>>>>>>>> and even less knowledge with respect to R. >>>>>>>>>>>> >>>>>>>>>>>> When I enter the following: >>>>>>>>>>>> >>>>>>>>>>>> library(affyQCReport); QCReport(mydata, file="ExampleQC.pdf") >>>>>>>>>>>> >>>>>>>>>>>> I get some errors in return. >>>>>>>>>>>> >>>>>>>>>>>> Loading required package: lattice >>>>>>>>>>>> Error: cannot allocate vector of size 437.4 Mb >>>>>>>>>>> >>>>>>>>>>> This indicates that you need more RAM, as you are running out of >>>>>>>>>>> memory. >>>>>>>>>>> >>>>>>>>>>>> In addition: Warning message: >>>>>>>>>>>> In data.row.names(row.names, rowsi, i) : >>>>>>>>>>>> some row.names duplicated: >>>>>>>>>>>> >>>>>>>> 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,4 9,50,51,52 >>>>>>>> ,5 >>>>>>>> 3, >>>>>>>>>> >>>>>>>>>> >>>>>>>> 5 >>>>>>>>>>>> >>>>>>>> 4,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,94,95,96 ,97,98,99, >>>>>>>> 10 >>>>>>>> 2, >>>>>>>>>> >>>>>>>>>> >>>>>>>> 1 >>>>>>>>>>>> >>>>>>>> 03,104,108,109,110,111,114,119,120,121,122,127,134,136,137,13 8,139,141, >>>>>>>> 14 >>>>>>>> 2, >>>>>>>>>> >>>>>>>>>> >>>>>>>> 1 >>>>>>>>>>>> >>>>>>>> 47,148,149,152,153,156,157,158,159,162,163,164,165,166,167,16 8,169,170, >>>>>>>> 17 >>>>>>>> 1, >>>>>>>>>> >>>>>>>>>> >>>>>>>> 1 >>>>>>>>>>>> >>>>>>>> 73,175,176,179,180,183,184,185,186,191,192,195,197,198,199,20 0,202,206, >>>>>>>> 20 >>>>>>>> 7, >>>>>>>>>> >>>>>>>>>> >>>>>>>> 2 >>>>>>>>>>>> >>>>>>>> 10,219,220,227,228,229,230,233,234,235,240,241,243,245,246,24 8,249,250, >>>>>>>> 25 >>>>>>>> 1, >>>>>>>>>> >>>>>>>>>> >>>>>>>> 2 >>>>>>>>>>>> >>>>>>>> 52,253,257,259,260,266,271,272,276,277,280,281,284,286,287,28 9,290,291, >>>>>>>> 29 >>>>>>>> 2, >>>>>>>>>> >>>>>>>>>> >>>>>>>> 2 >>>>>>>>>>>> >>>>>>>> 96,297,298,302,304,305,306,310,311,312,313,317,318,319,321,32 2,324,334, >>>>>>>> 33 >>>>>>>> 7, >>>>>>>>>> >>>>>>>>>> >>>>>>>> 3 >>>>>>>>>>>> >>>>>>>> 38,339,340,341,345,346,350,351,356,359,362,364,366,367,370,37 1,373,376, >>>>>>>> 37 >>>>>>>> 8, >>>>>>>>>> >>>>>>>>>> >>>>>>>> 3 >>>>>>>>>>>> >>>>>>>> 82,383,384,385,386,387,388,389,391,394,395,397,398,399,400,40 2,403,405, >>>>>>>> 40 >>>>>>>> 6, >>>>>>>>>> >>>>>>>>>> >>>>>>>> 4 >>>>>>>>>>>> >>>>>>>> 07,409,410,411,415,416,418,419,425,431,432,433,434,435,440,44 1,443,445, >>>>>>>> 44 >>>>>>>> 7, >>>>>>>>>> >>>>>>>>>> >>>>>>>> 4 >>>>>>>>>>>> >>>>>>>> 49,450,452,454,455,456,461,464,466,470,472,473,481,487,488,49 1,492,493, >>>>>>>> 49 >>>>>>>> 4, >>>>>>>>>> >>>>>>>>>> >>>>>>>> 4 >>>>>>>>>>>> 95,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51 [... >>>>>>>>>>>> truncated] >>>>>>>>>>> >>>>>>>>>>> What exactly is 'mydata', and how did you generate it? The above >>>>>>>>>>> error >>>>>>>>>>> indicates that you have duplicate row names, which IIRC isn't >>>>>>>>>>> possible >>>>>>>>>>> to do with an expressionSet. >>>>>>>>>>> >>>>>>>>>>>> R(9062,0xa05c5540) malloc: *** mmap(size=458665984) failed (error >>>>>>>>>>>> code=12) >>>>>>>>>>>> *** error: can't allocate region >>>>>>>>>>>> *** set a breakpoint in malloc_error_break to debug >>>>>>>>>>>> R(9062,0xa05c5540) malloc: *** mmap(size=458665984) failed (error >>>>>>>>>>>> code=12) >>>>>>>>>>>> *** error: can't allocate region >>>>>>>>>>>> *** set a breakpoint in malloc_error_break to debug >>>>>>>>>>> >>>>>>>>>>> More lack of memory errors. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Error in help(dt[i], package = pkg[i], htmlhelp = TRUE) : >>>>>>>>>>>> unused argument(s) (htmlhelp = TRUE) >>>>>>>>>>>> In addition: Warning messages: >>>>>>>>>>>> 1: In data(package = .packages(all.available = TRUE)) : >>>>>>>>>>>> datasets have been moved from package 'base' to package >>>>>>>>>>>> 'datasets' >>>>>>>>>>>> 2: In data(package = .packages(all.available = TRUE)) : >>>>>>>>>>>> datasets have been moved from package 'stats' to package >>>>>>>>>>>> 'datasets' >>>>>>>>>>>> starting httpd help server ... done >>>>>>>>>>>> >>>>>>>>>>>> Would someone be able to diagnose the problem and suggest a >>>>>>>>>>>> solution? >>>>>>>>>>> >>>>>>>>>>> First, get more RAM. Second, you will be better off using a 64-bit >>>>>>>>>>> OS. >>>>>>>>>>> Depending on your hardware, you might be able to just install a >>>>>>>>>>> 64-bit >>>>>>>>>>> version of R. >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> >>>>>>>>>>> Jim >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> If it is useful, I am using the following R software: R for Mac OS >>>>>>>>>>>> X >>>>>>>>>>>> GUI >>>>>>>>>>>> 1.35-dev Leopard build 32-bit. If there is any other info that >>>>>>>>>>>> would >>>>>>>>>>>> be >>>>>>>>>>>> useful please let me know. >>>>>>>>>>>> >>>>>>>>>>>> I had a read of the AffyQCReport Package pdf and I have added the >>>>>>>>>>>> following >>>>>>>>>>>> line: QCReport(ReadAffy(widget=TRUE)). Then I tried >>>>>>>>>>>> library(affyQCReport); >>>>>>>>>>>> QCReport(mydata, file="ExampleQC.pdf") again. It now seems to be >>>>>>>>>>>> doing >>>>>>>>>>>> something, in other words it doesn?t go to the error, yet, but it?s >>>>>>>>>>>> been >>>>>>>>>>>> processing for about 10 minutes. I am analyzing 35 chips. >>>>>>>>>>>> >>>>>>>>>>>> Perhaps it would work if I tried to generate each QCReport page >>>>>>>>>>>> separately >>>>>>>>>>>> rather than as a whole. >>>>>>>>>>>> >>>>>>>>>>>> Cordially, >>>>>>>>>>>> Rick >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> Bioconductor mailing list >>>>>>>>>>>> Bioconductor at r-project.org >>>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>>>>>>>> Search the archives: >>>>>>>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>>>>>>> >>>>>>>> >>>>>> >>>> >> -- Rick Frausto PhD Candidate The University of Sydney School of Molecular Bioscience G08 Camperdown, NSW 2006 AUSTRALIA ricardo.frausto at sydney.edu.au Phone: 61 2 9036 5354 Lab of Iain L. Campbell
ADD REPLY
0
Entering edit mode
Hi Rick, On 1/10/2011 4:57 PM, Rick Frausto wrote: > Hi Jim, > > You're right... > >> any(duplicated(unlist(indexProbes(mydata, "both")))) > [1] TRUE >> > > Figured it would be something simple, almost always is. Guess since the MM > values are only really necessary for calculating a "real" PM value I should > generally still be ok with using R Bioconductor packages for downstream > analysis of these chips?? For example, using eset<-rma() to normalize my > data should still be ok. Yep. RMA only uses PM values, so this will be fine. You only get into trouble when trying to use mas5 based methods. > > By the way, the documentation on the AffyQCReport function regarding > signalDist() states that "The first is a boxplot plot of the all pm > intensities and the second plot consists of kernel density estimates of > these intensities." From this it would seem to a novice like me that it only > uses PM values, clearly I'm not correct. I guess these are PM values > adjusted for the MM signal. Nope, they aren't adjusted for MM, they just include the MM values as well. Here is a little primer on how to see what is going on. If you load the affyQCReport package and then type signalDist at the R prompt, you will get this: > signalDist function (object) { par(mfrow = c(2, 1)) ArrayIndex = as.character(1:length(sampleNames(object))) boxplot(object, names = ArrayIndex, ylab = "Log2(Intensity)", xlab = "Array Index") hist(x = object, lt = 1:length(ArrayIndex), col = 1:length(ArrayIndex), which = "both") temppar <- par() legend(((temppar$xaxp[2] - temppar$xaxp[1])/temppar$xaxp[3]) * (temppar$xaxp[3] - 1) + temppar$xaxp[1], temppar$yaxp[2], as.character(ArrayIndex), lt = 1:length(ArrayIndex), col = 1:length(ArrayIndex), cex = 0.5) } <environment: namespace:affyqcreport=""> So you can see that we are calling boxplot() as well as hist() on the 'object', which is an AffyBatch. Let's see what boxplot() and hist() do. > boxplot standardGeneric for "boxplot" defined from package "graphics" function (x, ...) standardGeneric("boxplot") <environment: 0x184ea378=""> Methods may be defined for arguments: x Use showMethods("boxplot") for currently available ones. So this is an S4 method, and the methods are slightly harder to get to, but let's follow the prescription on the last line. > showMethods(boxplot, class = "AffyBatch", includeDefs = TRUE) Function: boxplot (package graphics) x="AffyBatch" function (x, ...) { .local <- function (x, which = "both", range = 0, main, ...) { tmp <- description(x) if (missing(main) && (is(tmp, "MIAME"))) main <- tmp at title tmp <- unlist(indexProbes(x, which)) tmp <- tmp[seq(1, length(tmp), len = 5000)] boxplot(data.frame(log2(intensity(x)[tmp, ])), main = main, range = range, ...) } .local(x, ...) } Note two things here. I added in class = "AffyBatch", because there may be other boxplot methods for other objects, and we really don't care about them. Additionally, I included includeDefs = TRUE, which will cause the function to be output. The .local function has a default of which = 'both', and you see that argument is used for the call to indexProbes (also note that there is a '...' argument to .local, that could be used to pass in a which = "pm" in signalDist() to override the default, but it is not, so the help page is incorrect). If you look at ?indexProbes, you will see this in the methods section: indexProbes 'signature(object = "AffyBatch", which = "character")': returns a list with locations of the probes in each probe set. The affyID corresponding to the probe set to retrieve can be specified in an optional parameter 'genenames'. By default, all the affyIDs are retrieved. The names of the elements in the list returned are the affyIDs. 'which' can be "pm", "mm", or "both". If "both" then perfect match locations are given followed by mismatch locations. The warning you get comes from here: tmp <- unlist(indexProbes(x, which)) tmp <- tmp[seq(1, length(tmp), len = 5000)] boxplot(data.frame(log2(intensity(x)[tmp, ])), main = main, range = range, ...) Which is basically getting a subset of 5000 probes to create the boxplot. Since half of your indices from indexProbes() will be NA, a bunch of the tmp variable will be NAs as well. We can re-create the warning you get below with a little example: > x <- matrix(rnorm(100), ncol = 10) > row.names(x) <- letters[1:10] > z <- data.frame(x[c(1,2,3,NA,4,5,NA),]) Warning message: In data.row.names(row.names, rowsi, i) : some row.names duplicated: 7 --> row.names NOT used Best, Jim > > Thanks for figuring this out for me. Let me know if these and other related > questions would be better served as standalone e-mails. > > Cheers, > Rick > > > > On 10/01/11 7:04 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> wrote: > >> Hi Rick, >> >> After all that, the reason is really simple. You are trying to use >> affyQCReport on a PM-only chip, which isn't going to work out so well. I >> don't have any mogene data around to play with (and don't have the time >> to go searching), so I will have to make some educated guesses. >> >> Internally in signalDist() you are calling boxplot() and hist() on your >> AffyBatch. And the default for both functions is to use both PM and MM >> probes. I'm betting that >> >> any(duplicated(unlist(indexProbes(mydata, "both")))) >> >> returns TRUE, indicating that indexProbes doesn't work correctly on a >> PM-only chip, which is fair enough, as it was never designed to do so. >> >> And plot(qc(mydata)) will never work, as it relies on computing a >> Wilcoxon signed-rank between the PM and MM probes, and since you don't >> have MM probes, well you get the picture... >> >> Best, >> >> Jim >> >> >> >> On 1/7/2011 6:56 PM, Rick Frausto wrote: >>> Hi Jim, >>> >>> Ok, so after doing a bit of reading and re-reading I was eventually able to >>> generate each page in a quartz window that the "QCReport" function should >>> also generate. I found which ones give me the errors. So, there should be 6 >>> pages in total. Page 2 gives me the duplication error and page 3 gives me >>> the error in evaluating the argument x. The other pages are ok and are >>> generated as expected. >>> >>> In brief, page 2 is suppose to be generated with the "signalDist(mydata)" >>> command. Page 3 is suppose to generated with the "plot(qc(mydata))" command. >>> >>> So, I guess there must be particular requirements for these commands that >>> I'm missing.I've included the session below along with traceback() and >>> sessionInfo(). >>> >>> >>> R version 2.12.0 (2010-10-15) >>> Copyright (C) 2010 The R Foundation for Statistical Computing >>> ISBN 3-900051-07-0 >>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >>> >>> R is free software and comes with ABSOLUTELY NO WARRANTY. >>> You are welcome to redistribute it under certain conditions. >>> Type 'license()' or 'licence()' for distribution details. >>> >>> Natural language support but running in an English locale >>> >>> R is a collaborative project with many contributors. >>> Type 'contributors()' for more information and >>> 'citation()' on how to cite R or R packages in publications. >>> >>> Type 'demo()' for some demos, 'help()' for on-line help, or >>> 'help.start()' for an HTML browser interface to help. >>> Type 'q()' to quit R. >>> >>> [R.app GUI 1.35 (5632) x86_64-apple-darwin9.8.0] >>> >>> [Workspace restored from /Users/rickfrausto/.RData] >>> [History restored from /Users/rickfrausto/.Rapp.history] >>> >>>> library(simpleaffy) >>> Loading required package: affy >>> Loading required package: Biobase >>> >>> Welcome to Bioconductor >>> >>> Vignettes contain introductory material. To view, type >>> 'openVignette()'. To cite Bioconductor, see >>> 'citation("Biobase")' and for packages 'citation(pkgname)'. >>> >>> Loading required package: genefilter >>> Loading required package: gcrma >>> >>> Attaching package: 'simpleaffy' >>> >>> The following object(s) are masked _by_ '.GlobalEnv': >>> >>> getBioC >>> >>>> library(affy) >>>> mydata<- ReadAffy() >>>> eset<- rma(mydata) >>> Background correcting >>> Normalizing >>> Calculating Expression >>>> library(affycoretools); affystart(plot=T, express="rma") >>> Loading required package: GO.db >>> Loading required package: AnnotationDbi >>> Loading required package: DBI >>> Loading required package: KEGG.db >>> Background correcting >>> Normalizing >>> Calculating Expression >>> Please give the x-coordinate for a legend.30 >>> Please give the y-coordinate for a legend.80 >>> ExpressionSet (storageMode: lockedEnvironment) >>> assayData: 34760 features, 35 samples >>> element names: exprs >>> protocolData >>> sampleNames: A_WT1_NT_2hr.CEL B_WT1_NT_2hr.CEL ... >>> ZI_ST1KO_HIL6_12hr.CEL (35 total) >>> varLabels: ScanDate >>> varMetadata: labelDescription >>> phenoData >>> sampleNames: A_WT1_NT_2hr.CEL B_WT1_NT_2hr.CEL ... >>> ZI_ST1KO_HIL6_12hr.CEL (35 total) >>> varLabels: sample >>> varMetadata: labelDescription >>> featureData: none >>> experimentData: use 'experimentData(object)' >>> Annotation: mogene10stv1 >>>> write.exprs(eset, file="mydata.txt") >>>> x<- data.frame(exprs(eset), exprs(eset_PMA), assayDataElement(eset_PMA, >>> "se.exprs")); x<- x[,sort(names(x))]; write.table(x, file="mydata_PMA.xls", >>> quote=F, col.names = NA, sep="\t") >>> Error in exprs(eset_PMA) : >>> error in evaluating the argument 'object' in selecting a method for >>> function 'exprs' >>>> mypm<- pm(mydata) >>>> mymm<- mm(mydata) >>>> myaffyids<- probeNames(mydata) >>>> result<- data.frame(myaffyids, mypm, mymm) >>>> eset; pData(eset) >>> ExpressionSet (storageMode: lockedEnvironment) >>> assayData: 34760 features, 35 samples >>> element names: exprs >>> protocolData >>> sampleNames: A_WT1_NT_2hr.CEL B_WT1_NT_2hr.CEL ... >>> ZI_ST1KO_HIL6_12hr.CEL (35 total) >>> varLabels: ScanDate >>> varMetadata: labelDescription >>> phenoData >>> sampleNames: A_WT1_NT_2hr.CEL B_WT1_NT_2hr.CEL ... >>> ZI_ST1KO_HIL6_12hr.CEL (35 total) >>> varLabels: sample >>> varMetadata: labelDescription >>> featureData: none >>> experimentData: use 'experimentData(object)' >>> Annotation: mogene10stv1 >>> sample >>> A_WT1_NT_2hr.CEL 1 >>> B_WT1_NT_2hr.CEL 2 >>> C_WT1_NT_12hr.CEL 3 >>> D_WT1_NT_12hr.CEL 4 >>> E_WT1_HIL6_2hr.CEL 5 >>> F_WT1_HIL6_2hr.CEL 6 >>> G_WT1_HIL6_12hr.CEL 7 >>> H_WT1_HIL6_12hr.CEL 8 >>> I_FF_NT_2hr.CEL 9 >>> J_FF_NT_2hr.CEL 10 >>> K_FF_NT_12hr.CEL 11 >>> L_FF_NT_12hr.CEL 12 >>> M_FF_HIL6_2hr.CEL 13 >>> N_FF_HIL6_2hr.CEL 14 >>> O_FF_HIL6_12hr.CEL 15 >>> P_FF_HIL6_12hr.CEL 16 >>> Q_WT2_NT_2hr.CEL 17 >>> R_WT2_NT_2hr.CEL 18 >>> S_WT2_NT_12hr.CEL 19 >>> T_WT2_NT_12hr.CEL 20 >>> U_WT2_HIL6_2hr.CEL 21 >>> V_WT2_HIL6_2hr.CEL 22 >>> W_WT2_HIL6_12hr.CEL 23 >>> X_WT2_HIL6_12hr.CEL 24 >>> Y_DD_NT_2hr.CEL 25 >>> Z_DD_NT_2hr.CEL 26 >>> ZA_DD_NT_12hr.CEL 27 >>> ZB_DD_NT_12hr.CEL 28 >>> ZC_DD_HIL6_2hr.CEL 29 >>> ZD_DD_HIL6_2hr.CEL 30 >>> ZE_DD_HIL6_12hr.CEL 31 >>> ZF_DD_HIL6_12hr.CEL 32 >>> ZG_ST1KO_NT_2hr.CEL 33 >>> ZH_ST1KO_HIL6_2hr.CEL 34 >>> ZI_ST1KO_HIL6_12hr.CEL 35 >>>> data.frame(eset) >>> X10338001 X10338003 X10338004 X10338017 X10338025 >>> A_WT1_NT_2hr.CEL 11.71717 10.183620 9.440631 12.79412 8.823529 >>> B_WT1_NT_2hr.CEL 11.78778 10.027760 9.489226 12.98544 8.843002 >>> X10338026 X10338029 X10338035 X10338036 X10338037 >>> A_WT1_NT_2hr.CEL 13.22585 9.405038 8.853564 9.379031 3.661987 >>> B_WT1_NT_2hr.CEL 13.29043 9.575309 8.772872 9.513050 3.514885 >>> X10338041 X10338042 X10338044 X10338047 X10338056 >>> A_WT1_NT_2hr.CEL 10.94638 10.116516 11.88296 8.872839 3.133222 >>> B_WT1_NT_2hr.CEL 11.23276 10.134084 12.03381 7.568584 3.088548 >>> X10338059 X10338060 X10338063 X10338064 X10338065 >>> >>> JIM, I TRUNCATED THIS LIST, BUT THOUGHT IT MIGHT BE USEFUL IN DIAGNOSING THE >>> PROBLEMS I'M HAVING. SESSION IS CONTINUED BELOW. >>> >>>> library(affyQCReport) >>> Loading required package: lattice >>>> titlePage(mydata) >>> [1] TRUE >>>> signalDist(mydata) >>> Warning message: >>> In data.row.names(row.names, rowsi, i) : >>> some row.names duplicated: >>> 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,49,50, 51,52,53,5 >>> 4,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,9 8,99,102,1 >>> 03,104,108,109,110,111,114,119,120,121,122,127,134,136,137,138,139 ,141,142,1 >>> 47,148,149,152,153,156,157,158,159,162,163,164,165,166,167,168,169 ,170,171,1 >>> 73,175,176,179,180,183,184,185,186,191,192,195,197,198,199,200,202 ,206,207,2 >>> 10,219,220,227,228,229,230,233,234,235,240,241,243,245,246,248,249 ,250,251,2 >>> 52,253,257,259,260,266,271,272,276,277,280,281,284,286,287,289,290 ,291,292,2 >>> 96,297,298,302,304,305,306,310,311,312,313,317,318,319,321,322,324 ,334,337,3 >>> 38,339,340,341,345,346,350,351,356,359,362,364,366,367,370,371,373 ,376,378,3 >>> 82,383,384,385,386,387,388,389,391,394,395,397,398,399,400,402,403 ,405,406,4 >>> 07,409,410,411,415,416,418,419,425,431,432,433,434,435,440,441,443 ,445,447,4 >>> 49,450,452,454,455,456,461,464,466,470,472,473,481,487,488,491,492 ,493,494,4 >>> 95,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51 [... >>> truncated] >>>> plot(qc(mydata)) >>> Error in plot(qc(mydata)) : >>> error in evaluating the argument 'x' in selecting a method for function >>> 'plot' >>>> borderQC1(mydata) >>> [1] TRUE >>>> borderQC2(mydata) >>> [1] TRUE >>>> correlationPlot(mydata) >>> [1] TRUE >>>> titlePage(mydata) >>> [1] TRUE >>>> titlePage(mydata) >>> Error in polygon(c(0, 0, 0.9, 0.9, 0), c(0.05, 0.95, 0.95, 0.05, 0.05)) : >>> plot.new has not been called yet >>>> correlationPlot(mydata) >>> [1] TRUE >>>> titlePage(mydata) >>> Error in polygon(c(0, 0, 0.9, 0.9, 0), c(0.05, 0.95, 0.95, 0.05, 0.05)) : >>> plot.new has not been called yet >>> In addition: Warning message: >>> Display list redraw incomplete >>>> borderQC1(mydata) >>> [1] TRUE >>>> titlePage(mydata) >>> [1] TRUE >>>> titlePage(mydata) >>> Error in polygon(c(0, 0, 0.9, 0.9, 0), c(0.05, 0.95, 0.95, 0.05, 0.05)) : >>> plot.new has not been called yet >>>> traceback() >>> 2: polygon(c(0, 0, 0.9, 0.9, 0), c(0.05, 0.95, 0.95, 0.05, 0.05)) >>> 1: titlePage(mydata) >>>> sessionInfo() >>> R version 2.12.0 (2010-10-15) >>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >>> >>> locale: >>> [1] en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8 >>> >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> >>> other attached packages: >>> [1] affyQCReport_1.28.1 lattice_0.19-13 affycoretools_1.22.0 >>> [4] KEGG.db_2.4.5 GO.db_2.4.5 RSQLite_0.9-4 >>> [7] DBI_0.2-5 AnnotationDbi_1.12.0 mogene10stv1cdf_2.7.0 >>> [10] simpleaffy_2.26.1 gcrma_2.22.0 genefilter_1.32.0 >>> [13] affy_1.28.0 Biobase_2.10.0 >>> >>> loaded via a namespace (and not attached): >>> [1] affyio_1.18.0 affyPLM_1.26.0 annaffy_1.22.0 >>> [4] annotate_1.28.0 biomaRt_2.6.0 Biostrings_2.18.2 >>> [7] Category_2.16.0 GOstats_2.16.0 graph_1.28.0 >>> [10] grid_2.12.0 GSEABase_1.12.2 IRanges_1.8.7 >>> [13] limma_3.6.9 preprocessCore_1.12.0 RBGL_1.26.0 >>> [16] RColorBrewer_1.0-2 RCurl_1.4-3 splines_2.12.0 >>> [19] survival_2.36-2 tools_2.12.0 XML_3.2-0 >>> [22] xtable_1.5-6 >>>> >>> >>> On 7/01/11 12:47 PM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> wrote: >>> >>>> Hi Rick, >>>> >>>> What happens if you load the simpleaffy package first? >>>> >>>> Best, >>>> >>>> Jim >>>> >>>> On 1/7/2011 2:14 PM, Rick Frausto wrote: >>>>> Hi James, >>>>> >>>>> Below is the information that you requested - traceback() and >>>>> sessioninfo(). >>>>> Doesn't seem like much to me, but perhaps you can help. As you answer to a >>>>> lot of e-mails, thought I'd remind you that this is in regards to the "some >>>>> row.names duplicated" error. >>>>> >>>>> Hope your holidays were good! >>>>> >>>>> -Rick >>>>> >>>>> [R.app GUI 1.35 (5632) x86_64-apple-darwin9.8.0] >>>>> >>>>> [Workspace restored from /Users/rickfrausto/.RData] >>>>> [History restored from /Users/rickfrausto/.Rapp.history] >>>>> >>>>>> library(affy) >>>>> Loading required package: Biobase >>>>> >>>>> Welcome to Bioconductor >>>>> >>>>> Vignettes contain introductory material. To view, type >>>>> 'openVignette()'. To cite Bioconductor, see >>>>> 'citation("Biobase")' and for packages 'citation(pkgname)'. >>>>> >>>>>> mydata<- ReadAffy() >>>>>> eset<- rma(mydata) >>>>> Background correcting >>>>> Normalizing >>>>> Calculating Expression >>>>>> write.exprs(eset, file="mydata.txt") >>>>>> mypm<- pm(mydata) >>>>>> mymm<- mm(mydata) >>>>>> myaffyids<- probeNames(mydata) >>>>>> result<- data.frame(myaffyids, mypm, mymm) >>>>>> library(affyQCReport); QCReport(mydata, file="ExampleQC.pdf") >>>>> Loading required package: lattice >>>>> Warning message: >>>>> In data.row.names(row.names, rowsi, i) : >>>>> some row.names duplicated: >>>>> > 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,49,50,51 ,52,53,>>>> > 5 >>>>> > 4,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98, 99,102,>>>> > 1 >>>>> > 03,104,108,109,110,111,114,119,120,121,122,127,134,136,137,138,139,1 41,142,>>>> > 1 >>>>> > 47,148,149,152,153,156,157,158,159,162,163,164,165,166,167,168,169,1 70,171,>>>> > 1 >>>>> > 73,175,176,179,180,183,184,185,186,191,192,195,197,198,199,200,202,2 06,207,>>>> > 2 >>>>> > 10,219,220,227,228,229,230,233,234,235,240,241,243,245,246,248,249,2 50,251,>>>> > 2 >>>>> > 52,253,257,259,260,266,271,272,276,277,280,281,284,286,287,289,290,2 91,292,>>>> > 2 >>>>> > 96,297,298,302,304,305,306,310,311,312,313,317,318,319,321,322,324,3 34,337,>>>> > 3 >>>>> > 38,339,340,341,345,346,350,351,356,359,362,364,366,367,370,371,373,3 76,378,>>>> > 3 >>>>> > 82,383,384,385,386,387,388,389,391,394,395,397,398,399,400,402,403,4 05,406,>>>> > 4 >>>>> > 07,409,410,411,415,416,418,419,425,431,432,433,434,435,440,441,443,4 45,447,>>>> > 4 >>>>> > 49,450,452,454,455,456,461,464,466,470,472,473,481,487,488,491,492,4 93,494,>>>> > 4 >>>>> 95,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51 [... >>>>> truncated] >>>>> Error in plot(qc(object)) : >>>>> error in evaluating the argument 'x' in selecting a method for function >>>>> 'plot' >>>>>> traceback() >>>>> 2: plot(qc(object)) >>>>> 1: QCReport(mydata, file = "ExampleQC.pdf") >>>>>> sessionInfo() >>>>> R version 2.12.0 (2010-10-15) >>>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >>>>> >>>>> locale: >>>>> [1] en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8 >>>>> >>>>> attached base packages: >>>>> [1] stats graphics grDevices utils datasets methods base >>>>> >>>>> other attached packages: >>>>> [1] affyQCReport_1.28.1 latptice_0.19-13 mogene10stv1cdf_2.7.0 >>>>> [4] affy_1.28.0 Biobase_2.10.0 >>>>> >>>>> loaded via a namespace (and not attached): >>>>> [1] affyio_1.18.0 affyPLM_1.26.0 annotate_1.28.0 >>>>> [4] AnnotationDbi_1.12.0 Biostrings_2.18.2 DBI_0.2-5 >>>>> [7] gcrma_2.22.0 genefilter_1.32.0 grid_2.12.0 >>>>> [10] IRanges_1.8.7 preprocessCore_1.12.0 RColorBrewer_1.0-2 >>>>> [13] RSQLite_0.9-4 simpleaffy_2.26.1 splines_2.12.0 >>>>> [16] survival_2.36-2 tools_2.12.0 xtable_1.5-6 >>>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On 20/12/10 6:33 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> wrote: >>>>> >>>>>> Hi Rick, >>>>>> >>>>>> On 12/17/2010 9:24 PM, Rick Frausto wrote: >>>>>>> Hey Jim, >>>>>>> >>>>>>> Ok, I will give that a go. The only problem is an ExpressionSet contains >>>>>>> all >>>>>>> of the necessary information for further analysis (e.g. phenodata, >>>>>>> featuredata and annotation, etc - including, treatment type, cell type, >>>>>>> time >>>>>>> points, replicates). I am still learning how to include all of these for >>>>>>> a >>>>>>> complete ExpressionSet. As a starting point I've loaded a txt file >>>>>>> containing some of this information (gene abbrev, ontology, probeset ID) >>>>>>> which I created using Affymetrix's Expression Console software, without >>>>>>> replicate, time point and cell type info. Doing this I've gotten as far >>>>>>> as >>>>>>> creating a minimal ExpressionSet, which I guess the functions you mention >>>>>>> below do just that but with the information contained in the CEL file >>>>>>> only. >>>>>>> >>>>>>> In any case, since as you say, the functions in the online manual create >>>>>>> a >>>>>>> proper ExpressionSet why would I get the issue of duplication? >>>>>> >>>>>> Oh yeah, the original question ;-D. Try running QCreport() again, and >>>>>> when it errors out run traceback() and send the output. Also include the >>>>>> output of sessionInfo(). >>>>>> >>>>>> Jim >>>>>> >>>>>> >>>>>>> >>>>>>> In regards to the 64-bit discussion. It may have very well made enough of >>>>>>> a >>>>>>> difference as it did not come up with the memory error the last time I >>>>>>> tried >>>>>>> it. Going to upgrade to 8GB RAM anyways, can't hurt. >>>>>>> >>>>>>> Cheers, >>>>>>> Rick >>>>>>> >>>>>>> >>>>>>> On 17/12/10 7:20 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Rick, >>>>>>>> >>>>>>>> On 12/16/2010 4:13 PM, Rick Frausto wrote: >>>>>>>>> Hi Jim, >>>>>>>>> >>>>>>>>> How do I run an RMA analysis without a proper ExpresionSet? Honest >>>>>>>>> answer, >>>>>>>>> I >>>>>>>>> don't know, I just put in a command line from a manual I found online >>>>>>>>> and >>>>>>>>> it >>>>>>>>> spit out some result- see #3 Affy packages in following link ( >>>>>>>>> http://manuals.bioinformatics.ucr.edu/home/R_BioCondManual#b iocon_intro >>>>>>>>> ). >>>>>>>> >>>>>>>> You are mistaken. All of the functions mentioned there result in a >>>>>>>> proper ExpressionSet. And if you just do >>>>>>>> >>>>>>>> abatch<- ReadAffy() >>>>>>>> eset<- rma(abatch) >>>>>>>> >>>>>>>> Then you will 100% surely get an ExpressionSet. >>>>>>>> >>>>>>>>> >>>>>>>>> Perhaps you don't need an ExpressionSet until after the preprocessing, >>>>>>>>> at >>>>>>>>> least that is what I get from the "An Introduction to Bioconductor's >>>>>>>>> ExpressionSet Class" written by Seth Falcon, Martin Morgan and Robert >>>>>>>>> Gentleman. Everything seemed to be going smoothly until I tried to get >>>>>>>>> a >>>>>>>>> QC >>>>>>>>> Report. >>>>>>>>> >>>>>>>>> Now, the answer for why I would want to do such a thing is easy. Simply >>>>>>>>> that >>>>>>>>> I don't know any better :) Just started working with R a few days ago, >>>>>>>>> but >>>>>>>>> I'm learning. >>>>>>>>> >>>>>>>>> >>>>>>>>> Apparently Snow Leopard running on 32bit can only utilize about 3.2GB >>>>>>>>> of >>>>>>>>> RAM, whereas 64bit can make use of all 4GB. I'll switch to the 64 bit >>>>>>>>> OS >>>>>>>>> and >>>>>>>>> see if it makes a difference. >>>>>>>> >>>>>>>> Well, it won't be much different. The reason a 32-bit OS can only use >>>>>>>> about 3.2 Gb of RAM is that the OS needs some to run. The 64-bit OS also >>>>>>>> needs to use some RAM, so you won't get all 4 Gb there either. The issue >>>>>>>> is how much RAM can be allocated to a single process, and on a 64-bit OS >>>>>>>> that gets bumped up significantly. >>>>>>>> >>>>>>>> Best, >>>>>>>> >>>>>>>> Jim >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Thanks for your insight! >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Rick >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 16/12/10 11:31 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi Rick, >>>>>>>>>> >>>>>>>>>> On 12/16/2010 12:57 PM, Rick Frausto wrote: >>>>>>>>>>> Thanks Jim! How much memory would I need, I currently have 4GB, but >>>>>>>>>>> have >>>>>>>>>>> quite a few other programs running in the background...I'll see if >>>>>>>>>>> closing >>>>>>>>>>> them helps. Perhaps setting up an "ExpressionSet" would solve the >>>>>>>>>>> problem. >>>>>>>>>>> I >>>>>>>>>>> just started reading up on how to set one of these up yesterday. Will >>>>>>>>>>> do >>>>>>>>>>> this and see if the duplicates will go away. >>>>>>>>>>> >>>>>>>>>>> The "mydata" originates from CEL files and then I run the RMA >>>>>>>>>>> analysis >>>>>>>>>>> on >>>>>>>>>>> it, but I didn't actually set up a proper ExpressionSet. I'm guessing >>>>>>>>>>> that >>>>>>>>>>> doing this might reduce the QCReport PDF file size quite considerably >>>>>>>>>>> since >>>>>>>>>>> I won't have any duplication and will make further analysis easier. >>>>>>>>>> >>>>>>>>>> How do you run an RMA analysis without setting up a proper >>>>>>>>>> ExpressionSet? The default behavior is to create one. In addition, why >>>>>>>>>> would you want to do such a thing? The ExpressionSet class is >>>>>>>>>> specifically designed to contain these sorts of data. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I'm running Snow Leopard OSX which can be set up as 64bit. Would >>>>>>>>>>> running >>>>>>>>>>> as >>>>>>>>>>> 64bit still necessitate more RAM? >>>>>>>>>> >>>>>>>>>> Probably. The difference isn't efficiency, but the ability to address >>>>>>>>>> more RAM. A 32-bit OS can still address all the available memory that >>>>>>>>>> you will have with just 4 Gb RAM, so you need to bump that up if you >>>>>>>>>> want to do all the chips together. As for how much, I don't know. >>>>>>>>>> Since >>>>>>>>>> RAM isn't that expensive these days, you might look at maxing your box >>>>>>>>>> out. >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> >>>>>>>>>> Jim >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks again, >>>>>>>>>>> Rick >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 15/12/10 7:45 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Rick, >>>>>>>>>>>> >>>>>>>>>>>> On 12/14/2010 3:55 PM, Rick Frausto wrote: >>>>>>>>>>>>> Dear All, >>>>>>>>>>>>> >>>>>>>>>>>>> I have recently entered the world of R. Through some trial and >>>>>>>>>>>>> error >>>>>>>>>>>>> I'm >>>>>>>>>>>>> becoming more familiar with R and the relevant Bioconductor Affy >>>>>>>>>>>>> packages. >>>>>>>>>>>>> I?m a molecular and cell biologist with rudimentary statistical >>>>>>>>>>>>> knowledge >>>>>>>>>>>>> and even less knowledge with respect to R. >>>>>>>>>>>>> >>>>>>>>>>>>> When I enter the following: >>>>>>>>>>>>> >>>>>>>>>>>>> library(affyQCReport); QCReport(mydata, file="ExampleQC.pdf") >>>>>>>>>>>>> >>>>>>>>>>>>> I get some errors in return. >>>>>>>>>>>>> >>>>>>>>>>>>> Loading required package: lattice >>>>>>>>>>>>> Error: cannot allocate vector of size 437.4 Mb >>>>>>>>>>>> >>>>>>>>>>>> This indicates that you need more RAM, as you are running out of >>>>>>>>>>>> memory. >>>>>>>>>>>> >>>>>>>>>>>>> In addition: Warning message: >>>>>>>>>>>>> In data.row.names(row.names, rowsi, i) : >>>>>>>>>>>>> some row.names duplicated: >>>>>>>>>>>>> >>>>>>>>> 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48, 49,50,51,52 >>>>>>>>> ,5 >>>>>>>>> 3, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> 5 >>>>>>>>>>>>> >>>>>>>>> 4,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,94,95,9 6,97,98,99, >>>>>>>>> 10 >>>>>>>>> 2, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> 1 >>>>>>>>>>>>> >>>>>>>>> 03,104,108,109,110,111,114,119,120,121,122,127,134,136,137,1 38,139,141, >>>>>>>>> 14 >>>>>>>>> 2, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> 1 >>>>>>>>>>>>> >>>>>>>>> 47,148,149,152,153,156,157,158,159,162,163,164,165,166,167,1 68,169,170, >>>>>>>>> 17 >>>>>>>>> 1, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> 1 >>>>>>>>>>>>> >>>>>>>>> 73,175,176,179,180,183,184,185,186,191,192,195,197,198,199,2 00,202,206, >>>>>>>>> 20 >>>>>>>>> 7, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> 2 >>>>>>>>>>>>> >>>>>>>>> 10,219,220,227,228,229,230,233,234,235,240,241,243,245,246,2 48,249,250, >>>>>>>>> 25 >>>>>>>>> 1, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> 2 >>>>>>>>>>>>> >>>>>>>>> 52,253,257,259,260,266,271,272,276,277,280,281,284,286,287,2 89,290,291, >>>>>>>>> 29 >>>>>>>>> 2, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> 2 >>>>>>>>>>>>> >>>>>>>>> 96,297,298,302,304,305,306,310,311,312,313,317,318,319,321,3 22,324,334, >>>>>>>>> 33 >>>>>>>>> 7, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> 3 >>>>>>>>>>>>> >>>>>>>>> 38,339,340,341,345,346,350,351,356,359,362,364,366,367,370,3 71,373,376, >>>>>>>>> 37 >>>>>>>>> 8, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> 3 >>>>>>>>>>>>> >>>>>>>>> 82,383,384,385,386,387,388,389,391,394,395,397,398,399,400,4 02,403,405, >>>>>>>>> 40 >>>>>>>>> 6, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> 4 >>>>>>>>>>>>> >>>>>>>>> 07,409,410,411,415,416,418,419,425,431,432,433,434,435,440,4 41,443,445, >>>>>>>>> 44 >>>>>>>>> 7, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> 4 >>>>>>>>>>>>> >>>>>>>>> 49,450,452,454,455,456,461,464,466,470,472,473,481,487,488,4 91,492,493, >>>>>>>>> 49 >>>>>>>>> 4, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> 4 >>>>>>>>>>>>> 95,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51 [... >>>>>>>>>>>>> truncated] >>>>>>>>>>>> >>>>>>>>>>>> What exactly is 'mydata', and how did you generate it? The above >>>>>>>>>>>> error >>>>>>>>>>>> indicates that you have duplicate row names, which IIRC isn't >>>>>>>>>>>> possible >>>>>>>>>>>> to do with an expressionSet. >>>>>>>>>>>> >>>>>>>>>>>>> R(9062,0xa05c5540) malloc: *** mmap(size=458665984) failed (error >>>>>>>>>>>>> code=12) >>>>>>>>>>>>> *** error: can't allocate region >>>>>>>>>>>>> *** set a breakpoint in malloc_error_break to debug >>>>>>>>>>>>> R(9062,0xa05c5540) malloc: *** mmap(size=458665984) failed (error >>>>>>>>>>>>> code=12) >>>>>>>>>>>>> *** error: can't allocate region >>>>>>>>>>>>> *** set a breakpoint in malloc_error_break to debug >>>>>>>>>>>> >>>>>>>>>>>> More lack of memory errors. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Error in help(dt[i], package = pkg[i], htmlhelp = TRUE) : >>>>>>>>>>>>> unused argument(s) (htmlhelp = TRUE) >>>>>>>>>>>>> In addition: Warning messages: >>>>>>>>>>>>> 1: In data(package = .packages(all.available = TRUE)) : >>>>>>>>>>>>> datasets have been moved from package 'base' to package >>>>>>>>>>>>> 'datasets' >>>>>>>>>>>>> 2: In data(package = .packages(all.available = TRUE)) : >>>>>>>>>>>>> datasets have been moved from package 'stats' to package >>>>>>>>>>>>> 'datasets' >>>>>>>>>>>>> starting httpd help server ... done >>>>>>>>>>>>> >>>>>>>>>>>>> Would someone be able to diagnose the problem and suggest a >>>>>>>>>>>>> solution? >>>>>>>>>>>> >>>>>>>>>>>> First, get more RAM. Second, you will be better off using a 64-bit >>>>>>>>>>>> OS. >>>>>>>>>>>> Depending on your hardware, you might be able to just install a >>>>>>>>>>>> 64-bit >>>>>>>>>>>> version of R. >>>>>>>>>>>> >>>>>>>>>>>> Best, >>>>>>>>>>>> >>>>>>>>>>>> Jim >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> If it is useful, I am using the following R software: R for Mac OS >>>>>>>>>>>>> X >>>>>>>>>>>>> GUI >>>>>>>>>>>>> 1.35-dev Leopard build 32-bit. If there is any other info that >>>>>>>>>>>>> would >>>>>>>>>>>>> be >>>>>>>>>>>>> useful please let me know. >>>>>>>>>>>>> >>>>>>>>>>>>> I had a read of the AffyQCReport Package pdf and I have added the >>>>>>>>>>>>> following >>>>>>>>>>>>> line: QCReport(ReadAffy(widget=TRUE)). Then I tried >>>>>>>>>>>>> library(affyQCReport); >>>>>>>>>>>>> QCReport(mydata, file="ExampleQC.pdf") again. It now seems to be >>>>>>>>>>>>> doing >>>>>>>>>>>>> something, in other words it doesn?t go to the error, yet, but it?s >>>>>>>>>>>>> been >>>>>>>>>>>>> processing for about 10 minutes. I am analyzing 35 chips. >>>>>>>>>>>>> >>>>>>>>>>>>> Perhaps it would work if I tried to generate each QCReport page >>>>>>>>>>>>> separately >>>>>>>>>>>>> rather than as a whole. >>>>>>>>>>>>> >>>>>>>>>>>>> Cordially, >>>>>>>>>>>>> Rick >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>> Bioconductor mailing list >>>>>>>>>>>>> Bioconductor at r-project.org >>>>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>>>>>>>>> Search the archives: >>>>>>>>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>>>>>>>> >>>>>>>>> >>>>>>> >>>>> >>> > -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
ADD REPLY
0
Entering edit mode
Thanks for all your help Jim! On 11/01/11 6:58 AM, "James W. MacDonald" <jmacdon at="" med.umich.edu=""> wrote: > Hi Rick, > > On 1/10/2011 4:57 PM, Rick Frausto wrote: >> Hi Jim, >> >> You're right... >> >>> any(duplicated(unlist(indexProbes(mydata, "both")))) >> [1] TRUE >>> >> >> Figured it would be something simple, almost always is. Guess since the MM >> values are only really necessary for calculating a "real" PM value I should >> generally still be ok with using R Bioconductor packages for downstream >> analysis of these chips?? For example, using eset<-rma() to normalize my >> data should still be ok. > > Yep. RMA only uses PM values, so this will be fine. You only get into > trouble when trying to use mas5 based methods. > >> >> By the way, the documentation on the AffyQCReport function regarding >> signalDist() states that "The first is a boxplot plot of the all pm >> intensities and the second plot consists of kernel density estimates of >> these intensities." From this it would seem to a novice like me that it only >> uses PM values, clearly I'm not correct. I guess these are PM values >> adjusted for the MM signal. > > Nope, they aren't adjusted for MM, they just include the MM values as > well. Here is a little primer on how to see what is going on. > > If you load the affyQCReport package and then type signalDist at the R > prompt, you will get this: > >> signalDist > function (object) > { > par(mfrow = c(2, 1)) > ArrayIndex = as.character(1:length(sampleNames(object))) > boxplot(object, names = ArrayIndex, ylab = "Log2(Intensity)", > xlab = "Array Index") > hist(x = object, lt = 1:length(ArrayIndex), col = 1:length(ArrayIndex), > which = "both") > temppar <- par() > legend(((temppar$xaxp[2] - temppar$xaxp[1])/temppar$xaxp[3]) * > (temppar$xaxp[3] - 1) + temppar$xaxp[1], temppar$yaxp[2], > as.character(ArrayIndex), lt = 1:length(ArrayIndex), > col = 1:length(ArrayIndex), cex = 0.5) > } > <environment: namespace:affyqcreport=""> > > So you can see that we are calling boxplot() as well as hist() on the > 'object', which is an AffyBatch. Let's see what boxplot() and hist() do. > >> boxplot > standardGeneric for "boxplot" defined from package "graphics" > > function (x, ...) > standardGeneric("boxplot") > <environment: 0x184ea378=""> > Methods may be defined for arguments: x > Use showMethods("boxplot") for currently available ones. > > So this is an S4 method, and the methods are slightly harder to get to, > but let's follow the prescription on the last line. > >> showMethods(boxplot, class = "AffyBatch", includeDefs = TRUE) > Function: boxplot (package graphics) > x="AffyBatch" > function (x, ...) > { > .local <- function (x, which = "both", range = 0, main, ...) > { > tmp <- description(x) > if (missing(main) && (is(tmp, "MIAME"))) > main <- tmp at title > tmp <- unlist(indexProbes(x, which)) > tmp <- tmp[seq(1, length(tmp), len = 5000)] > boxplot(data.frame(log2(intensity(x)[tmp, ])), main = main, > range = range, ...) > } > .local(x, ...) > } > > Note two things here. I added in class = "AffyBatch", because there may > be other boxplot methods for other objects, and we really don't care > about them. Additionally, I included includeDefs = TRUE, which will > cause the function to be output. > > The .local function has a default of which = 'both', and you see that > argument is used for the call to indexProbes (also note that there is a > '...' argument to .local, that could be used to pass in a which = "pm" > in signalDist() to override the default, but it is not, so the help page > is incorrect). If you look at ?indexProbes, you will see this in the > methods section: > > indexProbes 'signature(object = "AffyBatch", which = > "character")': returns a list with locations of the probes in > each probe set. The affyID corresponding to the probe set to > retrieve can be specified in an optional parameter > 'genenames'. By default, all the affyIDs are retrieved. The > names of the elements in the list returned are the affyIDs. > 'which' can be "pm", "mm", or "both". If "both" then perfect > match locations are given followed by mismatch locations. > > The warning you get comes from here: > > tmp <- unlist(indexProbes(x, which)) > tmp <- tmp[seq(1, length(tmp), len = 5000)] > boxplot(data.frame(log2(intensity(x)[tmp, ])), main = main, > range = range, ...) > > Which is basically getting a subset of 5000 probes to create the > boxplot. Since half of your indices from indexProbes() will be NA, a > bunch of the tmp variable will be NAs as well. We can re-create the > warning you get below with a little example: > >> x <- matrix(rnorm(100), ncol = 10) >> row.names(x) <- letters[1:10] >> z <- data.frame(x[c(1,2,3,NA,4,5,NA),]) > Warning message: > In data.row.names(row.names, rowsi, i) : > some row.names duplicated: 7 --> row.names NOT used > > Best, > > Jim > > >> >> Thanks for figuring this out for me. Let me know if these and other related >> questions would be better served as standalone e-mails. >> >> Cheers, >> Rick >> >> >> >> On 10/01/11 7:04 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> wrote: >> >>> Hi Rick, >>> >>> After all that, the reason is really simple. You are trying to use >>> affyQCReport on a PM-only chip, which isn't going to work out so well. I >>> don't have any mogene data around to play with (and don't have the time >>> to go searching), so I will have to make some educated guesses. >>> >>> Internally in signalDist() you are calling boxplot() and hist() on your >>> AffyBatch. And the default for both functions is to use both PM and MM >>> probes. I'm betting that >>> >>> any(duplicated(unlist(indexProbes(mydata, "both")))) >>> >>> returns TRUE, indicating that indexProbes doesn't work correctly on a >>> PM-only chip, which is fair enough, as it was never designed to do so. >>> >>> And plot(qc(mydata)) will never work, as it relies on computing a >>> Wilcoxon signed-rank between the PM and MM probes, and since you don't >>> have MM probes, well you get the picture... >>> >>> Best, >>> >>> Jim >>> >>> >>> >>> On 1/7/2011 6:56 PM, Rick Frausto wrote: >>>> Hi Jim, >>>> >>>> Ok, so after doing a bit of reading and re-reading I was eventually able to >>>> generate each page in a quartz window that the "QCReport" function should >>>> also generate. I found which ones give me the errors. So, there should be 6 >>>> pages in total. Page 2 gives me the duplication error and page 3 gives me >>>> the error in evaluating the argument x. The other pages are ok and are >>>> generated as expected. >>>> >>>> In brief, page 2 is suppose to be generated with the "signalDist(mydata)" >>>> command. Page 3 is suppose to generated with the "plot(qc(mydata))" >>>> command. >>>> >>>> So, I guess there must be particular requirements for these commands that >>>> I'm missing.I've included the session below along with traceback() and >>>> sessionInfo(). >>>> >>>> >>>> R version 2.12.0 (2010-10-15) >>>> Copyright (C) 2010 The R Foundation for Statistical Computing >>>> ISBN 3-900051-07-0 >>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >>>> >>>> R is free software and comes with ABSOLUTELY NO WARRANTY. >>>> You are welcome to redistribute it under certain conditions. >>>> Type 'license()' or 'licence()' for distribution details. >>>> >>>> Natural language support but running in an English locale >>>> >>>> R is a collaborative project with many contributors. >>>> Type 'contributors()' for more information and >>>> 'citation()' on how to cite R or R packages in publications. >>>> >>>> Type 'demo()' for some demos, 'help()' for on-line help, or >>>> 'help.start()' for an HTML browser interface to help. >>>> Type 'q()' to quit R. >>>> >>>> [R.app GUI 1.35 (5632) x86_64-apple-darwin9.8.0] >>>> >>>> [Workspace restored from /Users/rickfrausto/.RData] >>>> [History restored from /Users/rickfrausto/.Rapp.history] >>>> >>>>> library(simpleaffy) >>>> Loading required package: affy >>>> Loading required package: Biobase >>>> >>>> Welcome to Bioconductor >>>> >>>> Vignettes contain introductory material. To view, type >>>> 'openVignette()'. To cite Bioconductor, see >>>> 'citation("Biobase")' and for packages 'citation(pkgname)'. >>>> >>>> Loading required package: genefilter >>>> Loading required package: gcrma >>>> >>>> Attaching package: 'simpleaffy' >>>> >>>> The following object(s) are masked _by_ '.GlobalEnv': >>>> >>>> getBioC >>>> >>>>> library(affy) >>>>> mydata<- ReadAffy() >>>>> eset<- rma(mydata) >>>> Background correcting >>>> Normalizing >>>> Calculating Expression >>>>> library(affycoretools); affystart(plot=T, express="rma") >>>> Loading required package: GO.db >>>> Loading required package: AnnotationDbi >>>> Loading required package: DBI >>>> Loading required package: KEGG.db >>>> Background correcting >>>> Normalizing >>>> Calculating Expression >>>> Please give the x-coordinate for a legend.30 >>>> Please give the y-coordinate for a legend.80 >>>> ExpressionSet (storageMode: lockedEnvironment) >>>> assayData: 34760 features, 35 samples >>>> element names: exprs >>>> protocolData >>>> sampleNames: A_WT1_NT_2hr.CEL B_WT1_NT_2hr.CEL ... >>>> ZI_ST1KO_HIL6_12hr.CEL (35 total) >>>> varLabels: ScanDate >>>> varMetadata: labelDescription >>>> phenoData >>>> sampleNames: A_WT1_NT_2hr.CEL B_WT1_NT_2hr.CEL ... >>>> ZI_ST1KO_HIL6_12hr.CEL (35 total) >>>> varLabels: sample >>>> varMetadata: labelDescription >>>> featureData: none >>>> experimentData: use 'experimentData(object)' >>>> Annotation: mogene10stv1 >>>>> write.exprs(eset, file="mydata.txt") >>>>> x<- data.frame(exprs(eset), exprs(eset_PMA), assayDataElement(eset_PMA, >>>> "se.exprs")); x<- x[,sort(names(x))]; write.table(x, file="mydata_PMA.xls", >>>> quote=F, col.names = NA, sep="\t") >>>> Error in exprs(eset_PMA) : >>>> error in evaluating the argument 'object' in selecting a method for >>>> function 'exprs' >>>>> mypm<- pm(mydata) >>>>> mymm<- mm(mydata) >>>>> myaffyids<- probeNames(mydata) >>>>> result<- data.frame(myaffyids, mypm, mymm) >>>>> eset; pData(eset) >>>> ExpressionSet (storageMode: lockedEnvironment) >>>> assayData: 34760 features, 35 samples >>>> element names: exprs >>>> protocolData >>>> sampleNames: A_WT1_NT_2hr.CEL B_WT1_NT_2hr.CEL ... >>>> ZI_ST1KO_HIL6_12hr.CEL (35 total) >>>> varLabels: ScanDate >>>> varMetadata: labelDescription >>>> phenoData >>>> sampleNames: A_WT1_NT_2hr.CEL B_WT1_NT_2hr.CEL ... >>>> ZI_ST1KO_HIL6_12hr.CEL (35 total) >>>> varLabels: sample >>>> varMetadata: labelDescription >>>> featureData: none >>>> experimentData: use 'experimentData(object)' >>>> Annotation: mogene10stv1 >>>> sample >>>> A_WT1_NT_2hr.CEL 1 >>>> B_WT1_NT_2hr.CEL 2 >>>> C_WT1_NT_12hr.CEL 3 >>>> D_WT1_NT_12hr.CEL 4 >>>> E_WT1_HIL6_2hr.CEL 5 >>>> F_WT1_HIL6_2hr.CEL 6 >>>> G_WT1_HIL6_12hr.CEL 7 >>>> H_WT1_HIL6_12hr.CEL 8 >>>> I_FF_NT_2hr.CEL 9 >>>> J_FF_NT_2hr.CEL 10 >>>> K_FF_NT_12hr.CEL 11 >>>> L_FF_NT_12hr.CEL 12 >>>> M_FF_HIL6_2hr.CEL 13 >>>> N_FF_HIL6_2hr.CEL 14 >>>> O_FF_HIL6_12hr.CEL 15 >>>> P_FF_HIL6_12hr.CEL 16 >>>> Q_WT2_NT_2hr.CEL 17 >>>> R_WT2_NT_2hr.CEL 18 >>>> S_WT2_NT_12hr.CEL 19 >>>> T_WT2_NT_12hr.CEL 20 >>>> U_WT2_HIL6_2hr.CEL 21 >>>> V_WT2_HIL6_2hr.CEL 22 >>>> W_WT2_HIL6_12hr.CEL 23 >>>> X_WT2_HIL6_12hr.CEL 24 >>>> Y_DD_NT_2hr.CEL 25 >>>> Z_DD_NT_2hr.CEL 26 >>>> ZA_DD_NT_12hr.CEL 27 >>>> ZB_DD_NT_12hr.CEL 28 >>>> ZC_DD_HIL6_2hr.CEL 29 >>>> ZD_DD_HIL6_2hr.CEL 30 >>>> ZE_DD_HIL6_12hr.CEL 31 >>>> ZF_DD_HIL6_12hr.CEL 32 >>>> ZG_ST1KO_NT_2hr.CEL 33 >>>> ZH_ST1KO_HIL6_2hr.CEL 34 >>>> ZI_ST1KO_HIL6_12hr.CEL 35 >>>>> data.frame(eset) >>>> X10338001 X10338003 X10338004 X10338017 X10338025 >>>> A_WT1_NT_2hr.CEL 11.71717 10.183620 9.440631 12.79412 8.823529 >>>> B_WT1_NT_2hr.CEL 11.78778 10.027760 9.489226 12.98544 8.843002 >>>> X10338026 X10338029 X10338035 X10338036 X10338037 >>>> A_WT1_NT_2hr.CEL 13.22585 9.405038 8.853564 9.379031 3.661987 >>>> B_WT1_NT_2hr.CEL 13.29043 9.575309 8.772872 9.513050 3.514885 >>>> X10338041 X10338042 X10338044 X10338047 X10338056 >>>> A_WT1_NT_2hr.CEL 10.94638 10.116516 11.88296 8.872839 3.133222 >>>> B_WT1_NT_2hr.CEL 11.23276 10.134084 12.03381 7.568584 3.088548 >>>> X10338059 X10338060 X10338063 X10338064 X10338065 >>>> >>>> JIM, I TRUNCATED THIS LIST, BUT THOUGHT IT MIGHT BE USEFUL IN DIAGNOSING >>>> THE >>>> PROBLEMS I'M HAVING. SESSION IS CONTINUED BELOW. >>>> >>>>> library(affyQCReport) >>>> Loading required package: lattice >>>>> titlePage(mydata) >>>> [1] TRUE >>>>> signalDist(mydata) >>>> Warning message: >>>> In data.row.names(row.names, rowsi, i) : >>>> some row.names duplicated: >>>> 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,49,50,51,5 2,53,>>>> 5 >>>> 4,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99 ,102,>>>> 1 >>>> 03,104,108,109,110,111,114,119,120,121,122,127,134,136,137,138,139,141 ,142,>>>> 1 >>>> 47,148,149,152,153,156,157,158,159,162,163,164,165,166,167,168,169,170 ,171,>>>> 1 >>>> 73,175,176,179,180,183,184,185,186,191,192,195,197,198,199,200,202,206 ,207,>>>> 2 >>>> 10,219,220,227,228,229,230,233,234,235,240,241,243,245,246,248,249,250 ,251,>>>> 2 >>>> 52,253,257,259,260,266,271,272,276,277,280,281,284,286,287,289,290,291 ,292,>>>> 2 >>>> 96,297,298,302,304,305,306,310,311,312,313,317,318,319,321,322,324,334 ,337,>>>> 3 >>>> 38,339,340,341,345,346,350,351,356,359,362,364,366,367,370,371,373,376 ,378,>>>> 3 >>>> 82,383,384,385,386,387,388,389,391,394,395,397,398,399,400,402,403,405 ,406,>>>> 4 >>>> 07,409,410,411,415,416,418,419,425,431,432,433,434,435,440,441,443,445 ,447,>>>> 4 >>>> 49,450,452,454,455,456,461,464,466,470,472,473,481,487,488,491,492,493 ,494,>>>> 4 >>>> 95,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51 [... >>>> truncated] >>>>> plot(qc(mydata)) >>>> Error in plot(qc(mydata)) : >>>> error in evaluating the argument 'x' in selecting a method for function >>>> 'plot' >>>>> borderQC1(mydata) >>>> [1] TRUE >>>>> borderQC2(mydata) >>>> [1] TRUE >>>>> correlationPlot(mydata) >>>> [1] TRUE >>>>> titlePage(mydata) >>>> [1] TRUE >>>>> titlePage(mydata) >>>> Error in polygon(c(0, 0, 0.9, 0.9, 0), c(0.05, 0.95, 0.95, 0.05, 0.05)) : >>>> plot.new has not been called yet >>>>> correlationPlot(mydata) >>>> [1] TRUE >>>>> titlePage(mydata) >>>> Error in polygon(c(0, 0, 0.9, 0.9, 0), c(0.05, 0.95, 0.95, 0.05, 0.05)) : >>>> plot.new has not been called yet >>>> In addition: Warning message: >>>> Display list redraw incomplete >>>>> borderQC1(mydata) >>>> [1] TRUE >>>>> titlePage(mydata) >>>> [1] TRUE >>>>> titlePage(mydata) >>>> Error in polygon(c(0, 0, 0.9, 0.9, 0), c(0.05, 0.95, 0.95, 0.05, 0.05)) : >>>> plot.new has not been called yet >>>>> traceback() >>>> 2: polygon(c(0, 0, 0.9, 0.9, 0), c(0.05, 0.95, 0.95, 0.05, 0.05)) >>>> 1: titlePage(mydata) >>>>> sessionInfo() >>>> R version 2.12.0 (2010-10-15) >>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >>>> >>>> locale: >>>> [1] en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8 >>>> >>>> attached base packages: >>>> [1] stats graphics grDevices utils datasets methods base >>>> >>>> other attached packages: >>>> [1] affyQCReport_1.28.1 lattice_0.19-13 affycoretools_1.22.0 >>>> [4] KEGG.db_2.4.5 GO.db_2.4.5 RSQLite_0.9-4 >>>> [7] DBI_0.2-5 AnnotationDbi_1.12.0 mogene10stv1cdf_2.7.0 >>>> [10] simpleaffy_2.26.1 gcrma_2.22.0 genefilter_1.32.0 >>>> [13] affy_1.28.0 Biobase_2.10.0 >>>> >>>> loaded via a namespace (and not attached): >>>> [1] affyio_1.18.0 affyPLM_1.26.0 annaffy_1.22.0 >>>> [4] annotate_1.28.0 biomaRt_2.6.0 Biostrings_2.18.2 >>>> [7] Category_2.16.0 GOstats_2.16.0 graph_1.28.0 >>>> [10] grid_2.12.0 GSEABase_1.12.2 IRanges_1.8.7 >>>> [13] limma_3.6.9 preprocessCore_1.12.0 RBGL_1.26.0 >>>> [16] RColorBrewer_1.0-2 RCurl_1.4-3 splines_2.12.0 >>>> [19] survival_2.36-2 tools_2.12.0 XML_3.2-0 >>>> [22] xtable_1.5-6 >>>>> >>>> >>>> On 7/01/11 12:47 PM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> wrote: >>>> >>>>> Hi Rick, >>>>> >>>>> What happens if you load the simpleaffy package first? >>>>> >>>>> Best, >>>>> >>>>> Jim >>>>> >>>>> On 1/7/2011 2:14 PM, Rick Frausto wrote: >>>>>> Hi James, >>>>>> >>>>>> Below is the information that you requested - traceback() and >>>>>> sessioninfo(). >>>>>> Doesn't seem like much to me, but perhaps you can help. As you answer to >>>>>> a >>>>>> lot of e-mails, thought I'd remind you that this is in regards to the >>>>>> "some >>>>>> row.names duplicated" error. >>>>>> >>>>>> Hope your holidays were good! >>>>>> >>>>>> -Rick >>>>>> >>>>>> [R.app GUI 1.35 (5632) x86_64-apple-darwin9.8.0] >>>>>> >>>>>> [Workspace restored from /Users/rickfrausto/.RData] >>>>>> [History restored from /Users/rickfrausto/.Rapp.history] >>>>>> >>>>>>> library(affy) >>>>>> Loading required package: Biobase >>>>>> >>>>>> Welcome to Bioconductor >>>>>> >>>>>> Vignettes contain introductory material. To view, type >>>>>> 'openVignette()'. To cite Bioconductor, see >>>>>> 'citation("Biobase")' and for packages 'citation(pkgname)'. >>>>>> >>>>>>> mydata<- ReadAffy() >>>>>>> eset<- rma(mydata) >>>>>> Background correcting >>>>>> Normalizing >>>>>> Calculating Expression >>>>>>> write.exprs(eset, file="mydata.txt") >>>>>>> mypm<- pm(mydata) >>>>>>> mymm<- mm(mydata) >>>>>>> myaffyids<- probeNames(mydata) >>>>>>> result<- data.frame(myaffyids, mypm, mymm) >>>>>>> library(affyQCReport); QCReport(mydata, file="ExampleQC.pdf") >>>>>> Loading required package: lattice >>>>>> Warning message: >>>>>> In data.row.names(row.names, rowsi, i) : >>>>>> some row.names duplicated: >>>>>> >> 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,49,50,5 1,52,53,>> >> >> >> 5 >>>>>> >> 4,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98 ,99,102,>> >> >> >> 1 >>>>>> >> 03,104,108,109,110,111,114,119,120,121,122,127,134,136,137,138,139, 141,142,>> >> >> >> 1 >>>>>> >> 47,148,149,152,153,156,157,158,159,162,163,164,165,166,167,168,169, 170,171,>> >> >> >> 1 >>>>>> >> 73,175,176,179,180,183,184,185,186,191,192,195,197,198,199,200,202, 206,207,>> >> >> >> 2 >>>>>> >> 10,219,220,227,228,229,230,233,234,235,240,241,243,245,246,248,249, 250,251,>> >> >> >> 2 >>>>>> >> 52,253,257,259,260,266,271,272,276,277,280,281,284,286,287,289,290, 291,292,>> >> >> >> 2 >>>>>> >> 96,297,298,302,304,305,306,310,311,312,313,317,318,319,321,322,324, 334,337,>> >> >> >> 3 >>>>>> >> 38,339,340,341,345,346,350,351,356,359,362,364,366,367,370,371,373, 376,378,>> >> >> >> 3 >>>>>> >> 82,383,384,385,386,387,388,389,391,394,395,397,398,399,400,402,403, 405,406,>> >> >> >> 4 >>>>>> >> 07,409,410,411,415,416,418,419,425,431,432,433,434,435,440,441,443, 445,447,>> >> >> >> 4 >>>>>> >> 49,450,452,454,455,456,461,464,466,470,472,473,481,487,488,491,492, 493,494,>> >> >> >> 4 >>>>>> 95,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51 [... >>>>>> truncated] >>>>>> Error in plot(qc(object)) : >>>>>> error in evaluating the argument 'x' in selecting a method for >>>>>> function >>>>>> 'plot' >>>>>>> traceback() >>>>>> 2: plot(qc(object)) >>>>>> 1: QCReport(mydata, file = "ExampleQC.pdf") >>>>>>> sessionInfo() >>>>>> R version 2.12.0 (2010-10-15) >>>>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >>>>>> >>>>>> locale: >>>>>> [1] en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8 >>>>>> >>>>>> attached base packages: >>>>>> [1] stats graphics grDevices utils datasets methods base >>>>>> >>>>>> other attached packages: >>>>>> [1] affyQCReport_1.28.1 latptice_0.19-13 mogene10stv1cdf_2.7.0 >>>>>> [4] affy_1.28.0 Biobase_2.10.0 >>>>>> >>>>>> loaded via a namespace (and not attached): >>>>>> [1] affyio_1.18.0 affyPLM_1.26.0 annotate_1.28.0 >>>>>> [4] AnnotationDbi_1.12.0 Biostrings_2.18.2 DBI_0.2-5 >>>>>> [7] gcrma_2.22.0 genefilter_1.32.0 grid_2.12.0 >>>>>> [10] IRanges_1.8.7 preprocessCore_1.12.0 RColorBrewer_1.0-2 >>>>>> [13] RSQLite_0.9-4 simpleaffy_2.26.1 splines_2.12.0 >>>>>> [16] survival_2.36-2 tools_2.12.0 xtable_1.5-6 >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On 20/12/10 6:33 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> >>>>>> wrote: >>>>>> >>>>>>> Hi Rick, >>>>>>> >>>>>>> On 12/17/2010 9:24 PM, Rick Frausto wrote: >>>>>>>> Hey Jim, >>>>>>>> >>>>>>>> Ok, I will give that a go. The only problem is an ExpressionSet >>>>>>>> contains >>>>>>>> all >>>>>>>> of the necessary information for further analysis (e.g. phenodata, >>>>>>>> featuredata and annotation, etc - including, treatment type, cell type, >>>>>>>> time >>>>>>>> points, replicates). I am still learning how to include all of these >>>>>>>> for >>>>>>>> a >>>>>>>> complete ExpressionSet. As a starting point I've loaded a txt file >>>>>>>> containing some of this information (gene abbrev, ontology, probeset >>>>>>>> ID) >>>>>>>> which I created using Affymetrix's Expression Console software, without >>>>>>>> replicate, time point and cell type info. Doing this I've gotten as far >>>>>>>> as >>>>>>>> creating a minimal ExpressionSet, which I guess the functions you >>>>>>>> mention >>>>>>>> below do just that but with the information contained in the CEL file >>>>>>>> only. >>>>>>>> >>>>>>>> In any case, since as you say, the functions in the online manual >>>>>>>> create >>>>>>>> a >>>>>>>> proper ExpressionSet why would I get the issue of duplication? >>>>>>> >>>>>>> Oh yeah, the original question ;-D. Try running QCreport() again, and >>>>>>> when it errors out run traceback() and send the output. Also include the >>>>>>> output of sessionInfo(). >>>>>>> >>>>>>> Jim >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> In regards to the 64-bit discussion. It may have very well made enough >>>>>>>> of >>>>>>>> a >>>>>>>> difference as it did not come up with the memory error the last time I >>>>>>>> tried >>>>>>>> it. Going to upgrade to 8GB RAM anyways, can't hurt. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Rick >>>>>>>> >>>>>>>> >>>>>>>> On 17/12/10 7:20 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi Rick, >>>>>>>>> >>>>>>>>> On 12/16/2010 4:13 PM, Rick Frausto wrote: >>>>>>>>>> Hi Jim, >>>>>>>>>> >>>>>>>>>> How do I run an RMA analysis without a proper ExpresionSet? Honest >>>>>>>>>> answer, >>>>>>>>>> I >>>>>>>>>> don't know, I just put in a command line from a manual I found online >>>>>>>>>> and >>>>>>>>>> it >>>>>>>>>> spit out some result- see #3 Affy packages in following link ( >>>>>>>>>> http://manuals.bioinformatics.ucr.edu/home/R_BioCondManual#biocon_int >>>>>>>>>> ro >>>>>>>>>> ). >>>>>>>>> >>>>>>>>> You are mistaken. All of the functions mentioned there result in a >>>>>>>>> proper ExpressionSet. And if you just do >>>>>>>>> >>>>>>>>> abatch<- ReadAffy() >>>>>>>>> eset<- rma(abatch) >>>>>>>>> >>>>>>>>> Then you will 100% surely get an ExpressionSet. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Perhaps you don't need an ExpressionSet until after the >>>>>>>>>> preprocessing, >>>>>>>>>> at >>>>>>>>>> least that is what I get from the "An Introduction to Bioconductor's >>>>>>>>>> ExpressionSet Class" written by Seth Falcon, Martin Morgan and Robert >>>>>>>>>> Gentleman. Everything seemed to be going smoothly until I tried to >>>>>>>>>> get >>>>>>>>>> a >>>>>>>>>> QC >>>>>>>>>> Report. >>>>>>>>>> >>>>>>>>>> Now, the answer for why I would want to do such a thing is easy. >>>>>>>>>> Simply >>>>>>>>>> that >>>>>>>>>> I don't know any better :) Just started working with R a few days >>>>>>>>>> ago, >>>>>>>>>> but >>>>>>>>>> I'm learning. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Apparently Snow Leopard running on 32bit can only utilize about 3.2GB >>>>>>>>>> of >>>>>>>>>> RAM, whereas 64bit can make use of all 4GB. I'll switch to the 64 bit >>>>>>>>>> OS >>>>>>>>>> and >>>>>>>>>> see if it makes a difference. >>>>>>>>> >>>>>>>>> Well, it won't be much different. The reason a 32-bit OS can only use >>>>>>>>> about 3.2 Gb of RAM is that the OS needs some to run. The 64-bit OS >>>>>>>>> also >>>>>>>>> needs to use some RAM, so you won't get all 4 Gb there either. The >>>>>>>>> issue >>>>>>>>> is how much RAM can be allocated to a single process, and on a 64-bit >>>>>>>>> OS >>>>>>>>> that gets bumped up significantly. >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> >>>>>>>>> Jim >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks for your insight! >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> Rick >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 16/12/10 11:31 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Rick, >>>>>>>>>>> >>>>>>>>>>> On 12/16/2010 12:57 PM, Rick Frausto wrote: >>>>>>>>>>>> Thanks Jim! How much memory would I need, I currently have 4GB, but >>>>>>>>>>>> have >>>>>>>>>>>> quite a few other programs running in the background...I'll see if >>>>>>>>>>>> closing >>>>>>>>>>>> them helps. Perhaps setting up an "ExpressionSet" would solve the >>>>>>>>>>>> problem. >>>>>>>>>>>> I >>>>>>>>>>>> just started reading up on how to set one of these up yesterday. >>>>>>>>>>>> Will >>>>>>>>>>>> do >>>>>>>>>>>> this and see if the duplicates will go away. >>>>>>>>>>>> >>>>>>>>>>>> The "mydata" originates from CEL files and then I run the RMA >>>>>>>>>>>> analysis >>>>>>>>>>>> on >>>>>>>>>>>> it, but I didn't actually set up a proper ExpressionSet. I'm >>>>>>>>>>>> guessing >>>>>>>>>>>> that >>>>>>>>>>>> doing this might reduce the QCReport PDF file size quite >>>>>>>>>>>> considerably >>>>>>>>>>>> since >>>>>>>>>>>> I won't have any duplication and will make further analysis easier. >>>>>>>>>>> >>>>>>>>>>> How do you run an RMA analysis without setting up a proper >>>>>>>>>>> ExpressionSet? The default behavior is to create one. In addition, >>>>>>>>>>> why >>>>>>>>>>> would you want to do such a thing? The ExpressionSet class is >>>>>>>>>>> specifically designed to contain these sorts of data. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I'm running Snow Leopard OSX which can be set up as 64bit. Would >>>>>>>>>>>> running >>>>>>>>>>>> as >>>>>>>>>>>> 64bit still necessitate more RAM? >>>>>>>>>>> >>>>>>>>>>> Probably. The difference isn't efficiency, but the ability to >>>>>>>>>>> address >>>>>>>>>>> more RAM. A 32-bit OS can still address all the available memory >>>>>>>>>>> that >>>>>>>>>>> you will have with just 4 Gb RAM, so you need to bump that up if you >>>>>>>>>>> want to do all the chips together. As for how much, I don't know. >>>>>>>>>>> Since >>>>>>>>>>> RAM isn't that expensive these days, you might look at maxing your >>>>>>>>>>> box >>>>>>>>>>> out. >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> >>>>>>>>>>> Jim >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks again, >>>>>>>>>>>> Rick >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 15/12/10 7:45 AM, "James W. MacDonald"<jmacdon at="" med.umich.edu=""> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hi Rick, >>>>>>>>>>>> >>>>>>>>>>>> On 12/14/2010 3:55 PM, Rick Frausto wrote: >>>>>>>>>>>> Dear All, >>>>>>>>>>>> >>>>>>>>>>>> I have recently entered the world of R. Through some trial and >>>>>>>>>>>> error >>>>>>>>>>>> I'm >>>>>>>>>>>> becoming more familiar with R and the relevant Bioconductor Affy >>>>>>>>>>>> packages. >>>>>>>>>>>> I?m a molecular and cell biologist with rudimentary statistical >>>>>>>>>>>> knowledge >>>>>>>>>>>> and even less knowledge with respect to R. >>>>>>>>>>>> >>>>>>>>>>>> When I enter the following: >>>>>>>>>>>> >>>>>>>>>>>> library(affyQCReport); QCReport(mydata, file="ExampleQC.pdf") >>>>>>>>>>>> >>>>>>>>>>>> I get some errors in return. >>>>>>>>>>>> >>>>>>>>>>>> Loading required package: lattice >>>>>>>>>>>> Error: cannot allocate vector of size 437.4 Mb >>>>>>>>>>>> >>>>>>>>>>>> This indicates that you need more RAM, as you are running out of >>>>>>>>>>>> memory. >>>>>>>>>>>> >>>>>>>>>>>> In addition: Warning message: >>>>>>>>>>>> In data.row.names(row.names, rowsi, i) : >>>>>>>>>>>> some row.names duplicated: >>>>>>>>>>>> >>>>>>>>>> 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,49,50,51, >>>>>>>>>> 52 >>>>>>>>>> ,5 >>>>>>>>>> 3, >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> 5 >>>>>>>>>>>> >>>>>>>>>> 4,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,9 >>>>>>>>>> 9, >>>>>>>>>> 10 >>>>>>>>>> 2, >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> 1 >>>>>>>>>>>> >>>>>>>>>> 03,104,108,109,110,111,114,119,120,121,122,127,134,136,137,138,139,14 >>>>>>>>>> 1, >>>>>>>>>> 14 >>>>>>>>>> 2, >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> 1 >>>>>>>>>>>> >>>>>>>>>> 47,148,149,152,153,156,157,158,159,162,163,164,165,166,167,168,169,17 >>>>>>>>>> 0, >>>>>>>>>> 17 >>>>>>>>>> 1, >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> 1 >>>>>>>>>>>> >>>>>>>>>> 73,175,176,179,180,183,184,185,186,191,192,195,197,198,199,200,202,20 >>>>>>>>>> 6, >>>>>>>>>> 20 >>>>>>>>>> 7, >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> 2 >>>>>>>>>>>> >>>>>>>>>> 10,219,220,227,228,229,230,233,234,235,240,241,243,245,246,248,249,25 >>>>>>>>>> 0, >>>>>>>>>> 25 >>>>>>>>>> 1, >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> 2 >>>>>>>>>>>> >>>>>>>>>> 52,253,257,259,260,266,271,272,276,277,280,281,284,286,287,289,290,29 >>>>>>>>>> 1, >>>>>>>>>> 29 >>>>>>>>>> 2, >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> 2 >>>>>>>>>>>> >>>>>>>>>> 96,297,298,302,304,305,306,310,311,312,313,317,318,319,321,322,324,33 >>>>>>>>>> 4, >>>>>>>>>> 33 >>>>>>>>>> 7, >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> 3 >>>>>>>>>>>> >>>>>>>>>> 38,339,340,341,345,346,350,351,356,359,362,364,366,367,370,371,373,37 >>>>>>>>>> 6, >>>>>>>>>> 37 >>>>>>>>>> 8, >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> 3 >>>>>>>>>>>> >>>>>>>>>> 82,383,384,385,386,387,388,389,391,394,395,397,398,399,400,402,403,40 >>>>>>>>>> 5, >>>>>>>>>> 40 >>>>>>>>>> 6, >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> 4 >>>>>>>>>>>> >>>>>>>>>> 07,409,410,411,415,416,418,419,425,431,432,433,434,435,440,441,443,44 >>>>>>>>>> 5, >>>>>>>>>> 44 >>>>>>>>>> 7, >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> 4 >>>>>>>>>>>> >>>>>>>>>> 49,450,452,454,455,456,461,464,466,470,472,473,481,487,488,491,492,49 >>>>>>>>>> 3, >>>>>>>>>> 49 >>>>>>>>>> 4, >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> 4 >>>>>>>>>>>> 95,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51 [... >>>>>>>>>>>> truncated] >>>>>>>>>>>> >>>>>>>>>>>> What exactly is 'mydata', and how did you generate it? The above >>>>>>>>>>>> error >>>>>>>>>>>> indicates that you have duplicate row names, which IIRC isn't >>>>>>>>>>>> possible >>>>>>>>>>>> to do with an expressionSet. >>>>>>>>>>>> >>>>>>>>>>>> R(9062,0xa05c5540) malloc: *** mmap(size=458665984) failed (error >>>>>>>>>>>> code=12) >>>>>>>>>>>> *** error: can't allocate region >>>>>>>>>>>> *** set a breakpoint in malloc_error_break to debug >>>>>>>>>>>> R(9062,0xa05c5540) malloc: *** mmap(size=458665984) failed (error >>>>>>>>>>>> code=12) >>>>>>>>>>>> *** error: can't allocate region >>>>>>>>>>>> *** set a breakpoint in malloc_error_break to debug >>>>>>>>>>>> >>>>>>>>>>>> More lack of memory errors. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Error in help(dt[i], package = pkg[i], htmlhelp = TRUE) : >>>>>>>>>>>> unused argument(s) (htmlhelp = TRUE) >>>>>>>>>>>> In addition: Warning messages: >>>>>>>>>>>> 1: In data(package = .packages(all.available = TRUE)) : >>>>>>>>>>>> datasets have been moved from package 'base' to package >>>>>>>>>>>> 'datasets' >>>>>>>>>>>> 2: In data(package = .packages(all.available = TRUE)) : >>>>>>>>>>>> datasets have been moved from package 'stats' to package >>>>>>>>>>>> 'datasets' >>>>>>>>>>>> starting httpd help server ... done >>>>>>>>>>>> >>>>>>>>>>>> Would someone be able to diagnose the problem and suggest a >>>>>>>>>>>> solution? >>>>>>>>>>>> >>>>>>>>>>>> First, get more RAM. Second, you will be better off using a 64-bit >>>>>>>>>>>> OS. >>>>>>>>>>>> Depending on your hardware, you might be able to just install a >>>>>>>>>>>> 64-bit >>>>>>>>>>>> version of R. >>>>>>>>>>>> >>>>>>>>>>>> Best, >>>>>>>>>>>> >>>>>>>>>>>> Jim >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> If it is useful, I am using the following R software: R for Mac OS >>>>>>>>>>>> X >>>>>>>>>>>> GUI >>>>>>>>>>>> 1.35-dev Leopard build 32-bit. If there is any other info that >>>>>>>>>>>> would >>>>>>>>>>>> be >>>>>>>>>>>> useful please let me know. >>>>>>>>>>>> >>>>>>>>>>>> I had a read of the AffyQCReport Package pdf and I have added the >>>>>>>>>>>> following >>>>>>>>>>>> line: QCReport(ReadAffy(widget=TRUE)). Then I tried >>>>>>>>>>>> library(affyQCReport); >>>>>>>>>>>> QCReport(mydata, file="ExampleQC.pdf") again. It now seems to be >>>>>>>>>>>> doing >>>>>>>>>>>> something, in other words it doesn?t go to the error, yet, but it?s >>>>>>>>>>>> been >>>>>>>>>>>> processing for about 10 minutes. I am analyzing 35 chips. >>>>>>>>>>>> >>>>>>>>>>>> Perhaps it would work if I tried to generate each QCReport page >>>>>>>>>>>> separately >>>>>>>>>>>> rather than as a whole. >>>>>>>>>>>> >>>>>>>>>>>> Cordially, >>>>>>>>>>>> Rick >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> Bioconductor mailing list >>>>>>>>>>>> Bioconductor at r-project.org >>>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>>>>>>>> Search the archives: >>>>>>>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>> >>>> >> -- Rick Frausto PhD Candidate The University of Sydney School of Molecular Bioscience G08 Camperdown, NSW 2006 AUSTRALIA ricardo.frausto at sydney.edu.au Phone: 61 2 9036 5354 Lab of Iain L. Campbell
ADD REPLY

Login before adding your answer.

Traffic: 633 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6