heatmap.2 - change column & row locations; angle / rotate

Entering edit mode

k. brand ▴ 420

@k-brand-1874

Last seen 10.5 years ago

<reposting from="" "r-help="" at="" r-project.org"=""> Esteemed BioC user's, I'm struggling to achieve some details of a heatmap using heatmap.2(): 1. Change label locations, for both rows & columns from the default right & bottom, to left and top. Can this be done within heatmap.2()? Or do i need to suppress this default behavior (how) and call a new function to relabel (what) specifying locations? 2. Change the angle of the labels. By default column labels are 90deg anti-clock-wise from horizontal. How to bring them back to horizontal? Or better, rotate 45deg clock-wise from horizontal (ie., rotate 135deg a.clock.wise from default)? Any suggestions or pointers to helpful resources greatly appreciated, Karl -- Karl Brand Department of Genetics Erasmus MC Dr Molewaterplein 50 3015 GE Rotterdam T +31 (0)10 704 3457 |F +31 (0)10 704 4743 |M +31 (0)642 777 268

• 5.1k views

ADD COMMENT • link updated 14.6 years ago by Amos Folarin ▴ 20 • written 14.6 years ago by k. brand ▴ 420

Entering edit mode

Amos Folarin ▴ 20

@amos-folarin-4180

Last seen 10.5 years ago

Hi Karl, The only way I know to rotate the labels is pretty crude. You will have to reconstitute the labels using the text() function. The caveat here is you'll have to play around to get this right. Try something like this: Library(gplots) x <- matrix(rnorm(25), 5) heatmap.2(x, labRow="", labCol="") #remove the labels # plot the text, perhaps someone can think of a smarter way of getting the labels in position... text(seq(par("xaxp")[1]+par("xaxp")[2]/par("xaxp")[3], par("xaxp")[2], by=0.8*(par("xaxp")[2]/par("xaxp")[3])),par("usr")[3], par("usr")[3] - 0.2, labels = c("first", "second", "third", "fourth", "fifth"), srt = 45, pos = 1, xpd = TRUE) Unfortunatetly the heatmap is laid out in a 2x2 matrix with the dendrograms and key in the first 3 cells and the heatmap in the bottom right -- I'm not sure if it is possible to access the axes of this element independently. If one could then it might make positioning the labels for the heatmap moiety of the plot simple. Amos -----Original Message----- From: bioconductor-bounces@stat.math.ethz.ch [mailto:bioconductor- bounces@stat.math.ethz.ch] On Behalf Of bioconductor- request@stat.math.ethz.ch Sent: 23 July 2010 11:00 To: bioconductor at stat.math.ethz.ch Subject: Bioconductor Digest, Vol 89, Issue 22 Send Bioconductor mailing list submissions to bioconductor at stat.math.ethz.ch To subscribe or unsubscribe via the World Wide Web, visit https://stat.ethz.ch/mailman/listinfo/bioconductor or, via email, send a message with subject or body 'help' to bioconductor-request at stat.math.ethz.ch You can reach the person managing the list at bioconductor-owner at stat.math.ethz.ch When replying, please edit your Subject line so it is more specific than "Re: Contents of Bioconductor digest..." Today's Topics: 1. heatmap.2 - change column & row locations; angle / rotate (Karl Brand) 2. In limma, how to set quility weight for each spot. (Jinyan Huang) 3. Re: In limma, how to set quility weight for each spot. (Sean Davis) 4. Re: exonmap/xmapcore error (Crispin Miller) 5. Heatmap.2 scale problems: Sacling inside the function gives different results than scaling outside!!! (Elmer Fern?ndez) 6. Re: exonmap/xmapcore error (Crispin Miller) 7. Re: Heatmap.2 scale problems: Sacling inside the function gives different results than scaling outside!!! (Sean Davis) 8. ShortRead QA (Alex Gutteridge) 9. Re: Heatmap.2 scale problems: Sacling inside the function gives different results than scaling outside!!! (Bazeley, Peter) 10. Re: Heatmap.2 scale problems: Sacling inside the function gives different results than scaling outside!!! (Benjamin Otto) 11. Biostrings - vcountPattern optimization (Erik Wright) 12. Re: Biostrings - vcountPattern optimization (Steve Lianoglou) 13. problem about hgu133plus2 annotation (Gina Liao) 14. Re: Heatmap.2 scale problems: Sacling inside the function gives different results than scaling outside!!! (Elmer Fern?ndez) 15. Re: problem about hgu133plus2 annotation (Marc Carlson) 16. Re: problem about hgu133plus2 annotation (James W. MacDonald) 17. Re: Biostrings - vcountPattern optimization (Patrick Aboyoun) 18. Re: feature request - pairwiseAlignment() in Biostrings (Patrick Aboyoun) 19. Re: Biostrings - vcountPattern optimization (Erik Wright) 20. Re: feature request - pairwiseAlignment() in Biostrings (Michael Lawrence) 21. Re: Heatmap.2 scale problems: Sacling inside the function gives different results than scaling outside!!! (Steve Lianoglou) 22. Re: Biostrings - vcountPattern optimization (Hervé Pagès) 23. Re: Heatmap.2 scale problems: Sacling inside the function gives different results than scaling outside!!! (Elmer Fern?ndez) 24. Re: Heatmap.2 scale problems: Sacling inside the function gives different results than scaling outside!!! (Sean Davis) 25. the design matrix again (Gordon K Smyth) 26. Open Postdoc Positions (Thomas Girke) 27. Re: htQPCR (Heidi Dvinge) 28. Re: Problem with function limmaCtData in HTqPCR package: "leading minor of order 2 is not positive definite" (Heidi Dvinge) 29. building a refseq-based transcriptDb: warnings of interest? (Vincent Carey) ---------------------------------------------------------------------- Message: 1 Date: Thu, 22 Jul 2010 12:18:16 +0200 From: Karl Brand <k.brand@erasmusmc.nl> To: bioconductor at stat.math.ethz.ch Subject: [BioC] heatmap.2 - change column & row locations; angle / rotate Message-ID: <4C481AE8.7060701 at erasmusmc.nl> Content-Type: text/plain; charset=ISO-8859-1; format=flowed <reposting from="" "r-help="" at="" r-project.org"=""> Esteemed BioC user's, I'm struggling to achieve some details of a heatmap using heatmap.2(): 1. Change label locations, for both rows & columns from the default right & bottom, to left and top. Can this be done within heatmap.2()? Or do i need to suppress this default behavior (how) and call a new function to relabel (what) specifying locations? 2. Change the angle of the labels. By default column labels are 90deg anti-clock-wise from horizontal. How to bring them back to horizontal? Or better, rotate 45deg clock-wise from horizontal (ie., rotate 135deg a.clock.wise from default)? Any suggestions or pointers to helpful resources greatly appreciated, Karl -- Karl Brand Department of Genetics Erasmus MC Dr Molewaterplein 50 3015 GE Rotterdam T +31 (0)10 704 3457 |F +31 (0)10 704 4743 |M +31 (0)642 777 268 ------------------------------ Message: 2 Date: Thu, 22 Jul 2010 13:39:46 +0200 From: Jinyan Huang <jhuang.ceph@gmail.com> To: bioconductor at stat.math.ethz.ch Subject: [BioC] In limma, how to set quility weight for each spot. Message-ID: <aanlktilvavdqrbcp-lbfa8pct7sut2vguxmov5l4dzun at="" mail.gmail.com=""> Content-Type: text/plain; charset=ISO-8859-1 Hi all, My data is from GoldenGate Methylation Cancer Panel I. For each spot, there are a p-value for quility. I want to use limma to analysis the data. How can I set the quility weight for each spot? From the manual of limma, it can be set by read.maimages. But my data is not import by read.maimages. Thanks. ------------------------------ Message: 3 Date: Thu, 22 Jul 2010 06:02:28 -0600 From: Sean Davis <sdavis2@mail.nih.gov> To: Jinyan Huang <jhuang.ceph at="" gmail.com=""> Cc: bioconductor at stat.math.ethz.ch Subject: Re: [BioC] In limma, how to set quility weight for each spot. Message-ID: <aanlktin2pna5terltx53tlqiw0za5rzqlnltekidc8hd at="" mail.gmail.com=""> Content-Type: text/plain On Thu, Jul 22, 2010 at 5:39 AM, Jinyan Huang <jhuang.ceph at="" gmail.com=""> wrote: > Hi all, > My data is from GoldenGate Methylation Cancer Panel I. For each spot, > there are a p-value for quility. I want to use limma to analysis the > data. How can I set the quility weight for each spot? From the manual > of limma, it can be set by read.maimages. But my data is not import by > read.maimages. > > Hi, Jinyan. You'll want to read the help for lmFit(). Sean [[alternative HTML version deleted]] ------------------------------ Message: 4 Date: Thu, 22 Jul 2010 13:58:04 +0100 From: "Crispin Miller" <cmiller@picr.man.ac.uk> To: "Bioconductor" <bioconductor at="" stat.math.ethz.ch=""> Subject: Re: [BioC] exonmap/xmapcore error Message-ID: <c86dfeec.cc8d%cmiller at="" picr.man.ac.uk=""> Content-Type: text/plain Dear Anupam, Since we published exonmap, we've released a newer package, xmapcore. This focuses on the core database connectivity and has a significant amount of work done behind the API to make certain bits of it much much quicker. We'll put a note in the exonmap vignette to point people to the new package, since it's obviously causing a bit of confusion. One thing that xmapcore does is use a smaller database that's been optimised for some of the queries that were slower in exonmap than we would have liked - this also means that you no longer have to install Ensembl - the xmapcore database, on it's own, will do the job. Have a look at the documentation for the xmapcore package (especially INSTALL.pdf) that provides step-by-step installation instructions. As we mention in the exonmap vignette, there were some basic utility functions to help people load and begin to explore exon array data. As you'll see from the vignette, we've not duplicated these in xmapcore. Crispin On 20/07/2010 17:00, "anupam sinha" <anupam.contact at="" gmail.com=""> wrote: > Dear all, > I have been learning to use exonmap/xmapcore from the > tutorial ""Comprehensive analysis of Affymetrix Exon arrays Using > BioConductor" . > But I have run into some problems. I have installed > "xmapcore_homo_sapiens_58" on my system as per instructions . > Do I also have to install ensemble and old exonmap databases? Can > someone help me out ? Thanks in advance for any suggestions. > > >> > library(xmapcore) >> > library(exonmap) > Loading required package: affy > Loading required package: Biobase > > Welcome to Bioconductor > > Vignettes contain introductory material. To view, type > 'openVignette()'. To cite Bioconductor, see > 'citation("Biobase")' and for packages 'citation(pkgname)'. > > > Attaching package: 'Biobase' > > The following object(s) are masked from 'package:IRanges': > > updateObject > > Loading required package: genefilter > Loading required package: RColorBrewer > > Attaching package: 'exonmap' > > The following object(s) are masked from 'package:xmapcore': > > exon.details, exon.to.gene, exon.to.probeset, exon.to.transcript, > exonic, exons.in.range, gene.details, gene.to.exon, > gene.to.probeset, gene.to.transcript, genes.in.range, intergenic, > intronic, is.exonic, is.intergenic, is.intronic, probes.in.range, > probeset.to.exon, probeset.to.gene, probeset.to.probe, > probeset.to.transcript, probesets.in.range, symbol.to.gene, > transcript.details, transcript.to.exon, transcript.to.gene, > transcript.to.probeset, transcripts.in.range > > >> > setwd("/home/aragorn/R_Workspace/ExonarraysMCF7andMCF10Adata_cel/") >> > raw.data<-read.exon() >> > raw.data at cdfName<-"exon.pmcdf" >> > x.rma<-rma(raw.data) > Background correcting > Normalizing > Calculating Expression >> > pc.rma<-pc(x.rma,"group",c("a","b")) >> > keep<-(abs(fc(pc.rma))>1)&tt(pc.rma)< 1e-4 >> > sigs<-featureNames(x.rma)[keep] >> > xmapConnect() > Select a database to connect to: > > 1: Hman ('xmapcore_homo_sapiens_58') > > Selection: 1 > password: > Warning message: > In .xmap.load.config() : > Environment 'R_XMAP_CONF_DIR' not set. Please refer to INSTALL.TXT for > information on how to set this up. > > Trying '.exonmap'. > >> > probeset.to.exon(sigs[1:5]) > *Error in mysqlExecStatement(conn, statement, ...) : > RS-DBI driver: (could not run statement: PROCEDURE > xmapcore_homo_sapiens_58.xmap_probesetToExon does not exist)* >> > xmapConnect() > Select a database to connect to: > > 1: Hman ('xmapcore_homo_sapiens_58') > > Selection: 1 > >> > probeset.to.exon(sigs[1:5]) > Error in mysqlExecStatement(conn, statement, ...) : > RS-DBI driver: (could not run statement: PROCEDURE > xmapcore_homo_sapiens_58.xmap_probesetToExon does not exist) > >> > xmap.connect() > password: > Disconnecting from xmapcore_homo_sapiens_58 (localhost) > Connected to xmapcore_homo_sapiens_58 (localhost) > Selected array 'HuEx-1_0' as a default. >> > probeset.to.exon(sigs[1:5]) > *Error in mysqlExecStatement(conn, statement, ...) : > RS-DBI driver: (could not run statement: PROCEDURE > xmapcore_homo_sapiens_58.xmap_probesetToExon does not exist)* >> > sessionInfo() > R version 2.11.0 (2010-04-22) > x86_64-redhat-linux-gnu > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] exon.pmcdf_1.1 exonmap_2.6.0 RColorBrewer_1.0-2 > genefilter_1.30.0 > [5] affy_1.26.1 Biobase_2.8.0 xmapcore_1.2.5 > digest_0.4.2 > [9] IRanges_1.6.8 RMySQL_0.7-4 DBI_0.2-5 > > loaded via a namespace (and not attached): > [1] affyio_1.16.0 annotate_1.26.1 AnnotationDbi_1.10.2 > [4] preprocessCore_1.10.0 RSQLite_0.9-1 splines_2.11.0 > [7] survival_2.35-8 tcltk_2.11.0 tools_2.11.0 > [10] xtable_1.5-6 > > Regards, > > Anupam > -- > Graduate Student, > Center For DNA Fingerprinting And Diagnostics, > 4-1-714 to 725/2, Tuljaguda complex > Mozamzahi Road, Nampally, > Hyderabad-500001 > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -------------------------------------------------------- This email is confidential and intended solely for the u...{{dropped:15}} ------------------------------ Message: 5 Date: Thu, 22 Jul 2010 10:05:39 -0300 From: Elmer Fern?ndez <elmerfer@gmail.com> To: Bioconductor mailing list <bioconductor at="" stat.math.ethz.ch=""> Subject: [BioC] Heatmap.2 scale problems: Sacling inside the function gives different results than scaling outside!!! Message-ID: <aanlktilqksufwajtt9skcscav0dutqie7il2mmwxqdyp at="" mail.gmail.com=""> Content-Type: text/plain Dear Users I'm working with the heatmap.2 function and I realize that if you use the scale input paramenter gives different results than usign the scale function outsie and feed the heatmap.2 fucntion with the scaled matrix. I attached the results of the two approaches and the used data matrix (M.csv). SO, what I'm doing wrong? R Code library(gplots) M=matrix(c(rnorm(10*3,1,2),rnorm(10*2,-0.5,1)),ncol=5) heatmap.2(M,scale="column",trace="none",main="scaled inside") x11();heatmap.2(scale(M),scale="none",trace="none",main="scaled outside") > sessionInfo() R version 2.10.0 (2009-10-26) x86_64-unknown-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=en_US.UTF-8 [9] LC_ADDRESS=en_US.UTF-8 LC_TELEPHONE=en_US.UTF-8 LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=en_US.UTF-8 attached base packages: [1] grid stats graphics grDevices utils datasets methods base other attached packages: [1] gplots_2.7.4 caTools_1.10 bitops_1.0-4.1 gdata_2.7.1 gtools_2.6.1 rkward_0.5.1 loaded via a namespace (and not attached): [1] tools_2.10.0 -- Elmer A. Fern?ndez (Bioing. PhD) Investigador Asistente CONICET - Research Assistant CONICET Prof. Inteligencia Artificial -UCC - Prof. Artificial Intelligence @ UCC tel: +54-(0)351-4938000 int 145 Fax: +54-(0)351-4938081 web page : http://www.uccor.edu.ar/modelo.php?param=3.8.5.15 http://sites.google.com/site/biologicaldatamininggroup/Home/ mail address: Camino Alta Gracia Km 7.1/2- C?rdoba-5017-Argentina -- Elmer A. Fern?ndez (Bioing. PhD) Investigador Asistente CONICET - Research Assistant CONICET Prof. Inteligencia Artificial -UCC - Prof. Artificial Intelligence @ UCC tel: +54-(0)351-4938000 int 145 Fax: +54-(0)351-4938081 web page : http://www.uccor.edu.ar/modelo.php?param=3.8.5.15 http://sites.google.com/site/biologicaldatamininggroup/Home/ mail address: Camino Alta Gracia Km 7.1/2- C?rdoba-5017-Argentina [[alternative HTML version deleted]] ------------------------------ Message: 6 Date: Thu, 22 Jul 2010 14:09:55 +0100 From: "Crispin Miller" <cmiller@picr.man.ac.uk> To: "Bioconductor" <bioconductor at="" stat.math.ethz.ch=""> Subject: Re: [BioC] exonmap/xmapcore error Message-ID: <c86e01b3.cc91%cmiller at="" picr.man.ac.uk=""> Content-Type: text/plain Hi Paul, Hopefully it's simpler now - with xmapcore, you need to install just the xmapcore database into a working MySQL instance (and the package itself, of course). There's also a pretty detailed walk through in the INSTALL.pdf document that forms part of the xmapcore package. Crispin > > Yeah originally, they did a pretty poor job at describing how to do > that, it was the largest impediment to otherwise using a very nice > package. They threw you to the wolves by pointing to a section that > describes how to entire the whole ensemble DB and web interface. I > notice they have the new xmapcore database , are those the ones you are > using?: > > http://xmap.picr.man.ac.uk/download/index#hsxmapcore > > I have NOT used those > > but at least in the beginning of the year , You only need SQL to > install ,you do not need to install ensemble , just the "core" data > base. > As I recall you need to go into the SQl and get create the database > then you need to run the script that makes the tables. > Then these are filled (but a second script, cat's recall) > > my notes indicate I also inatall exon.pmcdf: (in above web link) > R CMD INSTALL --clean exon.pmcdf_1.1.tar.gz > > > > you may need to run something like this on the command line first to > start the service: > > mysql -h host_computer -u xmap -pPassword ## where the host_compueter is > where the db is and Password is the password) > > then in R > > xmapConnect("human") > > > ################## > In my home directory there is a .exnmap file with: > a file database.txt attached > > and a subfolder db.local that has > a file starts.core.homo_sapiens_core_56_37a.R a larget 3.7Mb file > > and in bashrc: > export XMAP_BRIDGE_CACHE=/home/pleo/.xmb_cache > ####### > > I think now with the new core database you might be better off using > documentation in the latest exonmap or xmapcore libraries than that original > manuscript. They have made some changes. > > Hope that helps > Paul > > > > -----Original Message----- > From: anupam sinha <anupam.contact at="" gmail.com=""> > To: bioc <bioconductor at="" stat.math.ethz.ch=""> > Subject: [BioC] exonmap/xmapcore error > Date: Tue, 20 Jul 2010 21:30:24 +0530 > > > Dear all, > I have been learning to use exonmap/xmapcore from the > tutorial ""Comprehensive analysis of Affymetrix Exon arrays Using > BioConductor" . > But I have run into some problems. I have installed > "xmapcore_homo_sapiens_58" on my system as per instructions . > Do I also have to install ensemble and old exonmap databases? Can > someone help me out ? Thanks in advance for any suggestions. > > >> > library(xmapcore) >> > library(exonmap) > Loading required package: affy > Loading required package: Biobase > > Welcome to Bioconductor > > Vignettes contain introductory material. To view, type > 'openVignette()'. To cite Bioconductor, see > 'citation("Biobase")' and for packages 'citation(pkgname)'. > > > Attaching package: 'Biobase' > > The following object(s) are masked from 'package:IRanges': > > updateObject > > Loading required package: genefilter > Loading required package: RColorBrewer > > Attaching package: 'exonmap' > > The following object(s) are masked from 'package:xmapcore': > > exon.details, exon.to.gene, exon.to.probeset, exon.to.transcript, > exonic, exons.in.range, gene.details, gene.to.exon, > gene.to.probeset, gene.to.transcript, genes.in.range, intergenic, > intronic, is.exonic, is.intergenic, is.intronic, probes.in.range, > probeset.to.exon, probeset.to.gene, probeset.to.probe, > probeset.to.transcript, probesets.in.range, symbol.to.gene, > transcript.details, transcript.to.exon, transcript.to.gene, > transcript.to.probeset, transcripts.in.range > > >> > setwd("/home/aragorn/R_Workspace/ExonarraysMCF7andMCF10Adata_cel/") >> > raw.data<-read.exon() >> > raw.data at cdfName<-"exon.pmcdf" >> > x.rma<-rma(raw.data) > Background correcting > Normalizing > Calculating Expression >> > pc.rma<-pc(x.rma,"group",c("a","b")) >> > keep<-(abs(fc(pc.rma))>1)&tt(pc.rma)< 1e-4 >> > sigs<-featureNames(x.rma)[keep] >> > xmapConnect() > Select a database to connect to: > > 1: Hman ('xmapcore_homo_sapiens_58') > > Selection: 1 > password: > Warning message: > In .xmap.load.config() : > Environment 'R_XMAP_CONF_DIR' not set. Please refer to INSTALL.TXT for > information on how to set this up. > > Trying '.exonmap'. > >> > probeset.to.exon(sigs[1:5]) > *Error in mysqlExecStatement(conn, statement, ...) : > RS-DBI driver: (could not run statement: PROCEDURE > xmapcore_homo_sapiens_58.xmap_probesetToExon does not exist)* >> > xmapConnect() > Select a database to connect to: > > 1: Hman ('xmapcore_homo_sapiens_58') > > Selection: 1 > >> > probeset.to.exon(sigs[1:5]) > Error in mysqlExecStatement(conn, statement, ...) : > RS-DBI driver: (could not run statement: PROCEDURE > xmapcore_homo_sapiens_58.xmap_probesetToExon does not exist) > >> > xmap.connect() > password: > Disconnecting from xmapcore_homo_sapiens_58 (localhost) > Connected to xmapcore_homo_sapiens_58 (localhost) > Selected array 'HuEx-1_0' as a default. >> > probeset.to.exon(sigs[1:5]) > *Error in mysqlExecStatement(conn, statement, ...) : > RS-DBI driver: (could not run statement: PROCEDURE > xmapcore_homo_sapiens_58.xmap_probesetToExon does not exist)* >> > sessionInfo() > R version 2.11.0 (2010-04-22) > x86_64-redhat-linux-gnu > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] exon.pmcdf_1.1 exonmap_2.6.0 RColorBrewer_1.0-2 > genefilter_1.30.0 > [5] affy_1.26.1 Biobase_2.8.0 xmapcore_1.2.5 > digest_0.4.2 > [9] IRanges_1.6.8 RMySQL_0.7-4 DBI_0.2-5 > > loaded via a namespace (and not attached): > [1] affyio_1.16.0 annotate_1.26.1 AnnotationDbi_1.10.2 > [4] preprocessCore_1.10.0 RSQLite_0.9-1 splines_2.11.0 > [7] survival_2.35-8 tcltk_2.11.0 tools_2.11.0 > [10] xtable_1.5-6 > > Regards, > > Anupam > -- > Graduate Student, > Center For DNA Fingerprinting And Diagnostics, > 4-1-714 to 725/2, Tuljaguda complex > Mozamzahi Road, Nampally, > Hyderabad-500001 > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -------------------------------------------------------- This email is confidential and intended solely for the u...{{dropped:15}} ------------------------------ Message: 7 Date: Thu, 22 Jul 2010 08:17:21 -0600 From: Sean Davis <sdavis2@mail.nih.gov> To: Elmer Fern?ndez <elmerfer at="" gmail.com=""> Cc: Bioconductor mailing list <bioconductor at="" stat.math.ethz.ch=""> Subject: Re: [BioC] Heatmap.2 scale problems: Sacling inside the function gives different results than scaling outside!!! Message-ID: <aanlktimzp4hrxsuyyokxjgrs7ajwfhgvg1nyrnfazpyd at="" mail.gmail.com=""> Content-Type: text/plain 2010/7/22 Elmer Fern??ndez <elmerfer at="" gmail.com=""> > Dear Users > I'm working with the heatmap.2 function and I realize that if you use the > scale input paramenter gives different results than usign the scale > function > outsie and feed the heatmap.2 fucntion with the scaled matrix. I attached > the results of the two approaches and the used data matrix (M.csv). > SO, what I'm doing wrong? > > Hi, Elmer. The default distance function used by heatmap.2 is dist() which is not going to be invariant under centering and scaling, I don't think. It looks like you are using that default. Sean > R Code > > library(gplots) > M=matrix(c(rnorm(10*3,1,2),rnorm(10*2,-0.5,1)),ncol=5) > heatmap.2(M,scale="column",trace="none",main="scaled inside") > x11();heatmap.2(scale(M),scale="none",trace="none",main="scaled outside") > > > sessionInfo() > R version 2.10.0 (2009-10-26) > x86_64-unknown-linux-gnu > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > LC_PAPER=en_US.UTF-8 LC_NAME=en_US.UTF-8 > [9] LC_ADDRESS=en_US.UTF-8 LC_TELEPHONE=en_US.UTF-8 > LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=en_US.UTF-8 > > attached base packages: > [1] grid stats graphics grDevices utils datasets methods > base > > other attached packages: > [1] gplots_2.7.4 caTools_1.10 bitops_1.0-4.1 gdata_2.7.1 > gtools_2.6.1 rkward_0.5.1 > > loaded via a namespace (and not attached): > [1] tools_2.10.0 > > > -- > Elmer A. Fern??ndez (Bioing. PhD) > Investigador Asistente CONICET - Research Assistant CONICET > Prof. Inteligencia Artificial -UCC - Prof. Artificial Intelligence @ UCC > tel: +54-(0)351-4938000 int 145 > Fax: +54-(0)351-4938081 > web page : http://www.uccor.edu.ar/modelo.php?param=3.8.5.15 > http://sites.google.com/site/biologicaldatamininggroup/Home/ > mail address: Camino Alta Gracia Km 7.1/2- C??rdoba-5017-Argentina > > > > -- > Elmer A. Fern??ndez (Bioing. PhD) > Investigador Asistente CONICET - Research Assistant CONICET > Prof. Inteligencia Artificial -UCC - Prof. Artificial Intelligence @ UCC > tel: +54-(0)351-4938000 int 145 > Fax: +54-(0)351-4938081 > web page : http://www.uccor.edu.ar/modelo.php?param=3.8.5.15 > http://sites.google.com/site/biologicaldatamininggroup/Home/ > mail address: Camino Alta Gracia Km 7.1/2- C??rdoba-5017-Argentina > > [[alternative HTML version deleted]] > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]] ------------------------------ Message: 8 Date: Thu, 22 Jul 2010 15:26:21 +0100 From: Alex Gutteridge <alexg@ruggedtextile.com> To: <bioconductor at="" stat.math.ethz.ch=""> Subject: [BioC] ShortRead QA Message-ID: <da36088b4e3acb477e837c2e970fd5a9 at="" ruggedtextile.com=""> Content-Type: text/plain; charset=UTF-8 I'm dealing with some Solexa/Illumina data with ShortRead for the first time and had a couple of questions relating to QA: 1. Memory requirements: My data comprises 7 s_N_export.txt files. Each one comprises 10-20 million aligned reads. If I try to run qa() over the whole directory my machine rapidly grinds to a halt. Tackling each file individually keeps my machine running, but takes >1 hour for each one. The ShortRead vignette says evaluating a single lane can take 'several minutes', so I'm wondering if anyone can offer any clues as to why I'm struggling so much? The machine in question has 6GB of RAM - do I just need more? 2. Read distribution: The QA results I'm getting for the 'read distribution' section don't quite look like those presented in the example ShortRead Solexa QA report. My interpretation is that this is because my data is actually rather high quality, but I'd appreciate a second opinion. To quote from the ShortRead QA report: 'Ideally, the cumulative proportion of reads will transition sharply from low to high. Portions to the left of the transition might correspond roughly to sequencing or sample processing errors, and correspond to reads that are represented relatively infrequently [...]. Portions to the right of the transition represent reads that are over-represented compared to expectation.' Typically the read distribution plots I'm seeing look like this: http://dl.dropbox.com/u/419878/readOccurences.jpg There is a sharp transition, but no portion to the left. I interpret this as a good sign: most of the reads are seen a small number of times (<10), and there are relatively few over-represented reads. Is there anything there that would worry more experienced heads? -- Alex Gutteridge ------------------------------ Message: 9 Date: Thu, 22 Jul 2010 14:25:54 +0000 From: "Bazeley, Peter" <peter.bazeley@rockets.utoledo.edu> To: Elmer Fern?ndez <elmerfer at="" gmail.com=""> Cc: Sean Davis <sdavis2 at="" mail.nih.gov="">, Bioconductor mailing list <bioconductor at="" stat.math.ethz.ch=""> Subject: Re: [BioC] Heatmap.2 scale problems: Sacling inside the function gives different results than scaling outside!!! Message-ID: <5C621FDF7E426B4AAE3B2364B7EF07371F654407 at BL2PRD0103MB050.prod.exchangelabs.com> Content-Type: text/plain; charset="iso-8859-1" Hi Elmer, The default scale option in heatmap.2 scales by row, whereas the scale() function scales by column, so this is probably why there is a difference. I think whichever dimension contains unique samples is how you want to scale (if this was expression data, for example). Pete ________________________________________ From: bioconductor-bounces@stat.math.ethz.ch [bioconductor- bounces@stat.math.ethz.ch] on behalf of Sean Davis [sdavis2@mail.nih.gov] Sent: Thursday, July 22, 2010 9:17 AM To: Elmer Fern?ndez Cc: Bioconductor mailing list Subject: Re: [BioC] Heatmap.2 scale problems: Sacling inside the function gives different results than scaling outside!!! 2010/7/22 Elmer Fern?ndez <elmerfer at="" gmail.com=""> > Dear Users > I'm working with the heatmap.2 function and I realize that if you use the > scale input paramenter gives different results than usign the scale > function > outsie and feed the heatmap.2 fucntion with the scaled matrix. I attached > the results of the two approaches and the used data matrix (M.csv). > SO, what I'm doing wrong? > > Hi, Elmer. The default distance function used by heatmap.2 is dist() which is not going to be invariant under centering and scaling, I don't think. It looks like you are using that default. Sean > R Code > > library(gplots) > M=matrix(c(rnorm(10*3,1,2),rnorm(10*2,-0.5,1)),ncol=5) > heatmap.2(M,scale="column",trace="none",main="scaled inside") > x11();heatmap.2(scale(M),scale="none",trace="none",main="scaled outside") > > > sessionInfo() > R version 2.10.0 (2009-10-26) > x86_64-unknown-linux-gnu > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > LC_PAPER=en_US.UTF-8 LC_NAME=en_US.UTF-8 > [9] LC_ADDRESS=en_US.UTF-8 LC_TELEPHONE=en_US.UTF-8 > LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=en_US.UTF-8 > > attached base packages: > [1] grid stats graphics grDevices utils datasets methods > base > > other attached packages: > [1] gplots_2.7.4 caTools_1.10 bitops_1.0-4.1 gdata_2.7.1 > gtools_2.6.1 rkward_0.5.1 > > loaded via a namespace (and not attached): > [1] tools_2.10.0 > > > -- > Elmer A. Fern?ndez (Bioing. PhD) > Investigador Asistente CONICET - Research Assistant CONICET > Prof. Inteligencia Artificial -UCC - Prof. Artificial Intelligence @ UCC > tel: +54-(0)351-4938000 int 145 > Fax: +54-(0)351-4938081 > web page : http://www.uccor.edu.ar/modelo.php?param=3.8.5.15 > http://sites.google.com/site/biologicaldatamininggroup/Home/ > mail address: Camino Alta Gracia Km 7.1/2- C?rdoba-5017-Argentina > > > > -- > Elmer A. Fern?ndez (Bioing. PhD) > Investigador Asistente CONICET - Research Assistant CONICET > Prof. Inteligencia Artificial -UCC - Prof. Artificial Intelligence @ UCC > tel: +54-(0)351-4938000 int 145 > Fax: +54-(0)351-4938081 > web page : http://www.uccor.edu.ar/modelo.php?param=3.8.5.15 > http://sites.google.com/site/biologicaldatamininggroup/Home/ > mail address: Camino Alta Gracia Km 7.1/2- C?rdoba-5017-Argentina > > [[alternative HTML version deleted]] > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]] ------------------------------ Message: 10 Date: Thu, 22 Jul 2010 16:38:16 +0200 From: Benjamin Otto <b.otto@uke.uni-hamburg.de> To: "Bazeley, Peter" <peter.bazeley at="" rockets.utoledo.edu=""> Cc: Sean Davis <sdavis2 at="" mail.nih.gov="">, Bioconductor mailing list <bioconductor at="" stat.math.ethz.ch=""> Subject: Re: [BioC] Heatmap.2 scale problems: Sacling inside the function gives different results than scaling outside!!! Message-ID: <61679366-2C04-4959-8D3D-997A45BF45F5 at uke.uni- hamburg.de> Content-Type: text/plain; charset="utf-8" Hi Guys, do note that the scale() function in heatmap doesn't scale your values till AFTER clustering for visualization purpose! So if you provide already scaled data, you naturally will expect a different result. cheers Benjamin Am 22.07.2010 um 16:25 schrieb Bazeley, Peter: > Hi Elmer, > > The default scale option in heatmap.2 scales by row, whereas the scale() function scales by column, so this is probably why there is a difference. I think whichever dimension contains unique samples is how you want to scale (if this was expression data, for example). > > > Pete > ________________________________________ > From: bioconductor-bounces at stat.math.ethz.ch [bioconductor- bounces at stat.math.ethz.ch] on behalf of Sean Davis [sdavis2 at mail.nih.gov] > Sent: Thursday, July 22, 2010 9:17 AM > To: Elmer Fern?ndez > Cc: Bioconductor mailing list > Subject: Re: [BioC] Heatmap.2 scale problems: Sacling inside the function gives different results than scaling outside!!! > > 2010/7/22 Elmer Fern?ndez <elmerfer at="" gmail.com=""> > >> Dear Users >> I'm working with the heatmap.2 function and I realize that if you use the >> scale input paramenter gives different results than usign the scale >> function >> outsie and feed the heatmap.2 fucntion with the scaled matrix. I attached >> the results of the two approaches and the used data matrix (M.csv). >> SO, what I'm doing wrong? >> >> > Hi, Elmer. > > The default distance function used by heatmap.2 is dist() which is not going > to be invariant under centering and scaling, I don't think. It looks like > you are using that default. > > Sean > > >> R Code >> >> library(gplots) >> M=matrix(c(rnorm(10*3,1,2),rnorm(10*2,-0.5,1)),ncol=5) >> heatmap.2(M,scale="column",trace="none",main="scaled inside") >> x11();heatmap.2(scale(M),scale="none",trace="none",main="scaled outside") >> >>> sessionInfo() >> R version 2.10.0 (2009-10-26) >> x86_64-unknown-linux-gnu >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >> LC_PAPER=en_US.UTF-8 LC_NAME=en_US.UTF-8 >> [9] LC_ADDRESS=en_US.UTF-8 LC_TELEPHONE=en_US.UTF-8 >> LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=en_US.UTF-8 >> >> attached base packages: >> [1] grid stats graphics grDevices utils datasets methods >> base >> >> other attached packages: >> [1] gplots_2.7.4 caTools_1.10 bitops_1.0-4.1 gdata_2.7.1 >> gtools_2.6.1 rkward_0.5.1 >> >> loaded via a namespace (and not attached): >> [1] tools_2.10.0 >> >> >> -- >> Elmer A. Fern?ndez (Bioing. PhD) >> Investigador Asistente CONICET - Research Assistant CONICET >> Prof. Inteligencia Artificial -UCC - Prof. Artificial Intelligence @ UCC >> tel: +54-(0)351-4938000 int 145 >> Fax: +54-(0)351-4938081 >> web page : http://www.uccor.edu.ar/modelo.php?param=3.8.5.15 >> http://sites.google.com/site/biologicaldatamininggroup/Home/ >> mail address: Camino Alta Gracia Km 7.1/2- C?rdoba-5017-Argentina >> >> >> >> -- >> Elmer A. Fern?ndez (Bioing. PhD) >> Investigador Asistente CONICET - Research Assistant CONICET >> Prof. Inteligencia Artificial -UCC - Prof. Artificial Intelligence @ UCC >> tel: +54-(0)351-4938000 int 145 >> Fax: +54-(0)351-4938081 >> web page : http://www.uccor.edu.ar/modelo.php?param=3.8.5.15 >> http://sites.google.com/site/biologicaldatamininggroup/Home/ >> mail address: Camino Alta Gracia Km 7.1/2- C?rdoba-5017-Argentina >> >> [[alternative HTML version deleted]] >> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > ___________________________________________ Benjamin Otto, PhD University Medical Center Hamburg-Eppendorf Institute For Clinical Chemistry / Central Laboratories Campus Forschung N27 Martinistr. 52, D-20246 Hamburg Tel.: +49 40 7410 51908 Fax.: +49 40 7410 54971 ___________________________________________ -- Pflichtangaben gem?? Gesetz ?ber elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG): Universit?tsklinikum Hamburg-Eppendorf K?rperschaft des ?ffentlichen Rechts Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. J?rg F. Debatin (Vorsitzender) Dr. Alexander Kirstein Joachim Pr?l? Prof. Dr. Dr. Uwe Koch-Gromus ------------------------------ Message: 11 Date: Thu, 22 Jul 2010 10:54:28 -0500 From: Erik Wright <eswright@wisc.edu> To: BioC list <bioconductor at="" stat.math.ethz.ch=""> Subject: [BioC] Biostrings - vcountPattern optimization Message-ID: <3E19C211-BA75-4C68-88DE-1079FE64CAB0 at wisc.edu> Content-Type: text/plain; CHARSET=US-ASCII Hello, Lately I have been working on counting sequence fragments in larger sets of sequences. I am searching for thousands of fragments of 30 to 130 bases in hundreds of thousands of sequences between 1200 and 1600 bases. Currently I am using the following method to count the number of "hits": #### start #### library(Biostrings) fragments <- DNAStringSet(c("ACTG","AAAA")) sequence_set <- DNAStringSet(c("TAGACATGAC","ACCAAATGAC")) for (i in 1:length(fragments)) { counts <- vcountPattern(fragments[[i]], sequence_set, max.mismatch=1) hits <- length(which(counts > 0)) print(hits) } #### end #### This method is taking a long time to complete, so I am wondering if I am doing this in the most efficient manner? Does anyone have a suggestion for how I can accomplish the same task more efficiently? Thanks!, Erik > sessionInfo() R version 2.11.0 (2010-04-22) x86_64-apple-darwin9.8.0 locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] Biostrings_2.16.0 IRanges_1.6.0 loaded via a namespace (and not attached): [1] Biobase_2.8.0 ------------------------------ Message: 12 Date: Thu, 22 Jul 2010 12:19:21 -0400 From: Steve Lianoglou <mailinglist.honeypot@gmail.com> To: Erik Wright <eswright at="" wisc.edu=""> Cc: BioC list <bioconductor at="" stat.math.ethz.ch=""> Subject: Re: [BioC] Biostrings - vcountPattern optimization Message-ID: <aanlktil5przsipsdxng8fszyvci5rcdqn5zgahy8rswa at="" mail.gmail.com=""> Content-Type: text/plain; charset=ISO-8859-1 Hi, On Thu, Jul 22, 2010 at 11:54 AM, Erik Wright <eswright at="" wisc.edu=""> wrote: > Hello, > > Lately I have been working on counting sequence fragments in larger sets of sequences. ?I am searching for thousands of fragments of 30 to 130 bases in hundreds of thousands of sequences between 1200 and 1600 bases. ?Currently I am using the following method to count the number of "hits": Would using bowtie as an intermediary be an option? For instance, you could consider: (i) making a bowtie-index out of your 1200-1600 bp "references" (ii) aligning your 30-130bp fragments agains it and output to SAM format (give each read a unique id so you can hunt for it later) (iii) convert SAM -> indexed BAM (iv) process bam file w/ Rsamtools -- perhaps you could simply do a `table()` on the sequence IDs of each alignment if all you want is a count -- of course now that the sequences are aligned, the data is in "good shape" to do other types of analyses as well (whatever it is that you're doing). > #### start #### > library(Biostrings) > fragments <- DNAStringSet(c("ACTG","AAAA")) > sequence_set <- DNAStringSet(c("TAGACATGAC","ACCAAATGAC")) > > for (i in 1:length(fragments)) { > ? ? ? ?counts <- vcountPattern(fragments[[i]], > ? ? ? ? ? ? ? ?sequence_set, > ? ? ? ? ? ? ? ?max.mismatch=1) > ? ? ? ?hits <- length(which(counts > 0)) > ? ? ? ?print(hits) > } > #### end #### > > This method is taking a long time to complete, so I am wondering if I am doing this in the most efficient manner? ?Does anyone have a suggestion for how I can accomplish the same task more efficiently? I don't really have any suggestions to make the above R code run faster ... sorry. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact ------------------------------ Message: 13 Date: Thu, 22 Jul 2010 17:11:26 +0800 From: Gina Liao <yi713@hotmail.com> To: <bioconductor at="" stat.math.ethz.ch=""> Subject: [BioC] problem about hgu133plus2 annotation Message-ID: <bay146-w70ae25532ad94d7bd9116eaa20 at="" phx.gbl=""> Content-Type: text/plain Dear All, I have 20 chips, and I used R to standardize the CEL files.Then, i got an expression value data of all chips.And I also downloaded the annotation csv format from NetAffy.(HG-U133_Plus_2 Annotations, CSV format, Release 30 (22 MB, 11/15/09)) Here's my code. ########test = justRMA()eset.st = standardise(test) exprs.st = exprseset.st)e.out = exprs.stdim(e.out) #* 54675 20######## However, i found out that the order of the rownames(e.out) is a little different to the row name of hgu133plus2.csv. The order from 54630 to 54640 is not the same to these two rows. They should be the same,right? Is "hgu133plus2cdf" the problem? How could I solve it? Thanks!!!!! Best,Gina _________________________________________________________________ [[alternative HTML version deleted]] ------------------------------ Message: 14 Date: Thu, 22 Jul 2010 13:34:28 -0300 From: Elmer Fern?ndez <elmerfer@gmail.com> To: Benjamin Otto <b.otto at="" uke.uni-hamburg.de=""> Cc: Sean Davis <sdavis2 at="" mail.nih.gov="">, Bioconductor mailing list <bioconductor at="" stat.math.ethz.ch=""> Subject: Re: [BioC] Heatmap.2 scale problems: Sacling inside the function gives different results than scaling outside!!! Message-ID: <aanlktindagcqq5capzkk6lteypeu4kr0bymue9sju_jp at="" mail.gmail.com=""> Content-Type: text/plain Hy Benjamin Are you sure about that? If so, I think that it is not correct, right? best Elmer 2010/7/22 Benjamin Otto <b.otto at="" uke.uni-hamburg.de=""> > Hi Guys, > > do note that the scale() function in heatmap doesn't scale your values till > AFTER clustering for visualization purpose! So if you provide already scaled > data, you naturally will expect a different result. > > cheers > > Benjamin > > Am 22.07.2010 um 16:25 schrieb Bazeley, Peter: > > > Hi Elmer, > > > > The default scale option in heatmap.2 scales by row, whereas the scale() > function scales by column, so this is probably why there is a difference. I > think whichever dimension contains unique samples is how you want to scale > (if this was expression data, for example). > > > > > > Pete > > ________________________________________ > > From: bioconductor-bounces at stat.math.ethz.ch [ > bioconductor-bounces at stat.math.ethz.ch] on behalf of Sean Davis [ > sdavis2 at mail.nih.gov] > > Sent: Thursday, July 22, 2010 9:17 AM > > To: Elmer Fern?ndez > > Cc: Bioconductor mailing list > > Subject: Re: [BioC] Heatmap.2 scale problems: Sacling inside the function > gives different results than scaling outside!!! > > > > 2010/7/22 Elmer Fern?ndez <elmerfer at="" gmail.com=""> > > > >> Dear Users > >> I'm working with the heatmap.2 function and I realize that if you use > the > >> scale input paramenter gives different results than usign the scale > >> function > >> outsie and feed the heatmap.2 fucntion with the scaled matrix. I > attached > >> the results of the two approaches and the used data matrix (M.csv). > >> SO, what I'm doing wrong? > >> > >> > > Hi, Elmer. > > > > The default distance function used by heatmap.2 is dist() which is not > going > > to be invariant under centering and scaling, I don't think. It looks > like > > you are using that default. > > > > Sean > > > > > >> R Code > >> > >> library(gplots) > >> M=matrix(c(rnorm(10*3,1,2),rnorm(10*2,-0.5,1)),ncol=5) > >> heatmap.2(M,scale="column",trace="none",main="scaled inside") > >> x11();heatmap.2(scale(M),scale="none",trace="none",main="scaled > outside") > >> > >>> sessionInfo() > >> R version 2.10.0 (2009-10-26) > >> x86_64-unknown-linux-gnu > >> > >> locale: > >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > >> LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > >> LC_PAPER=en_US.UTF-8 LC_NAME=en_US.UTF-8 > >> [9] LC_ADDRESS=en_US.UTF-8 LC_TELEPHONE=en_US.UTF-8 > >> LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=en_US.UTF-8 > >> > >> attached base packages: > >> [1] grid stats graphics grDevices utils datasets methods > >> base > >> > >> other attached packages: > >> [1] gplots_2.7.4 caTools_1.10 bitops_1.0-4.1 gdata_2.7.1 > >> gtools_2.6.1 rkward_0.5.1 > >> > >> loaded via a namespace (and not attached): > >> [1] tools_2.10.0 > >> > >> > >> -- > >> Elmer A. Fern?ndez (Bioing. PhD) > >> Investigador Asistente CONICET - Research Assistant CONICET > >> Prof. Inteligencia Artificial -UCC - Prof. Artificial Intelligence @ UCC > >> tel: +54-(0)351-4938000 int 145 > >> Fax: +54-(0)351-4938081 > >> web page : http://www.uccor.edu.ar/modelo.php?param=3.8.5.15 > >> http://sites.google.com/site/biologicaldatamininggroup/Home/ > >> mail address: Camino Alta Gracia Km 7.1/2- C?rdoba-5017-Argentina > >> > >> > >> > >> -- > >> Elmer A. Fern?ndez (Bioing. PhD) > >> Investigador Asistente CONICET - Research Assistant CONICET > >> Prof. Inteligencia Artificial -UCC - Prof. Artificial Intelligence @ UCC > >> tel: +54-(0)351-4938000 int 145 > >> Fax: +54-(0)351-4938081 > >> web page : http://www.uccor.edu.ar/modelo.php?param=3.8.5.15 > >> http://sites.google.com/site/biologicaldatamininggroup/Home/ > >> mail address: Camino Alta Gracia Km 7.1/2- C?rdoba-5017-Argentina > >> > >> [[alternative HTML version deleted]] > >> > >> > >> _______________________________________________ > >> Bioconductor mailing list > >> Bioconductor at stat.math.ethz.ch > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > >> Search the archives: > >> http://news.gmane.org/gmane.science.biology.informatics.conductor > >> > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > ___________________________________________ > Benjamin Otto, PhD > University Medical Center Hamburg-Eppendorf > Institute For Clinical Chemistry / Central Laboratories > Campus Forschung N27 > Martinistr. 52, > D-20246 Hamburg > > Tel.: +49 40 7410 51908 > Fax.: +49 40 7410 54971 > ___________________________________________ > > > > > > -- > Pflichtangaben gem?? Gesetz ?ber elektronische Handelsregister und > Genossenschaftsregister sowie das Unternehmensregister (EHUG): > > Universit?tsklinikum Hamburg-Eppendorf > K?rperschaft des ?ffentlichen Rechts > Gerichtsstand: Hamburg > > Vorstandsmitglieder: > Prof. Dr. J?rg F. Debatin (Vorsitzender) > Dr. Alexander Kirstein > Joachim Pr?l? > Prof. Dr. Dr. Uwe Koch-Gromus > -- Elmer A. Fern?ndez (Bioing. PhD) Investigador Asistente CONICET - Research Assistant CONICET Prof. Inteligencia Artificial -UCC - Prof. Artificial Intelligence @ UCC tel: +54-(0)351-4938000 int 145 Fax: +54-(0)351-4938081 web page : http://www.uccor.edu.ar/modelo.php?param=3.8.5.15 http://sites.google.com/site/biologicaldatamininggroup/Home/ mail address: Camino Alta Gracia Km 7.1/2- C?rdoba-5017-Argentina [[alternative HTML version deleted]] ------------------------------ Message: 15 Date: Thu, 22 Jul 2010 09:38:19 -0700 From: Marc Carlson <mcarlson@fhcrc.org> To: bioconductor at stat.math.ethz.ch Subject: Re: [BioC] problem about hgu133plus2 annotation Message-ID: <4C4873FB.5030207 at fhcrc.org> Content-Type: text/plain; charset=ISO-8859-1 Hi Gina, I am afraid it's a little hard to tell what is going on here. For example, I don't see sessionInfo() so it is hard to tell what you were running. And I only have enough code to wildly speculate about what you were doing. You might want to see our posting guide here: http://www.bioconductor.org/docs/postingGuide.html Marc On 07/22/2010 02:11 AM, Gina Liao wrote: > Dear All, > I have 20 chips, and I used R to standardize the CEL files.Then, i got an expression value data of all chips.And I also downloaded the annotation csv format from NetAffy.(HG-U133_Plus_2 Annotations, CSV format, Release 30 (22 MB, 11/15/09)) > Here's my code. > ########test = justRMA()eset.st = standardise(test) > exprs.st = exprseset.st)e.out = exprs.stdim(e.out) #* 54675 20######## > However, i found out that the order of the rownames(e.out) is a little different to the row name of hgu133plus2.csv. The order from 54630 to 54640 is not the same to these two rows. > They should be the same,right? Is "hgu133plus2cdf" the problem? How could I solve it? > Thanks!!!!! > Best,Gina > _________________________________________________________________ > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > ------------------------------ Message: 16 Date: Thu, 22 Jul 2010 12:41:42 -0400 From: "James W. MacDonald" <jmacdon@med.umich.edu> To: Gina Liao <yi713 at="" hotmail.com=""> Cc: bioconductor at stat.math.ethz.ch Subject: Re: [BioC] problem about hgu133plus2 annotation Message-ID: <4C4874C6.9090008 at med.umich.edu> Content-Type: text/plain; charset="iso-8859-1"; format="flowed" Hi Gina, On 7/22/2010 5:11 AM, Gina Liao wrote: > > Dear All, > I have 20 chips, and I used R to standardize the CEL files.Then, i got an expression value data of all chips.And I also downloaded the annotation csv format from NetAffy.(HG-U133_Plus_2 Annotations, CSV format, Release 30 (22 MB, 11/15/09)) > Here's my code. > ########test = justRMA()eset.st = standardise(test) > exprs.st = exprseset.st)e.out = exprs.stdim(e.out) #* 54675 20######## > However, i found out that the order of the rownames(e.out) is a little different to the row name of hgu133plus2.csv. The order from 54630 to 54640 is not the same to these two rows. > They should be the same,right? Is "hgu133plus2cdf" the problem? How could I solve it? I would recommend you use the annotation packages that are available from Bioconductor rather than downloading the annotation packages from Affymetrix. The BioC annotation packages contain the same information, but are designed to be easily used from within R, and you will find the .csv files you can get from Affy are not as user-friendly. You can get the annotation package using biocLite(): biocLite("hgu133plus2.db") Note that there is no reason to expect that the order of annotation data will be the same as the order of expression data. Re-ordering things is exceedingly simple in R, so this point is irrelevant. Using the annotation packages will take some reading on your part, but once you get the hang of things, I think you will like how they work. You might start with library(hgu133plus2.db) ?hgu133plus2.db as well as openVignette() and choose the AnnotationDbi vignette. If you are interested in annotating the set of interesting genes from your experiment, you will want to look at the annaffy package, which will allow you to output both HTML and text files with your results and annotations for each gene. In addition, you might want to look at the affycoretools package, which helps automate some of the steps required to annotate results. This package is also integrated with limma, so you can go straight from your linear model fits to output in one function call. Best, Jim > Thanks!!!!! > Best,Gina > _________________________________________________________________ > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues ------------------------------ Message: 17 Date: Thu, 22 Jul 2010 10:11:28 -0700 From: Patrick Aboyoun <paboyoun@fhcrc.org> To: Erik Wright <eswright at="" wisc.edu=""> Cc: BioC list <bioconductor at="" stat.math.ethz.ch=""> Subject: Re: [BioC] Biostrings - vcountPattern optimization Message-ID: <4C487BC0.6010309 at fhcrc.org> Content-Type: text/plain; charset=windows-1252; format=flowed Erik, Have you tried vcountPDict? It will use an Aho - Corasick matching algorithm (http://en.wikipedia.org/wiki/Aho?Corasick_string_matching_algorithm) that is pretty fast, albeit memory intensive. library(Biostrings) fragments<- DNAStringSet(c("ACTG","AAAA")) sequence_set<- DNAStringSet(c("TAGACATGAC","ACCAAATGAC")) pdict<- PDict(fragments) counts<- vcountPDict(pdict, sequence_set) > counts [,1] [,2] [1,] 0 0 [2,] 0 0 > sessionInfo() R version 2.12.0 Under development (unstable) (2010-07-18 r52554) Platform: i386-apple-darwin9.8.0/i386 (32-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] Biostrings_2.17.26 IRanges_1.7.13 loaded via a namespace (and not attached): [1] Biobase_2.9.0 tools_2.12.0 Patrick On 7/22/10 8:54 AM, Erik Wright wrote: > Hello, > > Lately I have been working on counting sequence fragments in larger sets of sequences. I am searching for thousands of fragments of 30 to 130 bases in hundreds of thousands of sequences between 1200 and 1600 bases. Currently I am using the following method to count the number of "hits": > > #### start #### > library(Biostrings) > fragments<- DNAStringSet(c("ACTG","AAAA")) > sequence_set<- DNAStringSet(c("TAGACATGAC","ACCAAATGAC")) > > for (i in 1:length(fragments)) { > counts<- vcountPattern(fragments[[i]], > sequence_set, > max.mismatch=1) > hits<- length(which(counts> 0)) > print(hits) > } > #### end #### > > This method is taking a long time to complete, so I am wondering if I am doing this in the most efficient manner? Does anyone have a suggestion for how I can accomplish the same task more efficiently? > > Thanks!, > Erik > > > > > >> sessionInfo() >> > R version 2.11.0 (2010-04-22) > x86_64-apple-darwin9.8.0 > > locale: > [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] Biostrings_2.16.0 IRanges_1.6.0 > > loaded via a namespace (and not attached): > [1] Biobase_2.8.0 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > ------------------------------ Message: 18 Date: Thu, 22 Jul 2010 10:26:48 -0700 From: Patrick Aboyoun <paboyoun@fhcrc.org> To: "Coghlan, Avril" <a.coghlan at="" ucc.ie=""> Cc: bioconductor at stat.math.ethz.ch Subject: Re: [BioC] feature request - pairwiseAlignment() in Biostrings Message-ID: <4C487F58.1060305 at fhcrc.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Avril, I wont have time to extend pairwiseAlignment, but you are more then welcome to. It is written mainly in C with an R wrapper. You can grab it via svn at the URL https://hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/Biostrings with username: readonly and password: readonly. The particular files you'll want to look at are https://hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/Biostrings /src/align_pairwiseAlignment.c https://hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/Biostrings /R/pairwiseAlignment.R I can provide you with a code walkthrough if you like. Since I optimized the code for speed and memory usage, you may find it is easier to write your own C level function that will be used instead of the code I have since I don't keep enough information around to be able to select the top X alignments. Cheers, Patrick On 7/22/10 1:54 AM, Coghlan, Avril wrote: > Dear Patrick and Steve, > > I am wondering whether it would be possible to add an option to the > pairwiseAlignment() function in Biostrings, so that it could print out: > (i) all the top-scoring alignments for 2 sequences, if there are more > than one equally scoring top-scoring alignments ? > (ii) the top X top-scoring alignments for 2 sequences, where the user > specifies the number X, and where the X alignments don't have to have > equal scores, but are ordered by decreasing score ? > > I'm not sure if these options are easy to add, but would be very useful > if you could add them. > > If you haven't time to do this, I would be willing to try to help add > the features to the pairwiseAlignment() function, if you can point me > towards the code. > > Kind Regards, > Avril > > Avril Coghlan > University College Cork > Ireland > > > > > ------------------------------ Message: 19 Date: Thu, 22 Jul 2010 12:32:39 -0500 From: Erik Wright <eswright@wisc.edu> To: Patrick Aboyoun <paboyoun at="" fhcrc.org=""> Cc: BioC list <bioconductor at="" stat.math.ethz.ch=""> Subject: Re: [BioC] Biostrings - vcountPattern optimization Message-ID: <fbde47f7-a49a-4d50-93bb-0ae8d9097da7 at="" wisc.edu=""> Content-Type: text/plain; charset=windows-1252 Hi Patrick, Thanks, this looks promising. Three possible complications are: (1) The fragments are not all the same width. Is this possible with Pdict? (2) I allow a variable number of mismatches based on each individual fragment's width. (3) The fragments sometimes include ambiguity letters (IUPAC extended letters). A more accurate example would be: #### start #### fragments <- DNAStringSet(c("ACS","NCCAGAA")) # no indels sequence_set <- DNAStringSet(c("ATAGCATACKACCA","GATTACGTACCADADATTACA") # variable widths for (i in 1:length(fragments)) { counts <- vcountPattern(fragments[[i]], sequence_set, max.mismatch=floor(length(fragments[[i]])/5)) # variable mis-matches hits <- length(which(counts > 0)) print(hits) } #### end #### Do think it is possible to make this work Pdict for a speed improvement? Thanks again!, Erik On Jul 22, 2010, at 12:11 PM, Patrick Aboyoun wrote: > Erik, > Have you tried vcountPDict? It will use an Aho - Corasick matching algorithm (http://en.wikipedia.org/wiki/Aho?Corasick_string_matching_algorithm) that is pretty fast, albeit memory intensive. > > library(Biostrings) > fragments<- DNAStringSet(c("ACTG","AAAA")) > sequence_set<- DNAStringSet(c("TAGACATGAC","ACCAAATGAC")) > pdict<- PDict(fragments) > counts<- vcountPDict(pdict, sequence_set) > >> counts > [,1] [,2] > [1,] 0 0 > [2,] 0 0 > >> sessionInfo() > R version 2.12.0 Under development (unstable) (2010-07-18 r52554) > Platform: i386-apple-darwin9.8.0/i386 (32-bit) > > locale: > [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] Biostrings_2.17.26 IRanges_1.7.13 > > loaded via a namespace (and not attached): > [1] Biobase_2.9.0 tools_2.12.0 > > > > > Patrick > > > On 7/22/10 8:54 AM, Erik Wright wrote: >> Hello, >> >> Lately I have been working on counting sequence fragments in larger sets of sequences. I am searching for thousands of fragments of 30 to 130 bases in hundreds of thousands of sequences between 1200 and 1600 bases. Currently I am using the following method to count the number of "hits": >> >> #### start #### >> library(Biostrings) >> fragments<- DNAStringSet(c("ACTG","AAAA")) >> sequence_set<- DNAStringSet(c("TAGACATGAC","ACCAAATGAC")) >> >> for (i in 1:length(fragments)) { >> counts<- vcountPattern(fragments[[i]], >> sequence_set, >> max.mismatch=1) >> hits<- length(which(counts> 0)) >> print(hits) >> } >> #### end #### >> >> This method is taking a long time to complete, so I am wondering if I am doing this in the most efficient manner? Does anyone have a suggestion for how I can accomplish the same task more efficiently? >> >> Thanks!, >> Erik >> >> >> >> >> >>> sessionInfo() >>> >> R version 2.11.0 (2010-04-22) >> x86_64-apple-darwin9.8.0 >> >> locale: >> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] Biostrings_2.16.0 IRanges_1.6.0 >> >> loaded via a namespace (and not attached): >> [1] Biobase_2.8.0 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> > ------------------------------ Message: 20 Date: Thu, 22 Jul 2010 11:10:03 -0700 From: Michael Lawrence <lawrence.michael@gene.com> To: Patrick Aboyoun <paboyoun at="" fhcrc.org=""> Cc: bioconductor at stat.math.ethz.ch Subject: Re: [BioC] feature request - pairwiseAlignment() in Biostrings Message-ID: <aanlktik92of_5a3jhph8p_bwalpmt9yq2brvscd7b2lz at="" mail.gmail.com=""> Content-Type: text/plain Toughest question is probably not how to modify the C code, but how the results will be represented and manipulated in R. Good luck On Thu, Jul 22, 2010 at 10:26 AM, Patrick Aboyoun <paboyoun at="" fhcrc.org="">wrote: > Avril, > I wont have time to extend pairwiseAlignment, but you are more then welcome > to. It is written mainly in C with an R wrapper. You can grab it via svn at > the URL > > https://hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/Biostrings > > with username: readonly and password: readonly. > > The particular files you'll want to look at are > > > https://hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/Biostrin gs/src/align_pairwiseAlignment.c > > https://hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/Biostrin gs/R/pairwiseAlignment.R > > I can provide you with a code walkthrough if you like. Since I optimized > the code for speed and memory usage, you may find it is easier to write your > own C level function that will be used instead of the code I have since I > don't keep enough information around to be able to select the top X > alignments. > > > Cheers, > > Patrick > > > > On 7/22/10 1:54 AM, Coghlan, Avril wrote: > >> Dear Patrick and Steve, >> >> I am wondering whether it would be possible to add an option to the >> pairwiseAlignment() function in Biostrings, so that it could print out: >> (i) all the top-scoring alignments for 2 sequences, if there are more >> than one equally scoring top-scoring alignments ? >> (ii) the top X top-scoring alignments for 2 sequences, where the user >> specifies the number X, and where the X alignments don't have to have >> equal scores, but are ordered by decreasing score ? >> >> I'm not sure if these options are easy to add, but would be very useful >> if you could add them. >> >> If you haven't time to do this, I would be willing to try to help add >> the features to the pairwiseAlignment() function, if you can point me >> towards the code. >> >> Kind Regards, >> Avril >> >> Avril Coghlan >> University College Cork >> Ireland >> >> >> >> >> >> > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]] ------------------------------ Message: 21 Date: Thu, 22 Jul 2010 16:04:06 -0400 From: Steve Lianoglou <mailinglist.honeypot@gmail.com> To: Elmer Fern?ndez <elmerfer at="" gmail.com=""> Cc: Sean Davis <sdavis2 at="" mail.nih.gov="">, Bioconductor mailing list <bioconductor at="" stat.math.ethz.ch=""> Subject: Re: [BioC] Heatmap.2 scale problems: Sacling inside the function gives different results than scaling outside!!! Message-ID: <aanlktikape0f5juvye1tioilcfi_inwnha6vmn5drhok at="" mail.gmail.com=""> Content-Type: text/plain; charset=ISO-8859-1 Hi, 2010/7/22 Elmer Fern?ndez <elmerfer at="" gmail.com="">: > Hy Benjamin > Are you sure about that? Looking at the source code for heatmap.2 (and heatmap, for that matter) it looks as if Benjamin is correct. The scaling is done after the clustering. > If so, I think that it is not correct, right? I guess it depends on what you were expecting it to do :-) Having just realized this myself (yikes -- see what happens when we assume(?)), I think I'd more often rather send in a scaled version of the data and have scale='none' in the heatmap call, to be honest. -steve > best > Elmer > > 2010/7/22 Benjamin Otto <b.otto at="" uke.uni-hamburg.de=""> > >> Hi Guys, >> >> do note that the scale() function in heatmap doesn't scale your values till >> AFTER clustering for visualization purpose! So if you provide already scaled >> data, you naturally will expect a different result. >> >> cheers >> >> Benjamin >> >> Am 22.07.2010 um 16:25 schrieb Bazeley, Peter: >> >> > Hi Elmer, >> > >> > The default scale option in heatmap.2 scales by row, whereas the scale() >> function scales by column, so this is probably why there is a difference. I >> think whichever dimension contains unique samples is how you want to scale >> (if this was expression data, for example). >> > >> > >> > Pete >> > ________________________________________ >> > From: bioconductor-bounces at stat.math.ethz.ch [ >> bioconductor-bounces at stat.math.ethz.ch] on behalf of Sean Davis [ >> sdavis2 at mail.nih.gov] >> > Sent: Thursday, July 22, 2010 9:17 AM >> > To: Elmer Fern?ndez >> > Cc: Bioconductor mailing list >> > Subject: Re: [BioC] Heatmap.2 scale problems: Sacling inside the function >> ? ? ? gives different results than scaling outside!!! >> > >> > 2010/7/22 Elmer Fern?ndez <elmerfer at="" gmail.com=""> >> > >> >> Dear Users >> >> I'm working with the heatmap.2 function and I realize that if you use >> the >> >> scale input paramenter gives different results than usign the scale >> >> function >> >> outsie and feed the heatmap.2 fucntion with the scaled matrix. I >> attached >> >> the results of the two approaches and the used data matrix (M.csv). >> >> SO, what I'm doing wrong? >> >> >> >> >> > Hi, Elmer. >> > >> > The default distance function used by heatmap.2 is dist() which is not >> going >> > to be invariant under centering and scaling, I don't think. ?It looks >> like >> > you are using that default. >> > >> > Sean >> > >> > >> >> R Code >> >> >> >> library(gplots) >> >> M=matrix(c(rnorm(10*3,1,2),rnorm(10*2,-0.5,1)),ncol=5) >> >> heatmap.2(M,scale="column",trace="none",main="scaled inside") >> >> x11();heatmap.2(scale(M),scale="none",trace="none",main="scaled >> outside") >> >> >> >>> sessionInfo() >> >> R version 2.10.0 (2009-10-26) >> >> x86_64-unknown-linux-gnu >> >> >> >> locale: >> >> [1] LC_CTYPE=en_US.UTF-8 ? ? ? ? ?LC_NUMERIC=C >> >> LC_TIME=en_US.UTF-8 ? ? ? ? ? LC_COLLATE=en_US.UTF-8 >> >> [5] LC_MONETARY=en_US.UTF-8 ? ? ? LC_MESSAGES=en_US.UTF-8 >> >> LC_PAPER=en_US.UTF-8 ? ? ? ? ?LC_NAME=en_US.UTF-8 >> >> [9] LC_ADDRESS=en_US.UTF-8 ? ? ? ?LC_TELEPHONE=en_US.UTF-8 >> >> LC_MEASUREMENT=en_US.UTF-8 ? ?LC_IDENTIFICATION=en_US.UTF-8 >> >> >> >> attached base packages: >> >> [1] grid ? ? ?stats ? ? graphics ?grDevices utils ? ? datasets ?methods >> >> base >> >> >> >> other attached packages: >> >> [1] gplots_2.7.4 ? caTools_1.10 ? bitops_1.0-4.1 gdata_2.7.1 >> >> gtools_2.6.1 ? rkward_0.5.1 >> >> >> >> loaded via a namespace (and not attached): >> >> [1] tools_2.10.0 >> >> >> >> >> >> -- >> >> Elmer A. Fern?ndez (Bioing. PhD) >> >> Investigador Asistente CONICET - Research Assistant CONICET >> >> Prof. Inteligencia Artificial -UCC - Prof. Artificial Intelligence @ UCC >> >> tel: +54-(0)351-4938000 int 145 >> >> Fax: +54-(0)351-4938081 >> >> web page : http://www.uccor.edu.ar/modelo.php?param=3.8.5.15 >> >> http://sites.google.com/site/biologicaldatamininggroup/Home/ >> >> mail address: Camino Alta Gracia Km 7.1/2- C?rdoba-5017-Argentina >> >> >> >> >> >> >> >> -- >> >> Elmer A. Fern?ndez (Bioing. PhD) >> >> Investigador Asistente CONICET - Research Assistant CONICET >> >> Prof. Inteligencia Artificial -UCC - Prof. Artificial Intelligence @ UCC >> >> tel: +54-(0)351-4938000 int 145 >> >> Fax: +54-(0)351-4938081 >> >> web page : http://www.uccor.edu.ar/modelo.php?param=3.8.5.15 >> >> http://sites.google.com/site/biologicaldatamininggroup/Home/ >> >> mail address: Camino Alta Gracia Km 7.1/2- C?rdoba-5017-Argentina >> >> >> >> ? ? ? [[alternative HTML version deleted]] >> >> >> >> >> >> _______________________________________________ >> >> Bioconductor mailing list >> >> Bioconductor at stat.math.ethz.ch >> >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> >> Search the archives: >> >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> >> > >> > ? ? ? ?[[alternative HTML version deleted]] >> > >> > _______________________________________________ >> > Bioconductor mailing list >> > Bioconductor at stat.math.ethz.ch >> > https://stat.ethz.ch/mailman/listinfo/bioconductor >> > Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > >> >> ___________________________________________ >> Benjamin Otto, PhD >> University Medical Center Hamburg-Eppendorf >> Institute For Clinical Chemistry / Central Laboratories >> Campus Forschung N27 >> Martinistr. 52, >> D-20246 Hamburg >> >> Tel.: +49 40 7410 51908 >> Fax.: +49 40 7410 54971 >> ___________________________________________ >> >> >> >> >> >> -- >> Pflichtangaben gem?? Gesetz ?ber elektronische Handelsregister und >> Genossenschaftsregister sowie das Unternehmensregister (EHUG): >> >> Universit?tsklinikum Hamburg-Eppendorf >> K?rperschaft des ?ffentlichen Rechts >> Gerichtsstand: Hamburg >> >> Vorstandsmitglieder: >> Prof. Dr. J?rg F. Debatin (Vorsitzender) >> Dr. Alexander Kirstein >> Joachim Pr?l? >> Prof. Dr. Dr. Uwe Koch-Gromus >> > > > > -- > Elmer A. Fern?ndez (Bioing. PhD) > Investigador Asistente CONICET - Research Assistant CONICET > Prof. Inteligencia Artificial -UCC - Prof. Artificial Intelligence @ UCC > tel: +54-(0)351-4938000 int 145 > Fax: +54-(0)351-4938081 > web page : http://www.uccor.edu.ar/modelo.php?param=3.8.5.15 > http://sites.google.com/site/biologicaldatamininggroup/Home/ > mail address: Camino Alta Gracia Km 7.1/2- C?rdoba-5017-Argentina > > ? ? ? ?[[alternative HTML version deleted]] > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact ------------------------------ Message: 22 Date: Thu, 22 Jul 2010 13:14:34 -0700 From: Hervé Pagès <hpages@fhcrc.org> To: Erik Wright <eswright at="" wisc.edu=""> Cc: BioC list <bioconductor at="" stat.math.ethz.ch=""> Subject: Re: [BioC] Biostrings - vcountPattern optimization Message-ID: <4C48A6AA.2050407 at fhcrc.org> Content-Type: text/plain; charset=windows-1252; format=flowed Hi Erik, On 07/22/2010 10:32 AM, Erik Wright wrote: > Hi Patrick, > > Thanks, this looks promising. Three possible complications are: > (1) The fragments are not all the same width. Is this possible with Pdict? Yes, but given requirement (2), you need another solution. > (2) I allow a variable number of mismatches based on each individual fragment's width. So given (1) and (2), you could group your fragments by equal length, make a PDict object for each group, and use a single number of mismatches for that group (seems like this number only depends on the length of the fragment). > (3) The fragments sometimes include ambiguity letters (IUPAC extended letters). Unfortunately ambiguities are supported only in the subject at the moment. But you could still treat them separately with vcountPattern() in a loop. > > A more accurate example would be: > > #### start #### > fragments<- DNAStringSet(c("ACS","NCCAGAA")) # no indels > sequence_set<- DNAStringSet(c("ATAGCATACKACCA","GATTACGTACCADADATTACA") # variable widths > for (i in 1:length(fragments)) { > counts<- vcountPattern(fragments[[i]], > sequence_set, > max.mismatch=floor(length(fragments[[i]])/5)) # variable mis-matches > hits<- length(which(counts> 0)) > print(hits) > } > #### end #### > > Do think it is possible to make this work Pdict for a speed improvement? With max.mismatch being a fifth of the fragment length that means it will be between 6 (for 30bp fragments) and 26 (for 130bp fragments). Unfortunately, that's way too many mismatches PDict()/vcountPDict() can handle. Cheers, H. > > Thanks again!, > Erik > > > > On Jul 22, 2010, at 12:11 PM, Patrick Aboyoun wrote: > >> Erik, >> Have you tried vcountPDict? It will use an Aho - Corasick matching algorithm (http://en.wikipedia.org/wiki/Aho?Corasick_string_matching_algorithm) that is pretty fast, albeit memory intensive. >> >> library(Biostrings) >> fragments<- DNAStringSet(c("ACTG","AAAA")) >> sequence_set<- DNAStringSet(c("TAGACATGAC","ACCAAATGAC")) >> pdict<- PDict(fragments) >> counts<- vcountPDict(pdict, sequence_set) >> >>> counts >> [,1] [,2] >> [1,] 0 0 >> [2,] 0 0 >> >>> sessionInfo() >> R version 2.12.0 Under development (unstable) (2010-07-18 r52554) >> Platform: i386-apple-darwin9.8.0/i386 (32-bit) >> >> locale: >> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] Biostrings_2.17.26 IRanges_1.7.13 >> >> loaded via a namespace (and not attached): >> [1] Biobase_2.9.0 tools_2.12.0 >> >> >> >> >> Patrick >> >> >> On 7/22/10 8:54 AM, Erik Wright wrote: >>> Hello, >>> >>> Lately I have been working on counting sequence fragments in larger sets of sequences. I am searching for thousands of fragments of 30 to 130 bases in hundreds of thousands of sequences between 1200 and 1600 bases. Currently I am using the following method to count the number of "hits": >>> >>> #### start #### >>> library(Biostrings) >>> fragments<- DNAStringSet(c("ACTG","AAAA")) >>> sequence_set<- DNAStringSet(c("TAGACATGAC","ACCAAATGAC")) >>> >>> for (i in 1:length(fragments)) { >>> counts<- vcountPattern(fragments[[i]], >>> sequence_set, >>> max.mismatch=1) >>> hits<- length(which(counts> 0)) >>> print(hits) >>> } >>> #### end #### >>> >>> This method is taking a long time to complete, so I am wondering if I am doing this in the most efficient manner? Does anyone have a suggestion for how I can accomplish the same task more efficiently? >>> >>> Thanks!, >>> Erik >>> >>> >>> >>> >>> >>>> sessionInfo() >>>> >>> R version 2.11.0 (2010-04-22) >>> x86_64-apple-darwin9.8.0 >>> >>> locale: >>> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 >>> >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> >>> other attached packages: >>> [1] Biostrings_2.16.0 IRanges_1.6.0 >>> >>> loaded via a namespace (and not attached): >>> [1] Biobase_2.8.0 >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319 ------------------------------ Message: 23 Date: Thu, 22 Jul 2010 17:14:42 -0300 From: Elmer Fern?ndez <elmerfer@gmail.com> To: Steve Lianoglou <mailinglist.honeypot at="" gmail.com=""> Cc: Sean Davis <sdavis2 at="" mail.nih.gov="">, Bioconductor mailing list <bioconductor at="" stat.math.ethz.ch=""> Subject: Re: [BioC] Heatmap.2 scale problems: Sacling inside the function gives different results than scaling outside!!! Message-ID: <aanlktiklby_bjnudymd7aakezc0yl3wscagrqs7kjnt8 at="" mail.gmail.com=""> Content-Type: text/plain Dear Steve You are right when you say that you should scale your data according to what do you want to do, but from the help it is not clear when the scaling is done. In most of the R functions, when the scale parameter is present in the input you assume that the scaling process is permormed BEFORE the main process. That's why I said that it could not be correct. Dear guys, THANKS for the discussion!! I'll really appreciated and enjoyed. Best Elmer 2010/7/22 Steve Lianoglou <mailinglist.honeypot at="" gmail.com=""> > Hi, > > 2010/7/22 Elmer Fern?ndez <elmerfer at="" gmail.com="">: > > Hy Benjamin > > Are you sure about that? > > Looking at the source code for heatmap.2 (and heatmap, for that > matter) it looks as if Benjamin is correct. The scaling is done after > the clustering. > > > If so, I think that it is not correct, right? > > I guess it depends on what you were expecting it to do :-) > > Having just realized this myself (yikes -- see what happens when we > assume(?)), I think I'd more often rather send in a scaled version of > the data and have scale='none' in the heatmap call, to be honest. > > -steve > > > best > > Elmer > > > > 2010/7/22 Benjamin Otto <b.otto at="" uke.uni-hamburg.de=""> > > > >> Hi Guys, > >> > >> do note that the scale() function in heatmap doesn't scale your values > till > >> AFTER clustering for visualization purpose! So if you provide already > scaled > >> data, you naturally will expect a different result. > >> > >> cheers > >> > >> Benjamin > >> > >> Am 22.07.2010 um 16:25 schrieb Bazeley, Peter: > >> > >> > Hi Elmer, > >> > > >> > The default scale option in heatmap.2 scales by row, whereas the > scale() > >> function scales by column, so this is probably why there is a > difference. I > >> think whichever dimension contains unique samples is how you want to > scale > >> (if this was expression data, for example). > >> > > >> > > >> > Pete > >> > ________________________________________ > >> > From: bioconductor-bounces at stat.math.ethz.ch [ > >> bioconductor-bounces at stat.math.ethz.ch] on behalf of Sean Davis [ > >> sdavis2 at mail.nih.gov] > >> > Sent: Thursday, July 22, 2010 9:17 AM > >> > To: Elmer Fern?ndez > >> > Cc: Bioconductor mailing list > >> > Subject: Re: [BioC] Heatmap.2 scale problems: Sacling inside the > function > >> gives different results than scaling outside!!! > >> > > >> > 2010/7/22 Elmer Fern?ndez <elmerfer at="" gmail.com=""> > >> > > >> >> Dear Users > >> >> I'm working with the heatmap.2 function and I realize that if you use > >> the > >> >> scale input paramenter gives different results than usign the scale > >> >> function > >> >> outsie and feed the heatmap.2 fucntion with the scaled matrix. I > >> attached > >> >> the results of the two approaches and the used data matrix (M.csv). > >> >> SO, what I'm doing wrong? > >> >> > >> >> > >> > Hi, Elmer. > >> > > >> > The default distance function used by heatmap.2 is dist() which is not > >> going > >> > to be invariant under centering and scaling, I don't think. It looks > >> like > >> > you are using that default. > >> > > >> > Sean > >> > > >> > > >> >> R Code > >> >> > >> >> library(gplots) > >> >> M=matrix(c(rnorm(10*3,1,2),rnorm(10*2,-0.5,1)),ncol=5) > >> >> heatmap.2(M,scale="column",trace="none",main="scaled inside") > >> >> x11();heatmap.2(scale(M),scale="none",trace="none",main="scaled > >> outside") > >> >> > >> >>> sessionInfo() > >> >> R version 2.10.0 (2009-10-26) > >> >> x86_64-unknown-linux-gnu > >> >> > >> >> locale: > >> >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > >> >> LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > >> >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > >> >> LC_PAPER=en_US.UTF-8 LC_NAME=en_US.UTF-8 > >> >> [9] LC_ADDRESS=en_US.UTF-8 LC_TELEPHONE=en_US.UTF-8 > >> >> LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=en_US.UTF-8 > >> >> > >> >> attached base packages: > >> >> [1] grid stats graphics grDevices utils datasets > methods > >> >> base > >> >> > >> >> other attached packages: > >> >> [1] gplots_2.7.4 caTools_1.10 bitops_1.0-4.1 gdata_2.7.1 > >> >> gtools_2.6.1 rkward_0.5.1 > >> >> > >> >> loaded via a namespace (and not attached): > >> >> [1] tools_2.10.0 > >> >> > >> >> > >> >> -- > >> >> Elmer A. Fern?ndez (Bioing. PhD) > >> >> Investigador Asistente CONICET - Research Assistant CONICET > >> >> Prof. Inteligencia Artificial -UCC - Prof. Artificial Intelligence @ > UCC > >> >> tel: +54-(0)351-4938000 int 145 > >> >> Fax: +54-(0)351-4938081 > >> >> web page : http://www.uccor.edu.ar/modelo.php?param=3.8.5.15 > >> >> http://sites.google.com/site/biologicaldatamininggroup/Home/ > >> >> mail address: Camino Alta Gracia Km 7.1/2- C?rdoba-5017-Argentina > >> >> > >> >> > >> >> > >> >> -- > >> >> Elmer A. Fern?ndez (Bioing. PhD) > >> >> Investigador Asistente CONICET - Research Assistant CONICET > >> >> Prof. Inteligencia Artificial -UCC - Prof. Artificial Intelligence @ > UCC > >> >> tel: +54-(0)351-4938000 int 145 > >> >> Fax: +54-(0)351-4938081 > >> >> web page : http://www.uccor.edu.ar/modelo.php?param=3.8.5.15 > >> >> http://sites.google.com/site/biologicaldatamininggroup/Home/ > >> >> mail address: Camino Alta Gracia Km 7.1/2- C?rdoba-5017-Argentina > >> >> > >> >> [[alternative HTML version deleted]] > >> >> > >> >> > >> >> _______________________________________________ > >> >> Bioconductor mailing list > >> >> Bioconductor at stat.math.ethz.ch > >> >> https://stat.ethz.ch/mailman/listinfo/bioconductor > >> >> Search the archives: > >> >> http://news.gmane.org/gmane.science.biology.informatics.conductor > >> >> > >> > > >> > [[alternative HTML version deleted]] > >> > > >> > _______________________________________________ > >> > Bioconductor mailing list > >> > Bioconductor at stat.math.ethz.ch > >> > https://stat.ethz.ch/mailman/listinfo/bioconductor > >> > Search the archives: > >> http://news.gmane.org/gmane.science.biology.informatics.conductor > >> > > >> > >> ___________________________________________ > >> Benjamin Otto, PhD > >> University Medical Center Hamburg-Eppendorf > >> Institute For Clinical Chemistry / Central Laboratories > >> Campus Forschung N27 > >> Martinistr. 52, > >> D-20246 Hamburg > >> > >> Tel.: +49 40 7410 51908 > >> Fax.: +49 40 7410 54971 > >> ___________________________________________ > >> > >> > >> > >> > >> > >> -- > >> Pflichtangaben gem?? Gesetz ?ber elektronische Handelsregister und > >> Genossenschaftsregister sowie das Unternehmensregister (EHUG): > >> > >> Universit?tsklinikum Hamburg-Eppendorf > >> K?rperschaft des ?ffentlichen Rechts > >> Gerichtsstand: Hamburg > >> > >> Vorstandsmitglieder: > >> Prof. Dr. J?rg F. Debatin (Vorsitzender) > >> Dr. Alexander Kirstein > >> Joachim Pr?l? > >> Prof. Dr. Dr. Uwe Koch-Gromus > >> > > > > > > > > -- > > Elmer A. Fern?ndez (Bioing. PhD) > > Investigador Asistente CONICET - Research Assistant CONICET > > Prof. Inteligencia Artificial -UCC - Prof. Artificial Intelligence @ UCC > > tel: +54-(0)351-4938000 int 145 > > Fax: +54-(0)351-4938081 > > web page : http://www.uccor.edu.ar/modelo.php?param=3.8.5.15 > > http://sites.google.com/site/biologicaldatamininggroup/Home/ > > mail address: Camino Alta Gracia Km 7.1/2- C?rdoba-5017-Argentina > > > > [[alternative HTML version deleted]] > > > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact<http: cbio.mskc="" c.org="" %7elianos="" contact=""> > -- Elmer A. Fern?ndez (Bioing. PhD) Investigador Asistente CONICET - Research Assistant CONICET Prof. Inteligencia Artificial -UCC - Prof. Artificial Intelligence @ UCC tel: +54-(0)351-4938000 int 145 Fax: +54-(0)351-4938081 web page : http://www.uccor.edu.ar/modelo.php?param=3.8.5.15 http://sites.google.com/site/biologicaldatamininggroup/Home/ mail address: Camino Alta Gracia Km 7.1/2- C?rdoba-5017-Argentina [[alternative HTML version deleted]] ------------------------------ Message: 24 Date: Thu, 22 Jul 2010 15:00:56 -0600 From: Sean Davis <sdavis2@mail.nih.gov> To: Elmer Fern?ndez <elmerfer at="" gmail.com=""> Cc: Bioconductor mailing list <bioconductor at="" stat.math.ethz.ch=""> Subject: Re: [BioC] Heatmap.2 scale problems: Sacling inside the function gives different results than scaling outside!!! Message-ID: <aanlktimx3yubpv2nsyjcrypaxr4zon5wfknzqbs5tr0i at="" mail.gmail.com=""> Content-Type: text/plain 2010/7/22 Elmer Fern??ndez <elmerfer at="" gmail.com=""> > Hy Benjamin > Are you sure about that? If so, I think that it is not correct, right? > best > Elmer > Hi, Elmer. My reading of the source code for heatmap.2 suggests that Benjamin is correct. Sean > > 2010/7/22 Benjamin Otto <b.otto at="" uke.uni-hamburg.de=""> > > > Hi Guys, > > > > do note that the scale() function in heatmap doesn't scale your values > till > > AFTER clustering for visualization purpose! So if you provide already > scaled > > data, you naturally will expect a different result. > > > > cheers > > > > Benjamin > > > > Am 22.07.2010 um 16:25 schrieb Bazeley, Peter: > > > > > Hi Elmer, > > > > > > The default scale option in heatmap.2 scales by row, whereas the > scale() > > function scales by column, so this is probably why there is a difference. > I > > think whichever dimension contains unique samples is how you want to > scale > > (if this was expression data, for example). > > > > > > > > > Pete > > > ________________________________________ > > > From: bioconductor-bounces at stat.math.ethz.ch [ > > bioconductor-bounces at stat.math.ethz.ch] on behalf of Sean Davis [ > > sdavis2 at mail.nih.gov] > > > Sent: Thursday, July 22, 2010 9:17 AM > > > To: Elmer Fern??ndez > > > Cc: Bioconductor mailing list > > > Subject: Re: [BioC] Heatmap.2 scale problems: Sacling inside the > function > > gives different results than scaling outside!!! > > > > > > 2010/7/22 Elmer Fern??ndez <elmerfer at="" gmail.com=""> > > > > > >> Dear Users > > >> I'm working with the heatmap.2 function and I realize that if you use > > the > > >> scale input paramenter gives different results than usign the scale > > >> function > > >> outsie and feed the heatmap.2 fucntion with the scaled matrix. I > > attached > > >> the results of the two approaches and the used data matrix (M.csv). > > >> SO, what I'm doing wrong? > > >> > > >> > > > Hi, Elmer. > > > > > > The default distance function used by heatmap.2 is dist() which is not > > going > > > to be invariant under centering and scaling, I don't think. It looks > > like > > > you are using that default. > > > > > > Sean > > > > > > > > >> R Code > > >> > > >> library(gplots) > > >> M=matrix(c(rnorm(10*3,1,2),rnorm(10*2,-0.5,1)),ncol=5) > > >> heatmap.2(M,scale="column",trace="none",main="scaled inside") > > >> x11();heatmap.2(scale(M),scale="none",trace="none",main="scaled > > outside") > > >> > > >>> sessionInfo() > > >> R version 2.10.0 (2009-10-26) > > >> x86_64-unknown-linux-gnu > > >> > > >> locale: > > >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > > >> LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > > >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > > >> LC_PAPER=en_US.UTF-8 LC_NAME=en_US.UTF-8 > > >> [9] LC_ADDRESS=en_US.UTF-8 LC_TELEPHONE=en_US.UTF-8 > > >> LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=en_US.UTF-8 > > >> > > >> attached base packages: > > >> [1] grid stats graphics grDevices utils datasets > methods > > >> base > > >> > > >> other attached packages: > > >> [1] gplots_2.7.4 caTools_1.10 bitops_1.0-4.1 gdata_2.7.1 > > >> gtools_2.6.1 rkward_0.5.1 > > >> > > >> loaded via a namespace (and not attached): > > >> [1] tools_2.10.0 > > >> > > >> > > >> -- > > >> Elmer A. Fern??ndez (Bioing. PhD) > > >> Investigador Asistente CONICET - Research Assistant CONICET > > >> Prof. Inteligencia Artificial -UCC - Prof. Artificial Intelligence @ > UCC > > >> tel: +54-(0)351-4938000 int 145 > > >> Fax: +54-(0)351-4938081 > > >> web page : http://www.uccor.edu.ar/modelo.php?param=3.8.5.15 > > >> http://sites.google.com/site/biologicaldatamininggroup/Home/ > > >> mail address: Camino Alta Gracia Km 7.1/2- C??rdoba-5017-Argentina > > >> > > >> > > >> > > >> -- > > >> Elmer A. Fern??ndez (Bioing. PhD) > > >> Investigador Asistente CONICET - Research Assistant CONICET > > >> Prof. Inteligencia Artificial -UCC - Prof. Artificial Intelligence @ > UCC > > >> tel: +54-(0)351-4938000 int 145 > > >> Fax: +54-(0)351-4938081 > > >> web page : http://www.uccor.edu.ar/modelo.php?param=3.8.5.15 > > >> http://sites.google.com/site/biologicaldatamininggroup/Home/ > > >> mail address: Camino Alta Gracia Km 7.1/2- C??rdoba-5017-Argentina > > >> > > >> [[alternative HTML version deleted]] > > >> > > >> > > >> _______________________________________________ > > >> Bioconductor mailing list > > >> Bioconductor at stat.math.ethz.ch > > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > > >> Search the archives: > > >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > >> > > > > > > [[alternative HTML version deleted]] > > > > > > _______________________________________________ > > > Bioconductor mailing list > > > Bioconductor at stat.math.ethz.ch > > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > Search the archives: > > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > > ___________________________________________ > > Benjamin Otto, PhD > > University Medical Center Hamburg-Eppendorf > > Institute For Clinical Chemistry / Central Laboratories > > Campus Forschung N27 > > Martinistr. 52, > > D-20246 Hamburg > > > > Tel.: +49 40 7410 51908 > > Fax.: +49 40 7410 54971 > > ___________________________________________ > > > > > > > > > > > > -- > > Pflichtangaben gem???? Gesetz ??ber elektronische Handelsregister und > > Genossenschaftsregister sowie das Unternehmensregister (EHUG): > > > > Universit??tsklinikum Hamburg-Eppendorf > > K??rperschaft des ??ffentlichen Rechts > > Gerichtsstand: Hamburg > > > > Vorstandsmitglieder: > > Prof. Dr. J??rg F. Debatin (Vorsitzender) > > Dr. Alexander Kirstein > > Joachim Pr??l?? > > Prof. Dr. Dr. Uwe Koch-Gromus > > > > > > -- > Elmer A. Fern??ndez (Bioing. PhD) > Investigador Asistente CONICET - Research Assistant CONICET > Prof. Inteligencia Artificial -UCC - Prof. Artificial Intelligence @ UCC > tel: +54-(0)351-4938000 int 145 > Fax: +54-(0)351-4938081 > web page : http://www.uccor.edu.ar/modelo.php?param=3.8.5.15 > http://sites.google.com/site/biologicaldatamininggroup/Home/ > mail address: Camino Alta Gracia Km 7.1/2- C??rdoba-5017-Argentina > > [[alternative HTML version deleted]] > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]] ------------------------------ Message: 25 Date: Fri, 23 Jul 2010 09:13:56 +1000 (AUS Eastern Standard Time) From: Gordon K Smyth <smyth@wehi.edu.au> To: HuW at mskcc.org Cc: Bioconductor mailing list <bioconductor at="" stat.math.ethz.ch=""> Subject: [BioC] the design matrix again Message-ID: <pine.wnt.4.64.1007230912030.2728 at="" pc602.alpha.wehi.edu.au=""> Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Looks correct. Gordon > Date: Tue, 20 Jul 2010 17:44:07 -0400 > From: HuW at mskcc.org > To: bioconductor at stat.math.ethz.ch > Subject: [BioC] the design matrix again > > > Hi everyone, > > I know my question is answered in some extent on mail list. But I am > still not feel very confidence about my design. I really appreciate if > anyone can help me on this. > > the data set is about the patients before and after treatment. for > example, for 3 patients. I want to find out the genes that changed > expression before and after treatment. if I have 3 patients, I did like > this: > >> design > patient1 patient2 patient3 treatment14 > 1 1 0 0 0 > 2 0 1 0 0 > 3 0 0 1 0 > 4 1 0 0 1 > 5 0 1 0 1 > 6 0 0 1 1 > attr(,"assign") > [1] 1 1 1 2 > attr(,"contrasts") > attr(,"contrasts")$patient > [1] "contr.treatment" > > attr(,"contrasts")$treatment > [1] "contr.treatment" > >> eset.rma.fit = lmFit(eset.rma, design); >> eset.rma.bayes = eBayes(eset.rma.fit); >> topTable(eset4.rma.bayes, coef = "treatment14", adjust = "BH"); > > thank you very much. > > Wenhuo Hu ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}} ------------------------------ Message: 26 Date: Thu, 22 Jul 2010 16:23:53 -0700 From: Thomas Girke <thomas.girke@ucr.edu> To: Bioconductor mailing list <bioconductor at="" stat.math.ethz.ch="">, bioc-sig-sequencing at stat.math.ethz.ch Subject: [BioC] Open Postdoc Positions Message-ID: <20100722232353.GA18501 at biocluster.ucr.edu> Content-Type: text/plain; charset=us-ascii Dear List Members, There are currently two open postdoc positions in my group with secured funding for 3-4 years. One position is in the area of next generation sequencing and the other one in the chemical informatics field related to chemical genomics and drug discovery. Both positions will involve a combination of software development and data analysis/mining tasks. Ideal candidates should have a strong background in computer sciences and scientific data analysis, and should be proficient in at least two of the following programming languages: C/C++, Python and R. Experience with web and database programming is also beneficial, especially with Python/Django and MySQL/PostgreSQL, respectively. To apply, please email your CV with a detailed description of your professional skills to thomas.girke at ucr.edu. Thomas -- Thomas Girke Associate Professor of Bioinformatics Director, IIGB Bioinformatic Facility Institute for Integrative Genome Biology (IIGB) 1207F Genomics Building University of California Riverside, CA 92521 E-mail: thomas.girke at ucr.edu Personal Site: http://girke.bioinformatics.ucr.edu Ph: 951-905-5232 Fax: 951-827-5155 ------------------------------ Message: 27 Date: Fri, 23 Jul 2010 09:38:50 +0100 From: Heidi Dvinge <heidi@ebi.ac.uk> To: David martin <vilanew at="" gmail.com=""> Cc: bioconductor at stat.math.ethz.ch Subject: Re: [BioC] htQPCR Message-ID: <c49c2983-a056-4db2-b43c-1d35f91a194e at="" ebi.ac.uk=""> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Hello David, Thanks for the feedback on HTqPCR. I've never really thought of filtering out samples during my own analysis, hence no option in filterCtData. The default way is by doing subsetting, such as qPCRset [,c(1:3,5)], or by using sample names as you do in your example. However, I guess a specific filtering option might also be useful in other cases, such as potentially removing samples that have a high proportion of NA values and can therefore be considered failed plates/ samples. I'll put it on the todo list of HTqPCR improvements. CHeers \Heidi On 22 Jul 2010, at 10:47, David martin wrote: > Hello, > I would like to suggest a filtering method based on sample name. > FilterCTdata contains a lot of filtering methods but didn't see any > to filter based on sample names, > > Actually i use the match function do remove samples from the analysis. > > e.g > tofilter=c("sample1","sample2",...) > exprs(qpcrObj)[,-match(tofilter,colnames(exprs(qpcrObj)))] > > thanks, > david > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/ > gmane.science.biology.informatics.conductor ------------------------------ Message: 28 Date: Fri, 23 Jul 2010 10:11:47 +0100 From: Heidi Dvinge <heidi@ebi.ac.uk> To: "Bass, Kevin" <bassk1 at="" email.chop.edu=""> Cc: BioC List <bioconductor at="" stat.math.ethz.ch=""> Subject: Re: [BioC] Problem with function limmaCtData in HTqPCR package: "leading minor of order 2 is not positive definite" Message-ID: <12D85D00-CC61-4304-9112-7F870CA0A9D9 at ebi.ac.uk> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Hello Kevin, On 21 Jul 2010, at 19:50, Bass, Kevin wrote: > Hi, > > I am having a problem with using the function limmaCtData on a qPCRset > object created with the package HTqPCR. When I try to execute > limmaCtData, I get the following error: > > "Error in chol.default(V) : > the leading minor of order 2 is not positive definite" > as your traceback() shows, in the first step the error comes from lmFit from the limma package. As I recall, it means that one of internal design matrices has become singular. I'm afraid I don't know exactly why this is happening, however it can be caused by trying to do to much with too few observations/replicates. Is it possible to use a smaller design matrix? Looking at your design matrix it would appear that you have no replicates of either of the 5 treatments you list there. Based on your description of the experiment, I'm not really sure whether this is the case or not? By the way, it looks like you have quite a complex plate/sample combination design compared to a standard qPCR analysis - I can see we you end up with an object called "raw_monster2" after all the different rbind and cbind ;) Cheers \Heidi > Below, I will describe the experimental design and the steps taken to > create my qPCRset object. Then I will paste the commands used, and > their results, in the steps leading up to running the limmaCtData > function on my qPCRset object. > > We have 21 96-well plates. Each plate contains 5 experimental groups > and 4 genes--2 target genes, and 2 endogenous controls. Each > experimental group sampled all 4 genes, and there were 3 biological > replicates per sample, for a total of 12 wells per experimental group. > > Every 7 plates among the total 21 plates constitutes a "set" of > plates: they each contain the same 14 target genes. This means that > each gene, in each experimental condition, has 3 samples among the 21 > plates--one sample per experimental condition for each 7-plate set. > > The goal is to compare the Ct values for each gene in each > experimental group, to the Ct values for the same gene in every other > experimental group. > > Using rbind (HTqPCR), I collated 7 of the data files into one file, > so that all 14 genes could be analyzed simultaneously, at least among > a single set of plates--once I had figured that part out, I had > planned on combining the 3 sets. > > To give a clear idea what my data looks like--and how it was > implemented in my qPCRset object--this is the Slot "history" and Slot > "exprs" of my combined qPCRset object (with the data removed): > > Slot "exprs": > 01_veh+FA 02_low+FA 03_mid+FA 04_high+FA > 05_no_treatment > PGES > PGES > PGES > c-Fos > c-Fos > c-Fos > SPP1 > SPP1 > SPP1 > CD200 > CD200 > CD200 > COX-1 > COX-1 > COX-1 > COX-2 > COX-2 > COX-2 > OX-42 > OX-42 > OX-42 > iBA-1 > iBA-1 > iBA-1 > IL-2 > IL-2 > IL-2 > IL-4 > IL-4 > IL-4 > IL-6 > IL-6 > IL-6 > IL-8 > IL-8 > IL-8 > IL-10 > IL-10 > IL-10 > CD4 > CD4 > CD4 > > Slot "history": > history > 1 raw8: readCtData(files = "NS398_08b.txt", path = barrPath, > n.features = 12, > 2 flag = NULL, feature = 5, type = 7, position = 2, Ct = 6, > 3 header = TRUE, n.data = 5) > 4 raw9: readCtData(files = "NS398_09b.txt", path = barrPath, > n.features = 12, > 5 flag = NULL, feature = 5, type = 7, position = 2, Ct = 6, > 6 header = TRUE, n.data = 5) > 7 raw10: readCtData(files = "NS398_10b.txt", path = barrPath, > n.features = 12, > 8 flag = NULL, feature = 5, type = 7, position = 2, Ct = 6, > 9 header = TRUE, n.data = 5) > 10 raw11: readCtData(files = "NS398_11b.txt", path = barrPath, > n.features = 12, > 11 flag = NULL, feature = 5, type = 7, position = 2, Ct = 6, > 12 header = TRUE, n.data = 5) > 13 raw12: readCtData(files = "NS398_12b.txt", path = barrPath, > n.features = 12, > 14 flag = NULL, feature = 5, type = 7, position = 2, Ct = 6, > 15 header = TRUE, n.data = 5) > 16 raw13: readCtData(files = "NS398_13b.txt", path = barrPath, > n.features = 12, > 17 flag = NULL, feature = 5, type = 7, position = 2, Ct = 6, > 18 header = TRUE, n.data = 5) > 19 raw14: readCtData(files = "NS398_14b.txt", path = barrPath, > n.features = 12, > 20 flag = NULL, feature = 5, type = 7, position = 2, Ct = 6, > 21 header = TRUE, n.data = 5) > 22 rbind(deparse.level, ..1, ..2, ..3, ..4, ..5, ..6, ..7) > 23 normalizeCtData(q = raw_monster2, norm = "deltaCt", > deltaCt.genes = "GAPDH") > 24 filterCtDataNew(q = d.raw2, remove.type = "Endogenous > Control") > 25 setCategory(q = fd.raw2, Ct.max = 100, Ct.min = 0, > quantile = 0.9, > > So, then I prepared the matrix for analysis with limma: > >> design<-model.matrix(~0+sampleNames(test.d.raw2)) > Warning message: > In model.matrix.default(~0 + sampleNames(test.d.raw2)) : > variable 'sampleNames(test.d.raw2)' converted to a factor >> colnames(design)<-c("VehFA","LowFA","MidFA","HighFA","NoTreat") >> print(design) > VehFA LowFA MidFA HighFA NoTreat > 1 1 0 0 0 0 > 2 0 1 0 0 0 > 3 0 0 1 0 0 > 4 0 0 0 1 0 > 5 0 0 0 0 1 > attr(,"assign") > [1] 1 1 1 1 1 > attr(,"contrasts") > attr(,"contrasts")$`sampleNames(test.d.raw2)` > [1] "contr.treatment" >> contrasts<-makeContrasts(VehFA-LowFA, VehFA-MidFA, VehFA-HighFA, > + VehFA-NoTreat, LowFA-MidFA, LowFA-HighFA, LowFA-NoTreat, > + MidFA-HighFA, MidFA-NoTreat,HighFA-NoTreat, levels=design) >> colnames(contrasts)<-c("V-L", "V-M", "V-H", "V-NT", "L-M", "L-H", > + "L-NT", "M-H", "M-NT", "H-NT") >> print(contrasts) > Contrasts > Levels V-L V-M V-H V-NT L-M L-H L-NT M-H M-NT H-NT > VehFA 1 1 1 1 0 0 0 0 0 0 > LowFA -1 0 0 0 1 1 1 0 0 0 > MidFA 0 -1 0 0 -1 0 0 1 1 0 > HighFA 0 0 -1 0 0 -1 0 -1 0 1 > NoTreat 0 0 0 -1 0 0 -1 0 -1 -1 >> test.d.raw2b<-test.d.raw2[order(featureNames(test.d.raw2)), ] > ====================================================================== > = >> qDE.limma <- limmaCtData(test.d.raw2b,design=design, > + contrasts=contrasts,ndups=3,spacing=1) > Error in chol.default(V) : > the leading minor of order 2 is not positive definite > In addition: Warning message: > In sqrt(dfitted.values) : NaNs produced >> traceback() > 6: .Call("La_chol", as.matrix(x), PACKAGE = "base") > 5: chol.default(V) > 4: chol(V) > 3: gls.series(y$exprs, design = design, ndups = ndups, > spacing = spacing, block = block, correlation = correlation, > weights = weights, ...) > 2: lmFit(data, design = design, ndups = ndups, spacing = spacing, > correlation = dup.cor$consensus, ...) > 1: limmaCtData(test.d.raw2b, design = design, contrasts = contrasts, > ndups = 3, spacing = 1) > > Any ideas on why I am getting this error and what I might do to avoid > it? If there is any other information needed, please let me know. > > Thanks, > Kevin > bassk1 at email.chop.edu > > > > ===== > > Kevin Bass, Research Technician > Barr Lab > Children's Hospital of Philadelphia > Abramson Research Center > 3615 Civic Center Blvd, Suite 714 > Philadelphia PA 19104-4399 > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/ > gmane.science.biology.informatics.conductor ------------------------------ Message: 29 Date: Fri, 23 Jul 2010 05:50:49 -0400 From: Vincent Carey <stvjc@channing.harvard.edu> To: bioconductor <bioconductor at="" stat.math.ethz.ch=""> Subject: [BioC] building a refseq-based transcriptDb: warnings of interest? Message-ID: <aanlktikxjh9dbszeynccwst2hj15sbohmbnd8e46m5_- at="" mail.gmail.com=""> Content-Type: text/plain; charset=ISO-8859-1 > hg18r.txdb = makeTranscriptDbFromUCSC(tablename="refGene") Download the refGene table ... OK Download the refLink table ... OK Extract the 'transcripts' data frame ... OK Extract the 'splicings' data frame ... OK Download and preprocess the 'chrominfo' data frame ... OK Prepare the 'metadata' data frame ... OK Make the TranscriptDb object ... OK There were 50 or more warnings (use warnings() to see the first 50) > warnings() Warning messages: 1: In .extractUCSCCdsStartEnd(cdsStart[i], cdsEnd[i], exon_locs$start[[i]], ... : UCSC data anomaly in transcript NM_017940: the cds cumulative length is not a multiple of 3 2: In .extractUCSCCdsStartEnd(cdsStart[i], cdsEnd[i], exon_locs$start[[i]], ... : UCSC data anomaly in transcript NM_001037675: the cds cumulative length is not a multiple of 3 3: In .extractUCSCCdsStartEnd(cdsStart[i], cdsEnd[i], exon_locs$start[[i]], ... : UCSC data anomaly in transcript NM_001039703: the cds cumulative length is not a multiple of 3 4: In .extractUCSCCdsStartEnd(cdsStart[i], cdsEnd[i], exon_locs$start[[i]], ... : and so on. Does this need to be reported to UCSC? > sessionInfo() R version 2.12.0 Under development (unstable) (2010-06-30 r52417) Platform: x86_64-apple-darwin10.3.0/x86_64 (64-bit) locale: [1] C attached base packages: [1] stats graphics grDevices datasets tools utils methods [8] base other attached packages: [1] GenomicFeatures_1.1.6 GenomicRanges_1.1.15 IRanges_1.7.13 [4] weaver_1.15.0 codetools_0.2-2 digest_0.4.2 loaded via a namespace (and not attached): [1] BSgenome_1.17.5 Biobase_2.9.0 Biostrings_2.17.26 DBI_0.2-5 [5] RCurl_1.4-2 RSQLite_0.9-1 XML_3.1-0 biomaRt_2.5.1 [9] rtracklayer_1.9.3 ------------------------------ _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor End of Bioconductor Digest, Vol 89, Issue 22 ******************************************** ********************************************************************** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager (it.support at cancer.ucl.ac.uk). **********************************************************************

ADD COMMENT • link 14.6 years ago Amos Folarin ▴ 20

Entering edit mode

Cheers Amos, From this post response its clear there's no simple way to do this with available R heatmap functions. And with your example in hand can plow on and hopefully get a workable solution. Thanks alot for this! Karl On 7/23/2010 3:16 PM, Amos Folarin wrote: > Hi Karl, > The only way I know to rotate the labels is pretty crude. You will have to reconstitute the labels using the text() function. > The caveat here is you'll have to play around to get this right. > > Try something like this: > > Library(gplots) > x<- matrix(rnorm(25), 5) > heatmap.2(x, labRow="", labCol="") #remove the labels > # plot the text, perhaps someone can think of a smarter way of getting the labels in position... > text(seq(par("xaxp")[1]+par("xaxp")[2]/par("xaxp")[3], par("xaxp")[2], by=0.8*(par("xaxp")[2]/par("xaxp")[3])),par("usr")[3], par("usr")[3] - 0.2, labels = c("first", "second", "third", "fourth", "fifth"), srt = 45, pos = 1, xpd = TRUE) > > Unfortunatetly the heatmap is laid out in a 2x2 matrix with the dendrograms and key in the first 3 cells and the heatmap in the bottom right -- I'm not sure if it is possible to access the axes of this element independently. If one could then it might make positioning the labels for the heatmap moiety of the plot simple. > > > > Amos > > > > -----Original Message----- > From: bioconductor-bounces at stat.math.ethz.ch [mailto :bioconductor-bounces at stat.math.ethz.ch] On Behalf Of bioconductor- request at stat.math.ethz.ch > Sent: 23 July 2010 11:00 > To: bioconductor at stat.math.ethz.ch > Subject: Bioconductor Digest, Vol 89, Issue 22 > > Send Bioconductor mailing list submissions to > bioconductor at stat.math.ethz.ch > > To subscribe or unsubscribe via the World Wide Web, visit > https://stat.ethz.ch/mailman/listinfo/bioconductor > or, via email, send a message with subject or body 'help' to > bioconductor-request at stat.math.ethz.ch > > You can reach the person managing the list at > bioconductor-owner at stat.math.ethz.ch > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Bioconductor digest..." > > > Today's Topics: > > 1. heatmap.2 - change column& row locations; angle / rotate > (Karl Brand) > 2. In limma, how to set quility weight for each spot. (Jinyan Huang) > 3. Re: In limma, how to set quility weight for each spot. > (Sean Davis) > 4. Re: exonmap/xmapcore error (Crispin Miller) > 5. Heatmap.2 scale problems: Sacling inside the function gives > different results than scaling outside!!! (Elmer Fern?ndez) > 6. Re: exonmap/xmapcore error (Crispin Miller) > 7. Re: Heatmap.2 scale problems: Sacling inside the function > gives different results than scaling outside!!! (Sean Davis) > 8. ShortRead QA (Alex Gutteridge) > 9. Re: Heatmap.2 scale problems: Sacling inside the function > gives different results than scaling outside!!! (Bazeley, Peter) > 10. Re: Heatmap.2 scale problems: Sacling inside the function > gives different results than scaling outside!!! (Benjamin Otto) > 11. Biostrings - vcountPattern optimization (Erik Wright) > 12. Re: Biostrings - vcountPattern optimization (Steve Lianoglou) > 13. problem about hgu133plus2 annotation (Gina Liao) > 14. Re: Heatmap.2 scale problems: Sacling inside the function > gives different results than scaling outside!!! (Elmer Fern?ndez) > 15. Re: problem about hgu133plus2 annotation (Marc Carlson) > 16. Re: problem about hgu133plus2 annotation (James W. MacDonald) > 17. Re: Biostrings - vcountPattern optimization (Patrick Aboyoun) > 18. Re: feature request - pairwiseAlignment() in Biostrings > (Patrick Aboyoun) > 19. Re: Biostrings - vcountPattern optimization (Erik Wright) > 20. Re: feature request - pairwiseAlignment() in Biostrings > (Michael Lawrence) > 21. Re: Heatmap.2 scale problems: Sacling inside the function > gives different results than scaling outside!!! (Steve Lianoglou) > 22. Re: Biostrings - vcountPattern optimization (Hervé Pagès) > 23. Re: Heatmap.2 scale problems: Sacling inside the function > gives different results than scaling outside!!! (Elmer Fern?ndez) > 24. Re: Heatmap.2 scale problems: Sacling inside the function > gives different results than scaling outside!!! (Sean Davis) > 25. the design matrix again (Gordon K Smyth) > 26. Open Postdoc Positions (Thomas Girke) > 27. Re: htQPCR (Heidi Dvinge) > 28. Re: Problem with function limmaCtData in HTqPCR package: > "leading minor of order 2 is not positive definite" (Heidi Dvinge) > 29. building a refseq-based transcriptDb: warnings of interest? > (Vincent Carey) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Thu, 22 Jul 2010 12:18:16 +0200 > From: Karl Brand<k.brand at="" erasmusmc.nl=""> > To: bioconductor at stat.math.ethz.ch > Subject: [BioC] heatmap.2 - change column& row locations; angle / > rotate > Message-ID:<4C481AE8.7060701 at erasmusmc.nl> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > <reposting from="" "r-help="" at="" r-project.org"=""> > > Esteemed BioC user's, > > I'm struggling to achieve some details of a heatmap using heatmap.2(): > > 1. Change label locations, for both rows& columns from the default > right& bottom, to left and top. > Can this be done within heatmap.2()? Or do i need to suppress this > default behavior (how) and call a new function to relabel (what) > specifying locations? > > 2. Change the angle of the labels. > By default column labels are 90deg anti-clock-wise from horizontal. How > to bring them back to horizontal? Or better, rotate 45deg clock-wise > from horizontal (ie., rotate 135deg a.clock.wise from default)? > > Any suggestions or pointers to helpful resources greatly appreciated, > > Karl > -- Karl Brand Department of Genetics Erasmus MC Dr Molewaterplein 50 3015 GE Rotterdam T +31 (0)10 704 3457 |F +31 (0)10 704 4743 |M +31 (0)642 777 268

ADD REPLY • link 14.6 years ago k. brand ▴ 420

Entering edit mode

On Sun, Jul 25, 2010 at 9:04 AM, Karl Brand <k.brand@erasmusmc.nl> wrote: > Cheers Amos, > > From this post response its clear there's no simple way to do this with > available R heatmap functions. And with your example in hand can plow on and > hopefully get a workable solution. Thanks alot for this! > > Karl > > > For a totally different approach: library(latticeExtra) ?dendrogramGrob Sean > > On 7/23/2010 3:16 PM, Amos Folarin wrote: > >> Hi Karl, >> The only way I know to rotate the labels is pretty crude. You will have to >> reconstitute the labels using the text() function. >> The caveat here is you'll have to play around to get this right. >> >> Try something like this: >> >> Library(gplots) >> x<- matrix(rnorm(25), 5) >> heatmap.2(x, labRow="", labCol="") #remove the labels >> # plot the text, perhaps someone can think of a smarter way of getting the >> labels in position... >> text(seq(par("xaxp")[1]+par("xaxp")[2]/par("xaxp")[3], par("xaxp")[2], >> by=0.8*(par("xaxp")[2]/par("xaxp")[3])),par("usr")[3], par("usr")[3] - 0.2, >> labels = c("first", "second", "third", "fourth", "fifth"), srt = 45, pos = >> 1, xpd = TRUE) >> >> Unfortunatetly the heatmap is laid out in a 2x2 matrix with the >> dendrograms and key in the first 3 cells and the heatmap in the bottom right >> -- I'm not sure if it is possible to access the axes of this element >> independently. If one could then it might make positioning the labels for >> the heatmap moiety of the plot simple. >> >> >> >> Amos >> >> >> >> -----Original Message----- >> From: bioconductor-bounces@stat.math.ethz.ch [mailto: >> bioconductor-bounces@stat.math.ethz.ch] On Behalf Of >> bioconductor-request@stat.math.ethz.ch >> Sent: 23 July 2010 11:00 >> To: bioconductor@stat.math.ethz.ch >> Subject: Bioconductor Digest, Vol 89, Issue 22 >> >> Send Bioconductor mailing list submissions to >> bioconductor@stat.math.ethz.ch >> >> To subscribe or unsubscribe via the World Wide Web, visit >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> or, via email, send a message with subject or body 'help' to >> bioconductor-request@stat.math.ethz.ch >> >> You can reach the person managing the list at >> bioconductor-owner@stat.math.ethz.ch >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of Bioconductor digest..." >> >> >> Today's Topics: >> >> 1. heatmap.2 - change column& row locations; angle / rotate >> (Karl Brand) >> 2. In limma, how to set quility weight for each spot. (Jinyan Huang) >> 3. Re: In limma, how to set quility weight for each spot. >> (Sean Davis) >> 4. Re: exonmap/xmapcore error (Crispin Miller) >> 5. Heatmap.2 scale problems: Sacling inside the function gives >> different results than scaling outside!!! (Elmer Fern?ndez) >> 6. Re: exonmap/xmapcore error (Crispin Miller) >> 7. Re: Heatmap.2 scale problems: Sacling inside the function >> gives different results than scaling outside!!! (Sean Davis) >> 8. ShortRead QA (Alex Gutteridge) >> 9. Re: Heatmap.2 scale problems: Sacling inside the function >> gives different results than scaling outside!!! (Bazeley, Peter) >> 10. Re: Heatmap.2 scale problems: Sacling inside the function >> gives different results than scaling outside!!! (Benjamin Otto) >> 11. Biostrings - vcountPattern optimization (Erik Wright) >> 12. Re: Biostrings - vcountPattern optimization (Steve Lianoglou) >> 13. problem about hgu133plus2 annotation (Gina Liao) >> 14. Re: Heatmap.2 scale problems: Sacling inside the function >> gives different results than scaling outside!!! (Elmer Fern?ndez) >> 15. Re: problem about hgu133plus2 annotation (Marc Carlson) >> 16. Re: problem about hgu133plus2 annotation (James W. MacDonald) >> 17. Re: Biostrings - vcountPattern optimization (Patrick Aboyoun) >> 18. Re: feature request - pairwiseAlignment() in Biostrings >> (Patrick Aboyoun) >> 19. Re: Biostrings - vcountPattern optimization (Erik Wright) >> 20. Re: feature request - pairwiseAlignment() in Biostrings >> (Michael Lawrence) >> 21. Re: Heatmap.2 scale problems: Sacling inside the function >> gives different results than scaling outside!!! (Steve Lianoglou) >> 22. Re: Biostrings - vcountPattern optimization (Herv? Pag?s) >> 23. Re: Heatmap.2 scale problems: Sacling inside the function >> gives different results than scaling outside!!! (Elmer Fern?ndez) >> 24. Re: Heatmap.2 scale problems: Sacling inside the function >> gives different results than scaling outside!!! (Sean Davis) >> 25. the design matrix again (Gordon K Smyth) >> 26. Open Postdoc Positions (Thomas Girke) >> 27. Re: htQPCR (Heidi Dvinge) >> 28. Re: Problem with function limmaCtData in HTqPCR package: >> "leading minor of order 2 is not positive definite" (Heidi Dvinge) >> 29. building a refseq-based transcriptDb: warnings of interest? >> (Vincent Carey) >> >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Thu, 22 Jul 2010 12:18:16 +0200 >> From: Karl Brand<k.brand@erasmusmc.nl> >> To: bioconductor@stat.math.ethz.ch >> Subject: [BioC] heatmap.2 - change column& row locations; angle / >> rotate >> Message-ID:<4C481AE8.7060701@erasmusmc.nl> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed >> >> <reposting from="" "r-help@r-project.org"=""> >> >> Esteemed BioC user's, >> >> I'm struggling to achieve some details of a heatmap using heatmap.2(): >> >> 1. Change label locations, for both rows& columns from the default >> right& bottom, to left and top. >> Can this be done within heatmap.2()? Or do i need to suppress this >> default behavior (how) and call a new function to relabel (what) >> specifying locations? >> >> 2. Change the angle of the labels. >> By default column labels are 90deg anti-clock-wise from horizontal. How >> to bring them back to horizontal? Or better, rotate 45deg clock- wise >> from horizontal (ie., rotate 135deg a.clock.wise from default)? >> >> Any suggestions or pointers to helpful resources greatly appreciated, >> >> Karl >> >> > -- > Karl Brand > Department of Genetics > Erasmus MC > Dr Molewaterplein 50 > 3015 GE Rotterdam > T +31 (0)10 704 3457 |F +31 (0)10 704 4743 |M +31 (0)642 777 268 > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]

ADD REPLY • link 14.6 years ago Sean Davis 21k