Cluster analysis distance measuer
1
0
Entering edit mode
Auer Michael ▴ 250
@auer-michael-953
Last seen 10.2 years ago
I would like to know wheter there exists the possibility to cluster genes non-hierachically, but with the correlation as distance measure? K-means, clara, pam, etc, only seem to work with euclidean metrics. I aks the question because the number of genes is often too big to apply hierarchical clustering, and the distance measure has a strong influence on the way genes are clusterd. Thanks Send Bioconductor mailing list submissions to > bioconductor@stat.math.ethz.ch > > To subscribe or unsubscribe via the World Wide Web, visit > https://stat.ethz.ch/mailman/listinfo/bioconductor > or, via email, send a message with subject or body 'help' to > bioconductor-request@stat.math.ethz.ch > > You can reach the person managing the list at > bioconductor-owner@stat.math.ethz.ch > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Bioconductor digest..." > > > Today's Topics: > > 1. error following cluster example (kfbargad@lg.ehu.es) > 2. RE: error following cluster example (Claire Wilson) > 3. Re: error following cluster example (Robert Gentleman) > 4. Re: AnnBuilder bug // R-2.0.0 // getList4GO (John Zhang) > 5. comparing different experiments (Julia Engelmann) > 6. Problems with heatmap on genes... (Giulio Di Giovanni) > 7. help with limma contrast matrix (Kimpel, Mark W) > 8. RE: Problems with heatmap on genes... (michael watson (IAH-C)) > 9. Re: Problems with heatmap on genes... (jeffrey rasmussen) > 10. Re: comparing different experiments (Fangxin Hong) > 11. affy segmentation fault (Sucheta Tripathy) > 12. Re: affy segmentation fault (Adaikalavan Ramasamy) > 13. Re: affy segmentation fault (Ben Bolstad) > 14. Re: affy segmentation fault (Sucheta Tripathy) > 15. Re: affy segmentation fault (Adaikalavan Ramasamy) > 16. RE: Problems with heatmap on genes... (Johan Lindberg) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 9 Nov 2004 12:59:41 +0100 (CET) > From: kfbargad@lg.ehu.es > Subject: [BioC] error following cluster example > To: bioconductor@stat.math.ethz.ch > Message-ID: <9456297971kfbargad@lg.ehu.es> > Content-Type: text/plain; charset="ISO-8859-1" > > Dear Users, > > I am following the example on Lab 5: Cluster analysis (June 2003) with > my own data. > > I have filtered my expression set as shown on the example and I get > the following > >> sub <- genefilter(X,ffun) >> sum(sub) > [1] 1124 > > I save this subset of genes and then log transform it. But when I type > the next command I get the following error: >> X <- X[sub,] >> X <- log2(X) >> RawDataSub <- Raw.Data[,sub] > Error in Raw.Data[, sub] : (subscript) logical subscript too long > > Why do I get this error?? > Also, if I have stored the subset expression data as X, why is Raw.Data > [,sub] using [,sub] again? I don?t really understand this step, if > anyone could explain its purpose. > > I?m running R 1.9.1 on an XP computer > > Thanks a lot for your help > > David > > > > ------------------------------ > > Message: 2 > Date: Tue, 9 Nov 2004 12:29:19 -0000 > From: "Claire Wilson" <clairewilson@picr.man.ac.uk> > Subject: RE: [BioC] error following cluster example > To: <kfbargad@lg.ehu.es>, <bioconductor@stat.math.ethz.ch> > Message-ID: > <baa35444b19ad940997ed02a6996aae001de15e7@sanmail.picr.man.ac.uk> > Content-Type: text/plain; charset="US-ASCII" > > >> Dear Users, >> >> I am following the example on Lab 5: Cluster analysis (June >> 2003) with >> my own data. >> >> I have filtered my expression set as shown on the example and I get >> the following >> >> > sub <- genefilter(X,ffun) >> > sum(sub) >> [1] 1124 >> >> I save this subset of genes and then log transform it. But >> when I type >> the next command I get the following error: >> > X <- X[sub,] >> > X <- log2(X) >> > RawDataSub <- Raw.Data[,sub] >> Error in Raw.Data[, sub] : (subscript) logical subscript too long > > it looks like you are tyring to select columns not rows, > RawDataSub <- Raw.Data[,sub] #subsets on columns > try: > RawDataSub <- Raw.Data[sub,] #subset on rows > > hth > > claire > > -------------------------------------------------------- > > > This email is confidential and intended solely for the use o...{{dropped}} > > > > ------------------------------ > > Message: 3 > Date: Tue, 9 Nov 2004 08:07:59 -0500 > From: Robert Gentleman <rgentlem@jimmy.harvard.edu> > Subject: Re: [BioC] error following cluster example > To: kfbargad@lg.ehu.es > Cc: bioconductor@stat.math.ethz.ch > Message-ID: <20041109080759.E29793@jimmy.harvard.edu> > Content-Type: text/plain; charset=iso-8859-1 > > On Tue, Nov 09, 2004 at 12:59:41PM +0100, kfbargad@lg.ehu.es wrote: >> Dear Users, >> >> I am following the example on Lab 5: Cluster analysis (June 2003) with >> my own data. >> >> I have filtered my expression set as shown on the example and I get >> the following >> >> > sub <- genefilter(X,ffun) >> > sum(sub) >> [1] 1124 >> >> I save this subset of genes and then log transform it. But when I type >> the next command I get the following error: >> > X <- X[sub,] >> > X <- log2(X) >> > RawDataSub <- Raw.Data[,sub] >> Error in Raw.Data[, sub] : (subscript) logical subscript too long >> >> Why do I get this error?? > > Perhaps because the dimensions of X and of Raw.Data are not the > same? If you are not familiar with R you should spend some time with > introductory material to learn about the language as that knowledge > is essential for debugging. > >> Also, if I have stored the subset expression data as X, why is Raw.Data >> [,sub] using [,sub] again? I don?t really understand this step, if >> anyone could explain its purpose. >> > > Because X and Raw.Data are not the same object. R basically has pass > by value semantics and so (almost) everything is a copy. > >> I?m running R 1.9.1 on an XP computer >> >> Thanks a lot for your help >> >> David >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor > > -- > +------------------------------------------------------------------- --------+ > | Robert Gentleman phone : (617) 632-5250 > | > | Associate Professor fax: (617) 632-2444 > | > | Department of Biostatistics office: M1B20 > | > | Harvard School of Public Health email: rgentlem@jimmy.harvard.edu > | > +------------------------------------------------------------------- --------+ > > > > ------------------------------ > > Message: 4 > Date: Tue, 9 Nov 2004 08:32:08 -0500 (EST) > From: John Zhang <jzhang@jimmy.harvard.edu> > Subject: Re: [BioC] AnnBuilder bug // R-2.0.0 // getList4GO > To: hathanassiou@automatedcell.com > Cc: bioconductor@stat.math.ethz.ch > Message-ID: <200411091332.IAA26906@blaise.dfci.harvard.edu> > Content-Type: TEXT/plain; charset=us-ascii > > Thanks. I will have a look at the code and fix the problem. > >>From: "Harry Athanassiou" <hathanassiou@automatedcell.com> >>To: <bioconductor@stat.math.ethz.ch> >>Date: Tue, 9 Nov 2004 01:22:42 -0500 >>MIME-Version: 1.0 >>X-Priority: 3 (Normal) >>X-MSMail-Priority: Normal >>Importance: Normal >>X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 >>X-ELNK-Trace: > 5cb454646877e76194f5150ab1c16ac08f4233f47979de267864528e82c1f9d01709 5c3ef67204ee > 350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c >>X-Originating-IP: 70.20.82.76 >>Received-SPF: none (hypatia: domain of >> bioconductor-bounces@stat.math.ethz.ch > does not designate permitted sender hosts) >>Received-SPF: none (hypatia: domain of hathanassiou@automatedcell.com >> does not > designate permitted sender hosts) >>X-Virus-Scanned: by amavisd-new at stat.math.ethz.ch >>Content-Transfer-Encoding: 8bit >>X-MIME-Autoconverted: from quoted-printable to 8bit by >> hypatia.math.ethz.ch id > iA96MkAl017284 >>Subject: [BioC] AnnBuilder bug // R-2.0.0 // getList4GO >>X-BeenThere: bioconductor@stat.math.ethz.ch >>X-Mailman-Version: 2.1.5 >>List-Id: The Bioconductor Project Mailing List >> <bioconductor.stat.math.ethz.ch> >>List-Unsubscribe: <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor="">, > <mailto:bioconductor-request@stat.math.ethz.ch?subject=unsubscribe> >>List-Archive: <https: stat.ethz.ch="" pipermail="" bioconductor=""> >>List-Post: <mailto:bioconductor@stat.math.ethz.ch> >>List-Help: <mailto:bioconductor- request@stat.math.ethz.ch?subject="help"> >>List-Subscribe: <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor="">, > <mailto:bioconductor-request@stat.math.ethz.ch?subject=subscribe> >>X-Spam-Checker-Version: SpamAssassin 2.60-rc1 (1.197-2003-08-21-exp) on > blaise.dfci.harvard.edu >>X-Spam-Status: No, hits=0.0 required=5.0 tests=none autolearn=ham > version=2.60-rc1 >>X-Spam-Level: >> >>I'm trying to use AnnBuilder to make some custom annotation files for a >>non-standard microarray chip. In running the tests with R-2.0.0, I run >>acroos a problem in the function getList4GO. I'm not sure if this issue >> is >>due to R-2.0.0 or not. >> >>Here's the issue: >>when the sub-function procOne is called by sapply, the names(goids) is >> NULL. >>Thus when procOne calls : >> apply(temp, 1, vect2List, vectNames = c("GOID", "Evidence", >> "Ontology")) >>the number of list-elements to be named is mismatched. >> >>I do not know how to make sapply pass the names() of its first argument >> to >>the FUN() it calls, so I modified procOne->procOne.new to drop the column >>"Evidence". >>And add this column with a trick afterwards. >> >>I'm sure this is not the best solution, just worked for me >> >>>>> >>getList4GO <- function (goNCat, goNEvi) >>{ >> procOne <- function(goids) { >> if (is.null(goids) || is.na(goids)) { >> return(NA) >> } >> else { >> temp <- cbind(goids, names(goids), goNCat[goids]) >> rownames(temp) <- goids >> return(apply(temp, 1, vect2List, vectNames = c("GOID", >>"Evidence", "Ontology"))) >> } >> } >> >> # the names(goids) do not get propagated through the sapply() in >>R-2.0.0! >> # remove the column evidence >> procOne.new <- function(goids) { >> if (is.null(goids) || is.na(goids)) { >> return(NA) >> } >> else { >> temp <- cbind(goids, goNCat[goids]) >> rownames(temp) <- goids >> return(apply(temp, 1, vect2List, vectNames = c("GOID", >>"Ontology"))) >> } >> } >> >> temp <- sapply(goNEvi, procOne.new) >> names(temp) <- 1:length(temp) >> >> # add the evidence list-element >> # do not know a better way will do a loop on an index to acc two >> arrays >>at the same time >> for (r in 1:length(goNEvi)) { >> if (!is.na(temp[r])) { >> temp[[r]] <- c(temp[[r]], "Evidence"=names(goNEvi)[r]) >> } >> } >> >> return(temp) >>} >>>>> >> >>Harry Athanassiou >>BioInformatics manager >>Automated Cell, Inc >> >>_______________________________________________ >>Bioconductor mailing list >>Bioconductor@stat.math.ethz.ch >>https://stat.ethz.ch/mailman/listinfo/bioconductor > > Jianhua Zhang > Department of Biostatistics > Dana-Farber Cancer Institute > 44 Binney Street > Boston, MA 02115-6084 > > > > ------------------------------ > > Message: 5 > Date: Tue, 09 Nov 2004 16:30:53 +0100 > From: Julia Engelmann <julia.engelmann@biozentrum.uni-wuerzburg.de> > Subject: [BioC] comparing different experiments > To: bioconductor@stat.math.ethz.ch > Message-ID: <4190E2AD.1060501@biozentrum.uni-wuerzburg.de> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Hi list, > > I wonder if I can compare Affymetrix arrays of the same type (ATH1) > which were made in different laboratories and with different tissue > types and different references. I have: "tissue1 treated", "tissue1 > untreated" from one lab and "tissue2 treated", "tissue2 untreated" from > the other lab. > The references (untreated) are different because of the different > tissue types. I am interested in the difference between tissue1 treated > and tissue2 treated, so I thought I could use limma to make a contrast: > (tissue1_treated-tissue1_untreated)-(tissue2_treated- tissue2_untreated). > I am not sure if this is valid, though? For example, I do not account > for the different labs that way. > Maybe it is just possible to analyse each experiment by itself and > compare the results at a latter stage, say compare lists of > differentially expressed genes? > > Any advice, comments or hints are highly appreciated, > > Julia > > > > ------------------------------ > > Message: 6 > Date: Tue, 09 Nov 2004 15:37:56 +0000 > From: "Giulio Di Giovanni" <perimessaggini@hotmail.com> > Subject: [BioC] Problems with heatmap on genes... > To: bioconductor@stat.math.ethz.ch > Message-ID: <bay10-f38bmhwrhiu340003395b@hotmail.com> > Content-Type: text/plain; charset=iso-8859-1; format=flowed > > > Hi, > > I'm trying to have a clear figure of gene clusters using heatmaps, but > with > more than 100-200 genes it's not possible to do it, with default options > (and I would like to do that with 1500 genes or so...). Gene names (and > branchs too) collapse together... > > I tried, setting new device dimensions (jpeg() or png() height and width), > and modifying par() options (fin, etc..), to have long cluster figures (to > be clear, dChip style). Well, it works for others high-level graphical > functions, but it doesn't work for heatmaps(). I always obtain big > figures, > but with exactely the same squared heatmap inside. > > I spent long time on the documentation and searching the web, and when I > found something, it was always some heatmaps for 50-100 genes at max > > I trust that someone working on gene clustering is confidential on this, > and I will appreciate a lot any suggestion... I almost became crazy on > that > !!! > > Thanks in advance, > > Giulio > > > > ------------------------------ > > Message: 7 > Date: Tue, 9 Nov 2004 11:54:55 -0500 > From: "Kimpel, Mark W" <mkimpel@iupui.edu> > Subject: [BioC] help with limma contrast matrix > To: <bioconductor@stat.math.ethz.ch> > Message-ID: > <2E6C5260C7C387449A96DF46EE76313C017D8985@iu-mssg- mbx02.exchange.iu.edu> > > Content-Type: text/plain; charset="us-ascii" > > I would appreciate advice on how to construct a contrast matrix for a > 5X2 ANOVA design. Briefly, I have a genomic experiment to analyze that > compares 5 brain regions in 2 strains of rats. We are interested in > discovering overall differences between strains (collapsing all brain > regions together) but also discovering differences that may only be > expressed in one brain region. > > I have attempted to construct the appropriate matrix with the code > listed below, but it does not work. I seem to get differences between > strains, but all the brain region contrasts give exactly the same > results, so I know something isn't correct. > > contrast <-makeContrasts( > > ( > (NPAccumbens + NPAmygdala + NPHippocampus + NPPrefrontal_Cortex > + NPStriatum) - #all regions of strain "NP" > > (PAccumbens + PAmygdala + PHippocampus + PPrefrontal_Cortex + > PStriatum) #all regions of strain "P" > > ), > > (NPAccumbens - PAccumbens), > #accumbens region of both strains > > (NPAmygdala - PAmygdala), > #amygdala region of both strains > > (NPHippocampus - PHippocampus), > #hippocampus region of both strains > > (NPPrefrontal_Cortex - PPrefrontal_Cortex), > #Prefrontal_Cortex region of both strains > > (NPStriatum - PStriatum), > #striatum region of both strains > > levels=design) > > > Thanks! > > Mark > > Mark W. Kimpel MD > > > > (317) 490-5129 Home, Work, & Mobile > > (317) 278-4104 FAX > > > > ------------------------------ > > Message: 8 > Date: Tue, 9 Nov 2004 16:54:09 -0000 > From: "michael watson (IAH-C)" <michael.watson@bbsrc.ac.uk> > Subject: RE: [BioC] Problems with heatmap on genes... > To: "Giulio Di Giovanni" <perimessaggini@hotmail.com>, > <bioconductor@stat.math.ethz.ch> > Message-ID: > <8975119BCD0AC5419D61A9CF1A923E950121B868@iahce2knas1.iah.bbsr c.reserved> > > Content-Type: text/plain; charset="Windows-1252" > > Hi > > There's a function called heatmap.2 in the gregmisc library that will > resize properly when you send it to a long png() or jpg(). > > It's similar to, but not the same as, heatmap() so read the docs! > > Mick > > > -----Original Message----- > From: Giulio Di Giovanni [mailto:perimessaggini@hotmail.com] > Sent: Tue 11/9/2004 3:37 PM > To: bioconductor@stat.math.ethz.ch > Cc: > Subject: [BioC] Problems with heatmap on genes... > > Hi, > > I'm trying to have a clear figure of gene clusters using heatmaps, but > with > more than 100-200 genes it's not possible to do it, with default options > (and I would like to do that with 1500 genes or so...). Gene names (and > branchs too) collapse together... > > I tried, setting new device dimensions (jpeg() or png() height and width), > and modifying par() options (fin, etc..), to have long cluster figures (to > be clear, dChip style). Well, it works for others high-level graphical > functions, but it doesn't work for heatmaps(). I always obtain big > figures, > but with exactely the same squared heatmap inside. > > I spent long time on the documentation and searching the web, and when I > found something, it was always some heatmaps for 50-100 genes at max > > I trust that someone working on gene clustering is confidential on this, > and I will appreciate a lot any suggestion... I almost became crazy on > that > !!! > > Thanks in advance, > > Giulio > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > > ------------------------------ > > Message: 9 > Date: Tue, 9 Nov 2004 09:14:23 -0800 (PST) > From: jeffrey rasmussen <rasmuss@u.washington.edu> > Subject: Re: [BioC] Problems with heatmap on genes... > To: Giulio Di Giovanni <perimessaggini@hotmail.com> > Cc: bioconductor@stat.math.ethz.ch > Message-ID: > <pine.a41.4.61b.0411090909320.309756@homer11.u.washington.edu> > Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed > > Hi Giulio, > > If you have access to Adobe Illustrator, you could write your heatmap to a > postscript file using postscript() and then open and edit the file in > Illustrator. I've found that in many cases this is much easier than > wrangling with the plotting parameters in R, in particular when it comes > to fonts. Otherwise, trying to display > 50 genes on a heatmap becomes > prohibitively difficult. > > Best, > Jeff. > > On Tue, 9 Nov 2004, Giulio Di Giovanni wrote: > >> >> Hi, >> >> I'm trying to have a clear figure of gene clusters using heatmaps, but >> with >> more than 100-200 genes it's not possible to do it, with default options >> (and >> I would like to do that with 1500 genes or so...). Gene names (and >> branchs >> too) collapse together... >> >> I tried, setting new device dimensions (jpeg() or png() height and >> width), >> and modifying par() options (fin, etc..), to have long cluster figures >> (to be >> clear, dChip style). Well, it works for others high-level graphical >> functions, but it doesn't work for heatmaps(). I always obtain big >> figures, >> but with exactely the same squared heatmap inside. >> >> I spent long time on the documentation and searching the web, and when I >> found something, it was always some heatmaps for 50-100 genes at max >> >> I trust that someone working on gene clustering is confidential on this, >> and I will appreciate a lot any suggestion... I almost became crazy on >> that >> !!! >> >> Thanks in advance, >> >> Giulio >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> > > > > ------------------------------ > > Message: 10 > Date: Tue, 9 Nov 2004 13:26:19 -0800 (PST) > From: "Fangxin Hong" <fhong@salk.edu> > Subject: Re: [BioC] comparing different experiments > To: "Julia Engelmann" <julia.engelmann@biozentrum.uni-wuerzburg.de> > Cc: bioconductor@stat.math.ethz.ch > Message-ID: <1630.10.10.200.250.1100035579.squirrel@10.10.200.250> > Content-Type: text/plain;charset=iso-8859-1 > > >> I wonder if I can compare Affymetrix arrays of the same type (ATH1) >> which were made in different laboratories and with different tissue >> types and different references. I have: "tissue1 treated", "tissue1 >> untreated" from one lab and "tissue2 treated", "tissue2 untreated" from >> the other lab. >> The references (untreated) are different because of the different >> tissue types. I am interested in the difference between tissue1 treated >> and tissue2 treated, so I thought I could use limma to make a contrast: >> (tissue1_treated-tissue1_untreated)-(tissue2_treated- tissue2_untreated). >> I am not sure if this is valid, though? For example, I do not account >> for the different labs that way. >> Maybe it is just possible to analyse each experiment by itself and >> compare the results at a latter stage, say compare lists of >> differentially expressed genes? > Based on what I observed when study data generated at different lab, lab > effect can't not be completely removed by normalization step. If you do > have some replicates or several data sets from each lab, and you want to > combine data together, I would suggest you to inlcude a fixed effect for > lab factor. > Hopefully this will help. > > Fangxin > > > _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> >> > > > -- > Fangxin Hong, Ph.D. > Plant Biology Laboratory > The Salk Institute > 10010 N. Torrey Pines Rd. > La Jolla, CA 92037 > E-mail: fhong@salk.edu > > > > ------------------------------ > > Message: 11 > Date: Tue, 9 Nov 2004 16:46:03 -0500 (EST) > From: "Sucheta Tripathy" <sutripa@vbi.vt.edu> > Subject: [BioC] affy segmentation fault > To: bioconductor@stat.math.ethz.ch > Message-ID: <1815.199.3.136.4.1100036763.squirrel@webmail.vbi.vt.edu> > Content-Type: text/plain;charset=iso-8859-1 > > > I know we have been cluttering this mailing list with this question over > and again. The reason I want to ask again is after seeing the segmentation > fault error, I found it says 340000 KB to be the size it needs. > > What puzzles me is our memory is way beyond that(almost 5 GB with 10 GB > swap memory). > > After trying all the remedies, it still fails. Can anyone suggest if in > the source where the exact memory allocation takes place, how much is > fixed to be the size. Can we not increase it? Or to begin with which > version of affy package has a fix for it. > > Thanks in advance. > > Sucheta > > -- > Sucheta Tripathy > Virginia Bioinformatics Institute Phase-I > Washington street. > Virginia Tech. > Blacksburg,VA 24061-0447 > phone:(540)231-8138 > Fax: (540) 231-2606 > > > > ------------------------------ > > Message: 12 > Date: Tue, 09 Nov 2004 22:46:04 +0000 > From: Adaikalavan Ramasamy <ramasamy@cancer.org.uk> > Subject: Re: [BioC] affy segmentation fault > To: Sucheta Tripathy <sutripa@vbi.vt.edu> > Cc: BioConductor mailing list <bioconductor@stat.math.ethz.ch> > Message-ID: <1100040364.3326.10.camel@localhost.localdomain> > Content-Type: text/plain > > I just checked the mailing archives. You sent 2 mails in Novembers > (excluding this) and 2 in October but none of them talk about > segmentation fault error. Perhaps you can explain who "we" are or better > yet state the problem or link to past mail (perhaps from > https://stat.ethz.ch/pipermail/bioconductor/). > > Start from a clean R session and see if you can repeat the problem. > Next, reduce the number of arrays till you find out how many arrays your > machine can handle. Try just.rma or just.gcrma. Also search the mailing > archives. These are all guesses. > > Note that although 5 GB is available to a machine, there might be a > limit to how much each process/user can have access to. Speak to your > system administrator about any such limitation. > > Regards, Adai > > > > On Tue, 2004-11-09 at 21:46, Sucheta Tripathy wrote: >> I know we have been cluttering this mailing list with this question over >> and again. The reason I want to ask again is after seeing the >> segmentation >> fault error, I found it says 340000 KB to be the size it needs. >> >> What puzzles me is our memory is way beyond that(almost 5 GB with 10 GB >> swap memory). >> >> After trying all the remedies, it still fails. Can anyone suggest if in >> the source where the exact memory allocation takes place, how much is >> fixed to be the size. Can we not increase it? Or to begin with which >> version of affy package has a fix for it. >> >> Thanks in advance. >> >> Sucheta > > > > ------------------------------ > > Message: 13 > Date: Tue, 09 Nov 2004 14:49:05 -0800 > From: Ben Bolstad <bolstad@stat.berkeley.edu> > Subject: Re: [BioC] affy segmentation fault > To: Sucheta Tripathy <sutripa@vbi.vt.edu> > Cc: bioconductor@stat.math.ethz.ch > Message-ID: <1100040545.2398.70.camel@bmbbox.dyndns.org> > Content-Type: text/plain > > Please wait for the next version of the affy package 1.6.0 which should > appear on the web in a few days. It has the requisite fix to deal with > your soybean arrays. > > Ben > > > > On Tue, 2004-11-09 at 13:46, Sucheta Tripathy wrote: >> I know we have been cluttering this mailing list with this question over >> and again. The reason I want to ask again is after seeing the >> segmentation >> fault error, I found it says 340000 KB to be the size it needs. >> >> What puzzles me is our memory is way beyond that(almost 5 GB with 10 GB >> swap memory). >> >> After trying all the remedies, it still fails. Can anyone suggest if in >> the source where the exact memory allocation takes place, how much is >> fixed to be the size. Can we not increase it? Or to begin with which >> version of affy package has a fix for it. >> >> Thanks in advance. >> >> Sucheta > -- > Ben Bolstad <bolstad@stat.berkeley.edu> > http://www.stat.berkeley.edu/~bolstad > > > > ------------------------------ > > Message: 14 > Date: Tue, 09 Nov 2004 18:51:17 -0500 > From: Sucheta Tripathy <sutripa@vbi.vt.edu> > Subject: Re: [BioC] affy segmentation fault > To: bioconductor@stat.math.ethz.ch > Message-ID: <5.1.0.14.0.20041109183629.01f966c8@mail.vbi.vt.edu> > Content-Type: text/plain; charset="us-ascii"; format=flowed > > At 04:59 PM 11/9/2004 -0500, Robert Gentleman wrote: >>On Tue, Nov 09, 2004 at 04:46:03PM -0500, Sucheta Tripathy wrote: >> > >> > I know we have been cluttering this mailing list with this question >> over >> > and again. The reason I want to ask again is after seeing the >> segmentation >> > fault error, I found it says 340000 KB to be the size it needs. >> > >> > What puzzles me is our memory is way beyond that(almost 5 GB with 10 >> GB >> > swap memory). >> >> And as I have said very many times already, it likely has nothing to >> do with that, but rather that you have a corrupted installation. You >> almost surely need to recompile R with the correct set of compiler >> flags for your system and to reinstall the the appropriate >> packages. I am not sure how I can say this more explicitly, but the >> problem does not seem to be affy, it seems to be your installation. > > I guess at this point if any body else who has done installation and > compilation with any other flag, shares the flags they have used, I will > really appreciate that. After digging through the installation > instruction, > I don't find anything other than > > $ ./configure > $ make > > with may be a option to prefix path.(where R binaries and libraries should > go). > > Probably I need help from someone who can point where to find a more > detailed installation help. I have been also looking at file config.site, > and most of the default options look fine to me. > > If it is just the case of R being corrupted,is no big deal provided we > know > what flags we are using to compile next. > > -Sucheta > >> Robert >> >> >> > >> > After trying all the remedies, it still fails. Can anyone suggest if >> in >> > the source where the exact memory allocation takes place, how much is >> > fixed to be the size. Can we not increase it? Or to begin with which >> > version of affy package has a fix for it. >> > >> > Thanks in advance. >> > >> > Sucheta >> > >> > -- >> > Sucheta Tripathy >> > Virginia Bioinformatics Institute Phase-I >> > Washington street. >> > Virginia Tech. >> > Blacksburg,VA 24061-0447 >> > phone:(540)231-8138 >> > Fax: (540) 231-2606 >> > >> > _______________________________________________ >> > Bioconductor mailing list >> > Bioconductor@stat.math.ethz.ch >> > https://stat.ethz.ch/mailman/listinfo/bioconductor >> >>-- >>+------------------------------------------------------------------- --------+ >>| Robert Gentleman phone : (617) 632-5250 >> | >>| Associate Professor fax: (617) 632-2444 >> | >>| Department of Biostatistics office: M1B20 >> | >>| Harvard School of Public Health email: rgentlem@jimmy.harvard.edu >> | >>+------------------------------------------------------------------- --------+ > > > > ------------------------------ > > Message: 15 > Date: Wed, 10 Nov 2004 08:16:32 +0000 > From: Adaikalavan Ramasamy <ramasamy@cancer.org.uk> > Subject: Re: [BioC] affy segmentation fault > To: Sucheta Tripathy <sutripa@vbi.vt.edu> > Cc: BioConductor mailing list <bioconductor@stat.math.ethz.ch> > Message-ID: <1100074592.7513.41.camel@localhost.localdomain> > Content-Type: text/plain > > A normal installation procedure for me would be something like : > > make clean # or make distclean if you tried configuring before > ./configure --prefix=/home/adai/R > make > make check > make install > > There are variants of versions of 'make check' such as 'make check- all' > which are more comprehensive testing (see page 3 of R-admin). > > I do not know comprehend the flags and various options. If there is an > error or problem, I usually get my system administrator involved and > failing that I would try R-help mailing which is the more appropriate > place. > > And when you email R-help, please mention some vital information such as > your operating system (and kernel), gcc version, R version. Have you > tried checking R-help or BioC mailing archives ? > > BTW, does Ben Bolstad's reply about affy 1.6.0. answer your question ? > > > > On Tue, 2004-11-09 at 23:51, Sucheta Tripathy wrote: >> At 04:59 PM 11/9/2004 -0500, Robert Gentleman wrote: >> >On Tue, Nov 09, 2004 at 04:46:03PM -0500, Sucheta Tripathy wrote: >> > > >> > > I know we have been cluttering this mailing list with this question >> over >> > > and again. The reason I want to ask again is after seeing the >> segmentation >> > > fault error, I found it says 340000 KB to be the size it needs. >> > > >> > > What puzzles me is our memory is way beyond that(almost 5 GB with 10 >> GB >> > > swap memory). >> > >> > And as I have said very many times already, it likely has nothing to >> > do with that, but rather that you have a corrupted installation. You >> > almost surely need to recompile R with the correct set of compiler >> > flags for your system and to reinstall the the appropriate >> > packages. I am not sure how I can say this more explicitly, but the >> > problem does not seem to be affy, it seems to be your installation. >> >> I guess at this point if any body else who has done installation and >> compilation with any other flag, shares the flags they have used, I will >> really appreciate that. After digging through the installation >> instruction, >> I don't find anything other than >> >> $ ./configure >> $ make >> >> with may be a option to prefix path.(where R binaries and libraries >> should go). >> >> Probably I need help from someone who can point where to find a more >> detailed installation help. I have been also looking at file >> config.site, >> and most of the default options look fine to me. >> >> If it is just the case of R being corrupted,is no big deal provided we >> know >> what flags we are using to compile next. >> >> -Sucheta >> >> > Robert >> > >> > >> > > >> > > After trying all the remedies, it still fails. Can anyone suggest if >> in >> > > the source where the exact memory allocation takes place, how much >> is >> > > fixed to be the size. Can we not increase it? Or to begin with which >> > > version of affy package has a fix for it. >> > > >> > > Thanks in advance. >> > > >> > > Sucheta >> > > >> > > -- >> > > Sucheta Tripathy >> > > Virginia Bioinformatics Institute Phase-I >> > > Washington street. >> > > Virginia Tech. >> > > Blacksburg,VA 24061-0447 >> > > phone:(540)231-8138 >> > > Fax: (540) 231-2606 >> > > >> > > _______________________________________________ >> > > Bioconductor mailing list >> > > Bioconductor@stat.math.ethz.ch >> > > https://stat.ethz.ch/mailman/listinfo/bioconductor >> > >> >-- >> >+----------------------------------------------------------------- ----------+ >> >| Robert Gentleman phone : (617) 632-5250 >> | >> >| Associate Professor fax: (617) 632-2444 >> | >> >| Department of Biostatistics office: M1B20 >> | >> >| Harvard School of Public Health email: rgentlem@jimmy.harvard.edu >> | >> >+----------------------------------------------------------------- ----------+ >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> > > > > ------------------------------ > > Message: 16 > Date: Wed, 10 Nov 2004 09:43:43 +0100 > From: "Johan Lindberg" <johanl@biotech.kth.se> > Subject: RE: [BioC] Problems with heatmap on genes... > To: "'Giulio Di Giovanni'" <perimessaggini@hotmail.com>, > <bioconductor@stat.math.ethz.ch> > Message-ID: <000b01c4c701$62f059b0$27230a0a@biochem.kth.se> > Content-Type: text/plain; charset="US-ASCII" > > Hi Giulio. Heatmap is as you say a great tool if you have a small number > of genes but NOT if you have a lot of genes. I was dealing with the same > thing as you are doing now some 6 month ago and I found no good solution > using Heatmap. Therefore we use the freeware (note freeware) MeV from > TIGR at our department to do hierarchical clustering and similar things. > > http://www.tigr.org/software/tm4/mev.html > > What we have done is to write a script (exportMEV) that takes an > MA-object (package Aroma in R) and export that object to MeV format and > use it when doing clustering. > http://www.biotech.kth.se/molbio/microarray/pages/kthpackagetransfer .htm > l > > Best regards > > // Johan Lindberg > > > > -----Original Message----- > From: bioconductor-bounces@stat.math.ethz.ch > [mailto:bioconductor-bounces@stat.math.ethz.ch] On Behalf Of Giulio Di > Giovanni > Sent: Tuesday, November 09, 2004 4:38 PM > To: bioconductor@stat.math.ethz.ch > Subject: [BioC] Problems with heatmap on genes... > > > Hi, > > I'm trying to have a clear figure of gene clusters using heatmaps, but > with > more than 100-200 genes it's not possible to do it, with default options > > (and I would like to do that with 1500 genes or so...). Gene names (and > branchs too) collapse together... > > I tried, setting new device dimensions (jpeg() or png() height and > width), > and modifying par() options (fin, etc..), to have long cluster figures > (to > be clear, dChip style). Well, it works for others high-level graphical > functions, but it doesn't work for heatmaps(). I always obtain big > figures, > but with exactely the same squared heatmap inside. > > I spent long time on the documentation and searching the web, and when I > > found something, it was always some heatmaps for 50-100 genes at max > > I trust that someone working on gene clustering is confidential on this, > and I will appreciate a lot any suggestion... I almost became crazy on > that > !!! > > Thanks in advance, > > Giulio > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > > ------------------------------ > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > End of Bioconductor Digest, Vol 21, Issue 10 > ******************************************** >
Microarray Annotation Normalization Clustering Cancer affy limma AnnBuilder BRAIN Cancer • 1.4k views
ADD COMMENT
0
Entering edit mode
Jenny Bryan ▴ 110
@jenny-bryan-949
Last seen 10.2 years ago
> From: "Auer Michael" <michael.auer@meduniwien.ac.at> > > I would like to know wheter there exists the possibility to cluster genes > non-hierachically, but with the correlation as distance measure? K-means, > clara, pam, etc, only seem to work with euclidean metrics. I aks the Many clustering algorithms, pam for example, will accept a dissimilarity object as input. The limitation you perceive arises only if you ask the pam function itself to compute the dissimilarity for you. Below is a tiny example of how to use a '1 minus correlation' type of dissimilarity. ############################ library(cluster) library(MASS) Sigma.x <- matrix(0.7,nrow = 3, ncol = 3) diag(Sigma.x) <- 1 x <- mvrnorm(n = 4, mu = c(3,5,3), Sigma = Sigma.x) Sigma.y <- matrix(0.6, nrow = 3, ncol = 3) diag(Sigma.y) <- 1 y <- mvrnorm(n = 4, mu = rep(1,3), Sigma = Sigma.y) z <- rbind(x,y) matplot(1:3,t(z), col = rep(c("red","green"),each=4),type = "l", lty = 1) cor.dist.z <- as.dist(1 - abs(cor(t(z)))) pamfit <- pam(cor.dist.z, k = 2) plot(pamfit) -- Jenny Bryan *----------------------------------* * Assistant Professor * * Department of Statistics and * * the Michael Smith Laboratories * * University of British Columbia * *----------------------------------* 333-6356 Agricultural Road Vancouver, BC V6T 1Z2 Canada tel: 604.822.6422 fax: 604.822.6960 email: jenny@stat.ubc.ca
ADD COMMENT

Login before adding your answer.

Traffic: 686 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6