cluster genes based on expression pattern

0

Entering edit mode

Asma rabe ▴ 290

@asma-rabe-4697

Last seen 6.8 years ago

Japan

Hi all, I have a normalized microarray (Affy) gene expression data for different time points and i would like to cluster genes with similar expression pattern ,which package shall i use? Thanks in advance Best Regards, Rabe [[alternative HTML version deleted]]

Microarray Microarray • 1.6k views

ADD COMMENT • link updated 13.4 years ago by Moshe Olshansky ▴ 260 • written 13.4 years ago by Asma rabe ▴ 290

0

Entering edit mode

Sean Davis 21k

@sean-davis-490

Last seen 12 weeks ago

United States

On Thu, Jun 16, 2011 at 7:02 AM, Asma rabe <asma.rabe at="" gmail.com=""> wrote: > Hi all, > > I have a normalized microarray (Affy) gene expression data for different > time points and i would like to cluster genes with similar expression > pattern ,which package shall i use? Hi, Rabe. Try the heatmap() function (assuming you are asking about making a picture). Also, consider searching the email archives. You'll probably want to filter your genes first, as clustering all the data is not likely to be a useful exercise. If, instead, you want to do clustering (and not just make a picture), try help.search('clustering'). Sean

ADD COMMENT • link 13.4 years ago Sean Davis 21k

0

Entering edit mode

Djork Clevert ▴ 210

@djork-clevert-422

Last seen 10.2 years ago

Hi, Rabe. try to summarize and filter your data using the farms package and afterwards to bicluster them using the fabia package. In doing so you will find a pair of a gene set and a sample set for which the genes are similar to each other on the samples and vice versa. Cheers, Okko -- Am 16.06.2011 um 13:02 schrieb Asma rabe: > Hi all, > > I have a normalized microarray (Affy) gene expression data for different > time points and i would like to cluster genes with similar expression > pattern ,which package shall i use? > > Thanks in advance > Best Regards, > Rabe > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 13.4 years ago Djork Clevert ▴ 210

0

Entering edit mode

Moshe Olshansky ▴ 260

@moshe-olshansky-4491

Last seen 10.2 years ago

Hi Rabe, You can check timecourse package (Bioconductor). Sean's suggestion to filter genes is always a good idea. My naive approach would be to define a sensible distance between two genes and use this distance for clustering (one possibility is hclust). To define a distance, suppose that you have two genes, A and B and n+1 time points: 0,1,...,n. Let Ai and Bi be expression levels of genes A and B at time i (i=0,1,...,n). One possibility is just the Lp distance (for a suitable p). Another possibility is to say that we do not care about the absolute abundance but only about how it evolves in time and then we can look at AAi = Ai/A1 and BBi = Bi/B1, i=1,2,...,n and take some Lp (or other) distance between AA and BB. These are just some suggestions. You may think of another reasonable distance. Best regards, Moshe. P.S. I would be interested to know what you have done and how well it works (since we also sometimes come across such data but I never had time to explore this). > Hi all, > > I have a normalized microarray (Affy) gene expression data for different > time points and i would like to cluster genes with similar expression > pattern ,which package shall i use? > > Thanks in advance > Best Regards, > Rabe > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}

ADD COMMENT • link 13.4 years ago Moshe Olshansky ▴ 260

0

Entering edit mode

On Fri, Jun 17, 2011 at 10:49:08AM +1000, Moshe Olshansky wrote: > Hi Rabe, > > You can check timecourse package (Bioconductor). > Sean's suggestion to filter genes is always a good idea. > My naive approach would be to define a sensible distance between two genes > and use this distance for clustering (one possibility is hclust). > To define a distance, suppose that you have two genes, A and B and n+1 > time points: 0,1,...,n. Let Ai and Bi be expression levels of genes A and > B at time i (i=0,1,...,n). One possibility is just the Lp distance (for a > suitable p). Another possibility is to say that we do not care about the > absolute abundance but only about how it evolves in time and then we can > look at AAi = Ai/A1 and BBi = Bi/B1, i=1,2,...,n and take some Lp (or > other) distance between AA and BB. > These are just some suggestions. You may think of another reasonable > distance. Another good choice is Pearson correlation or the absolute value of Pearson correlation (in that case, anti-correlated genes will cluster with correlated genes). In our lab we have had good experiences with a network-based approach. In this case one chooses a certain threshold, and only retains node- pairs for which the (absolute) Pearson correlation falls above that threshold. It is possible/advisable to vary such a threshold and look at graph statistics such as average node degree and number of singletons to get an idea for an appropriate threshold. >From there on, any graph clustering can be used. We use MCL (developed in our lab, so naturally). With MCL it pays to further transform the data, but I will not elaborate here. Cei Abreu-Goodger and I have written a book chapter on this subject, available for anyone interested. regards, Stijn -- Stijn van Dongen >8< -o) O< forename pronunciation: [Stan] EMBL-EBI /\\ Tel: +44-(0)1223-492675 Hinxton, Cambridge, CB10 1SD, UK _\_/ http://micans.org/stijn

ADD REPLY • link 13.4 years ago Stijn van Dongen ▴ 80

0

Entering edit mode

Hi Stijn, Thank you for your note. Are you doing Pearson correlation on the data itself or it's logarithm? What is the title of the book you mentioned? Best regards, Moshe. > > On Fri, Jun 17, 2011 at 10:49:08AM +1000, Moshe Olshansky wrote: >> Hi Rabe, >> >> You can check timecourse package (Bioconductor). >> Sean's suggestion to filter genes is always a good idea. >> My naive approach would be to define a sensible distance between two >> genes >> and use this distance for clustering (one possibility is hclust). >> To define a distance, suppose that you have two genes, A and B and n+1 >> time points: 0,1,...,n. Let Ai and Bi be expression levels of genes A >> and >> B at time i (i=0,1,...,n). One possibility is just the Lp distance (for >> a >> suitable p). Another possibility is to say that we do not care about the >> absolute abundance but only about how it evolves in time and then we can >> look at AAi = Ai/A1 and BBi = Bi/B1, i=1,2,...,n and take some Lp (or >> other) distance between AA and BB. >> These are just some suggestions. You may think of another reasonable >> distance. > > > Another good choice is Pearson correlation or the absolute value > of Pearson correlation (in that case, anti-correlated genes > will cluster with correlated genes). > > In our lab we have had good experiences with a network-based approach. > In this case one chooses a certain threshold, and only retains node- pairs > for which the (absolute) Pearson correlation falls above that threshold. > It is possible/advisable to vary such a threshold and look at graph > statistics such as average node degree and number of singletons to > get an idea for an appropriate threshold. > > From there on, any graph clustering can be used. We use MCL (developed > in our lab, so naturally). With MCL it pays to further transform the data, > but I will not elaborate here. Cei Abreu-Goodger and I have written > a book chapter on this subject, available for anyone interested. > > regards, > Stijn > > > -- > Stijn van Dongen >8< -o) O< forename pronunciation: > [Stan] > EMBL-EBI /\\ Tel: +44-(0)1223-492675 > Hinxton, Cambridge, CB10 1SD, UK _\_/ http://micans.org/stijn > ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}

ADD REPLY • link 13.4 years ago Moshe Olshansky ▴ 260

Login before adding your answer.