Hi all,
I have a normalized microarray (Affy) gene expression data for
different
time points and i would like to cluster genes with similar expression
pattern ,which package shall i use?
Thanks in advance
Best Regards,
Rabe
[[alternative HTML version deleted]]
On Thu, Jun 16, 2011 at 7:02 AM, Asma rabe <asma.rabe at="" gmail.com="">
wrote:
> Hi all,
>
> I have a normalized microarray (Affy) gene expression data for
different
> time points and i would like to cluster genes with similar
expression
> pattern ,which package shall i use?
Hi, Rabe.
Try the heatmap() function (assuming you are asking about making a
picture). Also, consider searching the email archives. You'll
probably want to filter your genes first, as clustering all the data
is not likely to be a useful exercise.
If, instead, you want to do clustering (and not just make a picture),
try help.search('clustering').
Sean
Hi, Rabe.
try to summarize and filter your data using the farms package and
afterwards to bicluster them
using the fabia package. In doing so you will find a pair of a gene
set and a sample set for which
the genes are similar to each other on the samples and vice versa.
Cheers,
Okko
--
Am 16.06.2011 um 13:02 schrieb Asma rabe:
> Hi all,
>
> I have a normalized microarray (Affy) gene expression data for
different
> time points and i would like to cluster genes with similar
expression
> pattern ,which package shall i use?
>
> Thanks in advance
> Best Regards,
> Rabe
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
Hi Rabe,
You can check timecourse package (Bioconductor).
Sean's suggestion to filter genes is always a good idea.
My naive approach would be to define a sensible distance between two
genes
and use this distance for clustering (one possibility is hclust).
To define a distance, suppose that you have two genes, A and B and n+1
time points: 0,1,...,n. Let Ai and Bi be expression levels of genes A
and
B at time i (i=0,1,...,n). One possibility is just the Lp distance
(for a
suitable p). Another possibility is to say that we do not care about
the
absolute abundance but only about how it evolves in time and then we
can
look at AAi = Ai/A1 and BBi = Bi/B1, i=1,2,...,n and take some Lp (or
other) distance between AA and BB.
These are just some suggestions. You may think of another reasonable
distance.
Best regards,
Moshe.
P.S. I would be interested to know what you have done and how well it
works (since we also sometimes come across such data but I never had
time
to explore this).
> Hi all,
>
> I have a normalized microarray (Affy) gene expression data for
different
> time points and i would like to cluster genes with similar
expression
> pattern ,which package shall i use?
>
> Thanks in advance
> Best Regards,
> Rabe
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
______________________________________________________________________
The information in this email is confidential and
intend...{{dropped:4}}
On Fri, Jun 17, 2011 at 10:49:08AM +1000, Moshe Olshansky wrote:
> Hi Rabe,
>
> You can check timecourse package (Bioconductor).
> Sean's suggestion to filter genes is always a good idea.
> My naive approach would be to define a sensible distance between two
genes
> and use this distance for clustering (one possibility is hclust).
> To define a distance, suppose that you have two genes, A and B and
n+1
> time points: 0,1,...,n. Let Ai and Bi be expression levels of genes
A and
> B at time i (i=0,1,...,n). One possibility is just the Lp distance
(for a
> suitable p). Another possibility is to say that we do not care about
the
> absolute abundance but only about how it evolves in time and then we
can
> look at AAi = Ai/A1 and BBi = Bi/B1, i=1,2,...,n and take some Lp
(or
> other) distance between AA and BB.
> These are just some suggestions. You may think of another reasonable
> distance.
Another good choice is Pearson correlation or the absolute value
of Pearson correlation (in that case, anti-correlated genes
will cluster with correlated genes).
In our lab we have had good experiences with a network-based approach.
In this case one chooses a certain threshold, and only retains node-
pairs
for which the (absolute) Pearson correlation falls above that
threshold.
It is possible/advisable to vary such a threshold and look at graph
statistics such as average node degree and number of singletons to
get an idea for an appropriate threshold.
>From there on, any graph clustering can be used. We use MCL
(developed
in our lab, so naturally). With MCL it pays to further transform the
data,
but I will not elaborate here. Cei Abreu-Goodger and I have written
a book chapter on this subject, available for anyone interested.
regards,
Stijn
--
Stijn van Dongen >8< -o) O< forename pronunciation:
[Stan]
EMBL-EBI /\\ Tel: +44-(0)1223-492675
Hinxton, Cambridge, CB10 1SD, UK _\_/ http://micans.org/stijn
Hi Stijn,
Thank you for your note.
Are you doing Pearson correlation on the data itself or it's
logarithm?
What is the title of the book you mentioned?
Best regards,
Moshe.
>
> On Fri, Jun 17, 2011 at 10:49:08AM +1000, Moshe Olshansky wrote:
>> Hi Rabe,
>>
>> You can check timecourse package (Bioconductor).
>> Sean's suggestion to filter genes is always a good idea.
>> My naive approach would be to define a sensible distance between
two
>> genes
>> and use this distance for clustering (one possibility is hclust).
>> To define a distance, suppose that you have two genes, A and B and
n+1
>> time points: 0,1,...,n. Let Ai and Bi be expression levels of genes
A
>> and
>> B at time i (i=0,1,...,n). One possibility is just the Lp distance
(for
>> a
>> suitable p). Another possibility is to say that we do not care
about the
>> absolute abundance but only about how it evolves in time and then
we can
>> look at AAi = Ai/A1 and BBi = Bi/B1, i=1,2,...,n and take some Lp
(or
>> other) distance between AA and BB.
>> These are just some suggestions. You may think of another
reasonable
>> distance.
>
>
> Another good choice is Pearson correlation or the absolute value
> of Pearson correlation (in that case, anti-correlated genes
> will cluster with correlated genes).
>
> In our lab we have had good experiences with a network-based
approach.
> In this case one chooses a certain threshold, and only retains node-
pairs
> for which the (absolute) Pearson correlation falls above that
threshold.
> It is possible/advisable to vary such a threshold and look at graph
> statistics such as average node degree and number of singletons to
> get an idea for an appropriate threshold.
>
> From there on, any graph clustering can be used. We use MCL
(developed
> in our lab, so naturally). With MCL it pays to further transform the
data,
> but I will not elaborate here. Cei Abreu-Goodger and I have written
> a book chapter on this subject, available for anyone interested.
>
> regards,
> Stijn
>
>
> --
> Stijn van Dongen >8< -o) O< forename
pronunciation:
> [Stan]
> EMBL-EBI /\\ Tel: +44-(0)1223-492675
> Hinxton, Cambridge, CB10 1SD, UK _\_/ http://micans.org/stijn
>
______________________________________________________________________
The information in this email is confidential and
intend...{{dropped:4}}