Gene Selection

0

Entering edit mode

Heike Pospisil ▴ 310

@heike-pospisil-1097

Last seen 10.5 years ago

Dear users, I am (nearly) a BioC beginner and hope someone could help me with my first analysis. I am looking for methods to select discriminating genes from a couple of cel-files using the following metrics: T-statistics, chi-square, Wilkins' and correlation-based feature selection. I would be glad to get some hints or links to some tutorials. Thanks in advance, Heike -- Dr. Heike Pospisil Center for Bioinformatics, University of Hamburg Bundesstrasse 43, 20146 Hamburg, Germany phone: +49-40-42838-7303 fax: +49-40-42838-7312

GLAD GLAD • 1.8k views

ADD COMMENT • link updated 20.0 years ago by Arne.Muller@sanofi-aventis.com ▴ 210 • written 20.1 years ago by Heike Pospisil ▴ 310

0

Entering edit mode

Lin Tang ▴ 20

@lin-tang-1088

Last seen 10.5 years ago

Hi, I am wondering can I run r codes in shell. Say I have a file.R , can I run it like: >R file.R under the shell? Thanks, Lin

ADD COMMENT • link 20.1 years ago Lin Tang ▴ 20

0

Entering edit mode

Hello, I think you can use the command below: > R CMD BATCH file.R You can see the help for R to see other parameters to run R in a command line... []s Gustavo On Thu, 03 Feb 2005 10:44:19 -0500, Lin Tang <lintang@jhmi.edu> wrote: > Hi, > I am wondering can I run r codes in shell. Say I have a file.R , can I > run it like: > > >R file.R > > under the shell? > > Thanks, > > Lin > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > -- _______________________________________________________________ "The truth you speak has no past and no future. It is, and that's all it needs to be." "A verdade que voc? fala n?o tem passado nem futuro. Ela ?, e isso ? tudo que ela precisa ser." Autor desconhecido _______________________________________________________________ Gustavo Henrique Esteves e-mail: gesteves@gmail.com Home Page: http://www.vision.ime.usp.br/~gesteves/

ADD REPLY • link 20.1 years ago Gustavo Henrique Esteves ▴ 60

0

Entering edit mode

Lin Tang wrote: > Hi, > I am wondering can I run r codes in shell. Say I have a file.R , can I > run it like: > > >>R file.R As you might imagine, this has been asked and answered many many times on the R-help list (note too that this question is particular to R rather than BioC, so you are on the incorrect list as well). A quick R site search returned many hits, one of which was this one: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/0768.html > > > under the shell? > > Thanks, > > Lin > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor -- James W. MacDonald Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109

ADD REPLY • link 20.1 years ago James W. MacDonald 68k

0

Entering edit mode

Naomi Altman ★ 6.0k

@naomi-altman-380

Last seen 3.8 years ago

United States

Dear Dr. Pospisil, I am sure someone would be happy to assist you, but we need more information. How many treatments (conditions, types of tissue, genotype, or whatever)? What is the objective of the study: differential expression? gene expression clustering? predicting tissue type? --Naomi Altman At 10:06 AM 2/3/2005, Heike Pospisil wrote: >Dear users, > >I am (nearly) a BioC beginner and hope someone could help me with my first >analysis. >I am looking for methods to select discriminating genes from a couple of >cel-files using the following metrics: T-statistics, chi-square, Wilkins' >and correlation-based feature selection. I would be glad to get some hints >or links to some tutorials. > >Thanks in advance, >Heike > >-- >Dr. Heike Pospisil >Center for Bioinformatics, University of Hamburg >Bundesstrasse 43, 20146 Hamburg, Germany >phone: +49-40-42838-7303 fax: +49-40-42838-7312 > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Bioinformatics Consulting Center Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111

ADD COMMENT • link 20.1 years ago Naomi Altman ★ 6.0k

0

Entering edit mode

Dear Naomi (and Stephen) thanks for your replies. Sorry for the little information I gave in my last email. I have 79 cel-files. Each chip is classified concerning three different criteria (categories). For each category, there exist at least 3 subclasses: Cat.A Cat.B Cat.C 1.CEL g l n 2.CEL n 0 r 3.CEL r n l ... 79.CEL n r 0 --------- ---------- ---------- 3 subclasses 4 subclasses 4 subclasses n,g,r l,0,n,r l,0,n,r For the first analysis, I only need to select differential expressed genes for one category. I read some tutorials and could reproduce these analyses, but I am not sure what the right strategy for me (limma or multtest or simple ttest or whatever). Thanks for your help and best wishes Heike > Dear Dr. Pospisil, > I am sure someone would be happy to assist you, but we need more > information. > > How many treatments (conditions, types of tissue, genotype, or whatever)? > What is the objective of the study: differential expression? gene > expression clustering? predicting tissue type? > > --Naomi Altman > > At 10:06 AM 2/3/2005, Heike Pospisil wrote: > >> Dear users, >> >> I am (nearly) a BioC beginner and hope someone could help me with my >> first analysis. >> I am looking for methods to select discriminating genes from a couple >> of cel-files using the following metrics: T-statistics, chi-square, >> Wilkins' and correlation-based feature selection. I would be glad to >> get some hints or links to some tutorials. >> >> Thanks in advance, >> Heike >> >> -- >> Dr. Heike Pospisil >> Center for Bioinformatics, University of Hamburg >> Bundesstrasse 43, 20146 Hamburg, Germany >> phone: +49-40-42838-7303 fax: +49-40-42838-7312 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor > > > Naomi S. Altman 814-865-3791 (voice) > Associate Professor > Bioinformatics Consulting Center > Dept. of Statistics 814-863-7114 (fax) > Penn State University 814-865-1348 (Statistics) > University Park, PA 16802-2111 > > > -- Dr. Heike Pospisil Center for Bioinformatics, University of Hamburg Bundesstrasse 43, 20146 Hamburg, Germany phone: +49-40-42838-7303 fax: +49-40-42838-7312

ADD REPLY • link 20.1 years ago Heike Pospisil ▴ 310

0

Entering edit mode

Arne.Muller@sanofi-aventis.com ▴ 210

@arnemullersanofi-aventiscom-1086

Last seen 10.5 years ago

Dear Heike, please correct me if I got it wrong: The experiment is a factorial design with factor 2 beeing nested within factor 1, i.e. 1. the "category" with three levels (category 1 to 3) 2. nested within within each level of the above factor there is another factor (sub-categories) with 3 to 4 levels. What do you mean by "select differential expressed genes for one category"? I see two choices: 1. is there an overall difference between the three main cateogies 2. Within each category, are all sub-categories the same in terms of gene expression or is there a (any) difference? Is that what you are looking for? kid regards, Arne > -----Original Message----- > From: bioconductor-bounces@stat.math.ethz.ch > [mailto:bioconductor-bounces@stat.math.ethz.ch]On Behalf Of Heike > Pospisil > Sent: 04 February 2005 16:28 > To: Naomi Altman > Cc: bioconductor@stat.math.ethz.ch > Subject: Re: [BioC] Gene Selection > > > Dear Naomi (and Stephen) > > thanks for your replies. Sorry for the little information I > gave in my > last email. > > I have 79 cel-files. Each chip is classified concerning three > different > criteria (categories). For each category, there exist at > least 3 subclasses: > > Cat.A Cat.B > > Cat.C > 1.CEL g l > > n > 2.CEL n > 0 r > 3.CEL r > n l > ... > 79.CEL n r > 0 > --------- ---------- > > ---------- > 3 subclasses 4 subclasses 4 > subclasses > n,g,r > l,0,n,r l,0,n,r > > For the first analysis, I only need to select differential expressed > genes for one category. > > I read some tutorials and could reproduce these analyses, but > I am not > sure what the right strategy for me (limma or multtest or > simple ttest > or whatever). > > Thanks for your help and best wishes > Heike > > > Dear Dr. Pospisil, > > I am sure someone would be happy to assist you, but we need more > > information. > > > > How many treatments (conditions, types of tissue, genotype, > or whatever)? > > What is the objective of the study: differential expression? gene > > expression clustering? predicting tissue type? > > > > --Naomi Altman > > > > At 10:06 AM 2/3/2005, Heike Pospisil wrote: > > > >> Dear users, > >> > >> I am (nearly) a BioC beginner and hope someone could help > me with my > >> first analysis. > >> I am looking for methods to select discriminating genes > from a couple > >> of cel-files using the following metrics: T-statistics, > chi-square, > >> Wilkins' and correlation-based feature selection. I would > be glad to > >> get some hints or links to some tutorials. > >> > >> Thanks in advance, > >> Heike > >> > >> -- > >> Dr. Heike Pospisil > >> Center for Bioinformatics, University of Hamburg > >> Bundesstrasse 43, 20146 Hamburg, Germany > >> phone: +49-40-42838-7303 fax: +49-40-42838-7312 > >> > >> _______________________________________________ > >> Bioconductor mailing list > >> Bioconductor@stat.math.ethz.ch > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > > > > > > Naomi S. Altman 814-865-3791 (voice) > > Associate Professor > > Bioinformatics Consulting Center > > Dept. of Statistics 814-863-7114 (fax) > > Penn State University 814-865-1348 > (Statistics) > > University Park, PA 16802-2111 > > > > > > > -- > Dr. Heike Pospisil > Center for Bioinformatics, University of Hamburg > Bundesstrasse 43, 20146 Hamburg, Germany > phone: +49-40-42838-7303 fax: +49-40-42838-7312 > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor >

ADD COMMENT • link 20.1 years ago Arne.Muller@sanofi-aventis.com ▴ 210

0

Entering edit mode

Dear Arne, thanks for your reply. >please correct me if I got it wrong: The experiment is a factorial design with factor 2 beeing nested within factor 1, i.e. > >1. the "category" with three levels (category 1 to 3) >2. nested within within each level of the above factor there is another factor (sub-categories) with 3 to 4 levels. > That is exactly the design I have. >What do you mean by "select differential expressed genes for one category"? > First, I want to select those genes discriminating between the 3 to 4 sub-categories within one category. (e.g. "Which genes are significantly differentially expressed [up or down] for sub-category 'g' and not for all others?") >I see two choices: > >1. is there an overall difference between the three main cateogies >2. Within each category, are all sub-categories the same in terms of gene expression or is there a (any) difference? > The second question is that I am looking for. Thanks for your help and best wishes, Heike >>-----Original Message----- >>From: bioconductor-bounces@stat.math.ethz.ch >>[mailto:bioconductor-bounces@stat.math.ethz.ch]On Behalf Of Heike >>Pospisil >>Sent: 04 February 2005 16:28 >>To: Naomi Altman >>Cc: bioconductor@stat.math.ethz.ch >>Subject: Re: [BioC] Gene Selection >> >> >>Dear Naomi (and Stephen) >> >>thanks for your replies. Sorry for the little information I >>gave in my >>last email. >> >>I have 79 cel-files. Each chip is classified concerning three >>different >>criteria (categories). For each category, there exist at >>least 3 subclasses: >> >> Cat.A Cat.B >> >> Cat.C >>1.CEL g l >> >> n >>2.CEL n >>0 r >>3.CEL r >>n l >>... >>79.CEL n r >> 0 >> --------- ---------- >> >>---------- >> 3 subclasses 4 subclasses 4 >>subclasses >> n,g,r >>l,0,n,r l,0,n,r >> >>For the first analysis, I only need to select differential expressed >>genes for one category. >> >>I read some tutorials and could reproduce these analyses, but >>I am not >>sure what the right strategy for me (limma or multtest or >>simple ttest >>or whatever). >> >>Thanks for your help and best wishes >>Heike >> >> >> >>>Dear Dr. Pospisil, >>>I am sure someone would be happy to assist you, but we need more >>>information. >>> >>>How many treatments (conditions, types of tissue, genotype, >>> >>> >>or whatever)? >> >> >>>What is the objective of the study: differential expression? gene >>>expression clustering? predicting tissue type? >>> >>>--Naomi Altman >>> >>>At 10:06 AM 2/3/2005, Heike Pospisil wrote: >>> >>> >>> >>>>Dear users, >>>> >>>>I am (nearly) a BioC beginner and hope someone could help >>>> >>>> >>me with my >> >> >>>>first analysis. >>>>I am looking for methods to select discriminating genes >>>> >>>> >>from a couple >> >> >>>>of cel-files using the following metrics: T-statistics, >>>> >>>> >>chi-square, >> >> >>>>Wilkins' and correlation-based feature selection. I would >>>> >>>> >>be glad to >> >> >>>>get some hints or links to some tutorials. >>>> >>>>Thanks in advance, >>>>Heike >>>> >>>>-- >>>>Dr. Heike Pospisil >>>>Center for Bioinformatics, University of Hamburg >>>>Bundesstrasse 43, 20146 Hamburg, Germany >>>>phone: +49-40-42838-7303 fax: +49-40-42838-7312 >>>> >>>>_______________________________________________ >>>>Bioconductor mailing list >>>>Bioconductor@stat.math.ethz.ch >>>>https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> >>>> >>>Naomi S. Altman 814-865-3791 (voice) >>>Associate Professor >>>Bioinformatics Consulting Center >>>Dept. of Statistics 814-863-7114 (fax) >>>Penn State University 814-865-1348 >>> >>> >>(Statistics) >> >> >>>University Park, PA 16802-2111 >>> >>> >>> >>> >>> >>-- >>Dr. Heike Pospisil >>Center for Bioinformatics, University of Hamburg >>Bundesstrasse 43, 20146 Hamburg, Germany >>phone: +49-40-42838-7303 fax: +49-40-42838-7312 >> >>_______________________________________________ >>Bioconductor mailing list >>Bioconductor@stat.math.ethz.ch >>https://stat.ethz.ch/mailman/listinfo/bioconductor >> >> > > > > -- Dr. Heike Pospisil Center for Bioinformatics, University of Hamburg Bundesstrasse 43, 20146 Hamburg, Germany phone: +49-40-42838-7303 fax: +49-40-42838-7312

ADD REPLY • link 20.0 years ago Heike Pospisil ▴ 310

0

Entering edit mode

Arne.Muller@sanofi-aventis.com ▴ 210

@arnemullersanofi-aventiscom-1086

Last seen 10.5 years ago

> > > Dear Arne, > > thanks for your reply. > > >please correct me if I got it wrong: The experiment is a > factorial design with factor 2 beeing nested within factor 1, i.e. > > > >1. the "category" with three levels (category 1 to 3) > >2. nested within within each level of the above factor there > is another factor (sub-categories) with 3 to 4 levels. > > > > That is exactly the design I have. > > >What do you mean by "select differential expressed genes for > one category"? > > > > First, I want to select those genes discriminating between the 3 to 4 > sub-categories within one category. (e.g. "Which genes are > significantly > differentially expressed [up or down] for sub-category 'g' > and not for > all others?") You should use a linear model for this, maybe limma for which you'd need to setup a proper model matrix and contrasts. I can only give you some hints for the standrd "poor man's" linear models in R. Look at http://cran.r-project.org/doc/manuals/R-intro.html#Formulae-for- statistical-models For coding complex models in R. For you purpose you may want to have a look at a nested model (sub- categories are nested within categories) fit <- lm(Intensity ~ category + category %in% subcategory, data=x) summary(fit) gives you the estimate and p-values and anova(fit) tells you whether there are overall differences in category:subcategory. This also compares the catagories with each other. You can use the estimates to calculate fold changes using the predict function for a fit (it gives you predicted values, intensities, for the model and you can use that to calculate rations). If you're interested in specific comparisons you need to construct contrasts, e.g. contrasts(subcategories) <- contr.treatment(levels(subcategories), base=1) see ?contrasts kind regards, Arne > >I see two choices: > > > >1. is there an overall difference between the three main cateogies > >2. Within each category, are all sub-categories the same in > terms of gene expression or is there a (any) difference? > > > > The second question is that I am looking for. > > Thanks for your help and best wishes, > Heike > > > > >>-----Original Message----- > >>From: bioconductor-bounces@stat.math.ethz.ch > >>[mailto:bioconductor-bounces@stat.math.ethz.ch]On Behalf Of Heike > >>Pospisil > >>Sent: 04 February 2005 16:28 > >>To: Naomi Altman > >>Cc: bioconductor@stat.math.ethz.ch > >>Subject: Re: [BioC] Gene Selection > >> > >> > >>Dear Naomi (and Stephen) > >> > >>thanks for your replies. Sorry for the little information I > >>gave in my > >>last email. > >> > >>I have 79 cel-files. Each chip is classified concerning three > >>different > >>criteria (categories). For each category, there exist at > >>least 3 subclasses: > >> > >> Cat.A Cat.B > >> > >> Cat.C > >>1.CEL g l > >> > >> n > >>2.CEL n > >>0 r > >>3.CEL r > >>n l > >>... > >>79.CEL n r > >> 0 > >> --------- ---------- > >> > >>---------- > >> 3 subclasses 4 subclasses 4 > >>subclasses > >> n,g,r > >>l,0,n,r l,0,n,r > >> > >>For the first analysis, I only need to select differential > expressed > >>genes for one category. > >> > >>I read some tutorials and could reproduce these analyses, but > >>I am not > >>sure what the right strategy for me (limma or multtest or > >>simple ttest > >>or whatever). > >> > >>Thanks for your help and best wishes > >>Heike > >> > >> > >> > >>>Dear Dr. Pospisil, > >>>I am sure someone would be happy to assist you, but we need more > >>>information. > >>> > >>>How many treatments (conditions, types of tissue, genotype, > >>> > >>> > >>or whatever)? > >> > >> > >>>What is the objective of the study: differential expression? gene > >>>expression clustering? predicting tissue type? > >>> > >>>--Naomi Altman > >>> > >>>At 10:06 AM 2/3/2005, Heike Pospisil wrote: > >>> > >>> > >>> > >>>>Dear users, > >>>> > >>>>I am (nearly) a BioC beginner and hope someone could help > >>>> > >>>> > >>me with my > >> > >> > >>>>first analysis. > >>>>I am looking for methods to select discriminating genes > >>>> > >>>> > >>from a couple > >> > >> > >>>>of cel-files using the following metrics: T-statistics, > >>>> > >>>> > >>chi-square, > >> > >> > >>>>Wilkins' and correlation-based feature selection. I would > >>>> > >>>> > >>be glad to > >> > >> > >>>>get some hints or links to some tutorials. > >>>> > >>>>Thanks in advance, > >>>>Heike > >>>> > >>>>-- > >>>>Dr. Heike Pospisil > >>>>Center for Bioinformatics, University of Hamburg > >>>>Bundesstrasse 43, 20146 Hamburg, Germany > >>>>phone: +49-40-42838-7303 fax: +49-40-42838-7312 > >>>> > >>>>_______________________________________________ > >>>>Bioconductor mailing list > >>>>Bioconductor@stat.math.ethz.ch > >>>>https://stat.ethz.ch/mailman/listinfo/bioconductor > >>>> > >>>> > >>>Naomi S. Altman 814-865-3791 (voice) > >>>Associate Professor > >>>Bioinformatics Consulting Center > >>>Dept. of Statistics 814-863-7114 (fax) > >>>Penn State University 814-865-1348 > >>> > >>> > >>(Statistics) > >> > >> > >>>University Park, PA 16802-2111 > >>> > >>> > >>> > >>> > >>> > >>-- > >>Dr. Heike Pospisil > >>Center for Bioinformatics, University of Hamburg > >>Bundesstrasse 43, 20146 Hamburg, Germany > >>phone: +49-40-42838-7303 fax: +49-40-42838-7312 > >> > >>_______________________________________________ > >>Bioconductor mailing list > >>Bioconductor@stat.math.ethz.ch > >>https://stat.ethz.ch/mailman/listinfo/bioconductor > >> > >> > > > > > > > > > > -- > Dr. Heike Pospisil > Center for Bioinformatics, University of Hamburg > Bundesstrasse 43, 20146 Hamburg, Germany > phone: +49-40-42838-7303 fax: +49-40-42838-7312 > > >

ADD COMMENT • link 20.0 years ago Arne.Muller@sanofi-aventis.com ▴ 210

Login before adding your answer.