Re : Filtering of data

0

Entering edit mode

Guest User ★ 13k

@guest-user-4897

Last seen 10.2 years ago

I have done the normalization of my dataset using bioconductor affy but now i am having a problem in the filtering of the normalized dataset,I don't know how to do it? Help me in resolving the problem -- output of sessionInfo(): I dont know the code for it -- Sent via the guest posting facility at bioconductor.org.

Normalization Normalization • 1.4k views

ADD COMMENT • link updated 12.4 years ago by James W. MacDonald 67k • written 12.4 years ago by Guest User ★ 13k

0

Entering edit mode

fire1976 wyoming ▴ 380

@fire1976-wyoming-324

Last seen 10.2 years ago

Hi Gordon and LIMMA users, I am sure this question has been answered before and I tried looking into the archives for some answer but did n't have any success there. My experimental design has diseased and healthy volunteers blood treated with a drug. I have gene expression data for both before and after treatment. So, I have disease, treatment and patient_ID (before vs. after treatment) as covariates. What I am interested in are as follows: 1. What genes change in untreated disease vs. untreated healthy volunteers? 2. What genes change in treated disease vs. untreated disease blood samples? 3. What genes change in treated healthy volunteers vs. untreated healthy volunteers blood samples? Design of the experiment: design <- model.matrix(~ dis + tx + patient) Based on the above design I am able to answer question 1. I was wondering how I would answer question 2 and 3 in a paired T -test (to account for before vs. after treatment). Do I need to do some contrasts because I have been trying to work off the lmfit. Any help would be greatly apreciated. Thanks, Som. [[alternative HTML version deleted]]

ADD COMMENT • link 12.4 years ago fire1976 wyoming ▴ 380

0

Entering edit mode

Your design matrix is not sufficient to answer questions 2 and 3. Your questions presume an interaction between treatment and disease, i.e., distinct effects for treatment for disease and healthy, whereas your model formula assumes no interaction. You need: design <- model.matrix(~patient + dis + dis:tx) Then last two coefficients answer questions 2 and 3. Gordon --------------------------------------------- Professor Gordon K Smyth, Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Vic 3052, Australia. http://www.wehi.edu.au http://www.statsci.org/smyth On Tue, 3 Jul 2012, somnath bandyopadhyay wrote: > > Hi Gordon and LIMMA users, > > I am sure this question has been answered before and I tried looking into the archives for some answer but did n't have any success there. > > My experimental design has diseased and healthy volunteers blood treated with a drug. I have gene expression data for both before and after treatment. So, I have disease, treatment and patient_ID (before vs. after treatment) as covariates. What I am interested in are as follows: > > 1. What genes change in untreated disease vs. untreated healthy volunteers? > 2. What genes change in treated disease vs. untreated disease blood samples? > 3. What genes change in treated healthy volunteers vs. untreated healthy volunteers blood samples? > > Design of the experiment: > design <- model.matrix(~ dis + tx + patient) > > Based on the above design I am able to answer question 1. I was > wondering how I would answer question 2 and 3 in a paired T -test (to > account for before vs. after treatment). Do I need to do some contrasts > because I have been trying to work off the lmfit. > > Any help would be greatly apreciated. > > Thanks, > Som. > > > > > ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}

ADD REPLY • link 12.4 years ago Gordon Smyth 52k

0

Entering edit mode

Hi Gordon, Thanks for your suggestion. That helped a lot! I had one more question: if the patient to patient variability is too large, would you recommend doing a Welch's t-test? Is there a way to do it in limma using the same linear model (~patient + dis + dis:tx)? Thanks, Som. > Date: Wed, 4 Jul 2012 10:27:18 +1000 > From: smyth@wehi.EDU.AU > To: genome1976@hotmail.com > CC: bioconductor@r-project.org; maintainer@bioconductor.org > Subject: Re: LIMMA paired T-test > > Your design matrix is not sufficient to answer questions 2 and 3. Your > questions presume an interaction between treatment and disease, i.e., > distinct effects for treatment for disease and healthy, whereas your model > formula assumes no interaction. > > You need: > > design <- model.matrix(~patient + dis + dis:tx) > > Then last two coefficients answer questions 2 and 3. > > Gordon > > --------------------------------------------- > Professor Gordon K Smyth, > Bioinformatics Division, > Walter and Eliza Hall Institute of Medical Research, > 1G Royal Parade, Parkville, Vic 3052, Australia. > http://www.wehi.edu.au > http://www.statsci.org/smyth > > On Tue, 3 Jul 2012, somnath bandyopadhyay wrote: > > > > > Hi Gordon and LIMMA users, > > > > I am sure this question has been answered before and I tried looking into the archives for some answer but did n't have any success there. > > > > My experimental design has diseased and healthy volunteers blood treated with a drug. I have gene expression data for both before and after treatment. So, I have disease, treatment and patient_ID (before vs. after treatment) as covariates. What I am interested in are as follows: > > > > 1. What genes change in untreated disease vs. untreated healthy volunteers? > > 2. What genes change in treated disease vs. untreated disease blood samples? > > 3. What genes change in treated healthy volunteers vs. untreated healthy volunteers blood samples? > > > > Design of the experiment: > > design <- model.matrix(~ dis + tx + patient) > > > > Based on the above design I am able to answer question 1. I was > > wondering how I would answer question 2 and 3 in a paired T -test (to > > account for before vs. after treatment). Do I need to do some contrasts > > because I have been trying to work off the lmfit. > > > > Any help would be greatly apreciated. > > > > Thanks, > > Som. > > > > > > > > > > > > ______________________________________________________________________ > The information in this email is confidential and inte...{{dropped:9}}

ADD REPLY • link 12.4 years ago fire1976 wyoming ▴ 380

0

Entering edit mode

Dear Som, I certainly do not recommend Welch's t-test. Your limma analysis is already full adjusting for patient variability, and Welch's test has nothing to do with patient to patient variability anyway. Best wishes Gordon --------------------------------------------- Professor Gordon K Smyth, Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Vic 3052, Australia. Tel: (03) 9345 2326, Fax (03) 9347 0852, http://www.statsci.org/smyth On Fri, 6 Jul 2012, somnath bandyopadhyay wrote: > > > > Hi > Gordon, > > Thanks for your suggestion. That helped a lot! > > > > I had one more question: if the patient to patient variability is too large, > would you recommend doing a Welch's t-test? Is there a way to do it in limma > using the same linear model (~patient + dis + dis:tx)? > > > > Thanks, > > Som. > > > >> Date: Wed, 4 Jul 2012 10:27:18 +1000 >> From: smyth at wehi.EDU.AU >> To: genome1976 at hotmail.com >> CC: bioconductor at r-project.org; maintainer at bioconductor.org >> Subject: Re: LIMMA paired T-test >> >> Your design matrix is not sufficient to answer questions 2 and 3. Your >> questions presume an interaction between treatment and disease, i.e., >> distinct effects for treatment for disease and healthy, whereas your model >> formula assumes no interaction. >> >> You need: >> >> design <- model.matrix(~patient + dis + dis:tx) >> >> Then last two coefficients answer questions 2 and 3. >> >> Gordon >> >> --------------------------------------------- >> Professor Gordon K Smyth, >> Bioinformatics Division, >> Walter and Eliza Hall Institute of Medical Research, >> 1G Royal Parade, Parkville, Vic 3052, Australia. >> http://www.wehi.edu.au >> http://www.statsci.org/smyth >> >> On Tue, 3 Jul 2012, somnath bandyopadhyay wrote: >> >>> >>> Hi Gordon and LIMMA users, >>> >>> I am sure this question has been answered before and I tried looking into the archives for some answer but did n't have any success there. >>> >>> My experimental design has diseased and healthy volunteers blood treated with a drug. I have gene expression data for both before and after treatment. So, I have disease, treatment and patient_ID (before vs. after treatment) as covariates. What I am interested in are as follows: >>> >>> 1. What genes change in untreated disease vs. untreated healthy volunteers? >>> 2. What genes change in treated disease vs. untreated disease blood samples? >>> 3. What genes change in treated healthy volunteers vs. untreated healthy volunteers blood samples? >>> >>> Design of the experiment: >>> design <- model.matrix(~ dis + tx + patient) >>> >>> Based on the above design I am able to answer question 1. I was >>> wondering how I would answer question 2 and 3 in a paired T -test (to >>> account for before vs. after treatment). Do I need to do some contrasts >>> because I have been trying to work off the lmfit. >>> >>> Any help would be greatly apreciated. >>> >>> Thanks, >>> Som. >>> >>> >>> >>> >>> >> >> ______________________________________________________________________ >> The information in this email is confidential and intended solely for the addressee. >> You must not disclose, forward, print or use it without the permission of the sender. >> ______________________________________________________________________ > ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}

ADD REPLY • link 12.4 years ago Gordon Smyth 52k

0

Entering edit mode

James W. MacDonald 67k

@james-w-macdonald-5106

Last seen 2 days ago

United States

Hi Deeksha, Each Bioconductor package has at least one vignette that should help you get started. For instance, genefilter has this one: http://bioconductor.org/packages/2.10/bioc/vignettes/genefilter/inst/d oc/howtogenefilter.pdf Best, Jim On 7/3/2012 3:30 PM, Deeksha [guest] wrote: > I have done the normalization of my dataset using bioconductor affy but now i am having a problem in the filtering of the normalized dataset,I don't know how to do it? > Help me in resolving the problem > > > > -- output of sessionInfo(): > > I dont know the code for it > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD COMMENT • link 12.4 years ago James W. MacDonald 67k

0

Entering edit mode

James W. MacDonald 67k

@james-w-macdonald-5106

Last seen 2 days ago

United States

Please don't take things off-list. We like to think of the list archives as a useful repository of information, and taking threads off-list defeats that purpose. On 7/3/2012 4:18 PM, Deeksha Malhan wrote: > thanx but how to convert csv into expressionset format /? Why are your data in csv format? You said that you used the affy package to normalize, so at some point you would have had to have an ExpressionSet. Anyway, there is no requirement for your data to be in an ExpressionSet. If you look at the help page for genefilter(), you will see that you can pass a matrix as well. At this point I should warn you that the amount of help you will receive on this list is correlated with the apparent amount of work you have done yourself. We are quite helpful to those that seem to really be stuck, less so for those who simply want someone to tell them what to do at each step. Best, Jim > > On Wed, Jul 4, 2012 at 1:46 AM, James W. MacDonald <jmacdon at="" uw.edu=""> <mailto:jmacdon at="" uw.edu="">> wrote: > > Hi Deeksha, > > Each Bioconductor package has at least one vignette that should > help you get started. For instance, genefilter has this one: > > http://bioconductor.org/packages/2.10/bioc/vignettes/genefilter/ inst/doc/howtogenefilter.pdf > > Best, > > Jim > > > > > On 7/3/2012 3:30 PM, Deeksha [guest] wrote: > > I have done the normalization of my dataset using bioconductor > affy but now i am having a problem in the filtering of the > normalized dataset,I don't know how to do it? > Help me in resolving the problem > > > > -- output of sessionInfo(): > > I dont know the code for it > > -- > Sent via the guest posting facility at bioconductor.org > <http: bioconductor.org="">. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD COMMENT • link 12.4 years ago James W. MacDonald 67k

0

Entering edit mode

Deeksha Malhan ▴ 10

@deeksha-malhan-5377

Last seen 9.9 years ago

Germany

I have done the normalization of rice dataset obtained from GEO-NCBI but I am not sure how to filter it using genefliter. Help me in resolving this issue Thanx in advance [[alternative HTML version deleted]]

ADD COMMENT • link 12.4 years ago Deeksha Malhan ▴ 10

Login before adding your answer.