edgeR: glmLRT test
1
0
Entering edit mode
KJ Lim ▴ 420
@kj-lim-5288
Last seen 4.2 years ago
Finland
Dear edgeR community, Good day. I have sets of RNA-Seq data of 2 phenotype (HighS, LowS) plants with area TZ & SW (control). I used exactTest to study which genes that are differential expressed in area TZ compare to SW of each phenotype. I would like to know do the phenotypes have effect on the area, in this case I should use glmLRT test. Before fit into the GLM, I defined my design matrix as: ~phenotype+area Could someone please advice me regarding the design matrix? Should I define my design matrix as ~area+phenotype ? Thank you very much for your time and help. Have a nice weekend. Best regards, KJ Lim [[alternative HTML version deleted]]
edgeR edgeR • 1.7k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 2 days ago
United States
Hi KJ, The short answer is that it doesn't matter, contingent upon you having set up the experiment correctly. Have you tried constructing the design matrix each way? > phenotype <- factor(rep(1:2, each=8)) > area <- factor(rep(1:2, times = 8)) > colnames(model.matrix(~area+phenotype)) [1] "(Intercept)" "area2" "phenotype2" > colnames(model.matrix(~phenotype+area)) [1] "(Intercept)" "phenotype2" "area2" The interpretation of each coefficient will be the same in both instances. Best, Jim On 3/22/2013 9:57 AM, KJ Lim wrote: > Dear edgeR community, > > Good day. > > I have sets of RNA-Seq data of 2 phenotype (HighS, LowS) plants with area > TZ& SW (control). I used exactTest to study which genes that are > differential expressed in area TZ compare to SW of each phenotype. > > I would like to know do the phenotypes have effect on the area, in this > case I should use glmLRT test. Before fit into the GLM, I defined my design > matrix as: > > ~phenotype+area > > Could someone please advice me regarding the design matrix? Should I define > my design matrix as ~area+phenotype ? > > Thank you very much for your time and help. > > Have a nice weekend. > > Best regards, > KJ Lim > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
Dear Jim, Thanks for your replied. I used read.delim to read my data into R session. The target object is: files phenotype area 1 H3TZ.txt HS TZ 2 H3SW.txt HS SW 3 H4TZ.txt HS TZ 4 H4SW.txt HS SW 5 L2TZ.txt LS TZ 6 L2SW.txt LS SW 7 L3TZ.txt LS TZ 8 L3SW.txt LS SW I built the factors as: > phenotype <- factor(targets$phenotype) > area <- factor(ts.targets$area, levels=c("TZ","SW")) I want to know do the phenotypes (HS & LS) have effect on the area. I built the design matrix in each way and I found the colnames is not as you have suggested. > colnames(model.matrix(~phenotype+area)) [1] "(Intercept)" "phenotypeL" "areaTZ" > colnames(model.matrix(~area+phenotype)) [1] "(Intercept)" "areaTZ" "phenotypeL" Perhaps, I made mistakes in between? Thanks for your time. Have a nice weekend. Best regards, KJ Lim On 22 March 2013 16:17, James W. MacDonald <jmacdon@uw.edu> wrote: > Hi KJ, > > The short answer is that it doesn't matter, contingent upon you having set > up the experiment correctly. Have you tried constructing the design matrix > each way? > > > phenotype <- factor(rep(1:2, each=8)) > > area <- factor(rep(1:2, times = 8)) > > colnames(model.matrix(~area+**phenotype)) > [1] "(Intercept)" "area2" "phenotype2" > > colnames(model.matrix(~**phenotype+area)) > [1] "(Intercept)" "phenotype2" "area2" > > The interpretation of each coefficient will be the same in both instances. > > Best, > > Jim > > > > > On 3/22/2013 9:57 AM, KJ Lim wrote: > >> Dear edgeR community, >> >> Good day. >> >> I have sets of RNA-Seq data of 2 phenotype (HighS, LowS) plants with area >> TZ& SW (control). I used exactTest to study which genes that are >> >> differential expressed in area TZ compare to SW of each phenotype. >> >> I would like to know do the phenotypes have effect on the area, in this >> case I should use glmLRT test. Before fit into the GLM, I defined my >> design >> matrix as: >> >> ~phenotype+area >> >> Could someone please advice me regarding the design matrix? Should I >> define >> my design matrix as ~area+phenotype ? >> >> Thank you very much for your time and help. >> >> Have a nice weekend. >> >> Best regards, >> KJ Lim >> >> [[alternative HTML version deleted]] >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: http://news.gmane.org/gmane.** >> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi KJ, On 3/22/2013 11:17 AM, KJ Lim wrote: > Dear Jim, > > Thanks for your replied. > > I used read.delim to read my data into R session. The target object is: > > files phenotype area > 1 H3TZ.txt HS TZ > 2 H3SW.txt HS SW > 3 H4TZ.txt HS TZ > 4 H4SW.txt HS SW > 5 L2TZ.txt LS TZ > 6 L2SW.txt LS SW > 7 L3TZ.txt LS TZ > 8 L3SW.txt LS SW > > I built the factors as: > > > phenotype <- factor(targets$phenotype) > > area <- factor(ts.targets$area, levels=c("TZ","SW")) > > I want to know do the phenotypes (HS & LS) have effect on the area. I > built the design matrix in each way and I found the colnames is not as > you have suggested. Right. But that isn't the point I was trying to make (that the colnames would be something in particular). Instead, I was pointing out that the coefficients being estimated will be the same regardless, and the only difference is the order in the design matrix. > > > colnames(model.matrix(~phenotype+area)) > [1] "(Intercept)" "phenotypeL" "areaTZ" > > colnames(model.matrix(~area+phenotype)) > [1] "(Intercept)" "areaTZ" "phenotypeL" > > Perhaps, I made mistakes in between? No, you illustrated my point as well. The two coefficients of note here are areaTZ and phenotypeL, and the only difference between the design matrices is the order that they appear. But you make a further statement above 'I want to know do the phenotypes (HS & LS) have effect on the area.'. This looks to me like you want the interaction term, which captures the difference in phenotype, given the area (or conversely, the differences in area, given phenotype). I believe there is at least one example of an interaction term in the limma User's Guide that you can look at to gain understanding. Best, Jim > > Thanks for your time. Have a nice weekend. > > Best regards, > KJ Lim > > On 22 March 2013 16:17, James W. MacDonald <jmacdon at="" uw.edu=""> <mailto:jmacdon at="" uw.edu="">> wrote: > > Hi KJ, > > The short answer is that it doesn't matter, contingent upon you > having set up the experiment correctly. Have you tried > constructing the design matrix each way? > > > phenotype <- factor(rep(1:2, each=8)) > > area <- factor(rep(1:2, times = 8)) > > colnames(model.matrix(~area+phenotype)) > [1] "(Intercept)" "area2" "phenotype2" > > colnames(model.matrix(~phenotype+area)) > [1] "(Intercept)" "phenotype2" "area2" > > The interpretation of each coefficient will be the same in both > instances. > > Best, > > Jim > > > > > On 3/22/2013 9:57 AM, KJ Lim wrote: > > Dear edgeR community, > > Good day. > > I have sets of RNA-Seq data of 2 phenotype (HighS, LowS) > plants with area > TZ& SW (control). I used exactTest to study which genes that > are > > differential expressed in area TZ compare to SW of each phenotype. > > I would like to know do the phenotypes have effect on the > area, in this > case I should use glmLRT test. Before fit into the GLM, I > defined my design > matrix as: > > ~phenotype+area > > Could someone please advice me regarding the design matrix? > Should I define > my design matrix as ~area+phenotype ? > > Thank you very much for your time and help. > > Have a nice weekend. > > Best regards, > KJ Lim > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD REPLY
0
Entering edit mode
Dear Jim, Thanks for your prompt relied and suggestion. I will have a look on the Limma User's Guide. Thank you. Best regards, KJ Lim On 22 March 2013 17:28, James W. MacDonald <jmacdon@uw.edu> wrote: > Hi KJ, > > > On 3/22/2013 11:17 AM, KJ Lim wrote: > >> Dear Jim, >> >> Thanks for your replied. >> >> I used read.delim to read my data into R session. The target object is: >> >> files phenotype area >> 1 H3TZ.txt HS TZ >> 2 H3SW.txt HS SW >> 3 H4TZ.txt HS TZ >> 4 H4SW.txt HS SW >> 5 L2TZ.txt LS TZ >> 6 L2SW.txt LS SW >> 7 L3TZ.txt LS TZ >> 8 L3SW.txt LS SW >> >> I built the factors as: >> >> > phenotype <- factor(targets$phenotype) >> > area <- factor(ts.targets$area, levels=c("TZ","SW")) >> >> I want to know do the phenotypes (HS & LS) have effect on the area. I >> built the design matrix in each way and I found the colnames is not as you >> have suggested. >> > > Right. But that isn't the point I was trying to make (that the colnames > would be something in particular). Instead, I was pointing out that the > coefficients being estimated will be the same regardless, and the only > difference is the order in the design matrix. > > > >> > colnames(model.matrix(~**phenotype+area)) >> [1] "(Intercept)" "phenotypeL" "areaTZ" >> > colnames(model.matrix(~area+**phenotype)) >> [1] "(Intercept)" "areaTZ" "phenotypeL" >> >> Perhaps, I made mistakes in between? >> > > No, you illustrated my point as well. The two coefficients of note here > are areaTZ and phenotypeL, and the only difference between the design > matrices is the order that they appear. > > But you make a further statement above 'I want to know do the phenotypes > (HS & LS) have effect on the area.'. This looks to me like you want the > interaction term, which captures the difference in phenotype, given the > area (or conversely, the differences in area, given phenotype). > > I believe there is at least one example of an interaction term in the > limma User's Guide that you can look at to gain understanding. > > Best, > > Jim > > > >> Thanks for your time. Have a nice weekend. >> >> Best regards, >> KJ Lim >> >> On 22 March 2013 16:17, James W. MacDonald <jmacdon@uw.edu <mailto:="">> jmacdon@uw.edu>> wrote: >> >> Hi KJ, >> >> The short answer is that it doesn't matter, contingent upon you >> having set up the experiment correctly. Have you tried >> constructing the design matrix each way? >> >> > phenotype <- factor(rep(1:2, each=8)) >> > area <- factor(rep(1:2, times = 8)) >> > colnames(model.matrix(~area+**phenotype)) >> [1] "(Intercept)" "area2" "phenotype2" >> > colnames(model.matrix(~**phenotype+area)) >> [1] "(Intercept)" "phenotype2" "area2" >> >> The interpretation of each coefficient will be the same in both >> instances. >> >> Best, >> >> Jim >> >> >> >> >> On 3/22/2013 9:57 AM, KJ Lim wrote: >> >> Dear edgeR community, >> >> Good day. >> >> I have sets of RNA-Seq data of 2 phenotype (HighS, LowS) >> plants with area >> TZ& SW (control). I used exactTest to study which genes that >> are >> >> differential expressed in area TZ compare to SW of each phenotype. >> >> I would like to know do the phenotypes have effect on the >> area, in this >> case I should use glmLRT test. Before fit into the GLM, I >> defined my design >> matrix as: >> >> ~phenotype+area >> >> Could someone please advice me regarding the design matrix? >> Should I define >> my design matrix as ~area+phenotype ? >> >> Thank you very much for your time and help. >> >> Have a nice weekend. >> >> Best regards, >> KJ Lim >> >> [[alternative HTML version deleted]] >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org <mailto:bioconductor@r-**project.org<bioconductor@r-project.org> >> > >> >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: >> http://news.gmane.org/gmane.**science.biology.informatics.** >> conductor<http: news.gmane.org="" gmane.science.biology.informatics.c="" onductor=""> >> >> >> -- James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> >> >> > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Dear Jim and community, I read the Limma User's Guide and read again the edgeR User's Guide. I believe the section 3.5 "Comparison both Between and Within Subjects" in the edgeR User's Guide is suitable to answer my question: Do phenotypes HS & LS have effect on the area (SW is control)? I constructed the design matrix as: > Phenotype <- factor(targets$phenotype, levels=c("HS", "LS")) > Area <- factor(targets$area, levels=c("SW","TZ")) > Tree <- gl(2,2, length=8) > design <- model.matrix(~Phenotype+Phenotype:Tree+Phenotype:Area) > colnames(design) [1] "(Intercept)" "PhenotypeLS" "PhenotypeHS:Tree2" [4] "PhenotypeLS:Tree2" "PhenotypeHS:AreaTZ" "PhenotypeLS:AreaTZ" Then carried out the Likelihood Ratio Test (LRT): >lrtHvsL <- glmLRT(fit, contrast=c(0,0,0,0,1,-1)) > summary(decideTestsDGE(lrtHvsL)) [,1] -1 1 0 27849 1 0 Based on the LRT result, can I concluded that phenotypes HS & LS have NO effect on the area? If this is not the way, kindly please advice. Thank you very much for your time and help. Have a nice day. Best regards, KJ Lim On 25 March 2013 09:20, KJ Lim <jinkeanlim@gmail.com> wrote: > Dear Jim, > > Thanks for your prompt relied and suggestion. I will have a look on the > Limma User's Guide. > Thank you. > > Best regards, > KJ Lim > > > On 22 March 2013 17:28, James W. MacDonald <jmacdon@uw.edu> wrote: > >> Hi KJ, >> >> >> On 3/22/2013 11:17 AM, KJ Lim wrote: >> >>> Dear Jim, >>> >>> Thanks for your replied. >>> >>> I used read.delim to read my data into R session. The target object is: >>> >>> files phenotype area >>> 1 H3TZ.txt HS TZ >>> 2 H3SW.txt HS SW >>> 3 H4TZ.txt HS TZ >>> 4 H4SW.txt HS SW >>> 5 L2TZ.txt LS TZ >>> 6 L2SW.txt LS SW >>> 7 L3TZ.txt LS TZ >>> 8 L3SW.txt LS SW >>> >>> I built the factors as: >>> >>> > phenotype <- factor(targets$phenotype) >>> > area <- factor(ts.targets$area, levels=c("TZ","SW")) >>> >>> I want to know do the phenotypes (HS & LS) have effect on the area. I >>> built the design matrix in each way and I found the colnames is not as you >>> have suggested. >>> >> >> Right. But that isn't the point I was trying to make (that the colnames >> would be something in particular). Instead, I was pointing out that the >> coefficients being estimated will be the same regardless, and the only >> difference is the order in the design matrix. >> >> >> >>> > colnames(model.matrix(~phenotype+area)) >>> [1] "(Intercept)" "phenotypeL" "areaTZ" >>> > colnames(model.matrix(~area+phenotype)) >>> [1] "(Intercept)" "areaTZ" "phenotypeL" >>> >>> Perhaps, I made mistakes in between? >>> >> >> No, you illustrated my point as well. The two coefficients of note here >> are areaTZ and phenotypeL, and the only difference between the design >> matrices is the order that they appear. >> >> But you make a further statement above 'I want to know do the phenotypes >> (HS & LS) have effect on the area.'. This looks to me like you want the >> interaction term, which captures the difference in phenotype, given the >> area (or conversely, the differences in area, given phenotype). >> >> I believe there is at least one example of an interaction term in the >> limma User's Guide that you can look at to gain understanding. >> >> Best, >> >> Jim >> >> >> >>> Thanks for your time. Have a nice weekend. >>> >>> Best regards, >>> KJ Lim >>> >>> On 22 March 2013 16:17, James W. MacDonald <jmacdon@uw.edu <mailto:="">>> jmacdon@uw.edu>> wrote: >>> >>> Hi KJ, >>> >>> The short answer is that it doesn't matter, contingent upon you >>> having set up the experiment correctly. Have you tried >>> constructing the design matrix each way? >>> >>> > phenotype <- factor(rep(1:2, each=8)) >>> > area <- factor(rep(1:2, times = 8)) >>> > colnames(model.matrix(~area+phenotype)) >>> [1] "(Intercept)" "area2" "phenotype2" >>> > colnames(model.matrix(~phenotype+area)) >>> [1] "(Intercept)" "phenotype2" "area2" >>> >>> The interpretation of each coefficient will be the same in both >>> instances. >>> >>> Best, >>> >>> Jim >>> >>> >>> >>> >>> On 3/22/2013 9:57 AM, KJ Lim wrote: >>> >>> Dear edgeR community, >>> >>> Good day. >>> >>> I have sets of RNA-Seq data of 2 phenotype (HighS, LowS) >>> plants with area >>> TZ& SW (control). I used exactTest to study which genes that >>> are >>> >>> differential expressed in area TZ compare to SW of each >>> phenotype. >>> >>> I would like to know do the phenotypes have effect on the >>> area, in this >>> case I should use glmLRT test. Before fit into the GLM, I >>> defined my design >>> matrix as: >>> >>> ~phenotype+area >>> >>> Could someone please advice me regarding the design matrix? >>> Should I define >>> my design matrix as ~area+phenotype ? >>> >>> Thank you very much for your time and help. >>> >>> Have a nice weekend. >>> >>> Best regards, >>> KJ Lim >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor@r-project.org <mailto:bioconductor@r-project.org> >>> >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> >>> -- James W. MacDonald, M.S. >>> Biostatistician >>> University of Washington >>> Environmental and Occupational Health Sciences >>> 4225 Roosevelt Way NE, # 100 >>> Seattle WA 98105-6099 >>> >>> >>> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> >> > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi KJ, On 3/26/2013 6:31 AM, KJ Lim wrote: > Dear Jim and community, > > I read the Limma User's Guide and read again the edgeR User's Guide. > > I believe the section 3.5 "Comparison both Between and Within > Subjects" in the edgeR User's Guide is suitable to answer my > question: Do phenotypes HS & LS have effect on the area (SW is control)? > > I constructed the design matrix as: > > > Phenotype <- factor(targets$phenotype, levels=c("HS", "LS")) > > Area <- factor(targets$area, levels=c("SW","TZ")) > > Tree <- gl(2,2, length=8) > > > design <- model.matrix(~Phenotype+Phenotype:Tree+Phenotype:Area) > > > colnames(design) > [1] "(Intercept)" "PhenotypeLS" "PhenotypeHS:Tree2" > [4] "PhenotypeLS:Tree2" "PhenotypeHS:AreaTZ" "PhenotypeLS:AreaTZ" > > Then carried out the Likelihood Ratio Test (LRT): > > >lrtHvsL <- glmLRT(fit, contrast=c(0,0,0,0,1,-1)) > > > summary(decideTestsDGE(lrtHvsL)) > [,1] > -1 1 > 0 27849 > 1 0 > > Based on the LRT result, can I concluded that phenotypes HS & LS have > NO effect on the area? If this is not the way, kindly please advice. This brings us to a pedantic discussion of what a p-value means, what a statistical test is actually testing, etc. This is in general boring for non statisticians, but is a fairly critical point that people should understand. Without getting into the nuances (if interested, there was a recent discussion on Simply Statistics; see point 4 here; http://simplystatistics.org/2013/03/17/sunday-datastatistics-link- roundup-31713/), when you are doing a statistical test you are only trying to reject the null hypothesis, which in this case is that there is no interaction between area and treatment. Failure to reject the null does not imply the converse of proving the null true. There are any number of reasons why you couldn't reject the null here. You might not have enough data, there might be an uncontrolled variable that is messing things up, etc. At this point all you can say is that you have no evidence to support an interaction between treatment and area. Best, Jim > > Thank you very much for your time and help. > > Have a nice day. > > Best regards, > KJ Lim > > > > > On 25 March 2013 09:20, KJ Lim <jinkeanlim at="" gmail.com=""> <mailto:jinkeanlim at="" gmail.com="">> wrote: > > Dear Jim, > > Thanks for your prompt relied and suggestion. I will have a look > on the Limma User's Guide. > Thank you. > > Best regards, > KJ Lim > > > On 22 March 2013 17:28, James W. MacDonald <jmacdon at="" uw.edu=""> <mailto:jmacdon at="" uw.edu="">> wrote: > > Hi KJ, > > > On 3/22/2013 11:17 AM, KJ Lim wrote: > > Dear Jim, > > Thanks for your replied. > > I used read.delim to read my data into R session. The > target object is: > > files phenotype area > 1 H3TZ.txt HS TZ > 2 H3SW.txt HS SW > 3 H4TZ.txt HS TZ > 4 H4SW.txt HS SW > 5 L2TZ.txt LS TZ > 6 L2SW.txt LS SW > 7 L3TZ.txt LS TZ > 8 L3SW.txt LS SW > > I built the factors as: > > > phenotype <- factor(targets$phenotype) > > area <- factor(ts.targets$area, levels=c("TZ","SW")) > > I want to know do the phenotypes (HS & LS) have effect on > the area. I built the design matrix in each way and I > found the colnames is not as you have suggested. > > > Right. But that isn't the point I was trying to make (that the > colnames would be something in particular). Instead, I was > pointing out that the coefficients being estimated will be the > same regardless, and the only difference is the order in the > design matrix. > > > > > colnames(model.matrix(~phenotype+area)) > [1] "(Intercept)" "phenotypeL" "areaTZ" > > colnames(model.matrix(~area+phenotype)) > [1] "(Intercept)" "areaTZ" "phenotypeL" > > Perhaps, I made mistakes in between? > > > No, you illustrated my point as well. The two coefficients of > note here are areaTZ and phenotypeL, and the only difference > between the design matrices is the order that they appear. > > But you make a further statement above 'I want to know do the > phenotypes (HS & LS) have effect on the area.'. This looks to > me like you want the interaction term, which captures the > difference in phenotype, given the area (or conversely, the > differences in area, given phenotype). > > I believe there is at least one example of an interaction term > in the limma User's Guide that you can look at to gain > understanding. > > Best, > > Jim > > > > Thanks for your time. Have a nice weekend. > > Best regards, > KJ Lim > > On 22 March 2013 16:17, James W. MacDonald <jmacdon at="" uw.edu=""> <mailto:jmacdon at="" uw.edu=""> <mailto:jmacdon at="" uw.edu=""> <mailto:jmacdon at="" uw.edu="">>> wrote: > > Hi KJ, > > The short answer is that it doesn't matter, contingent > upon you > having set up the experiment correctly. Have you tried > constructing the design matrix each way? > > > phenotype <- factor(rep(1:2, each=8)) > > area <- factor(rep(1:2, times = 8)) > > colnames(model.matrix(~area+phenotype)) > [1] "(Intercept)" "area2" "phenotype2" > > colnames(model.matrix(~phenotype+area)) > [1] "(Intercept)" "phenotype2" "area2" > > The interpretation of each coefficient will be the > same in both > instances. > > Best, > > Jim > > > > > On 3/22/2013 9:57 AM, KJ Lim wrote: > > Dear edgeR community, > > Good day. > > I have sets of RNA-Seq data of 2 phenotype (HighS, > LowS) > plants with area > TZ& SW (control). I used exactTest to study > which genes that > are > > differential expressed in area TZ compare to SW of > each phenotype. > > I would like to know do the phenotypes have effect > on the > area, in this > case I should use glmLRT test. Before fit into the > GLM, I > defined my design > matrix as: > > ~phenotype+area > > Could someone please advice me regarding the > design matrix? > Should I define > my design matrix as ~area+phenotype ? > > Thank you very much for your time and help. > > Have a nice weekend. > > Best regards, > KJ Lim > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > <mailto:bioconductor at="" r-project.org=""> > <mailto:bioconductor at="" r-project.org=""> <mailto:bioconductor at="" r-project.org="">> > > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD REPLY
0
Entering edit mode
Dear Jim, Good day and thanks for your prompt replied. This brings us to a pedantic discussion of what a p-value means, what a > statistical test is actually testing, etc. This is in general boring for > non statisticians, but is a fairly critical point that people should > understand. > > Without getting into the nuances (if interested, there was a recent > discussion on Simply Statistics; see point 4 here; > http://simplystatistics.org/2013/03/17/sunday-datastatistics-link- roundup-31713/), > when you are doing a statistical test you are only trying to reject the > null hypothesis, which in this case is that there is no interaction between > area and treatment. Failure to reject the null does not imply the converse > of proving the null true. > > There are any number of reasons why you couldn't reject the null here. You > might not have enough data, there might be an uncontrolled variable that is > messing things up, etc. At this point all you can say is that you have no > evidence to support an interaction between treatment and area. > You are right about that. I should rephrase my sentence in my previous email. My apologies, I'm not a statistician. Thanks for pointing out statistic fundamental concern. Have a nice day. Best regards, KJ Lim On 26 March 2013 15:24, James W. MacDonald <jmacdon@uw.edu> wrote: > Hi KJ, > > > On 3/26/2013 6:31 AM, KJ Lim wrote: > >> Dear Jim and community, >> >> I read the Limma User's Guide and read again the edgeR User's Guide. >> >> I believe the section 3.5 "Comparison both Between and Within Subjects" >> in the edgeR User's Guide is suitable to answer my question: Do phenotypes >> HS & LS have effect on the area (SW is control)? >> >> I constructed the design matrix as: >> >> > Phenotype <- factor(targets$phenotype, levels=c("HS", "LS")) >> > Area <- factor(targets$area, levels=c("SW","TZ")) >> > Tree <- gl(2,2, length=8) >> >> > design <- model.matrix(~Phenotype+**Phenotype:Tree+Phenotype:Area) >> >> > colnames(design) >> [1] "(Intercept)" "PhenotypeLS" "PhenotypeHS:Tree2" >> [4] "PhenotypeLS:Tree2" "PhenotypeHS:AreaTZ" "PhenotypeLS:AreaTZ" >> >> Then carried out the Likelihood Ratio Test (LRT): >> >> >lrtHvsL <- glmLRT(fit, contrast=c(0,0,0,0,1,-1)) >> >> > summary(decideTestsDGE(**lrtHvsL)) >> [,1] >> -1 1 >> 0 27849 >> 1 0 >> >> Based on the LRT result, can I concluded that phenotypes HS & LS have NO >> effect on the area? If this is not the way, kindly please advice. >> > > This brings us to a pedantic discussion of what a p-value means, what a > statistical test is actually testing, etc. This is in general boring for > non statisticians, but is a fairly critical point that people should > understand. > > Without getting into the nuances (if interested, there was a recent > discussion on Simply Statistics; see point 4 here; > http://simplystatistics.org/**2013/03/17/sunday-** > datastatistics-link- roundup-**31713/<http: simplystatistics.org="" 2013="" 03="" 17="" sunday-="" datastatistics-link-roundup-31713=""/>), > when you are doing a statistical test you are only trying to reject the > null hypothesis, which in this case is that there is no interaction between > area and treatment. Failure to reject the null does not imply the converse > of proving the null true. > > There are any number of reasons why you couldn't reject the null here. You > might not have enough data, there might be an uncontrolled variable that is > messing things up, etc. At this point all you can say is that you have no > evidence to support an interaction between treatment and area. > > Best, > > Jim > > > >> Thank you very much for your time and help. >> >> Have a nice day. >> >> Best regards, >> KJ Lim >> >> >> >> >> On 25 March 2013 09:20, KJ Lim <jinkeanlim@gmail.com <mailto:="">> jinkeanlim@gmail.com>> wrote: >> >> Dear Jim, >> >> Thanks for your prompt relied and suggestion. I will have a look >> on the Limma User's Guide. >> Thank you. >> >> Best regards, >> KJ Lim >> >> >> On 22 March 2013 17:28, James W. MacDonald <jmacdon@uw.edu>> <mailto:jmacdon@uw.edu>> wrote: >> >> Hi KJ, >> >> >> On 3/22/2013 11:17 AM, KJ Lim wrote: >> >> Dear Jim, >> >> Thanks for your replied. >> >> I used read.delim to read my data into R session. The >> target object is: >> >> files phenotype area >> 1 H3TZ.txt HS TZ >> 2 H3SW.txt HS SW >> 3 H4TZ.txt HS TZ >> 4 H4SW.txt HS SW >> 5 L2TZ.txt LS TZ >> 6 L2SW.txt LS SW >> 7 L3TZ.txt LS TZ >> 8 L3SW.txt LS SW >> >> I built the factors as: >> >> > phenotype <- factor(targets$phenotype) >> > area <- factor(ts.targets$area, levels=c("TZ","SW")) >> >> I want to know do the phenotypes (HS & LS) have effect on >> the area. I built the design matrix in each way and I >> found the colnames is not as you have suggested. >> >> >> Right. But that isn't the point I was trying to make (that the >> colnames would be something in particular). Instead, I was >> pointing out that the coefficients being estimated will be the >> same regardless, and the only difference is the order in the >> design matrix. >> >> >> >> > colnames(model.matrix(~**phenotype+area)) >> [1] "(Intercept)" "phenotypeL" "areaTZ" >> > colnames(model.matrix(~area+**phenotype)) >> [1] "(Intercept)" "areaTZ" "phenotypeL" >> >> Perhaps, I made mistakes in between? >> >> >> No, you illustrated my point as well. The two coefficients of >> note here are areaTZ and phenotypeL, and the only difference >> between the design matrices is the order that they appear. >> >> But you make a further statement above 'I want to know do the >> phenotypes (HS & LS) have effect on the area.'. This looks to >> me like you want the interaction term, which captures the >> difference in phenotype, given the area (or conversely, the >> differences in area, given phenotype). >> >> I believe there is at least one example of an interaction term >> in the limma User's Guide that you can look at to gain >> understanding. >> >> Best, >> >> Jim >> >> >> >> Thanks for your time. Have a nice weekend. >> >> Best regards, >> KJ Lim >> >> On 22 March 2013 16:17, James W. MacDonald <jmacdon@uw.edu>> <mailto:jmacdon@uw.edu> <mailto:jmacdon@uw.edu>> >> <mailto:jmacdon@uw.edu>>> wrote: >> >> Hi KJ, >> >> The short answer is that it doesn't matter, contingent >> upon you >> having set up the experiment correctly. Have you tried >> constructing the design matrix each way? >> >> > phenotype <- factor(rep(1:2, each=8)) >> > area <- factor(rep(1:2, times = 8)) >> > colnames(model.matrix(~area+**phenotype)) >> [1] "(Intercept)" "area2" "phenotype2" >> > colnames(model.matrix(~**phenotype+area)) >> [1] "(Intercept)" "phenotype2" "area2" >> >> The interpretation of each coefficient will be the >> same in both >> instances. >> >> Best, >> >> Jim >> >> >> >> >> On 3/22/2013 9:57 AM, KJ Lim wrote: >> >> Dear edgeR community, >> >> Good day. >> >> I have sets of RNA-Seq data of 2 phenotype (HighS, >> LowS) >> plants with area >> TZ& SW (control). I used exactTest to study >> which genes that >> are >> >> differential expressed in area TZ compare to SW of >> each phenotype. >> >> I would like to know do the phenotypes have effect >> on the >> area, in this >> case I should use glmLRT test. Before fit into the >> GLM, I >> defined my design >> matrix as: >> >> ~phenotype+area >> >> Could someone please advice me regarding the >> design matrix? >> Should I define >> my design matrix as ~area+phenotype ? >> >> Thank you very much for your time and help. >> >> Have a nice weekend. >> >> Best regards, >> KJ Lim >> >> [[alternative HTML version deleted]] >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> <mailto:bioconductor@r-**project.org<bioconductor@r-project.org> >> > >> <mailto:bioconductor@r-**project.org<bioconductor@r-project.org> >> >> <mailto:bioconductor@r-**project.org<bioconductor@r-project.org> >> >> >> >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<ht tps:="" stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: >> http://news.gmane.org/gmane.**science.biology.informatics.** >> conductor<http: news.gmane.org="" gmane.science.biology.informatics.c="" onductor=""> >> >> >> -- James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> >> >> >> -- James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> >> >> >> > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 609 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6