Contrast among a subset of interactions, glm in edgeR
2
0
Entering edit mode
@maximo-rivarola-6613
Last seen 10.1 years ago
Hello everyone, I am writing my first email to this list. I have a question in regards to contrasting a few interactions... I have an RNA-seq experiment with the following structure: - 3 cepa (genotypes) - 3 tiempo (time points) - inf (Inoculated or not) Basically, three different genotypes were infected or not (inf) and then 3 time points were take and sequenced. Most samples have 2 biol replicates and some 4 biol replicates. My question is the following: I would like to make a contrast were I compare the interaction of cepaHA853:tiempo4:infinoc - cepaHA853:tiempo8:infinoc ???? My problem is I do not know where the column for this interaction is??? I placed the 0 as for it not to intercept in the model, but the later factors are always shown from level 2 on... How can I "see" all the interaction from my design??? Below is some code: Thanks in advanced!!! cheers, Design implemented: design7 <- model.matrix(~ 0 + cepa*tiempo*inf + linea) > dim(design7) [1] 56 24 > head(design7) cepaHA853 cepaHA89 cepaRK416 tiempo4 tiempo8 infinoc linea2 linea3 linea4 linea5 linea6 linea7 cepaHA89:tiempo4 cepaRK416:tiempo4 1 0 1 0 0 0 1 0 1 0 0 0 0 0 0 2 0 1 0 0 0 1 0 0 1 0 0 0 0 0 3 0 1 0 0 0 1 0 0 0 1 0 0 0 0 4 0 1 0 0 0 1 0 0 0 0 1 0 0 0 5 1 0 0 0 0 1 0 0 0 0 0 1 0 0 6 1 0 0 0 0 1 0 0 0 0 0 0 0 0 cepaHA89:tiempo8 cepaRK416:tiempo8 cepaHA89:infinoc cepaRK416:infinoc tiempo4:infinoc tiempo8:infinoc cepaHA89:tiempo4:infinoc 1 0 0 1 0 0 0 0 2 0 0 1 0 0 0 0 3 0 0 1 0 0 0 0 4 0 0 1 0 0 0 0 5 0 0 0 0 0 0 0 6 0 0 0 0 0 0 0 cepaRK416:tiempo4:infinoc cepaHA89:tiempo8:infinoc cepaRK416:tiempo8:infinoc 1 0 0 0 2 0 0 0 3 0 0 0 4 0 0 0 5 0 0 0 6 0 0 0 > dge7 <- estimateGLMCommonDisp(dge1, design7, verbose = TRUE) #To estimate common dispersion # Disp = 0.03873 , BCV = 0.1968 dge7 <- estimateGLMTagwiseDisp(dge7, design7) #To estimate tagwise dispersions # $tagwise.dispersion # [1] 0.21290421 0.01218267 0.01580605 0.00906055 0.02124947 fit7 <- glmFit(dge7, design7, dispersion=dge7$tagwise.dispersion) > class(fit7) [1] "DGEGLM" attr(,"package") [1] "edgeR" > sessionInfo() R version 3.1.0 (2014-04-10) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=es_ES.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=es_ES.UTF-8 [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=es_ES.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] edgeR_3.6.2 limma_3.20.4 thanks a lot! hope is explained well.... -------------------------------------------------------- Maximo Rivarola
• 1.2k views
ADD COMMENT
0
Entering edit mode
@ryan-c-thompson-5618
Last seen 2 days ago
Icahn School of Medicine at Mount Sinaiā€¦
Hi Maximo, I believe that you want to use the following: designformula <- terms(~ 0 + cepa:tiempo:inf + linea, keep.order = TRUE) design7 <- model.matrix(designformula) This will give you one coefficient for each combination of cepa, timepo, and inf. Also, after doing that, you might also want to replace the colons in the column names with periods, to make syntacically valid names, which will make things easier when you start constructing contrasts: colnames(design7) <- make.names(colnames(design7)) -Ryan On Tue Jun 17 09:28:17 2014, Maximo Rivarola wrote: > Hello everyone, > > I am writing my first email to this list. I have a question in regards to contrasting a few interactions... > I have an RNA-seq experiment with the following structure: > - 3 cepa (genotypes) > - 3 tiempo (time points) > - inf (Inoculated or not) > > Basically, three different genotypes were infected or not (inf) and then 3 time points were take and sequenced. Most samples have 2 biol replicates and some 4 biol replicates. > > My question is the following: > I would like to make a contrast were I compare the interaction of cepaHA853:tiempo4:infinoc - cepaHA853:tiempo8:infinoc ???? > My problem is I do not know where the column for this interaction is??? > I placed the 0 as for it not to intercept in the model, but the later factors are always shown from level 2 on... > How can I "see" all the interaction from my design??? > > Below is some code: > Thanks in advanced!!! cheers, > > Design implemented: > design7 <- model.matrix(~ 0 + cepa*tiempo*inf + linea) > >> dim(design7) > [1] 56 24 > >> head(design7) > cepaHA853 cepaHA89 cepaRK416 tiempo4 tiempo8 infinoc linea2 linea3 linea4 linea5 linea6 linea7 cepaHA89:tiempo4 cepaRK416:tiempo4 > 1 0 1 0 0 0 1 0 1 0 0 0 0 0 0 > 2 0 1 0 0 0 1 0 0 1 0 0 0 0 0 > 3 0 1 0 0 0 1 0 0 0 1 0 0 0 0 > 4 0 1 0 0 0 1 0 0 0 0 1 0 0 0 > 5 1 0 0 0 0 1 0 0 0 0 0 1 0 0 > 6 1 0 0 0 0 1 0 0 0 0 0 0 0 0 > cepaHA89:tiempo8 cepaRK416:tiempo8 cepaHA89:infinoc cepaRK416:infinoc tiempo4:infinoc tiempo8:infinoc cepaHA89:tiempo4:infinoc > 1 0 0 1 0 0 0 0 > 2 0 0 1 0 0 0 0 > 3 0 0 1 0 0 0 0 > 4 0 0 1 0 0 0 0 > 5 0 0 0 0 0 0 0 > 6 0 0 0 0 0 0 0 > cepaRK416:tiempo4:infinoc cepaHA89:tiempo8:infinoc cepaRK416:tiempo8:infinoc > 1 0 0 0 > 2 0 0 0 > 3 0 0 0 > 4 0 0 0 > 5 0 0 0 > 6 0 0 0 >> > > dge7 <- estimateGLMCommonDisp(dge1, design7, verbose = TRUE) #To estimate common dispersion > # Disp = 0.03873 , BCV = 0.1968 > > dge7 <- estimateGLMTagwiseDisp(dge7, design7) #To estimate tagwise dispersions > # $tagwise.dispersion > # [1] 0.21290421 0.01218267 0.01580605 0.00906055 0.02124947 > > fit7 <- glmFit(dge7, design7, dispersion=dge7$tagwise.dispersion) > >> class(fit7) > [1] "DGEGLM" > attr(,"package") > [1] "edgeR" > >> sessionInfo() > R version 3.1.0 (2014-04-10) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=es_ES.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=es_ES.UTF-8 > [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=es_ES.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] edgeR_3.6.2 limma_3.20.4 > > thanks a lot! > hope is explained well.... > > -------------------------------------------------------- > Maximo Rivarola > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Thanks ryan, I beleive this works!!!! for the interactions i wanted to test! For the contrast between main factors i use the other design? or is there a way from this one? thanks again :) -------------------------------------------------------- Maximo Rivarola PhD. ________________________________________ From: Ryan [rct@thompsonclan.org] Sent: Tuesday, June 17, 2014 1:39 PM To: Maximo Rivarola Cc: bioconductor at r-project.org Subject: Re: [BioC] Contrast among a subset of interactions, glm in edgeR Hi Maximo, I believe that you want to use the following: designformula <- terms(~ 0 + cepa:tiempo:inf + linea, keep.order = TRUE) design7 <- model.matrix(designformula) This will give you one coefficient for each combination of cepa, timepo, and inf. Also, after doing that, you might also want to replace the colons in the column names with periods, to make syntacically valid names, which will make things easier when you start constructing contrasts: colnames(design7) <- make.names(colnames(design7)) -Ryan On Tue Jun 17 09:28:17 2014, Maximo Rivarola wrote: > Hello everyone, > > I am writing my first email to this list. I have a question in regards to contrasting a few interactions... > I have an RNA-seq experiment with the following structure: > - 3 cepa (genotypes) > - 3 tiempo (time points) > - inf (Inoculated or not) > > Basically, three different genotypes were infected or not (inf) and then 3 time points were take and sequenced. Most samples have 2 biol replicates and some 4 biol replicates. > > My question is the following: > I would like to make a contrast were I compare the interaction of cepaHA853:tiempo4:infinoc - cepaHA853:tiempo8:infinoc ???? > My problem is I do not know where the column for this interaction is??? > I placed the 0 as for it not to intercept in the model, but the later factors are always shown from level 2 on... > How can I "see" all the interaction from my design??? > > Below is some code: > Thanks in advanced!!! cheers, > > Design implemented: > design7 <- model.matrix(~ 0 + cepa*tiempo*inf + linea) > >> dim(design7) > [1] 56 24 > >> head(design7) > cepaHA853 cepaHA89 cepaRK416 tiempo4 tiempo8 infinoc linea2 linea3 linea4 linea5 linea6 linea7 cepaHA89:tiempo4 cepaRK416:tiempo4 > 1 0 1 0 0 0 1 0 1 0 0 0 0 0 0 > 2 0 1 0 0 0 1 0 0 1 0 0 0 0 0 > 3 0 1 0 0 0 1 0 0 0 1 0 0 0 0 > 4 0 1 0 0 0 1 0 0 0 0 1 0 0 0 > 5 1 0 0 0 0 1 0 0 0 0 0 1 0 0 > 6 1 0 0 0 0 1 0 0 0 0 0 0 0 0 > cepaHA89:tiempo8 cepaRK416:tiempo8 cepaHA89:infinoc cepaRK416:infinoc tiempo4:infinoc tiempo8:infinoc cepaHA89:tiempo4:infinoc > 1 0 0 1 0 0 0 0 > 2 0 0 1 0 0 0 0 > 3 0 0 1 0 0 0 0 > 4 0 0 1 0 0 0 0 > 5 0 0 0 0 0 0 0 > 6 0 0 0 0 0 0 0 > cepaRK416:tiempo4:infinoc cepaHA89:tiempo8:infinoc cepaRK416:tiempo8:infinoc > 1 0 0 0 > 2 0 0 0 > 3 0 0 0 > 4 0 0 0 > 5 0 0 0 > 6 0 0 0 >> > > dge7 <- estimateGLMCommonDisp(dge1, design7, verbose = TRUE) #To estimate common dispersion > # Disp = 0.03873 , BCV = 0.1968 > > dge7 <- estimateGLMTagwiseDisp(dge7, design7) #To estimate tagwise dispersions > # $tagwise.dispersion > # [1] 0.21290421 0.01218267 0.01580605 0.00906055 0.02124947 > > fit7 <- glmFit(dge7, design7, dispersion=dge7$tagwise.dispersion) > >> class(fit7) > [1] "DGEGLM" > attr(,"package") > [1] "edgeR" > >> sessionInfo() > R version 3.1.0 (2014-04-10) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=es_ES.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=es_ES.UTF-8 > [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=es_ES.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] edgeR_3.6.2 limma_3.20.4 > > thanks a lot! > hope is explained well.... > > -------------------------------------------------------- > Maximo Rivarola > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
Hi ryan, thanks again, i figured out the last part, i used: designformula <- terms(~ 0 + cepa:tiempo:inf + linea, keep.order = TRUE) designformula <- terms(~ 0 + cepa + tiempo + inf + cepa:tiempo:inf + linea, keep.order = TRUE) design7 <- model.matrix(designformula) rownames(design7) <- sinfo$names colnames(design7) <- make.names(colnames(design7)) design7 dim(design7) thanks!!! -------------------------------------------------------- Maximo Rivarola PhD. Inv. Asistente CONICET. ________________________________________ From: Ryan [rct@thompsonclan.org] Sent: Tuesday, June 17, 2014 1:39 PM To: Maximo Rivarola Cc: bioconductor at r-project.org Subject: Re: [BioC] Contrast among a subset of interactions, glm in edgeR Hi Maximo, I believe that you want to use the following: designformula <- terms(~ 0 + cepa:tiempo:inf + linea, keep.order = TRUE) design7 <- model.matrix(designformula) This will give you one coefficient for each combination of cepa, timepo, and inf. Also, after doing that, you might also want to replace the colons in the column names with periods, to make syntacically valid names, which will make things easier when you start constructing contrasts: colnames(design7) <- make.names(colnames(design7)) -Ryan On Tue Jun 17 09:28:17 2014, Maximo Rivarola wrote: > Hello everyone, > > I am writing my first email to this list. I have a question in regards to contrasting a few interactions... > I have an RNA-seq experiment with the following structure: > - 3 cepa (genotypes) > - 3 tiempo (time points) > - inf (Inoculated or not) > > Basically, three different genotypes were infected or not (inf) and then 3 time points were take and sequenced. Most samples have 2 biol replicates and some 4 biol replicates. > > My question is the following: > I would like to make a contrast were I compare the interaction of cepaHA853:tiempo4:infinoc - cepaHA853:tiempo8:infinoc ???? > My problem is I do not know where the column for this interaction is??? > I placed the 0 as for it not to intercept in the model, but the later factors are always shown from level 2 on... > How can I "see" all the interaction from my design??? > > Below is some code: > Thanks in advanced!!! cheers, > > Design implemented: > design7 <- model.matrix(~ 0 + cepa*tiempo*inf + linea) > >> dim(design7) > [1] 56 24 > >> head(design7) > cepaHA853 cepaHA89 cepaRK416 tiempo4 tiempo8 infinoc linea2 linea3 linea4 linea5 linea6 linea7 cepaHA89:tiempo4 cepaRK416:tiempo4 > 1 0 1 0 0 0 1 0 1 0 0 0 0 0 0 > 2 0 1 0 0 0 1 0 0 1 0 0 0 0 0 > 3 0 1 0 0 0 1 0 0 0 1 0 0 0 0 > 4 0 1 0 0 0 1 0 0 0 0 1 0 0 0 > 5 1 0 0 0 0 1 0 0 0 0 0 1 0 0 > 6 1 0 0 0 0 1 0 0 0 0 0 0 0 0 > cepaHA89:tiempo8 cepaRK416:tiempo8 cepaHA89:infinoc cepaRK416:infinoc tiempo4:infinoc tiempo8:infinoc cepaHA89:tiempo4:infinoc > 1 0 0 1 0 0 0 0 > 2 0 0 1 0 0 0 0 > 3 0 0 1 0 0 0 0 > 4 0 0 1 0 0 0 0 > 5 0 0 0 0 0 0 0 > 6 0 0 0 0 0 0 0 > cepaRK416:tiempo4:infinoc cepaHA89:tiempo8:infinoc cepaRK416:tiempo8:infinoc > 1 0 0 0 > 2 0 0 0 > 3 0 0 0 > 4 0 0 0 > 5 0 0 0 > 6 0 0 0 >> > > dge7 <- estimateGLMCommonDisp(dge1, design7, verbose = TRUE) #To estimate common dispersion > # Disp = 0.03873 , BCV = 0.1968 > > dge7 <- estimateGLMTagwiseDisp(dge7, design7) #To estimate tagwise dispersions > # $tagwise.dispersion > # [1] 0.21290421 0.01218267 0.01580605 0.00906055 0.02124947 > > fit7 <- glmFit(dge7, design7, dispersion=dge7$tagwise.dispersion) > >> class(fit7) > [1] "DGEGLM" > attr(,"package") > [1] "edgeR" > >> sessionInfo() > R version 3.1.0 (2014-04-10) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=es_ES.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=es_ES.UTF-8 > [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=es_ES.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] edgeR_3.6.2 limma_3.20.4 > > thanks a lot! > hope is explained well.... > > -------------------------------------------------------- > Maximo Rivarola > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
Hi ryan and everyone!, thanks again for your help, I have one more question? One more Error... I get an error when i try to estimateGLMcommondispersion, When i use the design: designformula_er <- terms(~ 0 + cepa + tiempo + inf + cepa:tiempo:inf + linea, keep.order = TRUE) design7_er <- model.matrix(designformula_er) rownames(design7_er) <- sinfo$names colnames(design7_er) <- make.names(colnames(design7_er)) colnames(design7_er) dim(design7_er) #[1] 56 30 Colnames look like this: > colnames(design7_er) [1] "cepaHA853" "cepaHA89" "cepaRK416" "tiempo4" "tiempo8" [6] "infinoc" "cepaHA853.tiempo0.infcontrol" "cepaHA89.tiempo0.infcontrol" "cepaRK416.tiempo0.infcontrol" "cepaHA853.tiempo4.infcontrol" [11] "cepaHA89.tiempo4.infcontrol" "cepaRK416.tiempo4.infcontrol" "cepaHA853.tiempo8.infcontrol" "cepaHA89.tiempo8.infcontrol" "cepaRK416.tiempo8.infcontrol" [16] "cepaHA853.tiempo0.infinoc" "cepaHA89.tiempo0.infinoc" "cepaRK416.tiempo0.infinoc" "cepaHA853.tiempo4.infinoc" "cepaHA89.tiempo4.infinoc" [21] "cepaRK416.tiempo4.infinoc" "cepaHA853.tiempo8.infinoc" "cepaHA89.tiempo8.infinoc" "cepaRK416.tiempo8.infinoc" "linea2" [26] "linea3" "linea4" "linea5" "linea6" "linea7" > I checked every column and there is no interaction for which I have all zeros. Up to there ok, but....... when i try ... My error is when i try to : dge7_er <- estimateGLMCommonDisp(dge1, design7_er, verbose = TRUE) #To estimate common dispersion # Error in glmFit.default(y, design = design, dispersion = dispersion, offset = offset, : # Design matrix not of full rank. The following coefficients not estimable: # cepaRK416:tiempo8:infcontrol cepaRK416:tiempo0:infinoc cepaRK416:tiempo4:infinoc cepaHA853:tiempo8:infinoc cepaHA89:tiempo8:infinoc cepaRK416:tiempo8:infinoc Does this mean i am trying to estimate "too" many parameters? I can not estimate the main factors? When i use the other design: which i estimate all 3 way interaction and batch effect, but NO main factors: This works!!! designformula <- terms(~ 0 + cepa:tiempo:inf + linea, keep.order = TRUE) design7 <- model.matrix(designformula) dim(design7) # [1] 56 24 dge7 <- estimateGLMCommonDisp(dge1, design7, verbose = TRUE) #To estimate common dispersion # Disp = 0.03873 , BCV = 0.1968 All good! What do you think???? thanks a million!!!! great software! cheers, maximo -------------------------------------------------------- Maximo Rivarola PhD. ________________________________________ From: Ryan [rct@thompsonclan.org] Sent: Tuesday, June 17, 2014 1:39 PM To: Maximo Rivarola Cc: bioconductor at r-project.org Subject: Re: [BioC] Contrast among a subset of interactions, glm in edgeR Hi Maximo, I believe that you want to use the following: designformula <- terms(~ 0 + cepa:tiempo:inf + linea, keep.order = TRUE) design7 <- model.matrix(designformula) This will give you one coefficient for each combination of cepa, timepo, and inf. Also, after doing that, you might also want to replace the colons in the column names with periods, to make syntacically valid names, which will make things easier when you start constructing contrasts: colnames(design7) <- make.names(colnames(design7)) -Ryan On Tue Jun 17 09:28:17 2014, Maximo Rivarola wrote: > Hello everyone, > > I am writing my first email to this list. I have a question in regards to contrasting a few interactions... > I have an RNA-seq experiment with the following structure: > - 3 cepa (genotypes) > - 3 tiempo (time points) > - inf (Inoculated or not) > > Basically, three different genotypes were infected or not (inf) and then 3 time points were take and sequenced. Most samples have 2 biol replicates and some 4 biol replicates. > > My question is the following: > I would like to make a contrast were I compare the interaction of cepaHA853:tiempo4:infinoc - cepaHA853:tiempo8:infinoc ???? > My problem is I do not know where the column for this interaction is??? > I placed the 0 as for it not to intercept in the model, but the later factors are always shown from level 2 on... > How can I "see" all the interaction from my design??? > > Below is some code: > Thanks in advanced!!! cheers, > > Design implemented: > design7 <- model.matrix(~ 0 + cepa*tiempo*inf + linea) > >> dim(design7) > [1] 56 24 > >> head(design7) > cepaHA853 cepaHA89 cepaRK416 tiempo4 tiempo8 infinoc linea2 linea3 linea4 linea5 linea6 linea7 cepaHA89:tiempo4 cepaRK416:tiempo4 > 1 0 1 0 0 0 1 0 1 0 0 0 0 0 0 > 2 0 1 0 0 0 1 0 0 1 0 0 0 0 0 > 3 0 1 0 0 0 1 0 0 0 1 0 0 0 0 > 4 0 1 0 0 0 1 0 0 0 0 1 0 0 0 > 5 1 0 0 0 0 1 0 0 0 0 0 1 0 0 > 6 1 0 0 0 0 1 0 0 0 0 0 0 0 0 > cepaHA89:tiempo8 cepaRK416:tiempo8 cepaHA89:infinoc cepaRK416:infinoc tiempo4:infinoc tiempo8:infinoc cepaHA89:tiempo4:infinoc > 1 0 0 1 0 0 0 0 > 2 0 0 1 0 0 0 0 > 3 0 0 1 0 0 0 0 > 4 0 0 1 0 0 0 0 > 5 0 0 0 0 0 0 0 > 6 0 0 0 0 0 0 0 > cepaRK416:tiempo4:infinoc cepaHA89:tiempo8:infinoc cepaRK416:tiempo8:infinoc > 1 0 0 0 > 2 0 0 0 > 3 0 0 0 > 4 0 0 0 > 5 0 0 0 > 6 0 0 0 >> > > dge7 <- estimateGLMCommonDisp(dge1, design7, verbose = TRUE) #To estimate common dispersion > # Disp = 0.03873 , BCV = 0.1968 > > dge7 <- estimateGLMTagwiseDisp(dge7, design7) #To estimate tagwise dispersions > # $tagwise.dispersion > # [1] 0.21290421 0.01218267 0.01580605 0.00906055 0.02124947 > > fit7 <- glmFit(dge7, design7, dispersion=dge7$tagwise.dispersion) > >> class(fit7) > [1] "DGEGLM" > attr(,"package") > [1] "edgeR" > >> sessionInfo() > R version 3.1.0 (2014-04-10) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=es_ES.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=es_ES.UTF-8 > [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=es_ES.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] edgeR_3.6.2 limma_3.20.4 > > thanks a lot! > hope is explained well.... > > -------------------------------------------------------- > Maximo Rivarola > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
@gordon-smyth
Last seen 2 hours ago
WEHI, Melbourne, Australia
Dear Maximo, You ask why you don't see "all" the interactions as columns in your design matrix. Someone asks a variation of this question every week or so on Bioconductor. The confusion comes from a mis-understanding of what interactions and contrasts actually are. You say that you want to compare the interaction "cepaHA853:tiempo4:infinoc - cepaHA853:tiempo8:infinoc", but I am pretty sure that you actually do not want to do that. Do you simply mean that you want to compare time points 8 and 4 for genotype HA853 innoculated? I am guessing that this is what you are actually after. You cannot answer questions of this type by fitting an interaction model. Instead of fitting the complicated and esoteric 3-way factorial model, it would be simpler and more correct to follow the advice of Section 3.3.1 of the edgeR User's Guide. Simply combine the factors cepa, tiempo and inof into a single factor with 18 levels. Fit the model ~0+CombinedFactor, then test the contrast: HA853.4.inoc - HA853.8.inoc This is simple and intuitive as well as being correct. Best wishes Gordon > Maximo Rivarola rivarola.maximo at inta.gob.ar > Tue Jun 17 18:28:17 CEST 2014 > > I am writing my first email to this list. I have a question in regards to > contrasting a few interactions... > I have an RNA-seq experiment with the following structure: > - 3 cepa (genotypes) > - 3 tiempo (time points) > - inf (Inoculated or not) > > Basically, three different genotypes were infected or not (inf) and then 3 > time points were take and sequenced. Most samples have 2 biol replicates and > some 4 biol replicates. > > My question is the following: > I would like to make a contrast were I compare the interaction of > cepaHA853:tiempo4:infinoc - cepaHA853:tiempo8:infinoc ???? > My problem is I do not know where the column for this interaction is??? > I placed the 0 as for it not to intercept in the model, but the later factors > are always shown from level 2 on... > How can I "see" all the interaction from my design??? > > Below is some code: > Thanks in advanced!!! cheers, > > Design implemented: > design7 <- model.matrix(~ 0 + cepa*tiempo*inf + linea) > >> dim(design7) > [1] 56 24 > >> head(design7) > cepaHA853 cepaHA89 cepaRK416 tiempo4 tiempo8 infinoc linea2 linea3 linea4 > linea5 linea6 linea7 cepaHA89:tiempo4 cepaRK416:tiempo4 > 1 0 1 0 0 0 1 0 1 0 0 > 0 0 0 0 > 2 0 1 0 0 0 1 0 0 1 0 > 0 0 0 0 > 3 0 1 0 0 0 1 0 0 0 1 > 0 0 0 0 > 4 0 1 0 0 0 1 0 0 0 0 > 1 0 0 0 > 5 1 0 0 0 0 1 0 0 0 0 > 0 1 0 0 > 6 1 0 0 0 0 1 0 0 0 0 > 0 0 0 0 > cepaHA89:tiempo8 cepaRK416:tiempo8 cepaHA89:infinoc cepaRK416:infinoc > tiempo4:infinoc tiempo8:infinoc cepaHA89:tiempo4:infinoc > 1 0 0 1 0 0 > 0 0 > 2 0 0 1 0 0 > 0 0 > 3 0 0 1 0 0 > 0 0 > 4 0 0 1 0 0 > 0 0 > 5 0 0 0 0 0 > 0 0 > 6 0 0 0 0 0 > 0 0 > cepaRK416:tiempo4:infinoc cepaHA89:tiempo8:infinoc > cepaRK416:tiempo8:infinoc > 1 0 0 0 > 2 0 0 0 > 3 0 0 0 > 4 0 0 0 > 5 0 0 0 > 6 0 0 0 >> > > dge7 <- estimateGLMCommonDisp(dge1, design7, verbose = TRUE) #To estimate > common dispersion > # Disp = 0.03873 , BCV = 0.1968 > > dge7 <- estimateGLMTagwiseDisp(dge7, design7) #To estimate tagwise > dispersions > # $tagwise.dispersion > # [1] 0.21290421 0.01218267 0.01580605 0.00906055 0.02124947 > > fit7 <- glmFit(dge7, design7, dispersion=dge7$tagwise.dispersion) > >> class(fit7) > [1] "DGEGLM" > attr(,"package") > [1] "edgeR" > >> sessionInfo() > R version 3.1.0 (2014-04-10) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=es_ES.UTF-8 > LC_COLLATE=en_US.UTF-8 LC_MONETARY=es_ES.UTF-8 > [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=es_ES.UTF-8 LC_NAME=C > LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] edgeR_3.6.2 limma_3.20.4 > > thanks a lot! > hope is explained well.... > > -------------------------------------------------------- > Maximo Rivarola > ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
ADD COMMENT
0
Entering edit mode
HI Gordon, thanks for your reply, I guess I was not clear in my question, I am by NO means an expert in stats, ... :) as it shows.... I agree with your idea of making the groups and be done... I will do this! I was not sure if that would be correct as of the model and so... I did want to mention that my "idea" was to be able to contrast the following: (cepaHA853t4INOC - cepaHA853t4Control) - (cepa853t8INOC - cepa853t8Control) In my case, time points make a big difference and not so much inoculated, thats why this type of contrasts... I guess i wanted to see the diff from 4day to 8 day inoculated but not genes diff expre just because of time. Thanks again!!!! I will have to take a course in stats again, great program and am learning a lot, cheers, maximo -------------------------------------------------------- Maximo Rivarola PhD. ________________________________________ From: Gordon K Smyth [smyth@wehi.EDU.AU] Sent: Friday, June 20, 2014 5:46 AM To: Maximo Rivarola Cc: Bioconductor mailing list Subject: Re: Contrast among a subset of interactions, glm in edgeR Dear Maximo, You ask why you don't see "all" the interactions as columns in your design matrix. Someone asks a variation of this question every week or so on Bioconductor. The confusion comes from a mis-understanding of what interactions and contrasts actually are. You say that you want to compare the interaction "cepaHA853:tiempo4:infinoc - cepaHA853:tiempo8:infinoc", but I am pretty sure that you actually do not want to do that. Do you simply mean that you want to compare time points 8 and 4 for genotype HA853 innoculated? I am guessing that this is what you are actually after. You cannot answer questions of this type by fitting an interaction model. Instead of fitting the complicated and esoteric 3-way factorial model, it would be simpler and more correct to follow the advice of Section 3.3.1 of the edgeR User's Guide. Simply combine the factors cepa, tiempo and inof into a single factor with 18 levels. Fit the model ~0+CombinedFactor, then test the contrast: HA853.4.inoc - HA853.8.inoc This is simple and intuitive as well as being correct. Best wishes Gordon > Maximo Rivarola rivarola.maximo at inta.gob.ar > Tue Jun 17 18:28:17 CEST 2014 > > I am writing my first email to this list. I have a question in regards to > contrasting a few interactions... > I have an RNA-seq experiment with the following structure: > - 3 cepa (genotypes) > - 3 tiempo (time points) > - inf (Inoculated or not) > > Basically, three different genotypes were infected or not (inf) and then 3 > time points were take and sequenced. Most samples have 2 biol replicates and > some 4 biol replicates. > > My question is the following: > I would like to make a contrast were I compare the interaction of > cepaHA853:tiempo4:infinoc - cepaHA853:tiempo8:infinoc ???? > My problem is I do not know where the column for this interaction is??? > I placed the 0 as for it not to intercept in the model, but the later factors > are always shown from level 2 on... > How can I "see" all the interaction from my design??? > > Below is some code: > Thanks in advanced!!! cheers, > > Design implemented: > design7 <- model.matrix(~ 0 + cepa*tiempo*inf + linea) > >> dim(design7) > [1] 56 24 > >> head(design7) > cepaHA853 cepaHA89 cepaRK416 tiempo4 tiempo8 infinoc linea2 linea3 linea4 > linea5 linea6 linea7 cepaHA89:tiempo4 cepaRK416:tiempo4 > 1 0 1 0 0 0 1 0 1 0 0 > 0 0 0 0 > 2 0 1 0 0 0 1 0 0 1 0 > 0 0 0 0 > 3 0 1 0 0 0 1 0 0 0 1 > 0 0 0 0 > 4 0 1 0 0 0 1 0 0 0 0 > 1 0 0 0 > 5 1 0 0 0 0 1 0 0 0 0 > 0 1 0 0 > 6 1 0 0 0 0 1 0 0 0 0 > 0 0 0 0 > cepaHA89:tiempo8 cepaRK416:tiempo8 cepaHA89:infinoc cepaRK416:infinoc > tiempo4:infinoc tiempo8:infinoc cepaHA89:tiempo4:infinoc > 1 0 0 1 0 0 > 0 0 > 2 0 0 1 0 0 > 0 0 > 3 0 0 1 0 0 > 0 0 > 4 0 0 1 0 0 > 0 0 > 5 0 0 0 0 0 > 0 0 > 6 0 0 0 0 0 > 0 0 > cepaRK416:tiempo4:infinoc cepaHA89:tiempo8:infinoc > cepaRK416:tiempo8:infinoc > 1 0 0 0 > 2 0 0 0 > 3 0 0 0 > 4 0 0 0 > 5 0 0 0 > 6 0 0 0 >> > > dge7 <- estimateGLMCommonDisp(dge1, design7, verbose = TRUE) #To estimate > common dispersion > # Disp = 0.03873 , BCV = 0.1968 > > dge7 <- estimateGLMTagwiseDisp(dge7, design7) #To estimate tagwise > dispersions > # $tagwise.dispersion > # [1] 0.21290421 0.01218267 0.01580605 0.00906055 0.02124947 > > fit7 <- glmFit(dge7, design7, dispersion=dge7$tagwise.dispersion) > >> class(fit7) > [1] "DGEGLM" > attr(,"package") > [1] "edgeR" > >> sessionInfo() > R version 3.1.0 (2014-04-10) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=es_ES.UTF-8 > LC_COLLATE=en_US.UTF-8 LC_MONETARY=es_ES.UTF-8 > [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=es_ES.UTF-8 LC_NAME=C > LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] edgeR_3.6.2 limma_3.20.4 > > thanks a lot! > hope is explained well.... > > -------------------------------------------------------- > Maximo Rivarola > ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:6}}
ADD REPLY
0
Entering edit mode
You can test the contrast you mention easily by the same method I suggested. Gordon On Fri, 20 Jun 2014, Maximo Rivarola wrote: > HI Gordon, thanks for your reply, > I guess I was not clear in my question, > I am by NO means an expert in stats, ... :) as it shows.... > > I agree with your idea of making the groups and be done... I will do this! I was not sure if that would be correct as of the model and so... > I did want to mention that my "idea" was to be able to contrast the following: > > (cepaHA853t4INOC - cepaHA853t4Control) - (cepa853t8INOC - cepa853t8Control) > > In my case, time points make a big difference and not so much inoculated, thats why this type of contrasts... > > I guess i wanted to see the diff from 4day to 8 day inoculated but not genes diff expre just because of time. > Thanks again!!!! > I will have to take a course in stats again, > > great program and am learning a lot, > cheers, > maximo > > > -------------------------------------------------------- > Maximo Rivarola PhD. > ________________________________________ > From: Gordon K Smyth [smyth at wehi.EDU.AU] > Sent: Friday, June 20, 2014 5:46 AM > To: Maximo Rivarola > Cc: Bioconductor mailing list > Subject: Re: Contrast among a subset of interactions, glm in edgeR > > Dear Maximo, > > You ask why you don't see "all" the interactions as columns in your design > matrix. Someone asks a variation of this question every week or so on > Bioconductor. The confusion comes from a mis-understanding of what > interactions and contrasts actually are. > > You say that you want to compare the interaction > "cepaHA853:tiempo4:infinoc - cepaHA853:tiempo8:infinoc", but I am pretty > sure that you actually do not want to do that. Do you simply mean that > you want to compare time points 8 and 4 for genotype HA853 innoculated? I > am guessing that this is what you are actually after. > > You cannot answer questions of this type by fitting an interaction model. > Instead of fitting the complicated and esoteric 3-way factorial model, it > would be simpler and more correct to follow the advice of Section 3.3.1 of > the edgeR User's Guide. Simply combine the factors cepa, tiempo and inof > into a single factor with 18 levels. Fit the model ~0+CombinedFactor, > then test the contrast: > > HA853.4.inoc - HA853.8.inoc > > This is simple and intuitive as well as being correct. > > Best wishes > Gordon > > >> Maximo Rivarola rivarola.maximo at inta.gob.ar >> Tue Jun 17 18:28:17 CEST 2014 >> >> I am writing my first email to this list. I have a question in regards to >> contrasting a few interactions... >> I have an RNA-seq experiment with the following structure: >> - 3 cepa (genotypes) >> - 3 tiempo (time points) >> - inf (Inoculated or not) >> >> Basically, three different genotypes were infected or not (inf) and then 3 >> time points were take and sequenced. Most samples have 2 biol replicates and >> some 4 biol replicates. >> >> My question is the following: >> I would like to make a contrast were I compare the interaction of >> cepaHA853:tiempo4:infinoc - cepaHA853:tiempo8:infinoc ???? >> My problem is I do not know where the column for this interaction is??? >> I placed the 0 as for it not to intercept in the model, but the later factors >> are always shown from level 2 on... >> How can I "see" all the interaction from my design??? >> >> Below is some code: >> Thanks in advanced!!! cheers, >> >> Design implemented: >> design7 <- model.matrix(~ 0 + cepa*tiempo*inf + linea) >> >>> dim(design7) >> [1] 56 24 >> >>> head(design7) >> cepaHA853 cepaHA89 cepaRK416 tiempo4 tiempo8 infinoc linea2 linea3 linea4 >> linea5 linea6 linea7 cepaHA89:tiempo4 cepaRK416:tiempo4 >> 1 0 1 0 0 0 1 0 1 0 0 >> 0 0 0 0 >> 2 0 1 0 0 0 1 0 0 1 0 >> 0 0 0 0 >> 3 0 1 0 0 0 1 0 0 0 1 >> 0 0 0 0 >> 4 0 1 0 0 0 1 0 0 0 0 >> 1 0 0 0 >> 5 1 0 0 0 0 1 0 0 0 0 >> 0 1 0 0 >> 6 1 0 0 0 0 1 0 0 0 0 >> 0 0 0 0 >> cepaHA89:tiempo8 cepaRK416:tiempo8 cepaHA89:infinoc cepaRK416:infinoc >> tiempo4:infinoc tiempo8:infinoc cepaHA89:tiempo4:infinoc >> 1 0 0 1 0 0 >> 0 0 >> 2 0 0 1 0 0 >> 0 0 >> 3 0 0 1 0 0 >> 0 0 >> 4 0 0 1 0 0 >> 0 0 >> 5 0 0 0 0 0 >> 0 0 >> 6 0 0 0 0 0 >> 0 0 >> cepaRK416:tiempo4:infinoc cepaHA89:tiempo8:infinoc >> cepaRK416:tiempo8:infinoc >> 1 0 0 0 >> 2 0 0 0 >> 3 0 0 0 >> 4 0 0 0 >> 5 0 0 0 >> 6 0 0 0 >>> >> >> dge7 <- estimateGLMCommonDisp(dge1, design7, verbose = TRUE) #To estimate >> common dispersion >> # Disp = 0.03873 , BCV = 0.1968 >> >> dge7 <- estimateGLMTagwiseDisp(dge7, design7) #To estimate tagwise >> dispersions >> # $tagwise.dispersion >> # [1] 0.21290421 0.01218267 0.01580605 0.00906055 0.02124947 >> >> fit7 <- glmFit(dge7, design7, dispersion=dge7$tagwise.dispersion) >> >>> class(fit7) >> [1] "DGEGLM" >> attr(,"package") >> [1] "edgeR" >> >>> sessionInfo() >> R version 3.1.0 (2014-04-10) >> Platform: x86_64-pc-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=es_ES.UTF-8 >> LC_COLLATE=en_US.UTF-8 LC_MONETARY=es_ES.UTF-8 >> [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=es_ES.UTF-8 LC_NAME=C >> LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] edgeR_3.6.2 limma_3.20.4 >> >> thanks a lot! >> hope is explained well.... >> >> -------------------------------------------------------- >> Maximo Rivarola >> > > ______________________________________________________________________ > The information in this email is confidential and inte...{{dropped:9}}
ADD REPLY

Login before adding your answer.

Traffic: 273 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6