TCC::ERROR: Need the design matrix for GLM
1
0
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia
Dear Panka, It seems as if you are just using the TCC package to call methods from the edgeR package indirectly. Why not use the edgeR package directly? That would probably be easier and you would have a more direct understanding of the methods being used. Your experiment is almost identical to the oral carcinoma case study in the edgeR User's Guide. Best wishes Gordon > Date: Tue, 15 Apr 2014 13:51:17 +0000 > From: Pankaj Agarwal <p.agarwal at="" duke.edu=""> > To: "bioconductor at r-project.org" <bioconductor at="" r-project.org=""> > Cc: "kadota at bi.a.u-tokyo.ac.jp" <kadota at="" bi.a.u-tokyo.ac.jp=""> > Subject: [BioC] TCC::ERROR: Need the design matrix for GLM. > > Hi, > > I have a rna-seq data consisting of matched tumor/normal samples from two patients. For normalization of the counts I am following the steps in the TCC vignette section "3.3 Normalization of two-group count data without replicates (paired)". The output from the commands are as follows: > >> data=read.delim("count_bt2_iGenomes_Ensembl.tsv") > >> head(data) > A.sorted.bam B.sorted.bam > ENSG00000000003 2400 1130 > ENSG00000000005 2 3 > ENSG00000000419 1819 575 > ENSG00000000457 1317 1262 > ENSG00000000460 799 1743 > ENSG00000000938 203 41 > C.sorted.bam D.sorted.bam > ENSG00000000003 12 72 > ENSG00000000005 0 0 > ENSG00000000419 938 1608 > ENSG00000000457 821 1469 > ENSG00000000460 367 800 > ENSG00000000938 33303 16355 > >> group <- c(1,1,2,2) >> pair <- c(1,2,1,2) >> c1 <- data.frame(group=group, pair=pair) >> colnames(data) <- c("T_BRPC13.1118", "T_BRPC_13.764", "N_DU04_PBMC", "N_DU06_PBMC") >> tcc <- new("TCC", data, c1) >> tcc <- calcNormFactors(tcc, norm.method="tmm", test.method="edger", iteration=1, FDR=0.1, floorPDEG=0.05, paired=TRUE) > TCC::INFO: Calculating normalization factors using DEGES > TCC::INFO: (iDEGES pipeline : tmm - [ edger - tmm ] X 1 ) > Error in .testByEdger.3(design = design, coef = coef, contrast = contrast) : > TCC::ERROR: Need the design matrix for GLM. > > Reading further for steps needed for edgeR without TCC I saw something related to design and tried it, but got the same error: > >> design <- model.matrix(~ group + pair) >> tcc <- new("TCC", data, c1) >> tcc <- calcNormFactors(tcc, norm.method="tmm", test.method="edger", iteration=1, FDR=0.1, floorPDEG=0.05, paired=TRUE) > TCC::INFO: Calculating normalization factors using DEGES > TCC::INFO: (iDEGES pipeline : tmm - [ edger - tmm ] X 1 ) > Error in .testByEdger.3(design = design, coef = coef, contrast = contrast) : > TCC::ERROR: Need the design matrix for GLM. > > I would appreciate help with understanding the cause of the error. > > The output from sessionInfo() and package description is as follows: > >> sessionInfo() > R version 3.0.3 (2014-03-06) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base >> >> packageDescription("TCC") > Package: TCC > Type: Package > Title: TCC: Differential expression analysis for tag count data with > robust normalization strategies > Version: 1.2.0 > Author: Jianqiang Sun, Tomoaki Nishiyama, Kentaro Shimizu, and Koji > Kadota > Maintainer: Jianqiang Sun <wukong at="" bi.a.u-tokyo.ac.jp="">, Tomoaki > Nishiyama <tomoakin at="" staff.kanazawa-u.ac.jp=""> > Description: This package provides a series of functions for performing > differential expression analysis from RNA-seq count data using > robust normalization strategy (called DEGES). The basic idea of > DEGES is that potential differentially expressed genes or > transcripts (DEGs) among compared samples should be removed > before data normalization to obtain a well-ranked gene list > where true DEGs are top-ranked and non-DEGs are bottom ranked. > This can be done by performing a multi-step normalization > strategy (called DEGES for DEG elimination strategy). A major > characteristic of TCC is to provide the robust normalization > methods for several kinds of count data (two-group with or > without replicates, multi-group/multi-factor, and so on) by > virtue of the use of combinations of functions in other > sophisticated packages (especially edgeR, DESeq, and baySeq). > Depends: R (>= 2.15), methods, DESeq, edgeR, baySeq, ROC > Imports: EBSeq, samr > Suggests: RUnit, BiocGenerics > Enhances: snow > biocViews: HighThroughputSequencing, DifferentialExpression, RNAseq > License: GPL-2 > Copyright: Authors listed above > Packaged: 2013-10-15 05:31:33 UTC; biocbuild > Built: R 3.0.3; ; 2014-03-31 20:00:49 UTC; unix > > -- File: /general/installs/R/R-3.0.3/lib64/R/library/TCC/Meta/package.rds > > Thank you, > > - Pankaj > -------------------------------------- > Pankaj Agarwal, M.S > Bioinformatician > Bioinformatics Shared Resource > Duke Cancer Institute > Duke University > 919-681-6573 > p.agarwal at duke.edu ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
DifferentialExpression Normalization Cancer edgeR baySeq DESeq EBSeq TCC Normalization • 1.8k views
ADD COMMENT
0
Entering edit mode
Koji Kadota ▴ 10
@koji-kadota-6504
Last seen 10.2 years ago
Dear Gordon, I am the corresponding author of TCC paper. What Panka want to do is not the same as the default procedure in edgeR. As explicitly described in TCC paper, an differentially expressed gene elimination strategy (DEGES) implemented in TCC is important for obtaining more accurate DE result. Please read the original paper. http://www.biomedcentral.com/1471-2105/14/219 Koji P.S. Dear Sun, please send again this mail to the Bioconductor mailing list if I could not ..., thanks in advance. ------------------------------------------ Koji Kadota, Ph.D., Associate Professor Agricultural Bioinformatics Research Unit, Graduate School of Agricultural and Life Sciences, The University of Tokyo 1-1-1, Yayoi, Bunkyo-ku Tokyo, 113-8657, JAPAN E-mail: kadota at iu.a.u-tokyo.ac.jp Web: http://www.iu.a.u-tokyo.ac.jp/~kadota ------------------------------------------ > -----Original Message----- > From: Gordon K Smyth [mailto:smyth at wehi.EDU.AU] > Sent: Friday, April 18, 2014 9:36 AM > To: Pankaj Agarwal > Cc: Bioconductor mailing list; kadota at bi.a.u-tokyo.ac.jp > Subject: TCC::ERROR: Need the design matrix for GLM > > Dear Panka, > > It seems as if you are just using the TCC package to call methods from the > edgeR package indirectly. > > Why not use the edgeR package directly? That would probably be easier and > you would have a more direct understanding of the methods being used. > Your experiment is almost identical to the oral carcinoma case study in > the edgeR User's Guide. > > Best wishes > Gordon > > > > Date: Tue, 15 Apr 2014 13:51:17 +0000 > > From: Pankaj Agarwal <p.agarwal at="" duke.edu=""> > > To: "bioconductor at r-project.org" <bioconductor at="" r-project.org=""> > > Cc: "kadota at bi.a.u-tokyo.ac.jp" <kadota at="" bi.a.u-tokyo.ac.jp=""> > > Subject: [BioC] TCC::ERROR: Need the design matrix for GLM. > > > > Hi, > > > > I have a rna-seq data consisting of matched tumor/normal samples from > two patients. For normalization of the counts I am following the steps > in the TCC vignette section "3.3 Normalization of two-group count data > without replicates (paired)". The output from the commands are as follows: > > > >> data=read.delim("count_bt2_iGenomes_Ensembl.tsv") > > > >> head(data) > > A.sorted.bam B.sorted.bam > > ENSG00000000003 2400 > 1130 > > ENSG00000000005 2 > 3 > > ENSG00000000419 1819 > 575 > > ENSG00000000457 1317 > 1262 > > ENSG00000000460 799 > 1743 > > ENSG00000000938 203 > 41 > > C.sorted.bam D.sorted.bam > > ENSG00000000003 12 > 72 > > ENSG00000000005 0 > 0 > > ENSG00000000419 938 > 1608 > > ENSG00000000457 821 > 1469 > > ENSG00000000460 367 > 800 > > ENSG00000000938 33303 > 16355 > > > >> group <- c(1,1,2,2) > >> pair <- c(1,2,1,2) > >> c1 <- data.frame(group=group, pair=pair) > >> colnames(data) <- c("T_BRPC13.1118", "T_BRPC_13.764", "N_DU04_PBMC", > >> "N_DU06_PBMC") tcc <- new("TCC", data, c1) tcc <- > >> calcNormFactors(tcc, norm.method="tmm", test.method="edger", > >> iteration=1, FDR=0.1, floorPDEG=0.05, paired=TRUE) > > TCC::INFO: Calculating normalization factors using DEGES > > TCC::INFO: (iDEGES pipeline : tmm - [ edger - tmm ] X 1 ) Error in > > .testByEdger.3(design = design, coef = coef, contrast = contrast) : > > TCC::ERROR: Need the design matrix for GLM. > > > > Reading further for steps needed for edgeR without TCC I saw something > related to design and tried it, but got the same error: > > > >> design <- model.matrix(~ group + pair) tcc <- new("TCC", data, c1) > >> tcc <- calcNormFactors(tcc, norm.method="tmm", test.method="edger", > >> iteration=1, FDR=0.1, floorPDEG=0.05, paired=TRUE) > > TCC::INFO: Calculating normalization factors using DEGES > > TCC::INFO: (iDEGES pipeline : tmm - [ edger - tmm ] X 1 ) Error in > > .testByEdger.3(design = design, coef = coef, contrast = contrast) : > > TCC::ERROR: Need the design matrix for GLM. > > > > I would appreciate help with understanding the cause of the error. > > > > The output from sessionInfo() and package description is as follows: > > > >> sessionInfo() > > R version 3.0.3 (2014-03-06) > > Platform: x86_64-unknown-linux-gnu (64-bit) > > > > locale: > > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > > [9] LC_ADDRESS=C LC_TELEPHONE=C > > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > > > attached base packages: > > [1] stats graphics grDevices utils datasets methods base > >> > >> packageDescription("TCC") > > Package: TCC > > Type: Package > > Title: TCC: Differential expression analysis for tag count data with > > robust normalization strategies > > Version: 1.2.0 > > Author: Jianqiang Sun, Tomoaki Nishiyama, Kentaro Shimizu, and Koji > > Kadota > > Maintainer: Jianqiang Sun <wukong at="" bi.a.u-tokyo.ac.jp="">, Tomoaki > > Nishiyama <tomoakin at="" staff.kanazawa-u.ac.jp=""> > > Description: This package provides a series of functions for performing > > differential expression analysis from RNA-seq count data using > > robust normalization strategy (called DEGES). The basic idea of > > DEGES is that potential differentially expressed genes or > > transcripts (DEGs) among compared samples should be removed > > before data normalization to obtain a well-ranked gene list > > where true DEGs are top-ranked and non-DEGs are bottom ranked. > > This can be done by performing a multi-step normalization > > strategy (called DEGES for DEG elimination strategy). A major > > characteristic of TCC is to provide the robust normalization > > methods for several kinds of count data (two-group with or > > without replicates, multi-group/multi-factor, and so on) by > > virtue of the use of combinations of functions in other > > sophisticated packages (especially edgeR, DESeq, and baySeq). > > Depends: R (>= 2.15), methods, DESeq, edgeR, baySeq, ROC > > Imports: EBSeq, samr > > Suggests: RUnit, BiocGenerics > > Enhances: snow > > biocViews: HighThroughputSequencing, DifferentialExpression, RNAseq > > License: GPL-2 > > Copyright: Authors listed above > > Packaged: 2013-10-15 05:31:33 UTC; biocbuild > > Built: R 3.0.3; ; 2014-03-31 20:00:49 UTC; unix > > > > -- File: > > /general/installs/R/R-3.0.3/lib64/R/library/TCC/Meta/package.rds > > > > Thank you, > > > > - Pankaj > > -------------------------------------- > > Pankaj Agarwal, M.S > > Bioinformatician > > Bioinformatics Shared Resource > > Duke Cancer Institute > > Duke University > > 919-681-6573 > > p.agarwal at duke.edu > > ______________________________________________________________________ > The information in this email is confidential and inte...{{dropped:6}}
ADD COMMENT

Login before adding your answer.

Traffic: 905 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6