samr - extract genes from siggenes.table

0

Entering edit mode

Assa Yeroslaviz ★ 1.5k

@assa-yeroslaviz-1597

Last seen 12 weeks ago

Germany

Hi BioC user, I have a problem extracting the gene set I would like to work with. Here is I work with my data: normData <- read.delim("normalizedData.txt",sep ="\t") ######### two class unpaired comparison # y must take values 1,2 classes <- c(-1,-2,1,2) #prepere the data for the samr analysis data.x <-as.matrix(normData[,8:11]) d=list(x=data.x,y=classes, geneid=as.character(normData[,1]),genenames=as.character(normData[,1]) , logged2=TRUE) samr.obj<-samr(d, resp.type="Two class paired", nperms=100) delta.table <- samr.compute.delta.table(samr.obj) delta=0.4 siggenes.table<-samr.compute.siggenes.table(samr.obj,delta, d, delta.table,min.foldchange=2) genes.up <- as.data.frame(siggenes.table$genes.up) genes.down <- as.data.frame(siggenes.table$genes.lo) the data set I am working with has four column of two experiments. when running the samr.compute.siggenes.table command I get > str(siggenes.table) List of 5 $ genes.up : chr [1:9769, 1:8] "6587" "865" "22929" "10172" ... ..- attr(*, "dimnames")=List of 2 .. ..$ : NULL .. ..$ : chr [1:8] "Row" "Gene ID" "Gene Name" "Score(d)" ... $ genes.lo : chr [1:10788, 1:8] "10836" "22277" "1243" "10509" ... ..- attr(*, "dimnames")=List of 2 .. ..$ : NULL .. ..$ : chr [1:8] "Row" "Gene ID" "Gene Name" "Score(d)" ... $ color.ind.for.multi: NULL $ ngenes.up : int 9769 $ ngenes.lo : int 10788 So I guess I have 9769 up-regulated and 10788 down-regulated genes. The problem is, that not all of them are above 2fold: > head(siggenes.table$genes.up) Row Gene ID Gene Name Score(d) [1,] "6587" "NM_001142426_at" "NM_001142426_at" "670.084615384572" [2,] "865" "NM_000946_at" "NM_000946_at" "581.731543624152" [3,] "22929" "NM_147134_at" "NM_147134_at" "469.481132075439" [4,] "10172" "NM_003640_at" "NM_003640_at" "296.630872483217" [5,] "10956" "NM_004484_at" "NM_004484_at" "284.233163028334" [6,] "28444" "XM_001125699_at" "XM_001125699_at" "281.629310344832" Numerator(r) Denominator(s+s0) Fold Change [1,] "435.555" "0.650000000000041" "*1.30352619041372e+131*" [2,] "433.39" "0.745000000000012" "2.90663046260321e+130" [3,] "248.825" "0.530000000000037" "8.01288059495468e+74" [4,] "220.99" "0.745000000000012" "3.34671508906627e+66" [5,] "3059.77" "10.7649999999999" "Inf" [6,] "163.345" "0.579999999999991" "*1.48506219251034e+49*" q-value(%) [1,] "0" [2,] "0" [3,] "0" [4,] "1.95405681104834" [5,] "1.95405681104834" [6,] "1.95405681104834" @What do I need the parameter min.foldchage if it is not a filter to extract genes with a fold induction value lower than 2? I would like to know how do I extract a subset of matrix from inside a list? my siggenes.table is a list with two matrices inside. I would like to filter these matrices for genes with a 2fold up- and down-regulation. Thanks Assa > sessionInfo() R version 2.12.0 (2010-10-15) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] samr_1.28 impute_1.24.0 loaded via a namespace (and not attached): [1] tools_2.12.0 [[alternative HTML version deleted]]

siggenes siggenes • 2.3k views

ADD COMMENT • link updated 14.2 years ago by Manca Marco PATH ▴ 340 • written 14.2 years ago by Assa Yeroslaviz ★ 1.5k

0

Entering edit mode

Manca Marco PATH ▴ 340

@manca-marco-path-4295

Last seen 10.7 years ago

I am not sure about this... I think they have logFC larger than 2... you are simply seeing them in scientific notation. Why don't you try setting the option scipen (penalty for scientific notation) "on the high side"? Something like: > options(scipen = 50) should be more than enough... Anyway, most likely other people can give you better hints. All the best, Marco -- Marco Manca, MD University of Maastricht Faculty of Health, Medicine and Life Sciences (FHML) Cardiovascular Research Institute (CARIM) Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands) Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, 6229 HX Maastricht E-mail: m.manca at maastrichtuniversity.nl Office telephone: +31(0)433874633 Personal mobile: +31(0)626441205 Twitter: @markomanka ********************************************************************** *********************************************** This email and any files transmitted with it are confidential and solely for the use of the intended recipient. It may contain material protected by privacy or attorney-client privilege. If you are not the intended recipient or the person responsible for delivering to the intended recipient, be advised that you have received this email in error and that any use is STRICTLY PROHIBITED. If you have received this email in error please notify us by telephone on +31626441205 Dr Marco MANCA ********************************************************************** *********************************************** ________________________________________ Da: bioconductor-bounces at r-project.org [bioconductor-bounces at r-project.org] per conto di Assa Yeroslaviz [frymor at gmail.com] Inviato: mercoled? 9 febbraio 2011 17.35 A: bioconductor; R help forum Oggetto: [BioC] samr - extract genes from siggenes.table Hi BioC user, I have a problem extracting the gene set I would like to work with. Here is I work with my data: normData <- read.delim("normalizedData.txt",sep ="\t") ######### two class unpaired comparison # y must take values 1,2 classes <- c(-1,-2,1,2) #prepere the data for the samr analysis data.x <-as.matrix(normData[,8:11]) d=list(x=data.x,y=classes, geneid=as.character(normData[,1]),genenames=as.character(normData[,1]) , logged2=TRUE) samr.obj<-samr(d, resp.type="Two class paired", nperms=100) delta.table <- samr.compute.delta.table(samr.obj) delta=0.4 siggenes.table<-samr.compute.siggenes.table(samr.obj,delta, d, delta.table,min.foldchange=2) genes.up <- as.data.frame(siggenes.table$genes.up) genes.down <- as.data.frame(siggenes.table$genes.lo) the data set I am working with has four column of two experiments. when running the samr.compute.siggenes.table command I get > str(siggenes.table) List of 5 $ genes.up : chr [1:9769, 1:8] "6587" "865" "22929" "10172" ... ..- attr(*, "dimnames")=List of 2 .. ..$ : NULL .. ..$ : chr [1:8] "Row" "Gene ID" "Gene Name" "Score(d)" ... $ genes.lo : chr [1:10788, 1:8] "10836" "22277" "1243" "10509" ... ..- attr(*, "dimnames")=List of 2 .. ..$ : NULL .. ..$ : chr [1:8] "Row" "Gene ID" "Gene Name" "Score(d)" ... $ color.ind.for.multi: NULL $ ngenes.up : int 9769 $ ngenes.lo : int 10788 So I guess I have 9769 up-regulated and 10788 down-regulated genes. The problem is, that not all of them are above 2fold: > head(siggenes.table$genes.up) Row Gene ID Gene Name Score(d) [1,] "6587" "NM_001142426_at" "NM_001142426_at" "670.084615384572" [2,] "865" "NM_000946_at" "NM_000946_at" "581.731543624152" [3,] "22929" "NM_147134_at" "NM_147134_at" "469.481132075439" [4,] "10172" "NM_003640_at" "NM_003640_at" "296.630872483217" [5,] "10956" "NM_004484_at" "NM_004484_at" "284.233163028334" [6,] "28444" "XM_001125699_at" "XM_001125699_at" "281.629310344832" Numerator(r) Denominator(s+s0) Fold Change [1,] "435.555" "0.650000000000041" "*1.30352619041372e+131*" [2,] "433.39" "0.745000000000012" "2.90663046260321e+130" [3,] "248.825" "0.530000000000037" "8.01288059495468e+74" [4,] "220.99" "0.745000000000012" "3.34671508906627e+66" [5,] "3059.77" "10.7649999999999" "Inf" [6,] "163.345" "0.579999999999991" "*1.48506219251034e+49*" q-value(%) [1,] "0" [2,] "0" [3,] "0" [4,] "1.95405681104834" [5,] "1.95405681104834" [6,] "1.95405681104834" @What do I need the parameter min.foldchage if it is not a filter to extract genes with a fold induction value lower than 2? I would like to know how do I extract a subset of matrix from inside a list? my siggenes.table is a list with two matrices inside. I would like to filter these matrices for genes with a 2fold up- and down-regulation. Thanks Assa > sessionInfo() R version 2.12.0 (2010-10-15) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] samr_1.28 impute_1.24.0 loaded via a namespace (and not attached): [1] tools_2.12.0 [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 14.2 years ago Manca Marco PATH ▴ 340

0

Entering edit mode

Isn't 100,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000, 000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,00 0,000,000,000,000,000,000,000,000,000,000 greater than 2? Do you really want to penalize scientific notation so much that 1e+131 displays like that? Do you really believe there are 1e131 times more transcripts in one sample than the other? The number of atoms in the known universe is estimated at only 1e80. -----Original Message----- From: r-help-bounces@r-project.org [mailto:r-help- bounces@r-project.org] On Behalf Of Manca Marco (PATH) Sent: Wednesday, February 09, 2011 11:54 AM To: Assa Yeroslaviz; bioconductor; R help forum Subject: [R] R: [BioC] samr - extract genes from siggenes.table Importance: High I am not sure about this... I think they have logFC larger than 2... you are simply seeing them in scientific notation. Why don't you try setting the option scipen (penalty for scientific notation) "on the high side"? Something like: > options(scipen = 50) should be more than enough... Anyway, most likely other people can give you better hints. All the best, Marco -- Marco Manca, MD University of Maastricht Faculty of Health, Medicine and Life Sciences (FHML) Cardiovascular Research Institute (CARIM) Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands) Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, 6229 HX Maastricht E-mail: m.manca at maastrichtuniversity.nl Office telephone: +31(0)433874633 Personal mobile: +31(0)626441205 Twitter: @markomanka ********************************************************************** *********************************************** This email and any files transmitted with it are confidential and solely for the use of the intended recipient. It may contain material protected by privacy or attorney-client privilege. If you are not the intended recipient or the person responsible for delivering to the intended recipient, be advised that you have received this email in error and that any use is STRICTLY PROHIBITED. If you have received this email in error please notify us by telephone on +31626441205 Dr Marco MANCA ********************************************************************** *********************************************** ________________________________________ Da: bioconductor-bounces at r-project.org [bioconductor-bounces at r-project.org] per conto di Assa Yeroslaviz [frymor at gmail.com] Inviato: mercoled? 9 febbraio 2011 17.35 A: bioconductor; R help forum Oggetto: [BioC] samr - extract genes from siggenes.table Hi BioC user, I have a problem extracting the gene set I would like to work with. Here is I work with my data: normData <- read.delim("normalizedData.txt",sep ="\t") ######### two class unpaired comparison # y must take values 1,2 classes <- c(-1,-2,1,2) #prepere the data for the samr analysis data.x <-as.matrix(normData[,8:11]) d=list(x=data.x,y=classes, geneid=as.character(normData[,1]),genenames=as.character(normData[,1]) , logged2=TRUE) samr.obj<-samr(d, resp.type="Two class paired", nperms=100) delta.table <- samr.compute.delta.table(samr.obj) delta=0.4 siggenes.table<-samr.compute.siggenes.table(samr.obj,delta, d, delta.table,min.foldchange=2) genes.up <- as.data.frame(siggenes.table$genes.up) genes.down <- as.data.frame(siggenes.table$genes.lo) the data set I am working with has four column of two experiments. when running the samr.compute.siggenes.table command I get > str(siggenes.table) List of 5 $ genes.up : chr [1:9769, 1:8] "6587" "865" "22929" "10172" ... ..- attr(*, "dimnames")=List of 2 .. ..$ : NULL .. ..$ : chr [1:8] "Row" "Gene ID" "Gene Name" "Score(d)" ... $ genes.lo : chr [1:10788, 1:8] "10836" "22277" "1243" "10509" ... ..- attr(*, "dimnames")=List of 2 .. ..$ : NULL .. ..$ : chr [1:8] "Row" "Gene ID" "Gene Name" "Score(d)" ... $ color.ind.for.multi: NULL $ ngenes.up : int 9769 $ ngenes.lo : int 10788 So I guess I have 9769 up-regulated and 10788 down-regulated genes. The problem is, that not all of them are above 2fold: > head(siggenes.table$genes.up) Row Gene ID Gene Name Score(d) [1,] "6587" "NM_001142426_at" "NM_001142426_at" "670.084615384572" [2,] "865" "NM_000946_at" "NM_000946_at" "581.731543624152" [3,] "22929" "NM_147134_at" "NM_147134_at" "469.481132075439" [4,] "10172" "NM_003640_at" "NM_003640_at" "296.630872483217" [5,] "10956" "NM_004484_at" "NM_004484_at" "284.233163028334" [6,] "28444" "XM_001125699_at" "XM_001125699_at" "281.629310344832" Numerator(r) Denominator(s+s0) Fold Change [1,] "435.555" "0.650000000000041" "*1.30352619041372e+131*" [2,] "433.39" "0.745000000000012" "2.90663046260321e+130" [3,] "248.825" "0.530000000000037" "8.01288059495468e+74" [4,] "220.99" "0.745000000000012" "3.34671508906627e+66" [5,] "3059.77" "10.7649999999999" "Inf" [6,] "163.345" "0.579999999999991" "*1.48506219251034e+49*" q-value(%) [1,] "0" [2,] "0" [3,] "0" [4,] "1.95405681104834" [5,] "1.95405681104834" [6,] "1.95405681104834" @What do I need the parameter min.foldchage if it is not a filter to extract genes with a fold induction value lower than 2? I would like to know how do I extract a subset of matrix from inside a list? my siggenes.table is a list with two matrices inside. I would like to filter these matrices for genes with a 2fold up- and down-regulation. Thanks Assa > sessionInfo() R version 2.12.0 (2010-10-15) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] samr_1.28 impute_1.24.0 loaded via a namespace (and not attached): [1] tools_2.12.0 [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited.

ADD REPLY • link 14.2 years ago rex.dwyer@syngenta.com ▴ 10

Login before adding your answer.