SPIA package problem
0
0
Entering edit mode
@january-weiner-3999
Last seen 10.2 years ago
Hello, I am trying to use SPIA on some mouse results from a mgug4122a Agilent microarray. Summary of the problem is as follows: I have two vectors: DE_gr_iii and ALL_gr_iii (created following the SPIA vignette, see below). > class( DE_gr_iii ) [1] "numeric" > class( ALL_gr_iii ) [1] "character" > names( DE_gr_iii ) <- ALL_gr_iii > DE_gr_iii[1:10] 12808 78369 71897 241568 102075 27273 0.15805260 0.75696349 -0.02208268 -0.53025986 -0.09489560 0.16656121 20321 57435 18010 18010 -0.13020754 -0.19411325 -0.02297658 -0.03317089 > ALL_gr_iii[1:10] [1] "12808" "78369" "71897" "241568" "102075" "27273" "20321" "57435" [9] "18010" "18010" > length( DE_gr_iii ) [1] 1918 Now when I run spia, I get the following error: > res <- spia( de=DE_gr_iii, all=ALL_gr_iii, organism="mmu", nB = 2000, plots=F, beta=NULL ) Error in spia(de = DE_gr_iii, all = ALL_gr_iii, organism = "mmu", nB = 2000, : de must be a vector of log2 fold changes. The names of de should be included in the refference array! The DE_gr_iii is definitely the log2 fold change vector. I'm not sure what is meant by the reference array since I don't see it in the SPIA vignette, but I assume that the reference is either the data file mmuSPIA.RData or the ALL vector. I am not sure whether ALL should really contain all Entrez IDs from the microarray, but I think not; I have tried also with all Entrez IDs, and it did not work; I also used the Colorectal cancer data set from the SPIA package only with first 100 values for the DE_Colorectal and ALL_Colorectal vectors, and it run w/o problems. The log fold changes were taken from a microarray experiment. I don't think there is a problem with that because I tried also to fake the values by taking them from the Colorectal cancer data provided with SPIA. I don't think that there is a problem with the length of the data. I tried also another data set with 20,000 genes, and the error was the same. Furthermore, I tried to run the Colorectal data set using only first 100 values, and there were no problems running that. The SPIA package seems to be correctly installed, because I can run the example from the vignette without any problems. The Entrez IDs that I used were derived from the Agilent annotation package for this chip: > a2sel$EID <- unlist( mget( as.character( a2sel$SCode ), mgug4122aENTREZID ) ) (a2sel is a data frame containing the fold changes, gene information etc.; agilent identifiers are stored in the SCode column) I removed any identifiers that were not mapped to Entrez: > length( which( is.na( a2sel$EID ) ) ) [1] 7022 > a2sel <- a2sel[ !is.na( a2sel$EID ),] > length( which( is.na( a2sel$EID ) ) ) [1] 0 The last hypothesis was that for whatever reason there is a problem with Entrez IDs (that they do not match the IDs from the mmuSPIA.RData file provided by the distribution). I tested this by using the identifiers that are directly to be found in the mmuSPIA.RData pathway info. I loaded the pathway info from the mmuSPIA.RData file: > load( file=paste( system.file( "extdata/mmuSPIA.RData", package="SPIA" ) ) ) I have chosen a pathway that contains several interactions of the type "activation" and used the colnames and rownames of the matrix for my ALL vector: > all_ttt <- c( colnames( path.info[["04010"]]$activation ), rownames( path.info[["04010"]]$activation ) ) > length( all_ttt ) [1] 564 I generated some random fold changes: > de_ttt <- runif( length( all_ttt ), -10, 10 ) > names( de_ttt ) <- all_ttt The result was, again, error: > res <- spia( de=de_ttt, all=all_ttt, organism="mmu", nB = 2000, plots=F, beta=NULL ) Error in spia(de = de_ttt, all = all_ttt, organism = "mmu", nB = 2000, plots = F, : de must be a vector of log2 fold changes. The names of de should be included in the refference array! I have no idea what the problem is. Thanks in advance for any help -- maybe I should use another package? I have lost two days on this problem already. j. P.S. > sessionInfo() R version 2.10.1 (2009-12-14) i486-pc-linux-gnu locale: [1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C [3] LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8 [5] LC_MONETARY=C LC_MESSAGES=en_US.utf8 [7] LC_PAPER=en_US.utf8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] mgug4122a.db_2.3.6 SPIA_1.4.0 org.Mm.eg.db_2.3.6 [4] BioIDMapper_2.1 gWidgetsRGtk2_0.0-65 gWidgets_0.0-41 [7] lattice_0.18-3 XML_3.1-0 RCurl_1.4-2 [10] bitops_1.0-4.1 hgu95av2.db_2.3.5 org.Hs.eg.db_2.3.6 [13] GO.db_2.3.5 annotate_1.24.1 GOstats_2.12.0 [16] RSQLite_0.8-3 DBI_0.2-5 graph_1.26.0 [19] Category_2.12.1 AnnotationDbi_1.8.2 Biobase_2.6.1 loaded via a namespace (and not attached): [1] genefilter_1.24.3 grid_2.10.1 GSEABase_1.8.0 RBGL_1.24.0 [5] RGtk2_2.12.15 splines_2.10.1 survival_2.35-8 tcltk_2.10.1 [9] tools_2.10.1 xtable_1.5-6 -- -------- Dr. January Weiner 3 -------------------------------------- Max Planck Institute for Infection Biology Charit?platz 1 D-10117 Berlin, Germany Web : www.mpiib-berlin.mpg.de Tel : +49-30-28460514
Microarray GO Cancer Organism hgu95av2 mgug4122a SPIA Microarray GO Cancer Organism • 1.1k views
ADD COMMENT

Login before adding your answer.

Traffic: 818 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6