Greetings Bioconductor community,
This is my first time ever posting and I apologize ahead of time if I don't get the code embedded/formatted correctly.
I have a question regarding the output from the pathview. I'm a relatively new to pathway analysis and visualization, but from what I understand, the pathview function maps the gene, or metabolite, data that is inputted into the gene.data argument. However, I noticed on my data that there are several genes in the pathview output that do not match the input data. I've used an adapted workflow described in the vignette (Code Chunk 37) to duplicate my workflow and highlight the mismatched output.
## Adapted from pathview vignette Code Chunk #37 library(gage) data(gse16873) cn <- colnames(gse16873) hn <- grep('HN',cn, ignore.case =TRUE) dcis <- grep('DCIS',cn, ignore.case =TRUE) data(kegg.gs) #prepare the differential expression data ## Vignette did not include compare argument. Believe it defaulted to "paired" gse16873.d <- gagePrep(gse16873, ref = hn, samp = dcis,compare="as.group") library(pathview) pvresults <- pathview(gene.data = gse16873.d, pathway.id = "hsa00500", species = "hsa") ## Capturing gene mapping output pv_pdgres <- pvresults[["plot.data.gene"]] ## Selecting genes that were mapped in from gagePrep data selected_gageprep <- gse16873.d[which(rownames(gse16873.d) %in% unique(pv_pdgres$kegg.names)),] ## Viewing data from pathview and gagePrep output pv_pdgres[,c("kegg.names","exp1")];selected_gageprep
> pv_pdgres[,c("kegg.names","exp1")];selected_gageprep
kegg.names exp1
86 57733 -0.03602881
87 57733 -0.03602881
88 3098 0.83039966
95 2548 0.32890221
97 6476 0.01425503
98 57733 -0.03602881
104 80146 0.16358451
111 5167 0.38402892
116 283209 NA
119 7358 0.93451774
120 7360 0.31232441
122 7363 0.62364563
123 2990 0.36034810
125 2997 0.06405324
127 5236 -0.07510014
129 2538 0.14398629
130 3098 0.83039966
131 2645 -0.02486297
132 2821 0.54499094
133 178 0.16979765
134 2632 0.21139016
137 11181 -0.01405438
144 276 NA
145 8972 0.16845517
146 2548 0.32890221
152 5834 0.20479492
153 178 0.16979765
171 6476 0.01425503
172 8972 0.16845517
173 6476 0.01425503
11181 178 2538 2548
-0.0140543792 0.1697976464 -0.0002384661 0.1604470369
2632 2645 2821 2990
0.2113901626 -0.0248629701 0.5449909414 0.4510192780
2997 3098 5167 5236
0.0707799289 0.1950114770 0.3840289168 -0.0751001356
57733 5834 6476 7358
-0.0360288070 -0.1675698656 0.0142550297 0.9345177389
7360 7363 80146 8972
0.3123244073 -0.0050854818 0.1635845085 0.1684551748
round(selected_gageprep, 6) %in% round(unique(pv_pdgres[-which(pv_pdgres$exp1 %in% NA),"exp1"]),6)
> round(selected_gageprep, 6) %in% round(unique(pv_pdgres[-which(pv_pdgres$exp1 %in% NA),"exp1"]),6)
[1] TRUE TRUE FALSE FALSE TRUE TRUE TRUE FALSE FALSE FALSE
[11] TRUE TRUE TRUE FALSE TRUE TRUE TRUE FALSE TRUE TRUE
So, gene numbers 2538,2548,2990,2997,3098,5834, and 7363 do not match the gagePrep input.
Am I missing something? I don't typically work with genomic data, so I readily admit that I may be interpreting these results incorrectly. However, I don't feel that there is anything in the help page that would suggest that the input will be modified.
Greatly appreciate anyone's help!!!
> sessionInfo()
R version 3.2.1 (2015-06-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] parallel stats4 stats graphics grDevices utils
[7] datasets methods base
other attached packages:
[1] pathview_1.8.0 org.Hs.eg.db_3.1.2 RSQLite_1.0.0
[4] DBI_0.3.1 AnnotationDbi_1.30.1 GenomeInfoDb_1.4.1
[7] IRanges_2.2.4 S4Vectors_0.6.0 Biobase_2.28.0
[10] BiocGenerics_0.14.0 KEGGgraph_1.26.0 graph_1.46.0
[13] XML_3.98-1.2 gage_2.18.0
loaded via a namespace (and not attached):
[1] Biostrings_2.36.1 png_0.1-7 grid_3.2.1
[4] magrittr_1.5 httr_0.6.1 zlibbioc_1.14.0
[7] stringi_0.4-1 XVector_0.8.0 Rgraphviz_2.12.0
[10] tools_3.2.1 stringr_1.0.0 KEGGREST_1.8.0
Luo,
Thank you for the help and clarification.