Hi everyone,
I am a PhD-student from Germany (Munich) and actually I am working on adipose tissue biology. As a side project I started working with RNA-seq data. I successfully managed to test for differential gene expression using "DESeq2" followed by GO-analysis with "goseq". So far, so good. However, I would like to know which genes (annotated to this pathway) are differentially expressed in my dataset.
library(DESeq2) library(goseq) library(GO.db) library (org.Mm.eg.db) library(annotate)
#"DESeq2": Pulling out the contrast of interest res <- results(dds, contrast=c("genotype","iWATKO", "iWATWT")) #Constructing a gene vector for "goseq" assayed.genes <- rownames(res) fdr.threshold <- 0.1 de.genes <- rownames(res)[ which(res$padj < fdr.threshold) ] gene.vector=as.integer(assayed.genes%in%de.genes) names(gene.vector)=assayed.genes #Fitting PWF pwf=nullp(gene.vector,"mm9","geneSymbol") #From here on I used the code posted by Iain Gallagher goseq analysis - extracting list of my genes which are DE in the enriched GO category goCats <- goseq(pwf, "mm9","geneSymbol", test.cats=("GO:BP")) sigCats <- goCats[which(goCats[,2] < 0.05),] cats <- sigCats$category terms <- stack(lapply(mget(cats,GOTERM, ifnotfound=NA),Term)) sigCats$Term <- with(sigCats, terms$values[match(terms$ind, sigCats$category)] ) ##Original code (Where does topGenes come from?) allGos <- stack(getgo(rownames(topGenes$table), "mm9", "geneSymbol")) ##I replaced topGenes with res, not sure if this is correct?! However, it does not produce an error. allGos <- stack(getgo(rownames(res), "mm9", "geneSymbol")) onlySigCats <- allGos[allGos$values %in% sigCats$category,] onlySigCats$Term <- with( onlySigCats, terms$value[match(onlySigCats$values, terms$ind)] ) ##Next line is producing an error, what is annot2??? #add the gene symbol onlySigCats$Symbol <- with( onlySigCats, annot2[,2][match(onlySigCats$ind, rownames(annot2) )] ) Error in eval(substitute(expr), data, enclos = parent.frame()) : Object 'annot2' not found > traceback () 5: eval(substitute(expr), data, enclos = parent.frame()) 4: eval(substitute(expr), data, enclos = parent.frame()) 3: with.default(onlySigCats, annot2[, 2][match(onlySigCats$ind, rownames(annot2))]) 2: with(onlySigCats, annot2[, 2][match(onlySigCats$ind, rownames(annot2))]) 1: with(onlySigCats, annot2[, 2][match(onlySigCats$ind, rownames(annot2))])
##Again,where does topGenes come from???
#add the gene logFC
onlySigCats$logFC <- with( onlySigCats, topGenes$table$logFC[match(onlySigCats$ind, rownames(topGenes$table) )] )
> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 10240)
Matrix products: default
locale:
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
[5] LC_TIME=German_Germany.1252
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] org.Mm.eg.db_3.5.0 goseq_1.30.0
[3] geneLenDataBase_1.14.0 BiasedUrn_1.07
[5] GO.db_3.5.0 annotate_1.56.1
[7] XML_3.98-1.9 AnnotationDbi_1.40.0
[9] DESeq2_1.18.1 SummarizedExperiment_1.8.1
[11] DelayedArray_0.4.1 matrixStats_0.53.1
[13] Biobase_2.38.0 GenomicRanges_1.30.0
[15] GenomeInfoDb_1.14.0 IRanges_2.12.0
[17] S4Vectors_0.16.0 BiocGenerics_0.24.0
[19] BiocInstaller_1.28.0
loaded via a namespace (and not attached):
[1] httr_1.3.1 RMySQL_0.10.13 bit64_0.9-7
[4] splines_3.4.3 assertthat_0.2.0 Formula_1.2-2
[7] latticeExtra_0.6-28 blob_1.1.0 GenomeInfoDbData_1.0.0
[10] Rsamtools_1.30.0 progress_1.1.2 pillar_1.1.0
[13] RSQLite_2.0 backports_1.1.2 lattice_0.20-35
[16] digest_0.6.15 RColorBrewer_1.1-2 XVector_0.18.0
[19] checkmate_1.8.5 colorspace_1.3-2 htmltools_0.3.6
[22] Matrix_1.2-12 plyr_1.8.4 pkgconfig_2.0.1
[25] biomaRt_2.34.2 genefilter_1.60.0 zlibbioc_1.24.0
[28] xtable_1.8-2 scales_0.5.0 BiocParallel_1.12.0
[31] htmlTable_1.11.2 tibble_1.4.2 mgcv_1.8-23
[34] ggplot2_2.2.1 GenomicFeatures_1.30.3 nnet_7.3-12
[37] lazyeval_0.2.1 survival_2.41-3 magrittr_1.5
[40] memoise_1.1.0 nlme_3.1-131 foreign_0.8-69
[43] prettyunits_1.0.2 tools_3.4.3 data.table_1.10.4-3
[46] stringr_1.2.0 munsell_0.4.3 locfit_1.5-9.1
[49] cluster_2.0.6 Biostrings_2.46.0 compiler_3.4.3
[52] rlang_0.1.6 grid_3.4.3 RCurl_1.95-4.10
[55] rstudioapi_0.7 htmlwidgets_1.0 bitops_1.0-6
[58] base64enc_0.1-3 gtable_0.2.0 DBI_0.7
[61] R6_2.2.2 GenomicAlignments_1.14.1 gridExtra_2.3
[64] knitr_1.19 rtracklayer_1.38.3 bit_1.1-12
[67] Hmisc_4.1-1 stringi_1.1.6 Rcpp_0.12.15
[70] geneplotter_1.56.0 rpart_4.1-12 acepack_1.4.1
>
I really appreciate your help :)
Kind regards,
Josef
Thank you, I'll try!