affycoretools::annotateEset(eset, pd.ht.hgu133.plus.pm) return error
1
0
Entering edit mode
giroudpaul ▴ 40
@giroudpaul-10031
Last seen 5.1 years ago
France

Hi

I have been trying to analyses a microarray dataset GSE85543 which has been done using Affymetrix HT HG-U133+ PM Array Plate.

I used to annotate microarray data using affycoretools::annotateEset and the corresponding ChipDB package (e.g. hgu133plus2.db) The reference manual for affycoretools indicate that annotateEset can work with either a ChipDB object or an AffyGenePDInfo.

However, when I try to annotate my data (post rma), I get the following error :

"There is no annotation object provided with the x package"

What does this mean ? Is there a problem with the package ? Or did I do something wrong ?

Code :

library("BiocManager")
library("GEOquery")
library("affy")
library("oligo")
library("pd.ht.hg.u133.plus.pm")
library("affycoretools")
library("ggplot2")

celpath = "C:/Users/pgiroud/OneDrive - Elsalys Biotech/Bioinfo/GSE85543/CEL/"
celFiles <- list.celfiles(celpath, full.names=TRUE)
data <- oligo::read.celfiles(celFiles)

data.rma = oligo::rma(data, background=TRUE, normalize=TRUE) 

data.ann <- affycoretools::annotateEset(data.rma, pd.ht.hg.u133.plus.pm)

Session Info :

R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)

Matrix products: default

locale:
[1] LC_COLLATE=French_France.1252  LC_CTYPE=French_France.1252    LC_MONETARY=French_France.1252
[4] LC_NUMERIC=C                   LC_TIME=French_France.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] affycoretools_1.58.2         pd.ht.hg.u133.plus.pm_3.12.0 DBI_1.0.0                   
 [4] RSQLite_2.1.4                oligo_1.50.0                 ggplot2_3.2.1               
 [7] Biostrings_2.54.0            XVector_0.26.0               IRanges_2.20.1              
[10] S4Vectors_0.24.1             oligoClasses_1.48.0          affy_1.64.0                 
[13] GEOquery_2.54.1              Biobase_2.46.0               BiocGenerics_0.32.0         
[16] BiocManager_1.30.10         

loaded via a namespace (and not attached):
  [1] backports_1.1.5             GOstats_2.52.0              Hmisc_4.3-0                
  [4] BiocFileCache_1.10.2        plyr_1.8.4                  lazyeval_0.2.2             
  [7] GSEABase_1.48.0             splines_3.6.1               BiocParallel_1.20.0        
 [10] GenomeInfoDb_1.22.0         digest_0.6.23               ensembldb_2.10.2           
 [13] foreach_1.4.7               htmltools_0.4.0             GO.db_3.10.0               
 [16] gdata_2.18.0                magrittr_1.5                checkmate_1.9.4            
 [19] memoise_1.1.0               BSgenome_1.54.0             cluster_2.1.0              
 [22] gcrma_2.58.0                limma_3.42.0                readr_1.3.1                
 [25] annotate_1.64.0             matrixStats_0.55.0          R.utils_2.9.2              
 [28] ggbio_1.34.0                askpass_1.1                 prettyunits_1.0.2          
 [31] colorspace_1.4-1            blob_1.2.0                  rappdirs_0.3.1             
 [34] xfun_0.11                   dplyr_0.8.3                 crayon_1.3.4               
 [37] RCurl_1.95-4.12             graph_1.64.0                genefilter_1.68.0          
 [40] zeallot_0.1.0               VariantAnnotation_1.32.0    survival_3.1-8             
 [43] iterators_1.0.12            glue_1.3.1                  gtable_0.3.0               
 [46] zlibbioc_1.32.0             DelayedArray_0.12.0         Rgraphviz_2.30.0           
 [49] scales_1.1.0                GGally_1.4.0                edgeR_3.28.0               
 [52] Rcpp_1.0.3                  xtable_1.8-4                progress_1.2.2             
 [55] htmlTable_1.13.3            foreign_0.8-72              bit_1.1-14                 
 [58] OrganismDbi_1.28.0          preprocessCore_1.48.0       Formula_1.2-3              
 [61] AnnotationForge_1.28.0      htmlwidgets_1.5.1           httr_1.4.1                 
 [64] gplots_3.0.1.1              RColorBrewer_1.1-2          ellipsis_0.3.0             
 [67] acepack_1.4.1               ff_2.2-14                   R.methodsS3_1.7.1          
 [70] pkgconfig_2.0.3             reshape_0.8.8               XML_3.98-1.20              
 [73] nnet_7.3-12                 dbplyr_1.4.2                locfit_1.5-9.1             
 [76] tidyselect_0.2.5            rlang_0.4.2                 reshape2_1.4.3             
 [79] AnnotationDbi_1.48.0        munsell_0.5.0               tools_3.6.1                
 [82] stringr_1.4.0               knitr_1.26                  bit64_0.9-7                
 [85] caTools_1.17.1.3            purrr_0.3.3                 AnnotationFilter_1.10.0    
 [88] RBGL_1.62.1                 R.oo_1.23.0                 xml2_1.2.2                 
 [91] biomaRt_2.42.0              compiler_3.6.1              rstudioapi_0.10            
 [94] curl_4.3                    affyio_1.56.0               PFAM.db_3.10.0             
 [97] tibble_2.1.3                geneplotter_1.64.0          stringi_1.4.3              
[100] GenomicFeatures_1.38.0      lattice_0.20-38             ProtGenerics_1.18.0        
[103] Matrix_1.2-18               vctrs_0.2.0                 pillar_1.4.2               
[106] lifecycle_0.1.0             data.table_1.12.6           bitops_1.0-6               
[109] rtracklayer_1.46.0          GenomicRanges_1.38.0        hwriter_1.3.2              
[112] R6_2.4.1                    latticeExtra_0.6-28         KernSmooth_2.23-16         
[115] gridExtra_2.3               affxparser_1.58.0           codetools_0.2-16           
[118] dichromat_2.0-0             gtools_3.8.1                assertthat_0.2.1           
[121] SummarizedExperiment_1.16.0 openssl_1.4.1               DESeq2_1.26.0              
[124] Category_2.52.1             ReportingTools_2.26.0       withr_2.1.2                
[127] GenomicAlignments_1.22.1    Rsamtools_2.2.1             GenomeInfoDbData_1.2.2     
[130] hms_0.5.2                   grid_3.6.1                  rpart_4.1-15               
[133] tidyr_1.0.0                 biovizBase_1.34.1           base64enc_0.1-3
affycoretools annotation HT HG-U133+ PM Array • 1.7k views
ADD COMMENT
2
Entering edit mode
@james-w-macdonald-5106
Last seen 3 hours ago
United States

For most of the pdInfo packages, there is a file in the extdata directory that contains the annotation for that array. So if you ask for annotations from a pdInfo package, the function looks in the requisite place and tries to load it. So as an example, here is the Clariom D array:

> dir(system.file("extdata/", package = "pd.clariom.d.human"))
[1] "netaffxProbeset.rda"       "netaffxTranscript.rda"    
[3] "pd.clariom.d.human.sqlite"

And the file called netaffxTranscript.rda would be loaded. But for the file you are using, this is what is in that directory:

> dir(system.file("extdata/", package = "pd.ht.hg.u133.plus.pm"))
[1] "pd.ht.hg.u133.plus.pm.sqlite"

Which is why you get the error saying there isn't an annotation file in this package.

I think this array has the same content as the hgu133plus2 array, so you might try

library(hgu133plus2.db)
anno <- do.call(cbind, lapply( c("PROBEID", "ENTREZID", "SYMBOL", "GENENAME"),
                       function(x) mapIds(x, featureNames(data.rma))))
fData(data.rma) <- AnnotatedDataFrame(data = anno)
validObject(data.rma)

ADD COMMENT
0
Entering edit mode

Hi James,

Thank you for the explanation ! You are right, it's the same chip as the hgu133plus2, but with only perfect match (PM) probes. I do not succeed however in making your solution work. I get the following message :

 Error in (function (classes, fdef, mtable)  : 
 unable to find an inherited method for function 'mapIds' for signature '"character"'
ADD REPLY
1
Entering edit mode

My bad. It should be

library(hgu133plus2.db)
anno <- do.call(cbind, lapply( c("PROBEID", "ENTREZID", "SYMBOL", "GENENAME"),
                       function(x) mapIds(hgu133plus2.db, featureNames(data.rma), x, "PROBEID")))
fData(data.rma) <- AnnotatedDataFrame(data = anno)
validObject(data.rma)

ADD REPLY
0
Entering edit mode

Hello James,

I still met some problems as probename were not exactly the same between hgu133plus2.db and my data :

>head(featureNames(data.rma))
[1] "1007_PM_s_at" "1053_PM_at"   "117_PM_at"    "121_PM_at"    "1255_PM_g_at" "1294_PM_at"  
> head(keys(hgu133plus2.db))
[1] "1007_s_at" "1053_at"   "117_at"    "121_at"    "1255_g_at" "1294_at"

I got around with this :

probes <- gsub("PM_", "", featureNames(data.rma))
anno <- do.call(cbind, lapply(c("ENTREZID", "SYMBOL", "GENENAME"),
                              function(x) mapIds(hgu133plus2.db, keys=probes,
                                                 column = x, keytype = "PROBEID")))

Also I removed "PROBEID" because it returned a memory error.

But now, I get stuck at the next steps when I thought I had it all figured out. The next time return :

fData(data.rma) <- AnnotatedDataFrame(data = anno)
Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function 'AnnotatedDataFrame' for signature '"matrix", "missing"'

So from what I understand, the problem is that :

  • anno is a matix
  • There is NA value within anno matrix (10384 probes without annotation over 54715)

Is this normal that so many probes return no gene, given it should be a "Perfect macth" only array ? How do I solve this issue ?

EDIT : I found the problem : I removed the PM in the probename in my feature data, but my Assaydata still have it, so I try to insert feature data looking like this "1007sat", in a Eset with assaydata whose featureNames look like this "1007PMs_at". Instead of removing PM, I should add it. I will search how to do this

ADD REPLY
0
Entering edit mode

It works like this :

featureNames(data.rma) <- gsub("PM_", "", featureNames(data.rma))
anno <- do.call(cbind, lapply(c("ENTREZID", "SYMBOL", "GENENAME"),
                              function(x) mapIds(hgu133plus2.db, keys=featureNames(data.rma), column = x, keytype = "PROBEID")))
fData(data.rma) <- as.data.frame(anno)
validObject(data.rma)

I just removed the PM also in my data.

ADD REPLY

Login before adding your answer.

Traffic: 624 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6