Annotation for a MoGene-2_0-st Chip/Arry Data
1
0
Entering edit mode
@michaelafuchs-8899
Last seen 9.1 years ago
Germany

Hello to you all,

I'm very new to the whole process of microarry analysis via R/Bioconductor.

I had a course in R and microanalysis, but it worked only on the ALL sample package for affy. So everything on getting the right packages and the data to the point where I can start the acuall analysis is new to me. 

I have imported the data from .CEL files via the olig package to Bioconductor and normalized it with RMA.

My problem is now how to get the annotation in there so that I can see, for example on a heatmap, what Genes turn up in the analysis. As you can see on the sessionInfo() below I have tried a variety of things.

It's propably just a simple syntax mistake, but im absolutely stuck.

the last thing that I tried was:

----------------------------------------

library(annotationTools)

sel <- order(rsd, decreasing =TRUE) [1:10]    ## just 10 genes for trial

annotation(eset) <- "mogene20sttranscriptcluster.db"

getGENESYMBOL(sel,annotation(eset))
Error in annot[, 1] : incorrect number of dimensions

---------------------------------

I'm sure the problem is with annotation(eset), but what do I put in there so that it works?

Do I need an additional Annotation file (.cvs is somthing i read i think)? And if yes where do i get it?

When I type :

--------

mogene20sttranscriptcluster()

###lots of other stuff and this warning

Warning messages:
1: In (function ()  :
  mogene20sttranscriptclusterCHR is deprecated. Please use an appropriate
  TxDb object or package for this kind of data.
2: In (function ()  :
  mogene20sttranscriptclusterCHRLENGTHS is deprecated. Please use an
  appropriate TxDb object or package for this kind of data.
3: In (function ()  :
  mogene20sttranscriptclusterCHRLOC is deprecated. Please use an
  appropriate TxDb object or package for this kind of data.
4: In (function ()  :
  mogene20sttranscriptclusterCHRLOCEND is deprecated. Please use an
  appropriate TxDb object or package for this kind of data.

--------------------------------

We used an MoGene-2_0-st Chip. I just need the Names of the Genes fo now :)

Any help is appreciated!

 

----------------------------------------------------------

sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: i386-w64-mingw32/i386 (32-bit)
Running under: Windows 7 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
 [1] annotationTools_1.42.0               GO.db_3.1.2                         
 [3] annotate_1.46.1                      XML_3.98-1.3                        
 [5] mogene20sttranscriptcluster.db_8.3.1 org.Mm.eg.db_3.1.2                  
 [7] AnnotationDbi_1.30.1                 GenomeInfoDb_1.4.3                  
 [9] estrogen_1.14.0                      affy_1.46.1                         
[11] pd.mogene.2.0.st_3.14.1              RSQLite_1.0.0                       
[13] DBI_0.3.1                            oligo_1.32.0                        
[15] Biostrings_2.36.4                    XVector_0.8.0                       
[17] IRanges_2.2.7                        S4Vectors_0.6.6                     
[19] Biobase_2.28.0                       oligoClasses_1.30.0                 
[21] BiocGenerics_0.14.0                  BiocInstaller_1.18.4                

loaded via a namespace (and not attached):
 [1] affxparser_1.40.0     GenomicRanges_1.20.8  splines_3.2.2        
 [4] zlibbioc_1.14.0       bit_1.1-12            xtable_1.7-4         
 [7] foreach_1.4.2         tools_3.2.2           ff_2.2-13            
[10] iterators_1.0.7       preprocessCore_1.30.0 affyio_1.36.0        
[13] codetools_0.2-14     

 

microarray affymetrix mouse gene arrays annotation pd.mogene.2.0.st • 1.8k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 7 hours ago
United States

If you just need the gene symbols, then you can use select() or maybe mapIds() on the mogene20sttranscriptcluster.db package.

Say you want symbols for all the probesets, and you did something like

eset <- rma(dat)

so your summarized data are in an ExpressionSet called 'eset'. You can get all the gene symbols thusly

library(mogene20sttranscriptcluster.db)

gns <- select(mogene20sttranscriptcluster.db, featureNames(eset), "SYMBOL")

But do note that you WILL have duplicates! Which you can get rid of by doing something like

gns <- gns[!duplicated(gns[,1]),]

An (identical) alternative would be

gns <- mapIds(mogene20sttranscriptcluster.db, featureNames(eset), "SYMBOL","PROBEID", multiVals = "first")

You should probably read the AnnotationDbi vignettes to familiarize yourself with the ins and outs of annotating things.

ADD COMMENT
0
Entering edit mode

Thank you so much!

It worked nicely.

A further question would be if there is a way to put the gene names along with the sample into a heatmap. The heatmap itself is not really a problem, I get it with

------

heatmap(exprs(eset)[selected,],col=gentlecol(256), cexCol =1, cexRow= 0.25) ## for example

-----

## but I get this

heatmap(exprs(gns)[,selected],col=gentlecol(256), cexCol =1, cexRow= 0.25)
Error in (function (classes, fdef, mtable)  :
  unable to find an inherited method for function ‘exprs’ for signature ‘"data.frame"’

-----

I do understand the problem, that the "gns" is not an expression set typ object. But I haven't jet figured out a way around that.

Any ideas?

ADD REPLY

Login before adding your answer.

Traffic: 612 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6