Question

Annotation for a MoGene-2_0-st Chip/Arry Data

0

Entering edit mode

Michaela.Fuchs • 0

@michaelafuchs-8899

Last seen 9.6 years ago

Germany

Hello to you all,

I'm very new to the whole process of microarry analysis via R/Bioconductor.

I had a course in R and microanalysis, but it worked only on the ALL sample package for affy. So everything on getting the right packages and the data to the point where I can start the acuall analysis is new to me.

I have imported the data from .CEL files via the olig package to Bioconductor and normalized it with RMA.

My problem is now how to get the annotation in there so that I can see, for example on a heatmap, what Genes turn up in the analysis. As you can see on the sessionInfo() below I have tried a variety of things.

It's propably just a simple syntax mistake, but im absolutely stuck.

the last thing that I tried was:

----------------------------------------

library(annotationTools)

sel <- order(rsd, decreasing =TRUE) [1:10] ## just 10 genes for trial

annotation(eset) <- "mogene20sttranscriptcluster.db"

getGENESYMBOL(sel,annotation(eset))
Error in annot[, 1] : incorrect number of dimensions

---------------------------------

I'm sure the problem is with annotation(eset), but what do I put in there so that it works?

Do I need an additional Annotation file (.cvs is somthing i read i think)? And if yes where do i get it?

When I type :

--------

mogene20sttranscriptcluster()

###lots of other stuff and this warning

Warning messages:
1: In (function () :
mogene20sttranscriptclusterCHR is deprecated. Please use an appropriate
TxDb object or package for this kind of data.
2: In (function () :
mogene20sttranscriptclusterCHRLENGTHS is deprecated. Please use an
appropriate TxDb object or package for this kind of data.
3: In (function () :
mogene20sttranscriptclusterCHRLOC is deprecated. Please use an
appropriate TxDb object or package for this kind of data.
4: In (function () :
mogene20sttranscriptclusterCHRLOCEND is deprecated. Please use an
appropriate TxDb object or package for this kind of data.

--------------------------------

We used an MoGene-2_0-st Chip. I just need the Names of the Genes fo now :)

Any help is appreciated!

----------------------------------------------------------

sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: i386-w64-mingw32/i386 (32-bit)
Running under: Windows 7 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets
[8] methods base

other attached packages:
[1] annotationTools_1.42.0               GO.db_3.1.2
[3] annotate_1.46.1                      XML_3.98-1.3
[5] mogene20sttranscriptcluster.db_8.3.1 org.Mm.eg.db_3.1.2
[7] AnnotationDbi_1.30.1                 GenomeInfoDb_1.4.3
[9] estrogen_1.14.0                      affy_1.46.1
[11] pd.mogene.2.0.st_3.14.1              RSQLite_1.0.0
[13] DBI_0.3.1                            oligo_1.32.0
[15] Biostrings_2.36.4                    XVector_0.8.0
[17] IRanges_2.2.7                        S4Vectors_0.6.6
[19] Biobase_2.28.0                       oligoClasses_1.30.0
[21] BiocGenerics_0.14.0                  BiocInstaller_1.18.4

loaded via a namespace (and not attached):
[1] affxparser_1.40.0     GenomicRanges_1.20.8 splines_3.2.2
[4] zlibbioc_1.14.0       bit_1.1-12            xtable_1.7-4
[7] foreach_1.4.2         tools_3.2.2           ff_2.2-13
[10] iterators_1.0.7       preprocessCore_1.30.0 affyio_1.36.0
[13] codetools_0.2-14

microarray affymetrix mouse gene arrays annotation pd.mogene.2.0.st • 1.9k views

ADD COMMENT • link 9.6 years ago Michaela.Fuchs • 0

score 0 · Answer 1 · 2015-09-29

0

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 8 hours ago

United States

If you just need the gene symbols, then you can use select() or maybe mapIds() on the mogene20sttranscriptcluster.db package.

Say you want symbols for all the probesets, and you did something like

eset <- rma(dat)

so your summarized data are in an ExpressionSet called 'eset'. You can get all the gene symbols thusly

library(mogene20sttranscriptcluster.db)

gns <- select(mogene20sttranscriptcluster.db, featureNames(eset), "SYMBOL")

But do note that you WILL have duplicates! Which you can get rid of by doing something like

gns <- gns[!duplicated(gns[,1]),]

An (identical) alternative would be

gns <- mapIds(mogene20sttranscriptcluster.db, featureNames(eset), "SYMBOL","PROBEID", multiVals = "first")

You should probably read the AnnotationDbi vignettes to familiarize yourself with the ins and outs of annotating things.

ADD COMMENT • link 9.6 years ago James W. MacDonald 68k

0

Entering edit mode

Thank you so much!

It worked nicely.

A further question would be if there is a way to put the gene names along with the sample into a heatmap. The heatmap itself is not really a problem, I get it with

------

heatmap(exprs(eset)[selected,],col=gentlecol(256), cexCol =1, cexRow= 0.25) ## for example

-----

## but I get this

heatmap(exprs(gns)[,selected],col=gentlecol(256), cexCol =1, cexRow= 0.25)
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘exprs’ for signature ‘"data.frame"’

-----

I do understand the problem, that the "gns" is not an expression set typ object. But I haven't jet figured out a way around that.

Any ideas?

ADD REPLY • link 9.6 years ago Michaela.Fuchs • 0