"James W. MacDonald" <jmacdon at="" med.umich.edu=""> writes:
> D wrote:
>> Oleg Moskvin <ovm at="" ...=""> writes:
>>
>>> Colleagues,
>>>
>>> I think this should be pretty simple task but I cannot find an
appropriate
>>> package for that.
>>> I need to generate a subset of eSet object which contains certain
probesets
>>> indicated in an external genelist (outside R environment).
>>>
>>> I.e. this procedure should look like this:
>>>
>>> mylist <- read.table .....
>>> fltered.eset <- someFunction(eSet, mylist)
>>>
>>> Probably this is already implemented somewhere.
>>> Any hints will be appreciated.
>>>
>>> All the best,
>>>
>>> Oleg
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at ...
>>>
https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>
http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>
>>
>> I have the exact same question. I am working with 2-color data in
limma
>> however. I'd like to be able to make a table of Mvalues
corresponding to a list
>> of geneIDs from an external table. Any help is appreciated.
>
> That is not the same question, really. Your question should be
easily
> answered by reading 'An Introduction to R', as that is a simple
> subsetting problem.
Maybe helpful to know that MALists contain or can be made to contain
(e.g., when reading in the original data files) whatever information
the manufacturer might provide in terms of additional annotations. You
might then do something like (the details depend entirely on how the
MAList object was created)
> idx <- ma$genes$Labels %in% c("EST1", "Actin")
> ma1 <- ma[idx,]
where this creates a (logical) index and then uses it for subsetting.
> The answer to the original question is also pretty simple. I don't
know
> if this is documented somewhere, but I think the principle of least
> surprise applies here:
>
> mylist <- read.table("my_external_list")
> filtered.eset <- original.eset[mylist,]
>
> As an example:
>
> > library(fibroEset)
> > data(fibroEset)
> > thenames <- featureNames(fibroEset)[sample(1:12625, 300)]
> > subsetted.eset <- fibroEset[thenames,]
> > subsetted.eset
> ExpressionSet (storageMode: lockedEnvironment)
> assayData: 300 features, 46 samples
> element names: exprs
> phenoData
> sampleNames: 1, 2, ..., 46 (46 total)
> varLabels and varMetadata:
> samp: sample code
> species: h: human, b: bonobo, g: gorilla
> featureData
> rowNames: 37599_at, 34494_at, ..., 36333_at (300 total)
> varLabels and varMetadata: none
> experimentData: use 'experimentData(object)'
> pubMedIds: 12840040
> Annotation [1] "hgu95av2"
A bit trickier when thenames are not probesets. One can use the maps
in the annotation package to get there, though, e.g., from SYMBOL:
> library(hgu95av2)
> rmap <- l2e(reverseSplit(as.list(hgu95av2SYMBOL)))
> head(ls(rmap))
[1] "2'-PDE" "3.8-1" "76P" "AADAC" "AAK1" "AAMP"
> rmap[["AADAC"]]
[1] "36512_at"
> thenames <- head(ls(rmap)) # the sybmols we're looking for?
> mget(thenames, rmap)
$`2'-PDE`
[1] "38144_at"
$`3.8-1`
[1] "34934_at"
$`76P`
[1] "40985_g_at" "40986_s_at" "40984_at"
$AADAC
[1] "36512_at"
$AAK1
[1] "34949_at" "40628_at" "39456_at" "40572_at" "39463_at"
$AAMP
[1] "38434_at"
> idx <- unique(unlist(mget(thenames, rmap), use.names=FALSE))
> fibroEset[idx,]
ExpressionSet (storageMode: lockedEnvironment)
assayData: 12 features, 46 samples
element names: exprs
phenoData
sampleNames: 1, 2, ..., 46 (46 total)
varLabels and varMetadata description:
samp: sample code
species: h: human, b: bonobo, g: gorilla
featureData
featureNames: 38144_at, 34934_at, ..., 38434_at (12 total)
fvarLabels and fvarMetadata description: none
experimentData: use 'experimentData(object)'
pubMedIds: 12840040
Annotation: hgu95av2
This will be a bit simpler in the forthcoming release, where the
AnnotationDbi package provides 'revmap'.
Martin
> Best,
>
> Jim
>
>
>>
>> Thanks,
>>
>> D
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>>
https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> Affymetrix and cDNA Microarray Core
> University of Michigan Cancer Center
> 1500 E. Medical Center Drive
> 7410 CCGC
> Ann Arbor MI 48109
> 734-647-5623
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
>
https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
--
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org