Custom GeneSetCollection PFAM
1
0
Entering edit mode
@fabian-grammes-6591
Last seen 4.2 years ago
Dear all I'm working with an unsupported organism. However I have a table of PFAM annotations for my genes and would like to make a GeneSetCollection out of it (to use it later for hypergeometric testing etc...) So how would I get a data set like this: gene_id pfam_id XLOC_000002 PF00354 XLOC_000002 PF13385 XLOC_000005 PF10523 XLOC_000005 PF13385 XLOC_000007 PF00013 XLOC_000007 PF02791 XLOC_000007 PF13385 into a GeneSetCollection ?? Thanks, kind regards Fabian
Organism Organism • 1.2k views
ADD COMMENT
0
Entering edit mode
@martin-morgan-1513
Last seen 3 months ago
United States
Hi Fabian -- On 06/06/2014 06:28 AM, Fabian Grammes wrote: > Dear all > > I'm working with an unsupported organism. However I have a table > of PFAM annotations for my genes and would like to make a > GeneSetCollection out of it (to use it later for hypergeometric testing > etc...) > > So how would I get a data set like this: > > gene_id pfam_id > XLOC_000002 PF00354 > XLOC_000002 PF13385 > XLOC_000005 PF10523 > XLOC_000005 PF13385 > XLOC_000007 PF00013 > XLOC_000007 PF02791 > XLOC_000007 PF13385 > > into a GeneSetCollection ?? I read in your data df <- read.csv(textConnection("gene_id,pfam_id XLOC_000002,PF00354 XLOC_000002,PF13385 XLOC_000005,PF10523 XLOC_000005,PF13385 XLOC_000007,PF00013 XLOC_000007,PF02791 XLOC_000007,PF13385"), stringsAsFactors=FALSE, row.names=NULL) then split it into groups based on pfam identifier sets <- split(df$gene_id, df$pfam_id) then created one gene set for each pfam id, and collected the set into a collection library(GSEABase) gsc <- GeneSetCollection(Map(function(pid, gids) { GeneSet(gids, setName=pid, collectionType=PfamCollection(pid)) }, names(sets), sets)) resulting in > gsc GeneSetCollection names: PF00013, PF00354, ..., PF13385 (5 total) unique identifiers: XLOC_000007, XLOC_000002, XLOC_000005 (3 total) types in collection: geneIdType: NullIdentifier (1 total) collectionType: PfamCollection (1 total) > gsc[["PF13385"]] setName: PF13385 geneIds: XLOC_000002, XLOC_000005, XLOC_000007 (total: 3) geneIdType: Null collectionType: Pfam ids: PF13385 (1 total) details: use 'details(object)' Hope that helps, Martin > > Thanks, kind regards > > Fabian > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
ADD COMMENT

Login before adding your answer.

Traffic: 639 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6