Entering edit mode
Dale Richardson
Last seen 10.5 years ago
Hi All,
I'm currently working on a differential gene expression analysis and
I've used GOSeq to find enriched GO categories, just like what is
mentioned here
), except I am using a non-supported organism (Arabidopsis). I've come
to the exact point in the analysis as Fernando has in the above link,
where I would like to extract all gene IDs associated with the
GO terms in my DE analysis.
My question is, how can I do this with a non-supported organism?
For a supported organism, the process looks to be straight forward..
for an unsupported genome and for a newbie in R, the process isn't so
This is some of the code that got me to where I am now.
#calculate pwf function
pwf = nullp(genes,bias.data=overlapLengths)
tairgo <- read.table("ATH_GO_GOSLIM.txt", header=F, sep="\t", fill=T)
#read in GO Categories File
GO.wall <- goseq(pwf, gene2cat=tairgo[,c(1,6)]) # get ID and GO
only from tairgo
GO.samp <- goseq(pwf, gene2cat=tairgo[,c(1,6)],
enriched.GO =
method = "BH") < 0.05]
enriched.sampgo =
GO.samp$category[p.adjust(GO.wall$over_represented_pvalue, method =
"BH") < 0.05]
What I've been thinking of doing is looping through my enriched GO
terms vector and finding all gene IDs that have matching GO terms in
"tairgo". However, is there a better way to do this using one of the
functions built into GOSeq?
Thanks so much for your valuable input!!
Dale Richardson, Ph.D.
Laboratory of Plant Molecular Biology
Instituto Gulbenkian de Ci?ncia
Rua da Quinta Grande, 6
2780-156 Oeiras
Tel: +351 214 464 647