Entering edit mode
Guest User
★
13k
@guest-user-4897
Last seen 10.4 years ago
Dear All,
Appreciate your time. Need your expertise. I am trying to use GOSeq
for GO analysis of my RNA-seq experiments.
I was using Tophat->Cufflinks for DE, and mouse mm10 for annotation.
I am trying to build the gene length database by myself, given that
the current version of goseq does not support the mm10 build.
1, Cufflinks seems ignored the original gene identifier that comes
with the mm10 and make its own, but they do keep the gene name in its
record, so I will just take gene name as identifier in my process. I
have already used the gene names for building the assayed gene vector
and the DE gene vector was built too.
2, Then it comes to the transcript length issue, I noticed one of
cufflink output file genes.fpkm_tracking contains both the gene name
and gene length information. The length column has this format:
chr1:4807892-4846735. This is for Lypla1 gene. But this sequence range
include introns too. So I can not simply get the transcript length by
subtracting the second number by the first one. I went into every
output file of cufflinks/cuffdiff and could not find a file containing
the transcript length information. Where can I get the transcript
length information?
3, In my experiment, I only have 39 DE genes, do you think it is even
worthy for me to use goseq? Or should I simply go to DAVID?
Best,
Tom
-- output of sessionInfo():
goseq
--
Sent via the guest posting facility at bioconductor.org.