Hello all,
My name is Maria Maqueda and I am working with some data from HuGene20st microarrays (at transcript cluster level). This is not the first time working with these arrays but it seems I am again struggling with the annotation. Mainly, I have two questions:
1) Regarding lincRNA annotation. I am obtaining around 730 lincRNA-related transcripts through hugene20sttranscriptcluster.db (v8.3.0), while in annotation file from Affymetrix, there are around 12k (mrna assignment category). Some time ago (late 2013) I already asked about this difference regarding lincRNA annotation (https://support.bioconductor.org/p/56347/#56349), do you foresee any better alignment between them?
2) Regarding cross-hybridization category. I have obtained 2613 transcripts from hugene20sttranscriptcluster.db (v8.3.0) which have "Mixed" cross-hybridization value in Affymetrix annotation file. My initial idea was to keep only "main" and "unique" (X-hyb) transcripts for further analysis, but based on this result I have my doubts. Could it be an error in the Affymetrix annotation files? Anyone has any suggestion about how to deal with this "mixed" X-hyb transcripts?
Many thanks in advance for any help you could bring.
Kind Regards,
Maria
sessionInfo()
R version 3.2.0 (2015-04-16)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.1 (Yosemite)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] hugene20sttranscriptcluster.db_8.3.0 org.Hs.eg.db_3.1.2
[3] RSQLite_1.0.0 DBI_0.3.1
[5] AnnotationDbi_1.30.1 GenomeInfoDb_1.4.0
[7] IRanges_2.2.1 S4Vectors_0.6.0
[9] Biobase_2.28.0 BiocGenerics_0.14.0
loaded via a namespace (and not attached):
[1] tools_3.2.0
Thanks Jim for your quick and explanatory answer!
1) Understood. Fully agree that most probably I will finally prioritized transcripts with no additional information rather than an ID.
2) Thanks for sharing your experience, it's some how....disturbing.
So, my personal outcome is that I will have to be very careful with those annotated transcripts through the Affy annot file, which basically will be the non-coding ones.
Thanks agains for your support,
Cheers,
María