Entering edit mode
Janet Young
▴
740
@janet-young-2360
Last seen 5.1 years ago
Fred Hutchinson Cancer Research Center,…
Hi,
I'm working with lumiHumanAll.db and chromosomal locations using the
CHR and CHRLOC tables.
Mostly things turn out fine but I think I have found some probes for
which the information in CHR and CHRLOC doesn't match up. (I'm not
sure whether I found all the problem probes, or just those a few that
were most obvious because they seemed to be off the end of the
chromosome).
I'd guess something to do with how probes mapping to multiple
locations are dealt with, which is tricky, but it seems important to
be internally consistent between CHR and CHRLOC.
I've tried to explain everything with the code at the bottom of the
email.
thanks very much,
Janet
-------------------------------------------------------------------
Dr. Janet Young
Tapscott and Malik labs
Fred Hutchinson Cancer Research Center
1100 Fairview Avenue N., C3-168,
P.O. Box 19024, Seattle, WA 98109-1024, USA.
tel: (206) 667 1471 fax: (206) 667 6524
email: jayoung ...at... fhcrc.org
-------------------------------------------------------------------
library(lumiHumanAll.db)
library(lumi)
library(annotate)
### these have mismatched CHR and CHRLOC info - I noticed them among a
much larger set of probes
odd_mappers <- c("cS._E8f0CEAHsPH.oU", "3B5Dx.5FBcAstHt9Iw",
"Ho.7bwAyQBWQ8f_RQU", "0k9AKLpXv97vAFU.rk")
### and a few other probes that looked fine
good_mappers <- c("Ku8QhfS0n_hIOABXuE", "fqPEquJRRlSVSfL.8A",
"ckiehnugOno9d7vf1Q", "x57Vw5B5Fbt5JUnQkI")
probes <- c(odd_mappers,good_mappers)
probeType <- c( rep("odd",length(odd_mappers)),
rep("good",length(good_mappers)) )
### get their map info from CHR and CHRLOC
chrs <- lookUp(probes, "lumiHumanAll.db", "CHR")
locs <- lookUp(probes, "lumiHumanAll.db", "CHRLOC")
### some probes have two locs, which is OK, but make sure we know
which information to double up when we make a table later
numLocsPerProbe <- sapply(locs,length)
#### put that info into a table
mapping <- data.frame( probe=rep( probes, numLocsPerProbe),
probeType=rep( probeType, numLocsPerProbe),
chrLoc=abs(unlist(locs,use.names=FALSE)), #ignore strand
chrsFromChrsList=rep(unlist(chrs,use.names=FALSE),
numLocsPerProbe),
chrsFromLocsList=unlist(lapply(locs, names),use.names=FALSE) )
#### looking at CHRLENGTH was how I realized some of the CHR info
wasn't right - probe maps way after end of chromosome
mapping[,"chrLengthChrsList"] <- org.Hs.egCHRLENGTHS[
as.character(mapping[,"chrsFromChrsList"]) ]
mapping[,"chrLengthLocsList"] <- org.Hs.egCHRLENGTHS[
as.character(mapping[,"chrsFromLocsList"]) ]
#### add probe sequences
mapping[,"seq"] <- id2seq(as.character(mapping[,"probe"]))
#### take a look at the table, and do some BLAT searches at UCSC
website to see where the probe really maps
mapping
### BLAT search results - these are the exact matches, but all have
other non-exact matches)
# first probe cS._E8f0CEAHsPH.oU maps to chr10:56367644-56367693
# second probe 3B5Dx.5FBcAstHt9Iw maps to chr17:13446846-13446895
# third probe Ho.7bwAyQBWQ8f_RQU maps to chr7:34980375-34980424
# fourth probe 0k9AKLpXv97vAFU.rk maps to chr3:149699708-149699757
####### so in each of those cases it looks like lumiHumanAllCHR has
the correct chromosome, and CHRLOC is wrong (perhaps it took one of
the secondary, non-exact matches?). (so the locations on the correct
chromosome are not available in any table?)
#################
sessionInfo()
R version 2.14.0 (2011-10-31)
Platform: i386-apple-darwin9.8.0/i386 (32-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] annotate_1.32.1 lumi_2.6.0 nleqslv_1.9.1
[4] methylumi_2.0.1 lumiHumanAll.db_1.16.0 org.Hs.eg.db_2.6.4
[7] RSQLite_0.11.0 DBI_0.2-5
AnnotationDbi_1.16.10
[10] Biobase_2.14.0
loaded via a namespace (and not attached):
[1] affy_1.32.0 affyio_1.22.0 BiocInstaller_1.2.1
[4] grid_2.14.0 hdrcde_2.15 IRanges_1.12.5
[7] KernSmooth_2.23-7 lattice_0.20-0 MASS_7.3-16
[10] Matrix_1.0-2 mgcv_1.7-11 nlme_3.1-102
[13] preprocessCore_1.16.0 xtable_1.6-0 zlibbioc_1.0.0