Entering edit mode
Kemal Akat
▴
120
@kemal-akat-4351
Last seen 10.3 years ago
Dear colleagues,
I am currently analyzing a Illumina Mouse v2 bead array dataset using
limma and ran across an error I don't quite understand. I came across
this error when trying to annotate the differentially expressed genes
later on in
the analysis. The problem seems to stem from empty strings in the
vector I provide to retrieve the annotation info. But I don't
understand how this can happen in the first place.
The probe and control profiles were exported from GenomeStudio without
background correction and normalization.
Here is the code I ran:
R> x = read.ilmn(files = "ProbeProfile.txt", ctrlfiles =
"ControlProbeProfile.txt", probeid = "Probe_ID", annotation =
"TargetID", other.columns = c("Detection", "Avg_NBEADS"), verbose =
FALSE)
R> y = neqc(x)
R> expressed = rowSums(y$other$Detection < 0.05) > 4
R> y = y[expressed, ]
R> ids = rownames(y)
R> entrez = unlist(mget(ids, illuminaMousev2ENTREZID, ifnotfound =
NA))
Error in unlist(mget(ids, illuminaMousev2ENTREZID, ifnotfound = NA)) :
error in evaluating the argument 'x' in selecting a method for
function 'unlist': Error in FUN(c("ILMN_2735294", "ILMN_2417611",
"ILMN_2545897", "ILMN_2762289", :
attempt to use zero-length variable name
Calls: mget ... as.list -> as.list -> .formatList -> lapply -> lapply
-> FUN
R> traceback()
1: unlist(mget(ids, illuminaMousev2ENTREZID, ifnotfound = NA))
R> ids[ids == ""]
[1] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" ""
[55] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" ""
[109] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" ""
[163] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" ""
[217] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" ""
[271] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" ""
[325] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" ""
[379] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" ""
[433] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" ""
[487] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" ""
[541] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" ""
[595] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" ""
[649] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" ""
[703] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" ""
[757] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" ""
[811] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" ""
[865] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" ""
[919] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
"" "" "" "" "" "" "" "" "" ""
[973] "" ""
So there seem to be 974 empty strings in the row names, but there is
nothing like that in the original data file, and in addition this
shouldn't be working in R in the first place?
Here is how the EListRaw object looks like after reading it into R.
R> x = read.ilmn(files = "ProbeProfile.txt", ctrlfiles =
"ControlProbeProfile.txt", probeid = "Probe_ID", annotation =
"TargetID", other.columns = c("Detection", "Avg_NBEADS"), verbose =
FALSE)
R> x
An object of class "EListRaw"
$source
[1] "illumina"
$E
9379087005_A 9379087005_B 9379087022_A 9379087022_B
9379087005_C 9379087005_D 9379087022_C 9379087022_D 9379087005_E
9379087005_F 9379087022_E
ILMN_2735294 420.8 401.8 395.8 422.9
360.1 358.5 420.7 327.1 178.8 343.4
425.5
ILMN_2417611 323.8 280.2 294.1 315.5
542.5 301.0 398.0 133.7 235.9 382.0
512.7
ILMN_2545897 98.3 109.2 128.0 124.5
111.3 102.6 110.2 106.6 87.2 104.6
101.8
ILMN_2762289 91.7 88.3 94.2 95.5
88.1 81.2 88.5 88.0 79.4 85.3
84.5
ILMN_1248788 87.6 84.7 92.0 92.9
85.9 84.0 93.8 86.9 77.5 84.9
86.3
9379087022_F
ILMN_2735294 322.0
ILMN_2417611 185.7
ILMN_2545897 107.8
ILMN_2762289 88.8
ILMN_1248788 85.1
46250 more rows ...
$genes
TargetID Status
1 0610005A07RIK regular
2 0610005C13RIK regular
3 0610005H09RIK regular
4 0610005I04 regular
5 0610005K03RIK regular
46250 more rows ...
$other
$Detection
9379087005_A 9379087005_B 9379087022_A 9379087022_B
9379087005_C 9379087005_D 9379087022_C 9379087022_D 9379087005_E
9379087005_F 9379087022_E
ILMN_2735294 0.00000 0.00000 0.0000 0.0000
0.0000 0.0000 0.00000 0.0000 0.00000
0.00000 0.00000
ILMN_2417611 0.00000 0.00000 0.0000 0.0000
0.0000 0.0000 0.00000 0.0000 0.00000
0.00000 0.00000
ILMN_2545897 0.08974 0.00321 0.0000 0.0000
0.0000 0.0000 0.00107 0.0000 0.00214
0.00214 0.00107
ILMN_2762289 0.34402 0.49359 0.1998 0.1827
0.6068 0.9220 0.71047 0.4776 0.27350
0.58654 0.77991
ILMN_1248788 0.76603 0.86004 0.3472 0.3718
0.8440 0.6645 0.21902 0.6004 0.58120
0.63675 0.53419
9379087022_F
ILMN_2735294 0.0000
ILMN_2417611 0.0000
ILMN_2545897 0.0000
ILMN_2762289 0.3440
ILMN_1248788 0.7949
46250 more rows ...
$Avg_NBEADS
9379087005_A 9379087005_B 9379087022_A 9379087022_B
9379087005_C 9379087005_D 9379087022_C 9379087022_D 9379087005_E
9379087005_F 9379087022_E
ILMN_2735294 51 63 58 57
36 46 49 60 62 50
58
ILMN_2417611 44 56 46 51
66 51 42 66 40 47
57
ILMN_2545897 51 69 45 67
47 39 44 56 59 43
50
ILMN_2762289 48 49 53 59
43 55 47 49 54 41
53
ILMN_1248788 43 42 29 38
39 42 36 36 29 31
45
9379087022_F
ILMN_2735294 50
ILMN_2417611 56
ILMN_2545897 58
ILMN_2762289 42
ILMN_1248788 38
46250 more rows ...
Now looking at the end of the file:
R> tail(x$E)
9379087005_A 9379087005_B 9379087022_A 9379087022_B 9379087005_C
9379087005_D 9379087022_C 9379087022_D 9379087005_E 9379087005_F
9379087022_E 9379087022_F
92.2 92.6 92.6 93.8 92.1
86.9 91.4 85.7 78.9 86.5 89.0
91.7
89.2 85.7 92.3 89.9 85.9
83.7 91.3 89.5 76.6 91.4 86.3
85.8
89.8 85.5 92.7 92.1 92.7
87.3 90.1 86.2 79.1 83.7 86.4
84.9
96.9 88.9 92.4 94.6 90.7
87.9 96.2 85.6 78.0 82.0 86.4
84.1
87.8 83.5 85.9 90.2 81.6
81.5 92.5 83.8 73.1 80.6 86.1
86.8
89.8 87.4 87.1 89.6 88.1
84.4 91.9 85.7 80.5 88.3 86.8
86.3
R> sessionInfo()
R Under development (unstable) (2013-06-26 r63071)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] splines parallel stats graphics grDevices utils
datasets methods base
other attached packages:
[1] xtable_1.7-1 vsn_3.29.1
reshape2_1.2.2 ratr_1.0 pheatmap_0.7.4
illuminaMousev2.db_1.18.0
[7] org.Mm.eg.db_2.9.0 GOstats_2.27.1 graph_1.39.3
ggplot2_0.9.3.1 edgeR_3.3.8 limma_3.17.23
[13] codetools_0.2-8 Category_2.27.3 GO.db_2.9.0
RSQLite_0.11.4 DBI_0.2-7 Matrix_1.0-12
[19] lattice_0.20-15 Biostrings_2.29.19 XVector_0.1.4
IRanges_1.19.37 AnnotationDbi_1.23.23 Biobase_2.21.7
[25] BiocGenerics_0.7.5 knitr_1.4.1
setwidth_1.0-3
loaded via a namespace (and not attached):
[1] affy_1.39.2 affyio_1.29.0 annotate_1.39.0
AnnotationForge_1.3.22 BiocInstaller_1.11.4 colorspace_1.2-2
dichromat_2.0-0
[8] digest_0.6.3 evaluate_0.4.7 formatR_0.9
genefilter_1.43.0 grid_3.1.0 GSEABase_1.23.0
gtable_0.1.2
[15] highr_0.2.1 labeling_0.2 MASS_7.3-26
munsell_0.4 plyr_1.8 preprocessCore_1.23.0
proto_0.3-10
[22] RBGL_1.37.2 RColorBrewer_1.0-5 scales_0.2.3
stats4_3.1.0 stringr_0.6.2 survival_2.37-4
tools_3.1.0
[29] XML_3.98-1.1 zlibbioc_1.7.0
R>
Any help and explanations appreciated!
Cheers,
Kemal
--
Kemal Akat
Laboratory of RNA Molecular Biology
The Rockefeller University
1230 York Avenue, Box #186
New York, NY 10065