Removing samples from S4 object (in lumi)
Chi Hong
Last seen 8.7 years ago
United States

How do I remove (failed) samples from my LumiBatch object? 


I've detected outlying samples from the various QC plots generated by lumiQ (actually, lumiR). I know their sample names. I'd like to remove them before normalizing the data and subsequent analysis.


I tried something like this:

sampleNames(lumiExpr) <- pData(lumiExpr)$sampleID

badsamp = c("5406958048_G","5406958047_I")

toremove = pData(lumiBatch.object)$sampleID %in% badsamp = lumiExpr[ , !toremove]


Got this error message:

The sample names in the controlData don't match sampleNames(object).
Warning message:
In lumiExpr[ , !toremove] :
  The controlData slot does not match the sampleNames.
The subsetting did not execute on controlData.


Is there a better way to do this?



lumi S4 • 2.9k views
Chi Hong
Last seen 8.7 years ago
United States

Summarizing a side-conversation:

Levi was correct. The bug is now fixed and will be a part of lumi 2.19.1.


Here's the temporary solution to remove 2 samples ('badsamp') (Thanks, Pan!):

sampleNames(lumiExpr) = pData(lumiExpr)$sampleID
badsamp = c("5406958048_G","5406958047_I")
toremove = pData(lumiExpr)$sampleID %in% badsamp lumiExpr[,-which(toremove)]


Features  Samples
   47231       34

Features  Samples
   47231       36



Levi Waldron
Last seen 29 days ago
CUNY Graduate School of Public Health

Pan, I think this is a bug in how lumi subsets from logical vectors, not user error - for example:

> library(lumi)

> data(example.lumi)

> dim(example.lumi)

Features  Samples 

    8000        4 

> example.lumi[, 1:3]

Summary of data information:

Illumina Inc. BeadStudio version

                Normalization = none

                Array Content = 11188230_100CP_MAGE-ML.XML

                Error Model = none

                DateTime = 2/3/2005 3:21 PM

                Local Settings = en-US



Major Operation History:

            submitted            finished

1 2007-04-22 00:08:36 2007-04-22 00:10:36

2 2007-04-22 00:10:36 2007-04-22 00:10:38

3 2007-04-22 00:13:06 2007-04-22 00:13:10

4 2007-04-22 00:59:20 2007-04-22 00:59:36

5 2015-03-05 14:32:23 2015-03-05 14:32:23

                                             command lumiVersion

1           lumiR("../data/Barnes_gene_profile.txt")       1.1.6

2                             lumiQ(x.lumi = x.lumi)       1.1.6

3 addNuId2lumi(x.lumi = x.lumi, lib = "lumiHumanV1")       1.1.6

4            Subsetting 8000 features and 4 samples.       1.1.6

5                              Subsetting 4 samples.      2.18.0


Object Information:

LumiBatch (storageMode: lockedEnvironment)

assayData: 8000 features, 3 samples 

  element names: beadNum, detection, exprs, se.exprs 

protocolData: none


  sampleNames: A01 A02 B01

  varLabels: sampleID label

  varMetadata: labelDescription


  featureNames: oZsQEQXp9ccVIlwoQo 9qedFRd_5Cul.ueZeQ ...

    33KnLHy.RFaieogAF4 (8000 total)

  fvarLabels: TargetID

  fvarMetadata: labelDescription

experimentData: use 'experimentData(object)'

Annotation: lumiHumanAll.db 

Control Data: Available

QC information: Please run summary(x, 'QC') for details!

> example.lumi[, c(T, T, T, F)]

The sample names in the controlData don't match sampleNames(object).

Summary of data information:

Illumina Inc. BeadStudio version

                Normalization = none

                Array Content = 11188230_100CP_MAGE-ML.XML

                Error Model = none

                DateTime = 2/3/2005 3:21 PM

                Local Settings = en-US



Major Operation History:

            submitted            finished

1 2007-04-22 00:08:36 2007-04-22 00:10:36

2 2007-04-22 00:10:36 2007-04-22 00:10:38

3 2007-04-22 00:13:06 2007-04-22 00:13:10

4 2007-04-22 00:59:20 2007-04-22 00:59:36

5 2015-03-05 14:32:23 2015-03-05 14:32:23

                                             command lumiVersion

1           lumiR("../data/Barnes_gene_profile.txt")       1.1.6

2                             lumiQ(x.lumi = x.lumi)       1.1.6

3 addNuId2lumi(x.lumi = x.lumi, lib = "lumiHumanV1")       1.1.6

4            Subsetting 8000 features and 4 samples.       1.1.6

5                              Subsetting 4 samples.      2.18.0


Object Information:

LumiBatch (storageMode: lockedEnvironment)

assayData: 8000 features, 3 samples 

  element names: beadNum, detection, exprs, se.exprs 

protocolData: none


  sampleNames: A01 A02 B01

  varLabels: sampleID label

  varMetadata: labelDescription


  featureNames: oZsQEQXp9ccVIlwoQo 9qedFRd_5Cul.ueZeQ ...

    33KnLHy.RFaieogAF4 (8000 total)

  fvarLabels: TargetID

  fvarMetadata: labelDescription

experimentData: use 'experimentData(object)'

Annotation: lumiHumanAll.db 

Control Data: Available

QC information: Please run summary(x, 'QC') for details!

Warning message:

In example.lumi[, c(T, T, T, F)] :

  The controlData slot does not match the sampleNames.

The subsetting did not execute on controlData.


> sessionInfo()

R version 3.1.2 (2014-10-31)

Platform: x86_64-pc-linux-gnu (64-bit)



 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              

 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    


 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 

 [9] LC_ADDRESS=C               LC_TELEPHONE=C            



attached base packages:

[1] parallel  stats     graphics  grDevices utils     datasets  methods  

[8] base     


other attached packages:

[1] lumi_2.18.0         Biobase_2.26.0      BiocGenerics_0.12.1


loaded via a namespace (and not attached):

 [1] affy_1.44.0             affyio_1.34.0           annotate_1.44.0        

 [4] AnnotationDbi_1.28.1    base64_1.1              base64enc_0.1-2        

 [7] BatchJobs_1.5           BBmisc_1.9              beanplot_1.2           

[10] BiocInstaller_1.16.1    BiocParallel_1.0.3      biomaRt_2.22.0         

[13] Biostrings_2.34.1       bitops_1.0-6            brew_1.0-6             

[16] BSgenome_1.34.1         bumphunter_1.6.0        checkmate_1.5.1        

[19] codetools_0.2-10        colorspace_1.2-5        DBI_0.3.1              

[22] digest_0.6.8            doRNG_1.6               fail_1.2               

[25] foreach_1.4.2           genefilter_1.48.1       GenomeInfoDb_1.2.4     

[28] GenomicAlignments_1.2.2 GenomicFeatures_1.18.3  GenomicRanges_1.18.4   

[31] grid_3.1.2              illuminaio_0.8.0        IRanges_2.0.1          

[34] iterators_1.0.7         KernSmooth_2.23-14      lattice_0.20-30        

[37] limma_3.22.6            locfit_1.5-9.1          MASS_7.3-39            

[40] Matrix_1.1-5            matrixStats_0.14.0      mclust_4.4             

[43] methylumi_2.12.0        mgcv_1.8-5              minfi_1.12.0           

[46] multtest_2.22.0         nleqslv_2.5             nlme_3.1-120           

[49] nor1mix_1.2-0           pkgmaker_0.22           plyr_1.8.1             

[52] preprocessCore_1.28.0   quadprog_1.5-5          RColorBrewer_1.1-2     

[55] Rcpp_0.11.4             RCurl_1.95-4.5          registry_0.2           

[58] reshape_0.8.5           rngtools_1.2.4          Rsamtools_1.18.3       

[61] RSQLite_1.0.0           rtracklayer_1.26.2      S4Vectors_0.4.0        

[64] sendmailR_1.2-1         siggenes_1.40.0         splines_3.1.2          

[67] stats4_3.1.2            stringr_0.6.2           survival_2.38-1        

[70] tcltk_3.1.2             tools_3.1.2             XML_3.98-1.1           

[73] xtable_1.7-4            XVector_0.6.0           zlibbioc_1.12.0        


Pan Du
Last seen 9.7 years ago
United States
As the error message shows, please check the colnames of controlData. If they are inconsistent with the sampleNames of lumiExpr, just rename them and try again. Pan On Thu, Mar 5, 2015 at 9:54 AM, Chi Hong [bioc] <> wrote: > Activity on a post you are following on > > User Chi Hong <https:"" u="" 7095=""/> wrote Question: > Removing samples from S4 object (in lumi) > <https:"" p="" 65416=""/>: > > How do I remove (failed) samples from my LumiBatch object? > > > > I've detected outlying samples from the various QC plots generated by > lumiQ (actually, lumiR). I know their sample names. I'd like to remove them > before normalizing the data and subsequent analysis. > > > > I tried something like this: > > sampleNames(lumiExpr) <- pData(lumiExpr)$sampleID > > badsamp = c("5406958048_G","5406958047_I") > > toremove = pData(lumiBatch.object)$sampleID %in% badsamp > > = lumiExpr[ , !toremove] > > > > Got this error message: > > The sample names in the controlData don't match sampleNames(object). > Warning message: > In lumiExpr[ , !toremove] : > The controlData slot does not match the sampleNames. > The subsetting did not execute on controlData. > > > > Is there a better way to do this? > > > > > > ------------------------------ > > You may reply via email or visit Removing samples from S4 object (in lumi) >
Chi Hong
Last seen 8.7 years ago
United States


The colnames of controlData appear to be consistent with sampleNames.  See below:



 [1] "5406958043_A" "5406958043_B" "5406958043_C" "5406958043_D" "5406958043_E"
 [6] "5406958043_F" "5406958043_G" "5406958043_H" "5406958043_I" "5406958043_J"
[11] "5406958043_K" "5406958043_L" "5406958047_A" "5406958047_B" "5406958047_C"
[16] "5406958047_D" "5406958047_E" "5406958047_F" "5406958047_G" "5406958047_H"
[21] "5406958047_I" "5406958047_J" "5406958047_K" "5406958047_L" "5406958048_A"
[26] "5406958048_B" "5406958048_C" "5406958048_D" "5406958048_E" "5406958048_F"
[31] "5406958048_G" "5406958048_H" "5406958048_I" "5406958048_J" "5406958048_K"
[36] "5406958048_L"


 [1] "controlType"  "ProbeID"      "5406958043_A" "5406958043_B" "5406958043_C"
 [6] "5406958043_D" "5406958043_E" "5406958043_F" "5406958043_G" "5406958043_H"
[11] "5406958043_I" "5406958043_J" "5406958043_K" "5406958043_L" "5406958047_A"
[16] "5406958047_B" "5406958047_C" "5406958047_D" "5406958047_E" "5406958047_F"
[21] "5406958047_G" "5406958047_H" "5406958047_I" "5406958047_J" "5406958047_K"
[26] "5406958047_L" "5406958048_A" "5406958048_B" "5406958048_C" "5406958048_D"
[31] "5406958048_E" "5406958048_F" "5406958048_G" "5406958048_H" "5406958048_I"
[36] "5406958048_J" "5406958048_K" "5406958048_L"


table( is.element( names(controlData(lumiExpr)), badsamp ) )

   36     2


table( is.element( sampleNames(lumiExpr), badsamp) )

   34     2



