BACKGROUND
I have four .fcs files from CyToF samples that I have exported from FlowJo 10.0.8. I am using flowCore to read in the samples so that I can then use the Rtsne package to do a t-SNE analysis of the data. I anticipate having a large number of samples to analyze in the future so I would like to streamline the reading-in of samples.
PROBLEM
1) I am able to read in the samples creating a flowSet "from scratch" like in the first example in the flowCore vignette, but when I try to use read.flowSet, I get an error that the files (at least the first one) are invalid.
2) I also get an error if I try to use "SAMPLE ID" as a keyword. I think this might have something to do with the data coming from CyToF software, rather than from DiVa on an LSR. When I look in FlowJo at the keywords available for my samples, SAMPLE ID isn't there. This isn't a big problem, just curious.
SESSION INFO
sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets
[6] methods base
other attached packages:
[1] gghlab_0.1 stringr_1.0.0 dplyr_0.4.2
[4] reshape2_1.4.1 ggplot2_1.0.1 Rtsne_0.10
[7] flowCore_1.32.2
loaded via a namespace (and not attached):
[1] assertthat_0.1 Biobase_2.26.0
[3] BiocGenerics_0.12.1 cluster_2.0.3
[5] colorspace_1.2-6 corpcor_1.6.8
[7] curl_0.9.1 DBI_0.3.1
[9] DEoptimR_1.0-3 devtools_1.8.0
[11] digest_0.6.8 git2r_0.10.1
[13] graph_1.44.1 grid_3.1.2
[15] gtable_0.1.2 httr_1.0.0
[17] labeling_0.3 lattice_0.20-33
[19] lazyeval_0.1.10 magrittr_1.5
[21] MASS_7.3-43 memoise_0.2.1
[23] munsell_0.4.2 mvtnorm_1.0-3
[25] parallel_3.1.2 pcaPP_1.9-60
[27] plyr_1.8.3 proto_0.3-10
[29] R6_2.1.0 Rcpp_0.12.0
[31] robustbase_0.92-5 rrcov_1.3-8
[33] rversions_1.0.2 scales_0.2.5
[35] stats4_3.1.2 stringi_0.5-5
[37] tools_3.1.2 xml2_0.1.1
CODE
> #locate the folder where the fcs files are
> folderPath<-"J:\\MacLabUsers\\Claire\\Projects\\GAPPS Project\\GAPPS 2015 Neutrophils\\data and analysis\\fcs and compiled flowjo\\normed live export"
>
> #The flowCore method for reading in files
> frames<-lapply(dir(folderPath, pattern = "\\.fcs",full.names = TRUE),
+ read.FCS)
> as(frames,"flowSet")
A flowSet with 4 experiments.
column names:
Cd110Di Cd111Di Cd112Di Cd113Di Cd114Di Cd116Di Dy161Di Dy162Di Dy163Di Dy164Di Er166Di Er167Di Er168Di Eu151Di Eu153Di Event_length Gd155Di Gd156Di Gd157Di Gd158Di Gd160Di Ho165Di In115Di Ir191Di Ir193Di Lu175Di Nd142Di Nd143Di Nd144Di Nd145Di Nd146Di Nd148Di Nd150Di Pr141Di Pt195Di Sm147Di Sm149Di Sm152Di Sm154Di Tb159Di Tm169Di Yb170Di Yb171Di Yb172Di Yb173Di Yb174Di Yb176Di Time
#add names to the list of flow frames
names(frames) <- sapply(frames, keyword, "SAMPLE ID")
fs<-as(frames,"flowSet")
#ERROR
Error: Replacement values are not unique.
traceback()
5: stop("Replacement values are not unique.", call. = FALSE)
4: `sampleNames<-`(`*tmp*`, value = c("NULL", "NULL", "NULL", "NULL"
))
3: `sampleNames<-`(`*tmp*`, value = c("NULL", "NULL", "NULL", "NULL"
))
2: asMethod(object)
1: as(frames, "flowSet")
#try with FILENAME instead of SAMPLE ID
names(frames) <- sapply(frames, keyword, "FILENAME")
> fs<-as(frames,"flowSet") #no error here
> str(fs)
Formal class 'flowSet' [package "flowCore"] with 3 slots
..@ frames :<environment: 0x000000001bba4180>
..@ phenoData:Formal class 'AnnotatedDataFrame' [package "Biobase"] with 4 slots
.. .. ..@ varMetadata :'data.frame': 1 obs. of 1 variable:
.. .. .. ..$ labelDescription: chr "Name"
.. .. ..@ data :'data.frame': 4 obs. of 1 variable:
.. .. .. ..$ name: chr [1:4] "J:\\MacLabUsers\\Claire\\Projects\\GAPPS Project\\GAPPS 2015 Neutrophils\\data and analysis\\fcs and compiled flowjo\\normed li"| __truncated__ "J:\\MacLabUsers\\Claire\\Projects\\GAPPS Project\\GAPPS 2015 Neutrophils\\data and analysis\\fcs and compiled flowjo\\normed li"| __truncated__ "J:\\MacLabUsers\\Claire\\Projects\\GAPPS Project\\GAPPS 2015 Neutrophils\\data and analysis\\fcs and compiled flowjo\\normed li"| __truncated__ "J:\\MacLabUsers\\Claire\\Projects\\GAPPS Project\\GAPPS 2015 Neutrophils\\data and analysis\\fcs and compiled flowjo\\normed li"| __truncated__
.. .. ..@ dimLabels : chr [1:2] "rowNames" "columnNames"
.. .. ..@ .__classVersion__:Formal class 'Versions' [package "Biobase"] with 1 slot
.. .. .. .. ..@ .Data:List of 1
.. .. .. .. .. ..$ : int [1:3] 1 1 0
..@ colnames : chr [1:48] "Cd110Di" "Cd111Di" "Cd112Di" "Cd113Di" ...
#now try reading in with read.flowSet
fs2<-read.flowSet(dir(folder, pattern = "\\.fcs"),
+ name.keyword="FILENAME",
+ phenoData=list(name="FILENAME",Filename="$FIL"))
#ERROR
Error in FUN(c("export_15-130-01_normalized_live.fcs", "export_3633_normalized_live.fcs", :
'export_15-130-01_normalized_live.fcs' is not a valid file
4: stop(paste("'", filename, "' is not a valid file", sep = ""))
3: FUN(c("export_15-130-01_normalized_live.fcs", "export_3633_normalized_live.fcs",
"export_4396_normalized_live.fcs", "exportCorrected_5217.txt_live.fcs"
)[[1L]], ...)
2: lapply(files, read.FCS, alter.names = alter.names, transformation = transformation,
which.lines = which.lines, column.pattern = column.pattern,
invert.pattern = invert.pattern, decades = decades, ncdf = ncdf,
min.limit = min.limit, emptyValue = emptyValue, dataset = dataset)
1: read.flowSet(dir(folder, pattern = "\\.fcs"), name.keyword = "FILENAME",
phenoData = list(name = "FILENAME", Filename = "$FIL"))
Thanks in advance for any insight anyone might be able to provide. I tried to push the flowSet to gitHub but the file was too big. I can put it on dropbox or something if someone thinks that would be helpful.
Claire
Claire Levy
Research Technologist
University of Washington
Hladik Lab UW-OB/GYN BB630
FHCRC VIDD Affiliate
Fantastic, this appears to work. Thanks for your help and sorry for taking so long to respond :)