FlowCore/FlowViz issues
0
0
Entering edit mode
Roger Leigh ▴ 60
@roger-leigh-4255
Last seen 10.2 years ago
On Tue, Sep 14, 2010 at 03:18:31PM -0700, Nishant Gopalakrishnan wrote: > Hi Roger, > > Answers to your questions are posted below. Dear Nishant, Many thanks for your very helpful answers. Many apologies for the delay in replying--I got sidetracked writing up my thesis and didn't have time to get back to this until recently. > 1) Gating with an ellipsoidGate > > Is there any explanation regarding how to construct a covariance matrix > from the actual > dimension I want ?. Is there an alternative constructor to create a gate > from real > dimensions? > > Ans: > As mentioned by Josef, the decision to represent gates using covariance > matrix > was made because the standards for definition of gates for flow > cytometry (gatingML specification) > decided to make use of covariance matrices as input. > > I do agree that an alternative constructor that takes in the half axes, > center > point etc would be more intuitive to use. I will try to add a method for > that. > Until then please make use of the spreadsheet provided by Josef. This is the helper function I used; feel free to adopt it if you haven't written your own yet: # Covariance matrix in Mahalanobis space (required for ellipsoidGate) cov.matrix <- function (a, b, angle) { theta <- angle * (pi/180) c1 <- ((cos(theta)^2)/a^2) + ((sin(theta)^2)/b^2) c2 <- sin(theta) * cos(theta) * ((1/a^2) - (1/b^2)) c3 <- ((sin(theta)^2)/a^2) + ((cos(theta)^2)/b^2) m1 <- matrix(c(c1, c2, c2, c3), byrow=TRUE, ncol=2) m2 <- solve(m1) m2 } # Convenience constructure for ellipsoidGate ellipsoidGate.simple <- function (distances, mean, rotation=0, parameters, filterId="defaultEllipsoidGate") { cov <- cov.matrix(distances[1], distances[2], rotation) rownames(cov) <- parameters colnames(cov) <- parameters gate <- cells <- ellipsoidGate(filterId=filterId, .gate=cov, mean=mean) gate } > 2) Plotting on a log scale > When plotting, I have the log values on the axes. However, is it > possible to either: > a) plot on logarithmic axes with the appropriate log binning, o > b) change the axis labels to be a log scale (even though the > axes are linear)? > I can do this with regular R plots, but I'm not sure if it's possible > with lattice > graphics. > > Ans: > Yes it is possible with lattice graphics as well. > Please have a look at figure 8.4 and 8.5 and the accompanying code for > generating the plots from Deepayans book at > http://lmdvr.r-forge.r-project.org/ > That should probably get close to what you are looking for. Yes, that was almost exactly what I needed. I also looked at the implementation of these examples in latticeExtra. Unfortunately, I couldn't get these to work directly for some reason, but they did when I removed some of the checks. These are the functions I'm currently using for log axes on log scales in xyplots and densityplots: logTicks <- function (lim, loc = c(1, 5)) { ii <- floor(log10(range(lim))) + c(-1, 2) main <- 10^(ii[1]:ii[2]) r <- as.numeric(outer(loc, main, "*")) r[lim[1] <= r & r <= lim[2]] } l.yscale.components.log10ticks <- function (lim, logsc = FALSE, at = NULL, ...) { ans <- yscale.components.default(lim = lim, logsc = logsc, at = at, ...) logbase <- 10 tick.at <- logTicks(logbase^lim, loc = 1:9) tick.at.major <- logTicks(logbase^lim, loc = 1) major <- tick.at %in% tick.at.major ans$left$ticks$at <- logtick.at, logbase) ans$left$ticks$tck <- ifelse(major, 1, 0.5) ans$left$labels$at <- logtick.at, logbase) ans$left$labels$labels <- as.charactertick.at) ans$left$labels$labels[!major] <- "" ans$left$labels$check.overlap <- FALSE ans } l.xscale.components.log10ticks <- function (lim, logsc = FALSE, at = NULL, ...) { ans <- xscale.components.default(lim = lim, logsc = logsc, at = at, ...) logbase <- 10 tick.at <- logTicks(logbase^lim, loc = 1:9) tick.at.major <- logTicks(logbase^lim, loc = 1) major <- tick.at %in% tick.at.major ans$bottom$ticks$at <- logtick.at, logbase) ans$bottom$ticks$tck <- ifelse(major, 1, 0.5) ans$bottom$labels$at <- logtick.at, logbase) ans$bottom$labels$labels <- as.charactertick.at) ans$bottom$labels$labels[!major] <- "" ans$bottom$labels$check.overlap <- FALSE ans } > 3) Plotting with lattice/cairo_pdf is broken > > try the above statements, but include > print(xyplot(`SS.Log` ~ `FS.Lin`, set, filter=cells, xlab="FS", > ylab=expression(log[10]~(SS)))) > instead and see if it works. > You need to call print on trellis objects to get it to work. This works just fine with the addition of print, thanks. > 4) Overlaying density plots > By default, density plots are stacked down the y-axis with a user- > specified overlap. Is is possible to directly overlay one- > dimensional histograms on top of each other to allow direct > comparison? > > Ans: > I do not think that there is an existing function available in flowViz > that let > you stack them up. > > If you want to use lattice, you will have to reshape your data a bit to > get it to work. > > nms <- sampleNames(dat) > k <- lapply(nms, function(x){ > nr <- nrow(exprs(dat[[x]])) > data.frame("FSC-H" = exprs(dat[[x]])[,"FSC-H"], samp =rep(x,= nr), > check.names =FALSE) > }) > df <- do.call(rbind, k) > densityplot( ~ `FSC-H`, df, groups = samp) > > Here is a simple way of doing that without lattice > data(GvHD) > dat <- GvHD[1:3] > plot(density(exprs(dat[[1]])[,"FSC-H"], n =512), col= col[1]) > col <- c("#FF0000", "#00FF00", "#0000FF") > for (i in 1:3) > polygon(density(exprs(dat[[i]])[,"FSC-H"], n =512), col = col[i] ) Thanks, I'll give some of these approaches a try; for now I'm using densityplots with zero overlap. > 5) Updating phenoData > I need to add certain information to my flowSet, so I'm currently > doing this by getting the phenoData, adding the missing bits and > setting it back: > > sampleNames(flowset) <- c("unstained", "isotype", "ng2") > > pd <- pData(phenoData(flowset)) > pd["isotype"] <- c(FALSE, TRUE, FALSE) > pd["description"] <- c("Unstained", "Isotype", "NG2") > > pData(phenoData(flowset)) <- pd > varMetadata(phenoData(set))["labelDescription"] <- c("Name", "Isotype", > "Description") > > This is quite long-winded. Is there a convenience function present > (or planned?) to add a single parameter which could update the > phenodata and varmetadata at the same time? It would ensure they > never get out of sync. > > Ans: I am not quite sure I understand your questions here . > There are update method for pData, phenoData, varLabels, varMetadata etc > that directly work with flowSets. Please see if these methods or their > update methods (<-) are what you are looking for . Sorry, I probably didn't describe this too clearly. If I have some descriptive metadata I wish to add to a flowSet, for example isotypecontrol <- c(TRUE, FALSE, TRUE, FALSE, TRUE, FALSE) treatment <- c("c1", "c1", "c2", "c2", "c2", "c2" What would be the recommended way to append these two extra descriptive items to the phenodata slots? The existing methods would appear to require multiple calls to append the two columns to the pData and then to update the varMetadata. e.g. pData(set)["isotype"] <- isotypecontrol pData(set)["treatment"] <- treatment varMetadata(set)["isotype","labelDescription"] <- "Isotype Control" varMetadata(set)["treatment","labelDescription"] <- "Treatment" is quite long-winded, and if you fail to update the labelDescription you get errors. Is there a convenience function or method to add everything in one go to ensure that you always have internal consistency in the flowSet phenoData/varMetadata? e.g. addphenoData("isotype", isotypecontrol, description="Isotype Control") > There's a convenience each_col to apply a function to each column in the > flowSet, but is it possible to just access a single column, or set of > specified columns? I wrote this little helper: > > capply <- function (col, func) { > function(x) { func(x at exprs[,col]) } > } > > which can then be used like so: > > fsApply(flowset, capply("FL.1.Log", median)) > fsApply(flowset, capply("FL.1.Log", sd)) > > > to get the median and sd for a single channel. However, it > doesn't work for mean, and I'm not sure why: > > fsApply(set2, capply("FL.1.Log", mean)) > Error in FUN(if (use.exprs) exprs(y) else y, ...) : > could not find function "func" > > If I call it directly, it works just fine: > > mean(flowset[[1]]@exprs[,"FL.1.Log"]) > [1] 0.4648052 > > It's not clear do me what the difference is here between the > two forms. Is there a built-in or better way to achieve the > same thing? > > ANS: If you are looking for median and median statistics for a flowSet > wont the summary method give you that ?? > data(GvHD) > summary(GvHD) It does, but it requires a number of steps to get to the data you want, and it's only a subset of the data I really need. There's no SD or %CV for example, or % falling in different gates compared to an unfiltered dataset (which requires comparing the filtered and unfiltered sets). > If you want to modify your code to get it to work, try this > fsApply(GvHD, function(x){ > median(exprs(x)[,"FSC-H"]) > }) This also works. But it's still rather long-winded, especially when you need to do it 10 times! I think I'll have to write a number of wrapper functions to make the analysis look readable (and also more robust since it will reduce the chance of mistakes!). > 7) Getting at filter summary data > > > > positive <- filter(flowset, pos) > > > summary(positive) > filter summary for frame 'isotype' > positive+: 111 of 9186 events (1.21%) > > filter summary for frame 'pos' > positive+: 12070 of 13246 events (91.12%) > > Is it possible to actually get at the raw data here? i.e. > the raw event number and/or the percentages. I'm currently > doing it the "hard way", by: > > newset <- Subset(flowset, pos) > fsApply(newset, capply("FL.1.Log", length)) / fsApply(flowset, > capply("FL.1.Log", length)) * 100 > > But there must be an easier and more efficient way of extracting this > information! Looking at the filter object, I can only extract the > strings as printed. > > Ans: > > Perhaps this example gives you what you are looking for. > dat <- read.FCS(system.file("extdata","0877408774.B08", > package="flowCore")) > rg <- rectangleGate(filterId="myRectGate", list("FSC-H"=c(200, 600), > "SSC-H"=c(0, 400))) > res <- filter(dat, rg) > > ##------------- > tmp <- as(res, "logical") > inside <- sum(tmp) > totalCount <- length(tmp) > pr <- inside*100/totalCount Ah, this does work just fine thanks. However, given that this value is summarised for you: summary(f[[1]]) CellGate+: 18662 of 23673 events (78.83%) having a method to directly extract it rather than having to reimplement the wheel (and having knowledge of the object's internals) would, I think, be a very useful improvement. Just the event numbers would be sufficient. Another issue I've found in the interim: xyplot(..., smooth=FALSE, outline=TRUE) prints the outline inconsistently. If I, for example, do an xyplot on a whole flowSet, I often see the first plot missing the outline completely, while all the following plots have an outline. I also see this issue when doing a single plot but not consistently. So far, it's always just the first plot, so it looks like a bug somewhere, but I haven't found what triggers it. Kind regards, Roger Leigh -- Centre for Immunology and Infection Tel: +44 (0)1904 328912 Department of Biology and HYMS Fax: + 44 (0)1904 328844 University of York PO Box 373 YORK YO10 5YW
GO GO • 1.5k views
ADD COMMENT

Login before adding your answer.

Traffic: 581 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6