Wow. Justin, thank for thoroughly researching this problem. I'll
hopefully have an answer for you in the next day or two.
Kipper
On 06/25/2014 07:09 PM, Justin Meskas wrote:
> Hello Kipper and Pratip,
>
> Thank you for your explanations. After looking at my data more
closely, I found that about half of the cases where flowClean was
removing only the first compartment were consistent with the shape of
the data. The other half of these files seemed to just remove the
first compartment randomly. I have created an R source code file you
can use to replicate this result. I have put it into a .tar.gz file
and will transfer it to you from my google drive in a follow up email.
Please do not redistribute the data. Inside the .tar.gz file there is
a folder called Figures that can be regenerated using the code. The
figures in Figures/Clean show the output of flowClean, while
Figures/CleanTest show plots of Marker Vs Time that I created using
plotDens from flowDensity. (I am using these Marker vs Time plots to
judge if a certain section of the data should be removed or not.)
>
> Files "SPLN_L000030297_P3_090.fcs" and "SPLN_L000031107_P3_141.fcs"
show when flowClean has removed the first compartment when I believe
it should not of been. The other 5 FCS files show cases where
flowClean seems to also give poor results (The other non-first-
compartment-removed files all looked good). In my opinion, flowClean
should be removing, from the following files, the following sections:
>
> SPLN_L000018651_Size_113 - 0-5% marks
> SPLN_L000018653_Size_115 - 0-5% and 75-80% marks
> SPLN_L000018656_Size_118 - 0-5% marks
> SPLN_L000019881_Size_148 - 0-5% and 20-25% marks
> SPLN_L000028450_P3_054 - 0-5%, 12-17% and 55-60% marks
> SPLN_L000030297_P3_090 - Nothing
> SPLN_L000031107_P3_141 - Nothing
>
> For SPLN_L000018653_Size_115, SPLN_L000018656_Size_118 and
SPLN_L000028450_P3_054 there seems to be certain locations where only
one marker is having a problem and it is not removed. Is it the case
that flowClean does not consider 1 marker problems to be substantial
enough to remove?
>
> Any insight you might have on any of these problems would be greatly
appreciated. Thank you very much,
>
> Justin
>
> P.S. I have made the code, hopefully, easy enough to use so all you
have to do is change the working directory to the folder that the
files have been extracted to. Let me know if there are any problems
with the code.
>
> ________________________________________
> From: Pratip K. Chattopadhyay [pchattop at mail.nih.gov]
> Sent: June 25, 2014 7:07 AM
> To: Kipper Fletez-Brant
> Cc: Justin Meskas; Ryan Brinkman; bioconductor at r-project.org
> Subject: Re: flowClean
>
> There are probably a couple of factors at work here...
>
> The HTS is more likely to exhibit anomalies early in collection for
various reasons... The pressure in the system may still be building
up, the cells are settled in the bottom of the well and so more events
go through at once, clogs/debris from previous wells/runs may
dislodge. In principle, the system is engineered to avoid these
issues, but in practice, I often (but not always) see anomalies at the
beginning of the collection. Interestingly, on days/runs where there
aren't many bad regions flagged, the early regions also look good.
This inspires confidence that the algorithm is detecting true problems
and doesn't have some systematic problem.
>
> The second factor - relevant to the case where you felt the first
events weren't too bad - is guilt by association. Kipper has built in
a little buffer to take out some bins that neighbor trouble spots,
just to keep things as clean as possible.
>
> Best, Pratip
>
> [cid:part1.07090309.09090509 at mail.nih.gov]
> Kipper Fletez-Brant<mailto:cafletezbrant at="" gmail.com="">
> June 25, 2014 8:56 AM
> Hi Justin,
>
> We (Pratip and I) think it may likely be your data - we have
observed that the early time points of collection in a flow run tend
to have the most errors. Pratip can speak a little more to the
technical causes of this. We appreciate your comments and look
forward to the results of your tests.
>
> Kipper
>
>
> Hi Kipper,
>
> On second thought, I think it is my data. I just checked a few files
and they seem to be consistent with only removing the first
compartment. I will run some tests tomorrow to validate this. Sorry
for the emails.
>
> Thanks,
> Justin
>
> ________________________________________
> From: Justin Meskas
> Sent: June 24, 2014 4:31 PM
> To: Kipper Fletez-Brant
> Cc: Ryan Brinkman; bioconductor at r-project.org<mailto:bioconductor at="" r-project.org="">
> Subject: RE: flowClean
>
> Hi Kipper,
>
> Sorry to keep emailing you, but I had another question about
flowClean. I have been noticing that the clean function seems to label
the first compartment for removal every time. This seems odd to me. I
attached two figures. The figure called "A..." looks like most other
figures, where the first compartment is labelled for removal. And the
other figure, called "B...", is my unique case where, I believe
anyway, the first compartment should be removed, but not the second.
Are all these files somehow accidentally removing the first
compartment? Or do you think it is the case that all these files have
bad data at the beginning?
>
> Thank you,
> Justin