Entering edit mode
The documentation example of AnnotatedDataFrame
are quite limited and only show how to coerce between data.frame
and AnnotatedDataFrame
. How can I subset a column of an AnnotatedDataFrame
and check for equality to a particular value, without converting it to a data.frame
first, for example ?
Sorry if I am missing the point but do you mean something different from this?:
Yes, but shouldn't the usual column accessor work?
Ah I see. This is what I get:
Which explains at least why it is not working (you get similar for for tmp[1,]). I guess the idea is that subsetting with `[` should return an `AnnotatedDataFrame` object, but accessing directly with `$` gets you the values. I have no idea if this is the intended behavior.The philosophy is that
[
is an 'endomorphism' -- it returns the class as it is applied to.$
and[[
are not. Also, usepData()
rather than slot access, and (strongly) considerS4Vectors::DataFrame
for a more modern implementation of theAnnotatedDataFrame
concept.Thanks! Why is pData() preferred over slot access?
The 'usual' reasons for object-oriented programming -- it separates the user-oriented interface from design considerations employed by the developer. Often not much divergence but for instance the slots (internal developer business) of a DNAStringSet have little to do with the interface designed for the user.
Sorry if I look persistent on this but I didn't consider using $ or [[ as slot access. But maybe I am mistaken? I see (at least) three ways to access the data in the example above:
Maybe I misunderstood and when you said "pData() rather than slot access" you meant example 3 here?
Yes, I meant example 3; the
@data
in C: Comparison of Column of AnnotatedDataFrame is slot access. tmp$Sepal.Length; pData(tmp)$Sepal.Legnth;
pData(tmp[,"Sepal.Length"])$Sepal.Length etc would be acceptable, as with[[
.I see! I completely forgot I used the slot to access the data for checking in my original example. Now I have no idea why I did that on the first place. I have updated that comment to avoid misleading potential readers. Thank you!
And this works also:
One more thought. I guess this behavior is also consistent with that of `data.frame(..., drop = FALSE)`. `[` will always return a data.frame, whereas `$` and `[[` return a vector.
It would be nice if there was a section of documentation titled Accessors.