We are processing a number of arrays from the GPL13158 platform, aka Affymetrix HT HG-U133+ PM Array Plate, in a machine-learning/classification context, so we wish to use fRMA for normalization. However, this platform is different enough from ordinary HG-U133 arrays, that I'm not sure if the frma vectors would still apply, and if so, how one would apply them. (Specifically, the mismatch probes are removed, and some probesets have been reduced from 11 to 9 or 10 probes.) This platform also comes in plates of 16 or 24 arrays, which could affect how one defines "batches".
So my question is: Are there frma vectors available for this platform; or, can I obtain them easily by subsetting from a related platform; or, should I just generate my own vectors from the available GEO data and/or just my own data?
I mean no disrespect to the fRMA method, but an alternative would be to use the SCAN algorithm (http://www.bioconductor.org/packages/release/bioc/html/SCAN.UPC.html), which can be applied to these arrays on a single-sample basis without having to generate a vector.
I tried both SCAN and fRMA on a similar dataset on a platform for which fRMA vectors were available. In that particular dataset, fRMA seemed to work better. I'll certainly try both of them out on this dataset as well, if I can.
Did you use the barcodes or the SCAN/fRMA corrected estimates for your classifications? Just curious.
It seems like SCAN.UPC is ultimately a bit more flexible than the current incarnation of fRMA for completely reannotated arrays (unpublished observations of my own, which with any luck will eventually make it out there) so if there are HUGE differences I'm interested. If the differences are minor, in my applications, the biological noise appears to dwarf them.
Anyways, in your specific case, I wonder if http://www.bioconductor.org/packages/release/data/annotation/html/hthgu133afrmavecs.html would help, since
1) RMA ignores MM probes last time I checked, and
2) if there are no new probe sequences, the platform should (!) be a strict subset of the HT-HGU133A design
Have you taken a peek at that, and perhaps subsetting the included vectors for your chips?
Hope this helps.
I haven't messed around with the barcoding using either SCAN or fRMA. That's another item on my list of things to try. I was just using the corrected estimates from both.
I believe that this platform is in fact a strict subset of standard HG-U133 arrays, and I was thinking of maybe subsetting the vectors from another platform. But I'll have to figure out the internal structure of that package first. And I worry that the probes have maybe been moved around on the new design, or the difference between single arrays and plates of 16 or 24 arrays might affect things somehow.