Combining Affymetrix miRNA samples from different arrays
1
0
Entering edit mode
@michal-blazejczyk-2231
Last seen 10.3 years ago
Dear group, We have a customer who has two datasets obtained using two Affymetrix miRNA arrays: miRNA-1_0 and miRNA-1_0_2xgain. The data are supposed to be analyzed together but the problem is that the expression profiles from both arrays look very different. Here's what I did so far: - Normalized both datasets separately using APT. I used APT because to my knowledge there is no CDF package for the 2xgain array in Bioconductor. - Then I applied an additional quantile normalization step using limma. The problem is that even with that, the datasets have very different profile patterns (when I do a PCA plot, 96% of the variance is explained by the difference in array types). I was hoping that maybe someone here has encountered this type of a problem before, and would be willing to share his/her thoughts. Best regards, Michal Blazejczyk FlexArray Lead Developer McGill University and Genome Quebec Innovation Centre http://genomequebec.mcgill.ca/FlexArray
Normalization cdf Normalization cdf • 777 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 23 hours ago
United States
Hi Michal, On 12/9/2010 12:41 PM, Michal Blazejczyk wrote: > Dear group, > > We have a customer who has two datasets obtained using two Affymetrix > miRNA arrays: miRNA-1_0 and miRNA-1_0_2xgain. The data are supposed > to be analyzed together but the problem is that the expression profiles > from both arrays look very different. Here's what I did so far: > - Normalized both datasets separately using APT. > I used APT because to my knowledge there is no CDF package for the > 2xgain array in Bioconductor. True, and there probably won't be one until April. I have built the cdf package, and it is identical to the miRNA-1_0 package, so you can simply change the cdfName of the AffyBatch you get with the 2xgain chips to miRNA-1_0 and go from there. > - Then I applied an additional quantile normalization step using limma. > The problem is that even with that, the datasets have very different > profile patterns (when I do a PCA plot, 96% of the variance is explained > by the difference in array types). You won't be able to normalize together. There are things you can do, however. The simplest thing would be to assume that the differences between sample types are just location differences (rather than scale differences), and fit a model with a batch effect. Or you can assume scale differences as well, and convert to Z-scores first. Or you can go the full meta-analysis route and just use the p-values from separate analyses. It all depends on what your goals are, and what the data look like. But more complicated analyses like these lend themselves to help from a local statistician rather than help via listserv. Best, Jim > > I was hoping that maybe someone here has encountered this type of a > problem before, and would be willing to share his/her thoughts. > > Best regards, > > Michal Blazejczyk > FlexArray Lead Developer > McGill University and Genome Quebec Innovation Centre > http://genomequebec.mcgill.ca/FlexArray > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
ADD COMMENT

Login before adding your answer.

Traffic: 512 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6