Questions related to processing large dataset in batches
0
0
Entering edit mode
@li-aiguo-nihnci-828
Last seen 10.1 years ago
Hello, everyone. Our project is a beta tester of Affy HGU133 plus 2 chips, which contain 56,000 probes, and the .cel file size in text format is about 32 MB. We currently have more than 100 chips for data process. I tried to read in the .cel files into my machine (1Gb RAM) and it can only read in 19 chips. I have been communicating with several R experts in our mailing list and some of them suggest me to split the data in batches during the probe level data analysis and combine at the probeset level using R cbine/merge function. I think that this probably is the best option for me because I have concerns on data handling capabilities even though I can upgrade my RAM to 4Gb. However, my second concern is whether the solution of batch analysis will have any effects on the final data analysis results. To my opinion, normalization across chips should be done at once across all chips. Can I have probe level normalization during the batch analysis and have an additional normalization at the probeset level across all chips after the data combination using R/bioconductor? Thanks in advance, Aiguo Lee [[alternative HTML version deleted]]
Normalization probe affy PROcess Normalization probe affy PROcess • 679 views
ADD COMMENT

Login before adding your answer.

Traffic: 1087 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6