ReadAffy question

0

Entering edit mode

Kimpel, Mark W ▴ 890

@kimpel-mark-w-727

Last seen 10.6 years ago

I work with CEL files that frequently have names assigned randomly in respect to phenotype. I create my pdata files by modifying spreadsheets with file and phenotype information already in appropriate columns. I had been assuming that it did not matter what order the filenames were in in the first column of the pdata file, that after being read in the CEL files would be matched to the appropriate row in pdata and would thus have the correct phenotype assigned. Some recent work has indicated to me that this is probably NOT the case, instead, it appears that the files are read in by filename alphanumeric order and the phenotype and sample is assigned by row order of the pdata file. This, of course, will often result in incorrect sample names and phenotypes being assigned to files. I have searched the documentation and help files for an answer to this question to no avail. How is this supposed to work? SessionInfo() Version 2.3.0 Under development (unstable) (2006-01-01 r36947) i386-pc-mingw32 attached base packages: [1] "tcltk" "splines" "tools" "methods" "stats" "graphics" [7] "grDevices" "utils" "datasets" "base" other attached packages: tkWidgets DynDoc reposTools widgetTools rat2302cdf "1.9.0" "1.9.0" "1.9.1" "1.7.0" "1.5.1" affycoretools GOstats multtest genefilter survival "1.3.1" "1.5.4" "1.8.0" "1.9.2" "2.20" xtable RBGL annotate GO graph "1.3-0" "1.7.6" "1.8.0" "1.6.5" "1.9.4" Ruuid cluster limma affy Biobase "1.9.0" "1.10.2" "2.4.4" "1.9.6" "1.9.2" RWinEdt "1.7-3" Mark W. Kimpel I.U. School of Medicine

GO Ruuid DynDoc annotate genefilter multtest tkWidgets reposTools affy widgetTools limma • 1.1k views

ADD COMMENT • link updated 19.3 years ago by James W. MacDonald 68k • written 19.3 years ago by Kimpel, Mark W ▴ 890

0

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 2 hours ago

United States

Kimpel, Mark William wrote: > I work with CEL files that frequently have names assigned randomly in > respect to phenotype. I create my pdata files by modifying spreadsheets > with file and phenotype information already in appropriate columns. I > had been assuming that it did not matter what order the filenames were > in in the first column of the pdata file, that after being read in the > CEL files would be matched to the appropriate row in pdata and would > thus have the correct phenotype assigned. > > Some recent work has indicated to me that this is probably NOT the case, > instead, it appears that the files are read in by filename alphanumeric > order and the phenotype and sample is assigned by row order of the pdata > file. This, of course, will often result in incorrect sample names and > phenotypes being assigned to files. > > I have searched the documentation and help files for an answer to this > question to no avail. > > How is this supposed to work? Two ways; you can either input your data using the widget-based interface, or the way you are doing things now except with the rows of the phenoData object in alphanumeric order. The function read.affybatch simply takes the phenoData object as is, and assumes you have ordered things correctly. The relevant line in read.affybatch() is this: samplenames <- rownames(pdata) A call to list.celfiles() can be used to set the order of your phenoData object to ensure things line up correctly. Best, Jim > > SessionInfo() > > Version 2.3.0 Under development (unstable) (2006-01-01 r36947) > i386-pc-mingw32 > > attached base packages: > [1] "tcltk" "splines" "tools" "methods" "stats" > "graphics" > [7] "grDevices" "utils" "datasets" "base" > > other attached packages: > tkWidgets DynDoc reposTools widgetTools rat2302cdf > "1.9.0" "1.9.0" "1.9.1" "1.7.0" "1.5.1" > affycoretools GOstats multtest genefilter survival > "1.3.1" "1.5.4" "1.8.0" "1.9.2" "2.20" > xtable RBGL annotate GO graph > "1.3-0" "1.7.6" "1.8.0" "1.6.5" "1.9.4" > Ruuid cluster limma affy Biobase > "1.9.0" "1.10.2" "2.4.4" "1.9.6" "1.9.2" > RWinEdt > "1.7-3" > > > Mark W. Kimpel > I.U. School of Medicine > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor -- James W. MacDonald Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623

ADD COMMENT • link 19.3 years ago James W. MacDonald 68k

Login before adding your answer.