makecdfenv package bug
2
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States
Hi Danny, In general, you don't use the makecdfenv/affy pipeline for tiling arrays, as there aren't (to my knowledge) any probesets. Instead, there are just probes, tiled along the genome. The affy package is predicated upon the idea that a set of probes are all grouped into a probeset, which is intended to measure the expression of a transcript. Since the tiling arrays are completely different, the two don't really mix. Normally I would point you to the oligo package, but you need to build a pdInfoPackage, which expects a bpmap file, not a cdf. In addition, I tried to read in the cdf that you can get from GEO using readCdfUnits() from affxparser, and it consistently segfaulted, so there might be a problem with the cdf itself. Looking around, it appears you might be better served by using either aroma (http://www.aroma-project.org/), which is supposed to handle tiling arrays (but since aroma uses affxparser, maybe it won't work). Or you could try Affy's software: http://www.affymetrix.com/estore/partners_programs/programs/developer/ TilingArrayTools/index.affx Best, Jim On 6/2/2014 2:55 PM, Danny Arends wrote: > Hey, > > I got a bug trying to create a custom cdf environment, which I need to > analyse some affy arrays: > > Both functions give the same error: > > > make.cdf.package("GPL16303_TilingatSNPtilx520433_At_TAIRG.cdf", > species="Arabidopsis_Thaliana") > Reading CDF file. > Creating CDF environment > Wait for about 0 dots > Error in assign(x[i], value[[i2]], envir = envir, inherits = inherits) : > invalid first argument > > > env <- make.cdf.env("GPL16303_TilingatSNPtilx520433_At_TAIRG.cdf") > Reading CDF file. > Creating CDF environment > Wait for about 0 dots > Error in assign(x[i], value[[i2]], envir = envir, inherits = inherits) : > invalid first argument > > My R version: > > version > platform x86_64-pc-linux-gnu > arch x86_64 > os linux-gnu > system x86_64, linux-gnu > status > major 3 > minor 1.0 > year 2014 > month 04 > day 10 > svn rev 65387 > language R > version.string R version 3.1.0 (2014-04-10) > nickname Spring Dance > > Is there any fix for this???, because I really wanna look into my array > data... > > Gr, > Danny Arends -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
cdf affy affxparser oligo cdf affy affxparser oligo • 1.8k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States
Hi Danny, Depends on what you think is 'easy'. ;-D Note that the celfiles are read in row by row. This includes all the poly-A 'landing lights' that the scanner uses to figure out how to align the camera. Also note that there is a function in affy called read.probematrix(), which just reads the data from multiple celfiles into a matrix, where the row of the matrix corresponds to the 'index' location of the probe on the array. The mapping file you reference has this sort of data: AT1G01010_at 1 + 3783 1193 1062 NA AT1G01010_at 1 + 3888 980 824 NA AT1G01010_at 1 + 4015 927 176 NA AT1G01010_at 1 + 4195 1542 695 NA AT1G01010_at 1 + 4525 1527 762 NA AT1G01010_at 1 + 4712 760 830 NA AT1G01010_at 1 + 4789 1239 6 NA AT1G01010_at 1 + 4860 626 38 NA AT1G01010_at 1 + 5009 1021 7 NA Where the columns are (in order) Probe ID Chr Strand Start X Y Probeset name (which is NA, as there are no probesets) So you have the (x,y) coordinates of the probes on the chip, and where the probes are in the genome, but when you read the data in you just have the index position. So you need to convert the (x,y) coordinates to the index positions. There is a function in affy called xy2indices that you can use to convert things. All you need to know is the number of columns in the array, which you can get from read.celfile.header(). So you could hypothetically read in the data using read.probematrix(), normalize using (probably easiest) normalizeBetweenArrays() from limma, convert the (x,y) probe locations to indices, merge things appropriately, and then if you want to be really cool, put all that into a GRanges object so you can use things like Gviz to make sweet plots. Best, Jim On 6/3/2014 11:07 AM, Danny Arends wrote: > Hey James, > > Thanks for your answer, I'll look into your suggestions... > > However just to be sure, is there an 'easy' hack to get the probes out > of the CDF file and match them to the CEL file information? > > I have available found the following files that describe the array: > cdf.gz > desc.txt.gz > mapping.txt.gz > probe_tab.txt.gz > > Just getting the probe locations or sequences is enough, then I could > start the analysis myself > (either by mapping probes to the reference using blast, or using the > supplied locations), > > I was hoping that I could use the Affy package for normalization of > probe intensities, etc > > Gr, > Danny > > > > 2014-06-02 21:49 GMT+02:00 James W. MacDonald <jmacdon at="" uw.edu=""> <mailto:jmacdon at="" uw.edu="">>: > > Hi Danny, > > > In general, you don't use the makecdfenv/affy pipeline for tiling > arrays, as there aren't (to my knowledge) any probesets. Instead, > there are just probes, tiled along the genome. > > The affy package is predicated upon the idea that a set of probes > are all grouped into a probeset, which is intended to measure the > expression of a transcript. Since the tiling arrays are completely > different, the two don't really mix. > > Normally I would point you to the oligo package, but you need to > build a pdInfoPackage, which expects a bpmap file, not a cdf. In > addition, I tried to read in the cdf that you can get from GEO using > readCdfUnits() from affxparser, and it consistently segfaulted, so > there might be a problem with the cdf itself. > > Looking around, it appears you might be better served by using > either aroma (http://www.aroma-project.org/__), which is supposed to > handle tiling arrays (but since aroma uses affxparser, maybe it > won't work). > > Or you could try Affy's software: > > http://www.affymetrix.com/__estore/partners_programs/__programs/ developer/__TilingArrayTools/index.affx > <http: www.affymetrix.com="" estore="" partners_programs="" programs="" dev="" eloper="" tilingarraytools="" index.affx=""> > > Best, > > Jim > > > > > On 6/2/2014 2:55 PM, Danny Arends wrote: > > Hey, > > I got a bug trying to create a custom cdf environment, which I > need to > analyse some affy arrays: > > Both functions give the same error: > > > > make.cdf.package("GPL16303___TilingatSNPtilx520433_At___TAIRG.cdf", > species="Arabidopsis_Thaliana"__) > Reading CDF file. > Creating CDF environment > Wait for about 0 dots > Error in assign(x[i], value[[i2]], envir = envir, inherits = > inherits) : > invalid first argument > > > env <- > make.cdf.env("GPL16303___TilingatSNPtilx520433_At___TAIRG.cdf") > Reading CDF file. > Creating CDF environment > Wait for about 0 dots > Error in assign(x[i], value[[i2]], envir = envir, inherits = > inherits) : > invalid first argument > > My R version: > > version > platform x86_64-pc-linux-gnu > arch x86_64 > os linux-gnu > system x86_64, linux-gnu > status > major 3 > minor 1.0 > year 2014 > month 04 > day 10 > svn rev 65387 > language R > version.string R version 3.1.0 (2014-04-10) > nickname Spring Dance > > Is there any fix for this???, because I really wanna look into > my array > data... > > Gr, > Danny Arends > > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
Danny Arends ▴ 10
@danny-arends-6585
Last seen 10.2 years ago
Hey James, Thanks for your answer, I'll look into your suggestions... However just to be sure, is there an 'easy' hack to get the probes out of the CDF file and match them to the CEL file information? I have available found the following files that describe the array: cdf.gz desc.txt.gz mapping.txt.gz probe_tab.txt.gz Just getting the probe locations or sequences is enough, then I could start the analysis myself (either by mapping probes to the reference using blast, or using the supplied locations), I was hoping that I could use the Affy package for normalization of probe intensities, etc Gr, Danny 2014-06-02 21:49 GMT+02:00 James W. MacDonald <jmacdon@uw.edu>: > Hi Danny, > > > In general, you don't use the makecdfenv/affy pipeline for tiling arrays, > as there aren't (to my knowledge) any probesets. Instead, there are just > probes, tiled along the genome. > > The affy package is predicated upon the idea that a set of probes are all > grouped into a probeset, which is intended to measure the expression of a > transcript. Since the tiling arrays are completely different, the two don't > really mix. > > Normally I would point you to the oligo package, but you need to build a > pdInfoPackage, which expects a bpmap file, not a cdf. In addition, I tried > to read in the cdf that you can get from GEO using readCdfUnits() from > affxparser, and it consistently segfaulted, so there might be a problem > with the cdf itself. > > Looking around, it appears you might be better served by using either > aroma (http://www.aroma-project.org/), which is supposed to handle tiling > arrays (but since aroma uses affxparser, maybe it won't work). > > Or you could try Affy's software: > > http://www.affymetrix.com/estore/partners_programs/programs/developer/ > TilingArrayTools/index.affx > > Best, > > Jim > > > > > On 6/2/2014 2:55 PM, Danny Arends wrote: > >> Hey, >> >> I got a bug trying to create a custom cdf environment, which I need to >> analyse some affy arrays: >> >> Both functions give the same error: >> >> > make.cdf.package("GPL16303_TilingatSNPtilx520433_At_TAIRG.cdf", >> species="Arabidopsis_Thaliana") >> Reading CDF file. >> Creating CDF environment >> Wait for about 0 dots >> Error in assign(x[i], value[[i2]], envir = envir, inherits = inherits) : >> invalid first argument >> >> > env <- make.cdf.env("GPL16303_TilingatSNPtilx520433_At_TAIRG.cdf") >> Reading CDF file. >> Creating CDF environment >> Wait for about 0 dots >> Error in assign(x[i], value[[i2]], envir = envir, inherits = inherits) : >> invalid first argument >> >> My R version: >> > version >> platform x86_64-pc-linux-gnu >> arch x86_64 >> os linux-gnu >> system x86_64, linux-gnu >> status >> major 3 >> minor 1.0 >> year 2014 >> month 04 >> day 10 >> svn rev 65387 >> language R >> version.string R version 3.1.0 (2014-04-10) >> nickname Spring Dance >> >> Is there any fix for this???, because I really wanna look into my array >> data... >> >> Gr, >> Danny Arends >> > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 565 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6