Missing probesets when creating Affymetrix GeneChip miRNA 4.0 CDF package using makecdfenv package
0
0
Entering edit mode
Isaac Neuhaus ▴ 360
@isaac-neuhaus-22
Last seen 9.5 years ago
United States
Lei Huang [guest] <guest at="" ...=""> writes: > > > Dear all, > > I am working on a set of Affymetrix GeneChip miRNA 4.0 microarray data and would like to perform > differential expression analysis using Bioconductor packages. Since this is a fairly new platform, no > CDF and annotation packages are available in bioconductor repository at the moment. Affymetrix folks > kindly provided me miRNA 4.0 CDF file as well as sample CEL data. So I desided to create a CDF package by my own > using make.cdf.package() from makecdfenv package. I was able to make the package and install it without > trouble. However, after I read the raw CEL files and normalized the affybatch with vsnrma()/rma(), I > found the number of probesets is only 25065 while the number is 36249 in original Affymetrix miRNA 4.0 CDF > file. I am aware that from version 4, Affymetrix changed their naming conve > ntion for the probeset IDs, but this shouldn't cause the problem of missing probesets. What I did wrong? I > would really appreciate if anyone could give me some hints/advices on solving this > problem. > > -Lei > > -- > Lei Huang > Center for Research Informatics > Biological Science Division > University of Chicago > http://cri.uchicago.edu > -- > > P.S. The following are the code and output from my R session: > > > setwd("~/Documents/Project/mirna/GeneChip 4-0 Array Sample Data") > > library(affy) > > library(makecdfenv) > Loading required package: affyio > > pkgpath <- tempdir() > > pname <- cleancdfname(whatcdf("20131118_Human-Brain-AM7962- 130ng_rep1_(miRNA-4_0).CEL")) > > make.cdf.package("miRNA-4_0-st-v1.cdf", > cdf.path="~/Documents/Project/mirna/miRNA-4_0-st-v1_CDF", > + compress=FALSE, species = "", packagename=pname, package.path = pkgpath) > Reading CDF file. > Creating CDF environment > Wait for about 251 dots.................................................................. ...... ...................................................................... ...... ...................................................................... ...... ............................. > Creating package in /var/folders/rh/rrlg3bcs6kgcj89zm4mgjjxh0000gq/T//RtmpRos3Be/mirna40cd f > > README PLEASE: > A source package has now been produced in > /var/folders/rh/rrlg3bcs6kgcj89zm4mgjjxh0000gq/T//RtmpRos3Be/mirna40 cdf. > Before using this package it must be installed via 'R CMD INSTALL' > at a terminal prompt (or DOS command shell). > If you are using Windows, you will need to get set up to install packages. > See the 'R Installation and Administration' manual, specifically > Section 6 'Add-on Packages' as well as 'Appendix E: The Windows Toolset' > for more information. > > Alternatively, you could use make.cdf.env(), which will not require you to install a package. > However, this environment will only persist for the current R session > unless you save() it. > > ## install the cdf package from shell > ## cd to mirna40cdf location > ## R CMD INSTALL mirna40cdf > > > library(limma) > > library(vsn) > > library(mirna40cdf) > > > > affybatch <- ReadAffy(filenames=list.files()) > > affybatch <at> cdfName > [1] "miRNA-4_0" > > ## normalization > > eset.norm <- vsnrma(affybatch) > vsn2: 292681 x 8 matrix (1 stratum). > Please use 'meanSdPlot' to verify the fit. > Calculating Expression > > ## only 25,065 probesets, the original Affymetrix cdf file contains 36,249 probesets > > dim(eset.norm) > Features Samples > 25065 8 > > -- output of sessionInfo(): > > > sessionInfo() > R version 3.0.2 (2013-09-25) > Platform: x86_64-apple-darwin10.8.0 (64-bit) > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods base > > other attached packages: > [1] mirna40cdf_1.38.0 AnnotationDbi_1.24.0 vsn_3.30.0 > [4] limma_3.18.9 makecdfenv_1.38.0 affyio_1.30.0 > [7] affy_1.40.0 Biobase_2.22.0 BiocGenerics_0.8.0 > > loaded via a namespace (and not attached): > [1] BiocInstaller_1.12.0 compiler_3.0.2 DBI_0.2-7 > [4] grid_3.0.2 IRanges_1.20.6 lattice_0.20-24 > [7] preprocessCore_1.24.0 RSQLite_0.11.4 stats4_3.0.2 > [10] tools_3.0.2 zlibbioc_1.8.0 > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at ... > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > I came across a similar problem with a brainCDF where makecdfenv was producing a package with less probesets. I believe the problem is in the c code that does the parser of ASCII files since I was able to correct the problem by converting the text CDF into binary and then read it with the makecdfenv package library("affxparser") library(makecdfenv) convertCdf("HGU133PLUS2_HS_REFSEQ.CDF", "hgu133plus2hsrefseqcdf", version=4, verbose=TRUE) make.cdf.package("hgu133plus2hsrefseqcdf", version = packageDescription("makecdfenv", field = "Version"), species = "H. sapiens", unlink = TRUE) I hope this helps. Isaac
miRNA Annotation cdf makecdfenv miRNA Annotation cdf makecdfenv • 2.0k views
ADD COMMENT

Login before adding your answer.

Traffic: 762 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6