Entering edit mode
Guest User
★
13k
@guest-user-4897
Last seen 10.2 years ago
Dear all,
I am working on a set of Affymetrix GeneChip miRNA 4.0 microarray data
and would like to perform differential expression analysis using
Bioconductor packages. Since this is a fairly new platform, no CDF and
annotation packages are available in bioconductor repository at the
moment. Affymetrix folks kindly provided me miRNA 4.0 CDF file as well
as sample CEL data. So I desided to create a CDF package by my own
using make.cdf.package() from makecdfenv package. I was able to make
the package and install it without trouble. However, after I read the
raw CEL files and normalized the affybatch with vsnrma()/rma(), I
found the number of probesets is only 25065 while the number is 36249
in original Affymetrix miRNA 4.0 CDF file. I am aware that from
version 4, Affymetrix changed their naming convention for the probeset
IDs, but this shouldn't cause the problem of missing probesets. What I
did wrong? I would really appreciate if anyone could give me some
hints/advices on solving this
problem.
-Lei
--
Lei Huang
Center for Research Informatics
Biological Science Division
University of Chicago
http://cri.uchicago.edu
--
P.S. The following are the code and output from my R session:
> setwd("~/Documents/Project/mirna/GeneChip 4-0 Array Sample Data")
> library(affy)
> library(makecdfenv)
Loading required package: affyio
> pkgpath <- tempdir()
> pname <- cleancdfname(whatcdf("20131118_Human-Brain-
AM7962-130ng_rep1_(miRNA-4_0).CEL"))
> make.cdf.package("miRNA-4_0-st-v1.cdf",
cdf.path="~/Documents/Project/mirna/miRNA-4_0-st-v1_CDF",
+ compress=FALSE, species = "", packagename=pname,
package.path = pkgpath)
Reading CDF file.
Creating CDF environment
Wait for about 251 dots...............................................
......................................................................
......................................................................
..................................................................
Creating package in /var/folders/rh/rrlg3bcs6kgcj89zm4mgjjxh0000gq/T//
RtmpRos3Be/mirna40cdf
README PLEASE:
A source package has now been produced in
/var/folders/rh/rrlg3bcs6kgcj89zm4mgjjxh0000gq/T//RtmpRos3Be/mirna40cd
f.
Before using this package it must be installed via 'R CMD INSTALL'
at a terminal prompt (or DOS command shell).
If you are using Windows, you will need to get set up to install
packages.
See the 'R Installation and Administration' manual, specifically
Section 6 'Add-on Packages' as well as 'Appendix E: The Windows
Toolset'
for more information.
Alternatively, you could use make.cdf.env(), which will not require
you to install a package.
However, this environment will only persist for the current R session
unless you save() it.
## install the cdf package from shell
## cd to mirna40cdf location
## R CMD INSTALL mirna40cdf
> library(limma)
> library(vsn)
> library(mirna40cdf)
>
> affybatch <- ReadAffy(filenames=list.files())
> affybatch at cdfName
[1] "miRNA-4_0"
## normalization
> eset.norm <- vsnrma(affybatch)
vsn2: 292681 x 8 matrix (1 stratum).
Please use 'meanSdPlot' to verify the fit.
Calculating Expression
## only 25,065 probesets, the original Affymetrix cdf file contains
36,249 probesets
> dim(eset.norm)
Features Samples
25065 8
-- output of sessionInfo():
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel stats graphics grDevices utils datasets
methods base
other attached packages:
[1] mirna40cdf_1.38.0 AnnotationDbi_1.24.0 vsn_3.30.0
[4] limma_3.18.9 makecdfenv_1.38.0 affyio_1.30.0
[7] affy_1.40.0 Biobase_2.22.0 BiocGenerics_0.8.0
loaded via a namespace (and not attached):
[1] BiocInstaller_1.12.0 compiler_3.0.2 DBI_0.2-7
[4] grid_3.0.2 IRanges_1.20.6 lattice_0.20-24
[7] preprocessCore_1.24.0 RSQLite_0.11.4 stats4_3.0.2
[10] tools_3.0.2 zlibbioc_1.8.0
--
Sent via the guest posting facility at bioconductor.org.