Entering edit mode
Guilherme Rocha
▴
40
@guilherme-rocha-6354
Last seen 7.7 years ago
Dear all,
I am trying to create the pfInfoBuilder packages for Affy's GeneChip
Human
Transcriptome Array 2.0.
I am using the "original" pgf, clf, mps, and probeset.csv files from
the
library files from Affy's website (
http://www.affymetrix.com/Auth/analysis/downloads/lf/hta/HTA-2_0
/AGCC_library_installer_HTA-2_0.zip
).
I was able to read the probeset.csv file using plain vanilla read.csv.
Thus, it is likely the solution given to a similar problem
with
Arabidopsis chips does not apply ("pdInfoBuilder fails on the new
Arabidopsis Gene ST 1.0 & 1.1 arrays",
https://stat.ethz.ch/pipermail/bioconductor/2012-March/044231.html)
Details are shown below.
Any help greatly appreciated.
Regards,
Guilherme Rocha
----------------------------------------------------------------------
--------------------------------------
R Code and output:
> library(pdInfoBuilder)
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following object is masked from 'package:stats':
xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, as.vector, cbind, colnames, duplicated, eval,
evalq,
get, intersect, is.unsorted, lapply, mapply, match, mget, order,
paste, pmax, pmax.int, pmin, pmin.int, rank, rbind, rep.int,
rownames, sapply, setdiff, sort, table, tapply, union, unique,
unlist
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Loading required package: RSQLite
Loading required package: DBI
Loading required package: affxparser
Loading required package: oligo
Loading required package: oligoClasses
Welcome to oligoClasses version 1.24.0
======================================================================
==========
Welcome to oligo version 1.26.0
======================================================================
==========
Attaching package: 'oligo'
The following object is masked from 'package:BiocGenerics':
normalize
>
> base_dir = "./"
>
> pgf = paste(base_dir, "/HTA-2_0.r1.pgf", sep="")
> clf = paste(base_dir, "/HTA-2_0.r1.clf", sep="")
> prob = paste(base_dir, "/HTA-2_0.na33.hg19.probeset.csv",
sep="")
> core_mps = paste(base_dir, "/HTA-2_0.r1.Psrs.mps", sep="")
> extended_mps = paste(base_dir, "/HTA-2_0.r1.Psrs.mps", sep="")
> full_mps = paste(base_dir, "/HTA-2_0.r1.Psrs.mps", sep="")
>
> test_csv = read.csv(paste(base_dir,
"/HTA-2_0.na33.hg19.probeset.csv", sep=""), skip=14, header=T)
>
> seed = new("AffyExonPDInfoPkgSeed",
+ pgfFile = pgf,
+ clfFile = clf,
+ probeFile = prob,
+ coreMps = core_mps,
+ extendedMps = extended_mps,
+ fullMps = full_mps,
+ author = "GR",
+ email = "anemailadress@gmail.com",
+ biocViews = "AnnotationData",
+ genomebuild = "GRCh37",
+ organism = "Human",
+ species = "Homo sapiens",
+ url = "")
>
> makePdInfoPackage(seed, destDir=base_dir);
======================================================================
==========
Building annotation package for Affymetrix Exon ST Array
PGF.........: HTA-2_0.r1.pgf
CLF.........: HTA-2_0.r1.clf
Probeset....: HTA-2_0.na33.hg19.probeset.csv
Transcript..: TheTranscriptFile
Core MPS....: HTA-2_0.r1.Psrs.mps
Full MPS....: HTA-2_0.r1.Psrs.mps
Extended MPS: HTA-2_0.r1.Psrs.mps
======================================================================
==========
Parsing file: HTA-2_0.r1.pgf... OK
Parsing file: HTA-2_0.r1.clf... OK
Creating initial table for probes... OK
Creating dictionaries... OK
Parsing file: HTA-2_0.na33.hg19.probeset.csv... OK
Parsing file: HTA-2_0.r1.Psrs.mps... OK
Parsing file: HTA-2_0.r1.Psrs.mps... OK
Parsing file: HTA-2_0.r1.Psrs.mps... OK
Creating package in .//pd.hta.2.0
Inserting 850 rows into table chrom_dict... OK
Inserting 5 rows into table level_dict... OK
Inserting 11 rows into table type_dict... OK
Inserting 577432 rows into table core_mps... OK
Inserting 577432 rows into table full_mps... OK
Inserting 577432 rows into table extended_mps... OK
Inserting 1839617 rows into table featureSet... Error in
sqliteExecStatement(con, statement, bind.data) :
RS-DBI driver: (RS_SQLite_exec: could not execute: datatype
mismatch)
>
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] C
attached base packages:
[1] parallel stats graphics grDevices utils datasets
methods
[8] base
other attached packages:
[1] pdInfoBuilder_1.26.0 oligo_1.26.0 oligoClasses_1.24.0
[4] affxparser_1.34.0 RSQLite_0.11.4 DBI_0.2-7
[7] Biobase_2.22.0 BiocGenerics_0.8.0
loaded via a namespace (and not attached):
[1] BiocInstaller_1.12.0 Biostrings_2.30.0 GenomicRanges_1.14.1
[4] IRanges_1.20.0 XVector_0.2.0 affyio_1.30.0
[7] bit_1.1-10 codetools_0.2-8 ff_2.2-12
[10] foreach_1.4.1 iterators_1.0.6 preprocessCore_1.24.0
[13] splines_3.0.2 stats4_3.0.2 zlibbioc_1.8.0
[[alternative HTML version deleted]]