Question

Selecting probe data from Clariom D assays using the fid probe tags

0

Entering edit mode

AliStair Rust • 0

@alistair-rust-18202

Last seen 6.5 years ago

Hi,

I'm likely to be working with some Clariom D assays and I'm starting to do some initial explorations.

There are some example CEL files provide on the Thermo Fisher website and I'm using those as my test set.

I'm interested in being able to select specific sets of probes from the total set of probes on the assay. Having run the following code below for a single CEL file, I'm confused as to how I can use the fid of a probe to be able to select it from the associated expression set.

I can extract the probe fids using getProbeInfo but I'm stuck after that. I notice that featureData is set to "none" for the CEL data read in.

Is a probe's fid its index in the expression set matrix?

Thanks

Alistair

> library(oligo) > library(affycoretools) > d_file <- "Clariom/Clariom_D/sample_data/WTPlus_Liver_Rep1_ClariomD.CEL" > d_data <- read.celfiles(d_file) Platform design info loaded. Reading in : /Clariom/Clariom_D/sample_data/WTPlus_Liver_Rep1_ClariomD.CEL

> d_data HTAFeatureSet (storageMode: lockedEnvironment) assayData: 6892960 features, 1 samples element names: exprs protocolData rowNames: WTPlus_Liver_Rep1_ClariomD.CEL varLabels: exprs dates varMetadata: labelDescription channel phenoData rowNames: WTPlus_Liver_Rep1_ClariomD.CEL varLabels: index varMetadata: labelDescription channel featureData: none experimentData: use 'experimentData(object)' Annotation: pd.clariom.d.human

> d_probes <- getProbeInfo(d_data, field = c('fid', 'fsetid', 'type', 'x', 'y')) > head(d_probes) fid man_fsetid fsetid x y type 1 6 PSR1700199794.hg.1 24403702 5 0 main->psr 2 8 24657315 24657315 7 0 <NA> 3 8 PSR1300152110.hg.1 24258776 7 0 main->psr 4 9 PSR0200224250.hg.1 23827198 8 0 main->psr 5 11 24587906 24587906 10 0 <NA> 6 11 PSR0300183028.hg.1 23858357 10 0 main->psr

> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=C
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=C
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
 [1] pd.clariom.d.human_3.14.1 DBI_1.0.0
 [3] RSQLite_2.1.1             affycoretools_1.50.6
 [5] oligo_1.42.0              Biostrings_2.46.0
 [7] XVector_0.18.0            IRanges_2.12.0
 [9] S4Vectors_0.16.0          Biobase_2.38.0
[11] oligoClasses_1.40.0       BiocGenerics_0.24.0

clariom pd.clariom.d.human • 1.1k views

ADD COMMENT • link updated 6.5 years ago by James W. MacDonald 68k • written 6.5 years ago by AliStair Rust • 0

score 1 · Answer 1 · 2018-11-05

1

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 3 hours ago

United States

The simple answer is yes, the fid is the row of the probe-level expression matrix.

ADD COMMENT • link 6.5 years ago James W. MacDonald 68k

0

Entering edit mode

Thanks James.

So something like the following, for the small example above, would give me what I expect:

wish_list_probe_fids <- c(6, 8, 9, 11)

eset_filtered <- exprs(d_data)[wish_list_probe_fids]