Question

oligo: how to extract the number of probes that make up a probeset?

0

Entering edit mode

Guido Hooiveld ★ 4.1k

@guido-hooiveld-2020

Last seen 22 days ago

Wageningen University, Wageningen, the …

I am analyzing some mouse Gene ST 2.0 arrays, and I would to know how many probes make up a probeset when having a vector of probesets IDs (in order to evaluate the 'reliability' of the expression measurement). For example, probeset "17200001" is comprised of e.g. 8 probes, probeset "17200003" of e.g. 6, etc. However, I don't know how to do this....(though I expect some fancy SQL querying is required...). I would appreciate it if someone could give me pointer on how to best approach this. Thanks, Guido

> celFiles <- list.celfiles(full.names = TRUE, listGzipped=TRUE)
> affy.data <- read.celfiles(celFiles)
Loading required package: pd.mogene.2.0.st
Loading required package: RSQLite
Loading required package: DBI
Platform design info loaded.
Reading in : ./GSM2028011_Ctrl1.CEL.gz
Reading in : ./GSM2028012_Ctrl2.CEL.gz
Reading in : ./GSM2028013_Ctrl3.CEL.gz
Reading in : ./GSM2028014_LPS1.CEL.gz
Reading in : ./GSM2028015_LPS2.CEL.gz
Reading in : ./GSM2028016_LPS3.CEL.gz
>
> x.norm.plm <- fitProbeLevelModel(affy.data)
Background correcting... OK
Normalizing... OK
Summarizing... OK
Extracting...
  Estimates... OK
  StdErrors... OK
  Weights..... OK
  Residuals... OK
  Scale....... OK
>
>
> probesets <- head(rownames(coef(x.norm.plm)))
> probesets
[1] "17200001" "17200003" "17200005" "17200007" "17200009" "17200011"
>

oligo pd.mogene.2.0.st • 1.3k views

ADD COMMENT • link updated 8.6 years ago by James W. MacDonald 68k • written 8.6 years ago by Guido Hooiveld ★ 4.1k

score 1 · Accepted Answer · 2016-09-15

Well, if you wanna hang with the cool kids it will take some SQL. Or you can just use a function that does the SQL for you...

And btw, your count is off - each of these probesets just has four probes

> library(pd.mogene.2.0.st)
> z <- oligo:::stArrayPmInfo(pd.mogene.2.0.st)
> z[z$fsetid == 17200001,]
      fid   fsetid
1 2255251 17200001
2 2274726 17200001
3 2330659 17200001
4  642742 17200001
> z[z$fsetid == 17200003,]
      fid   fsetid
5 2146495 17200003
6 1253268 17200003
7 1561595 17200003
8  882972 17200003

> table(table(z$fsetid))

   1    2    3    4    5    6    7    8    9   10   11   12   13   14   15   16
 725  551  796 4618  105   81  116 1330 1195 1001  868  736  758  735  791  836
  17   18   19   20   21   22   23   24   25   26   27   28   29   30   31   32
1067 1303 1640 2130 2319 2508 2499 2286 2787 1421 1176  938  670  614  446  390
  33   34   35   36   37   38   39   40   41   42   43   44   45   46   47   48
 353  246  233  200  172  132  100   68   62   39   44   29   23   17   16   13
  49   50   51   52   53   54   55   56   57   58   59   60   61   62   63   64
  15    7    7    7    1    9    6    9    3    5    6    4    5    3    4    6
  65   66   67   68   69   70   71   73   74   75   76   77   78   79   80   82
   1    2    1    2    1    1    1    4    1    1    2    2    3    2    1    3
  83   84   86   87   88   92   93   96   98  103  140  190  268  273  322  407
   1    3    1    1    1    2    1    1    1    1    1    1    1    1    1    1
 585  697  703  813  849  873  912  914  940  942  949  952  959  960  963  968
   1    1    1    1    1    1    1    1    1    1    1    1    1    2    1    1
 973
   1
>