Entering edit mode
Javier Pérez Florido
▴
840
@javier-perez-florido-3121
Last seen 6.8 years ago
Dear list,
A possible data analysis workflow for EXON arrays could be as follows
(extracted from "Exon Array data analysis using Affymetrix Power Tools
and R statistical software", Briefings in Bioinformatics):
* Normalization and summarization (at exon or gene-level) of the
array set.
* Quality control of exon array data of summarization results (to
remove possible outliers)
* Specific filtering steps, for example:
o Restrict analysis to core probesets
o Filter for undetected probesets (i.e., undetected exons),
making use of DABG (Detected above background) analysis.
o Filter for cross-hybridizing probesets (exons)
o Filter for genes undetected genes in all groups
I'm running a gene-level data analysis on Human GENE ST 1.0 (not
EXON)
arrays, which are, in principle, designed for gene expression
profiling,
that is, a gene-level analysis. My question is related to the
filtering
step. I was wondering if, once the normalization and summarization is
run at the transcript level (core), giving 33297 transcripts, the
following filtering can be run before differential expression
analysis:
* Remove control transcripts such as other_spike, AFFX,
pos_control
(normgene->exon) and neg_control (normgene->intron). This step
removes around 4156 transcripts
* Remove transcripts with very low variability through varFilter
function (genefilter package)
Since these were the steps recommended in "Bioconductor case studies"
book for 3'IVT arrays (the controls were different in 3'IVT), I was
wondering if these 2 filtering steps can also be used on Human Gene
arrays for gene-level analysis or, on the contrary, I have to run the
filtering steps described above for EXON arrays.
Thanks,
Javier
P.S. If you know any data analysis workflow document for HuGene
arrays,
please, let me know
[[alternative HTML version deleted]]