Entering edit mode
Seungwoo Hwang
▴
80
@seungwoo-hwang-2520
Last seen 10.2 years ago
Dear all,
I am analyzing data from Affymetrix Human Gene 1.0 ST Array.
After inspecting its probe annotation file, it came to my attention
that it contains a lot of probesets without transcript annotation as
follows;
Total number of probesets: 33,298
(1) Probesets with annotation: 24,409 (73%)
(2) Control probesets: 4,201 (13%)
(3) Probesets without any annotation: 4,688 (14%)
I am thinking about filtering out the probesets (2) and (3) prior to
statistical tests in order to reduce the total number of probesets
that are subject to statistical tests. Doing so will make a lot of
differences in multiple testing correction, compared to doing
statistical tests on all probesets (1),(2), and (3) followed by
filtering out the probesets (2) and (3) from the DEG list.
Is this type of filtering prior to statistical tests valid? Also, has
anyone encountered a similar situation (dealing with array data with a
lot of non-gene probes).
Thanks,
Seungwoo
------------------------------------
Seungwoo Hwang, Ph.D.
Senior Research Scientist
Korean Bioinformation Center (http://www.kobic.re.kr)