Question

Limma filtering based on treatment group

0

Entering edit mode

jamie.gearing ▴ 60

@jamiegearing-12556

Last seen 8 months ago

Australia

Hello,
I have a question about filtering lowly expressed probes (or genes) when using limma.
For the example in section 17.4 of the limma user's guide, probes are called expressed if they exceed a cut-off in more than a given number of samples (equal to the size of the smallest treatment group):

isexpr <- rowSums(y$E > cutoff) >= 4

I was wondering whether it would make sense to use the treatment group information to look for probes that are expressed in every sample in any treatment group (or perhaps a proportion of samples for any treatment). Something like this perhaps:

proportion <- 1.0
isexpr2 <- apply(y$E > cutoff, 1, function(z){
                           any(sapply(levels(Treatment), function(treat){
                             sum(z[Treatment == treat]) >= sum(Treatment == treat)*proportion
                             }))
             })

For this example, the normal approach yields 32754 expressed probes, whereas this yields 30840 probes.
I have seen other answers on this subject (e.g. https://support.bioconductor.org/p/52762/) warning against filtering based on variance because it would affect the limma algorithms, but I am not sure whether this is quite the same thing.
Would this be a technically valid? Even if it is, it may just be unnecessarily complicated.

limma filtering • 1.3k views

ADD COMMENT • link updated 7.9 years ago by Steve Lianoglou ★ 13k • written 7.9 years ago by jamie.gearing ▴ 60

score 2 · Accepted Answer · 2017-05-17

2

Entering edit mode

Steve Lianoglou ★ 13k

@steve-lianoglou-2771

Last seen 17 days ago

United States

No. Don't do that. Your filtering criterion needs to be independent of the test statistic:

http://m.pnas.org/content/107/21/9546.long

ADD COMMENT • link 7.9 years ago Steve Lianoglou ★ 13k

0

Entering edit mode

Very good. Thanks Steve.

ADD REPLY • link 7.9 years ago jamie.gearing ▴ 60