Combining pfilter (wateRmelon) and preprocessNoob (minfi)
1
0
Entering edit mode
@sarablocquiaux-21717
Last seen 5.0 years ago

Dear all,

I want to combine the packages minfi and wateRmelon to analyse my EPIC methylation data. I loaded my data as an RGChannelSetExtended into R with Minfi. Next I wanted to filter out low quality samples and probes with the pfilter function of wateRmelon. Downside: the function returns a Methylset, which I cannot use in the preprocessNoob function of Minfi (it requires a RGChannelSet). In the older versions of watermelon I noticed it was possible to return a RGChannelSetExtended.

Is there anyone who can help me with this problem?

Best,

Sara

wateRmelon minfi methylation pfilter preprocessNoob • 1.8k views
ADD COMMENT
0
Entering edit mode

Is the goal to filter on detection p-values, or are you wanting to filter on bead count as well?

ADD REPLY
1
Entering edit mode
tgorri ▴ 10
@tgorri-10034
Last seen 5.3 years ago

Hi Sara,

Performing pfilter of RGChannelSets has always been problem due to how the detection p values are calculated for the probes are calculated. The previous method did return an RGChannelSet which then required the manual filtering of data after normalization but this has been changed for a while now.

If you do want to use pfilter the only thing I can recommend is to run pfilter on your data and store it to an object, then normalize the older data object with preprocessNoob then subset the normalised data by the row and colnames of the pfilter object.

filt <- pfilter(data)
norm <- preprocessNoob(data)
norm_filt <- norm[rownames(filt), colnames(filt)]

I know it is not the most ideal method as we do recommend that you apply pfilter prior to normalization. We will look into this and try and come up with something for the upcoming bioconductor version.

ADD COMMENT
0
Entering edit mode

In minfi we have the function subsetByLoci() which might be useful in these situations. The issue is that an RGChannelSet is indexed by Addresses (kind of locations of the probes) while a MethylSet (and friends) is indexed by CpG names and we sometimes have multiple addresses <-> 1 CpG because of the array design.

This function "Subset an RGChannelSet by CpG loci." The usage should be pretty clear, from the manage:

   loci <- c("cg00050873", "cg00212031", "cg00213748", "cg00214611")
   subsetByLoci(RGsetEx.sub, includeLoci = loci)
   subsetByLoci(RGsetEx.sub, excludeLoci = loci)

Best, Kasper

ADD REPLY

Login before adding your answer.

Traffic: 651 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6