CNV parameter not working - MEDIPS
1
0
Entering edit mode
@anupriyaverma1408-13011
Last seen 6.5 years ago

I ran following sets of commands for getting differential methylation profiles & CNVs :

library(MEDIPS)
library(BSgenome.Mmusculus.UCSC.mm9)

BSgenome="BSgenome.Mmusculus.UCSC.mm9"

uniq=1e-3

extend=300

shift=0

ws=100


WT_Input = MEDIPS.createSet(file = "/Medip-seq_project/Alignment_BAMs_and_BEDs/3-WT-Input_S3.bam", BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select=NULL, paired = T)

WT_5mC = MEDIPS.createSet(file = "/Medip-seq_project/Alignment_BAMs_and_BEDs/1-WT-5mC_S1.bam", BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select=NULL, paired = T)


LshKO_Input = MEDIPS.createSet(file = "/Medip-seq_project/Alignment_BAMs_and_BEDs/4-LshKO-Input_S4.bam", BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select=NULL, paired = T)

LshKO_5mC = MEDIPS.createSet(file = "/Medip-seq_project/Alignment_BAMs_and_BEDs/2-LshKO-5mC_S2.bam", BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select=NULL, paired = T)  #chr.select=NULL doesn't affect further steps.

CS1 = MEDIPS.couplingVector(pattern = "CG", refObj = WT_Input)

mr.edgeR = MEDIPS.meth(MSet1 = WT_5mC, MSet2 = LshKO_5mC, CSet = CS1, ISet1 = WT_Input , ISet2 = LshKO_Input, p.adj = "bonferroni", diff.method = "edgeR", MeDIP = T, CNV = T, minRowSum = 10)

I got this error :

Creating results table...
Adding differential coverage results...
Adding CNV results...
Error in cnv.combined[i, 1]:cnv.combined[i, 2] : NA/NaN argument
In addition: Warning messages:
1: In MEDIPS.rms(MSet1[[i]], CSet, ccObj = ccObj) : NaNs produced
2: In MEDIPS.rms(MSet2[[i]], CSet, ccObj = ccObj) : NaNs produced

What does this error means? If I run function MEDIPS.meth with "CNV = F" jsut to get differntial methylation profiles then it's coming fine. But how to get CNVs?

Please help me out. Thank you :)

medips medip-seq error lukas • 1.9k views
ADD COMMENT
1
Entering edit mode
@matthias-lienhard-6292
Last seen 10 months ago
Max Planck Institute for molecular Gene…

Hi Anupriya Verma,

MEDIPS.createSet calculates CNVs based on the window size selected for the primary MeDIP seq analysis. In most cases, the input coverage is not sufficient for CNV detection at this resolution. Maybe, that this somehow leads to the error message you got. The suggested way would be to set CNV=F in this step, and then add CNV information using MEDIPS.addCNV. This function allows the user to specify a different window size. See also the notes on the CNV parameter at the MEDIPS.meth help page

> ?MEDIPS.meth

Furthermore, the error might be somehow related to the nonchoromosomal conigs in the mm9 reference. You could restrict the analysis on the chromosomes, e.g. by setting

> chr.select=paste0("chr",c(1:19,"X",Y"))

By the way, our new analysis package "QSEA" provides far more options to estimate and deal with CNV (among other benefits). In particular, it allows using the CNV within the normalization, and thereby considers CNV information in the detection of differentially methylated regions, rather than just providing this information in the output table. For MeDIP seq analysis it can completely replace the MEDIPS package -except from some QC metrics, that are not implemented in QSEA.

Best, Matthias

 

ADD COMMENT
0
Entering edit mode

Hi Matthias,
Thanks a lot. It worked the way you told me. Sub-setting the chromosome set using -> chr.select=paste0("chr",c(1:19,"X",Y")) was the thing that was needed. Now MEDIPS.meth is also working with 'CNV = T'. 

Can MEDIPS be used to get CNVs for one MSet say MSet1 ? :

mr.edgeR_LshKO = MEDIPS.meth(MSet1 = LshKO_5mC, CSet = CS1, ISet1 = LshKO_Input, p.adj = "bonferroni", diff.method = "edgeR", MeDIP = T, CNV = T, minRowSum = 10)


Running this above commands throws error : Cannot perform CNV analysis- please specify two groups of INPUT SETs!

Probably I'm not getting how CNVs' log2 ratios are calculated.

Regards,
 Anupriya 

ADD REPLY
1
Entering edit mode

Hi Anupriya,

within MEDIPS, the calculation of CNV works by comparing the local coverage of the input sample(s)  that contain CNVs (ISet2) to the control input sample(s) (ISet1). Therefore it is required to specify both sets.

However, probably the read coverage is not sufficient to run the CNV detection on 100 base windows. Typical values for the CNV analysis would be 100000 to 1000000 base windows. In MEDIPS, this would be done by initially specifying CNV=FALSE, and than use MEDIPS.addCNV().

I again want to encourage you to have a look at the QSEA package, which is has a much more integrated and convenient way of inferring, using, and depicting CNV information.

ADD REPLY
0
Entering edit mode

Hi Matthias,

I've one more doubt. After getting annotations with MEDIPS.getAnnotation function I'm getting some annotations that are having only Ensembl enzyme/protein/transcript ID (no geneIDs)  :

chr    start    stop    chr.1    start.1    stop.1    CF    X1.WT.5mC_S1.bam.counts    X2.LshKO.5mC_S2.bam.counts    X1.WT.5mC_S1.bam.rpkm    X2.LshKO.5mC_S2.bam.rpkm    X1.WT.5mC_S1.bam.rms    X2.LshKO.5mC_S2.bam.rms    X3.WT.Input_S3.bam.counts    X4.LshKO.Input_S4.bam.counts    X3.WT.Input_S3.bam.rpkm    X4.LshKO.Input_S4.bam.rpkm    MSets1.counts.mean    MSets1.rpkm.mean    MSets1.rms.mean    MSets2.counts.mean    MSets2.rpkm.mean    MSets2.rms.mean    edgeR.logFC    edgeR.logCPM    edgeR.p.value    edgeR.adj.p.value    CNV.log2.ratio    DMR_ID    1_id    1_id    2_id    1_id    2_id    3_id    4_id    5_id    6_id
chrX    130684601    130684700    chrX    130684601    130684700    1    146    194    177.27    317.53    0.55    0.62    38    32    5.85    9.77    146    177.27    0.55    194    317.53    0.62    -0.7990199711    5.3553704235    7.23115947184283E-07    0.0214613582    -0.0736    ID_90    NA    NA    NA    ENSMUSE00000546399    NA    NA    NA    NA    NA
chrX    130684701    130684800    chrX    130684701    130684800    2    147    196    178.48    320.8    0.55    0.61    39    31    6    9.47    147    178.48    0.55    196    320.8    0.61    -0.8039704999    5.3682293686    4.81168180090305E-07    0.0142805904    -0.0736    ID_90    NA    NA    NA    ENSMUSE00000546399    NA    NA    NA    NA    NA

chr9    3033801    3033900    chr9    3033801    3033900    2    3094    1929    3756.66    3157.3    0.85    0.83    9978    5311    1535.81    1622.26    3094    3756.66    0.85    1929    3157.3    0.83    0.2920778945    9.1580107642    3.16711714802772E-10    9.39968698363148E-06    -0.0316    ID_49    NA    TSS_ENSMUST00000170073    NA    NA    NA    NA    NA    NA    NA
 

Commands used:
anno.mart.gene = MEDIPS.getAnnotation(host="may2012.archive.ensembl.org",dataset=c("mmusculus_gene_ensembl")[1],annotation=c("TSS","EXON","GENE"),tssSz=c(-1000,500),chr=chr.select1) #for mm9 annotations.

roi_new = MEDIPS.setAnnotation(regions = rois, annotation = anno.mart.gene)  

How is it possible not to get geneIDs associated with transcripts/protein IDs? 

As mentioned by you, I'll be trying QSEA package after I'm done with MEDIPS. 

Thank you,
Anupriya

ADD REPLY
1
Entering edit mode

Unfortunately, MEDIPS.getAnnotation does not fetch gene names, and the function cannot be configured, such that it does fetch them. You could either modify the function, or match the ids and change the resulting output tables manually.

ADD REPLY

Login before adding your answer.

Traffic: 546 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6