Question

MEDIPS: Output appears to be truncated

0

Entering edit mode

tptacek3050 • 0

@tptacek3050-7477

Last seen 9.9 years ago

United States

I am attempting to run the latest version of MEDIPS. I have been following the example in this tutorial to set up my R script: http://www.bioconductor.org/packages/2.12/bioc/vignettes/MEDIPS/inst/doc/MEDIPS.pdf

Everything runs without any errors, however, the output doesn't look right to me. For example, here is the first few lines from MEDIPS.meth

         chr     start      stop CF X2668.SMC.0001.sort.bam.counts
1       chr1         1       100  0                              4
2       chr1       101       200  0                              6
3       chr1       201       300  2                              7
4       chr1       301       400  2                              6
5       chr1       401       500  0                              6
6       chr1       501       600  2                              4
7       chr1       601       700  0                              2
8       chr1       701       800  0                              2
9       chr1       801       900  0                              0

I think there are supposed to be more columns (one for each sample). Set 1 had 6 samples, and set 2 had 11 samples. Additionally, the first sample did not have an "X" at the begining of its name (the rest of the name is correct). Below is the code that I used to generate this data. Note that I am calling this script via Rscript from a Linux bash script. All file paths and PATH issues in .bashrc have been addressed.

#Get Arguments
args <- commandArgs(TRUE)
filepath1 <- args[1]
filepath2 <- args[2]
chrm <- args[3]
input1 <- args[4]
input2 <- args[5]

#Load MEDIPS libraries and set environment variables
library(MEDIPS)
library(BSgenome.Rnorvegicus.UCSC.rn5)
BSgenome="BSgenome.Rnorvegicus.UCSC.rn5"
uniq=TRUE
extend=300
shift=0
ws=100
chr.select <- chrm

#Get vectors with lists of files and file names
files1=list.files(path = filepath1, pattern = "\\.bam$", all.files = FALSE, full.names = TRUE, recursive = FALSE, ignore.case = FALSE, include.dirs = FALSE, no.. =FALSE)
names1=list.files(path = filepath1, pattern = "\\.bam$", all.files = FALSE, full.names = FALSE, recursive = FALSE, ignore.case = FALSE, include.dirs = FALSE, no.. = FALSE)
files2=list.files(path = filepath2, pattern = "\\.bam$", all.files = FALSE, full.names = TRUE, recursive = FALSE, ignore.case = FALSE, include.dirs = FALSE, no.. =FALSE)
names2=list.files(path = filepath2, pattern = "\\.bam$", all.files = FALSE, full.names = FALSE, recursive = FALSE, ignore.case = FALSE, include.dirs = FALSE, no.. = FALSE)

#Get length of file list vectors
len1 <- length(files1)
len2 <- length(files2)

#Set max lines for printing
options(max.print=1000000000)

#Loop through first set of files to create data set
x <- 2:len1
set1 = MEDIPS.createSet(file = files1[1], BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select = chr.select)
#print(files1[1])
for (i in seq(along=x)) {
        y <- i + 1
        set1 = c(set1, MEDIPS.createSet(file = files1[y], BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select = chr.select))
        #print(files1[y])
}

set1

#Loop through second set of files to create data set
x <- 2:len2
#print(files2[1])
set2 = MEDIPS.createSet(file = files2[1], BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select = chr.select)
for (i in seq(along=x)) {
        y <- i + 1
        set2 = c(set2, MEDIPS.createSet(file = files2[y], BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select = chr.select))
        #print(files2[y])
}

set2

#Generate input file sets
input_set1 = MEDIPS.createSet(file = input1, BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select = chr.select)
input_set2 = MEDIPS.createSet(file = input2, BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select = chr.select)

#Generate coupling set
CS = MEDIPS.couplingVector(pattern = "CG", refObj = set1[[1]])

#Calculate differential coverage
mr.edgeR = MEDIPS.meth(MSet1 = set1, MSet2 = set2, CSet = CS, ISet1 = input_set1, ISet2 = input_set2, p.adj = "bonferroni", diff.method = "edgeR", prob.method = "poisson", MeDIP = T, CNV = F, type = "rpkm", minRowSum = 1)
sink(paste0(chr.select,"_mr.edgeR.txt"), append=FALSE, split=FALSE)
mr.edgeR
sink()

medips • 2.0k views

ADD COMMENT • link updated 10.1 years ago by Lukas Chavez ▴ 570 • written 10.1 years ago by tptacek3050 • 0

score 0 · Answer 1 · 2015-03-18

0

Entering edit mode

Lukas Chavez ▴ 570

@lukas-chavez-5781

Last seen 7.2 years ago

USA/La Jolla/UCSD

Dear tptacek3050, indeed the output should contain many more columns, not just the counts for sample 'X2668.SMC.0001.sort.bam.counts’. I guess that MEDIPS fails - for any reason- to process the file names of the bam files. What is the content of files1 and files2, or what is the output when you type set1 and set2? Did MEDIPS create all individual MEDIPS SETs successfully with non redundant and unremarkable names? All the best, Lukas > On 18 Mar 2015, at 15:09, tptacek3050 [bioc] <noreply@bioconductor.org> wrote: > > Activity on a post you are following on support.bioconductor.org <https: support.bioconductor.org=""/> > User tptacek3050 <https: support.bioconductor.org="" u="" 7477=""/> wrote Question: MEDIPS: Output appears to be truncated <https: support.bioconductor.org="" p="" 65785=""/>: > > > I am attempting to run the latest version of MEDIPS. I have been following the example in this tutorial to set up my R script: http://www.bioconductor.org/packages/2.12/bioc/vignettes/MEDIPS/inst/doc/MEDIPS.pdf <http: www.bioconductor.org="" packages="" 2.12="" bioc="" vignettes="" medips="" inst="" doc="" medips.pdf=""> > Everything runs without any errors, however, the output doesn't look right to me. For example, here is the first few lines from MEDIPS.meth > > chr start stop CF X2668.SMC.0001.sort.bam.counts > 1 chr1 1 100 0 4 > 2 chr1 101 200 0 6 > 3 chr1 201 300 2 7 > 4 chr1 301 400 2 6 > 5 chr1 401 500 0 6 > 6 chr1 501 600 2 4 > 7 chr1 601 700 0 2 > 8 chr1 701 800 0 2 > 9 chr1 801 900 0 0 > > I think there are supposed to be more columns (one for each sample). Set 1 had 6 samples, and set 2 had 11 samples. Additionally, the first sample did not have an "X" at the begining of its name (the rest of the name is correct). Below is the code that I used to generate this data. Note that I am calling this script via Rscript from a Linux bash script. All file paths and PATH issues in .bashrc have been addressed. > > > #Get Arguments > args <- commandArgs(TRUE) > filepath1 <- args[1] > filepath2 <- args[2] > chrm <- args[3] > input1 <- args[4] > input2 <- args[5] > > #Load MEDIPS libraries and set environment variables > library(MEDIPS) > library(BSgenome.Rnorvegicus.UCSC.rn5) > BSgenome="BSgenome.Rnorvegicus.UCSC.rn5" > uniq=TRUE > extend=300 > shift=0 > ws=100 > chr.select <- chrm > > #Get vectors with lists of files and file names > files1=list.files(path = filepath1, pattern = "\\.bam$", all.files = FALSE, full.names = TRUE, recursive = FALSE, ignore.case = FALSE, include.dirs = FALSE, no.. =FALSE) > names1=list.files(path = filepath1, pattern = "\\.bam$", all.files = FALSE, full.names = FALSE, recursive = FALSE, ignore.case = FALSE, include.dirs = FALSE, no.. = FALSE) > files2=list.files(path = filepath2, pattern = "\\.bam$", all.files = FALSE, full.names = TRUE, recursive = FALSE, ignore.case = FALSE, include.dirs = FALSE, no.. =FALSE) > names2=list.files(path = filepath2, pattern = "\\.bam$", all.files = FALSE, full.names = FALSE, recursive = FALSE, ignore.case = FALSE, include.dirs = FALSE, no.. = FALSE) > > #Get length of file list vectors > len1 <- length(files1) > len2 <- length(files2) > > #Set max lines for printing > options(max.print=1000000000) > > #Loop through first set of files to create data set > x <- 2:len1 > set1 = MEDIPS.createSet(file = files1[1], BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select = chr.select) > #print(files1[1]) > for (i in seq(along=x)) { > y <- i + 1 > set1 = c(set1, MEDIPS.createSet(file = files1[y], BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select = chr.select)) > #print(files1[y]) > } > > set1 > > #Loop through second set of files to create data set > x <- 2:len2 > #print(files2[1]) > set2 = MEDIPS.createSet(file = files2[1], BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select = chr.select) > for (i in seq(along=x)) { > y <- i + 1 > set2 = c(set2, MEDIPS.createSet(file = files2[y], BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select = chr.select)) > #print(files2[y]) > } > > set2 > > #Generate input file sets > input_set1 = MEDIPS.createSet(file = input1, BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select = chr.select) > input_set2 = MEDIPS.createSet(file = input2, BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select = chr.select) > > #Generate coupling set > CS = MEDIPS.couplingVector(pattern = "CG", refObj = set1[[1]]) > > #Calculate differential coverage > mr.edgeR = MEDIPS.meth(MSet1 = set1, MSet2 = set2, CSet = CS, ISet1 = input_set1, ISet2 = input_set2, p.adj = "bonferroni", diff.method = "edgeR", prob.method = "poisson", MeDIP = T, CNV = F, type = "rpkm", minRowSum = 1) > sink(paste0(chr.select,"_mr.edgeR.txt"), append=FALSE, split=FALSE) > mr.edgeR > sink() > > > You may reply via email or visit MEDIPS: Output appears to be truncated >

ADD COMMENT • link 10.1 years ago Lukas Chavez ▴ 570

0

Entering edit mode

Everything appeared to finish with no errors. Here are the contents of the various data structures:

files1

[1] "raw_data_test_1/2668-SMC-0001.sort.bam"
[2] "raw_data_test_1/2668-SMC-0002.sort.bam"
[3] "raw_data_test_1/2668-SMC-0003.sort.bam"
[4] "raw_data_test_1/2668-SMC-0004.sort.bam"
[5] "raw_data_test_1/2668-SMC-0005.sort.bam"
[6] "raw_data_test_1/2668-SMC-0006.sort.bam"

names1

[1] "2668-SMC-0001.sort.bam" "2668-SMC-0002.sort.bam" "2668-SMC-0003.sort.bam"
[4] "2668-SMC-0004.sort.bam" "2668-SMC-0005.sort.bam" "2668-SMC-0006.sort.bam"

files2

 [1] "raw_data_test_2/2833-SMC-0003.sort.bam"
 [2] "raw_data_test_2/2833-SMC-0004.sort.bam"
 [3] "raw_data_test_2/2833-SMC-0005.sort.bam"
 [4] "raw_data_test_2/2833-SMC-0006.sort.bam"
 [5] "raw_data_test_2/2833-SMC-0007.sort.bam"
 [6] "raw_data_test_2/2833-SMC-0008.sort.bam"
 [7] "raw_data_test_2/2833-SMC-0009.sort.bam"
 [8] "raw_data_test_2/2833-SMC-0010.sort.bam"
 [9] "raw_data_test_2/2833-SMC-0011.sort.bam"
[10] "raw_data_test_2/2833-SMC-0012.sort.bam"
[11] "raw_data_test_2/2833-SMC-0013.sort.bam"

names2

 [1] "2833-SMC-0003.sort.bam" "2833-SMC-0004.sort.bam" "2833-SMC-0005.sort.bam"
 [4] "2833-SMC-0006.sort.bam" "2833-SMC-0007.sort.bam" "2833-SMC-0008.sort.bam"
 [7] "2833-SMC-0009.sort.bam" "2833-SMC-0010.sort.bam" "2833-SMC-0011.sort.bam"
[10] "2833-SMC-0012.sort.bam" "2833-SMC-0013.sort.bam"

set1

[[1]]
S4 Object of class MEDIPSset
=======================================
Regions file:  2668-SMC-0001.sort.bam
File path:  raw_data_test_1
Genome:  BSgenome.Rnorvegicus.UCSC.rn5
Number of regions:  3236283
Chromosomes: chr1
Chromosome lengths: 290094216
Genome wide window size:  100
Reads extended to:  300
Reads shifted by:  0
Parameter uniq:  TRUE

[[2]]
S4 Object of class MEDIPSset
=======================================
Regions file:  2668-SMC-0002.sort.bam
File path:  raw_data_test_1
Genome:  BSgenome.Rnorvegicus.UCSC.rn5
Number of regions:  3510565
Chromosomes: chr1
Chromosome lengths: 290094216
Genome wide window size:  100
Reads extended to:  300
Reads shifted by:  0
Parameter uniq:  TRUE

[[3]]
S4 Object of class MEDIPSset
=======================================
Regions file:  2668-SMC-0003.sort.bam
File path:  raw_data_test_1
Genome:  BSgenome.Rnorvegicus.UCSC.rn5
Number of regions:  4116556
Chromosomes: chr1
Chromosome lengths: 290094216
Genome wide window size:  100
Reads extended to:  300
Reads shifted by:  0
Parameter uniq:  TRUE

[[4]]
S4 Object of class MEDIPSset
=======================================
Regions file:  2668-SMC-0004.sort.bam
File path:  raw_data_test_1
Genome:  BSgenome.Rnorvegicus.UCSC.rn5
Number of regions:  3067967
Chromosomes: chr1
Chromosome lengths: 290094216
Genome wide window size:  100
Reads extended to:  300
Reads shifted by:  0
Parameter uniq:  TRUE

[[5]]
S4 Object of class MEDIPSset
=======================================
Regions file:  2668-SMC-0005.sort.bam
File path:  raw_data_test_1
Genome:  BSgenome.Rnorvegicus.UCSC.rn5
Number of regions:  2381139
Chromosomes: chr1
Chromosome lengths: 290094216
Genome wide window size:  100
Reads extended to:  300
Reads shifted by:  0
Parameter uniq:  TRUE

[[6]]
S4 Object of class MEDIPSset
=======================================
Regions file:  2668-SMC-0006.sort.bam
File path:  raw_data_test_1
Genome:  BSgenome.Rnorvegicus.UCSC.rn5
Number of regions:  2774979
Chromosomes: chr1
Chromosome lengths: 290094216
Genome wide window size:  100
Reads extended to:  300
Reads shifted by:  0
Parameter uniq:  TRUE

set2

[[1]]
S4 Object of class MEDIPSset
=======================================
Regions file:  2833-SMC-0003.sort.bam
File path:  raw_data_test_2
Genome:  BSgenome.Rnorvegicus.UCSC.rn5
Number of regions:  2948651
Chromosomes: chr1
Chromosome lengths: 290094216
Genome wide window size:  100
Reads extended to:  300
Reads shifted by:  0
Parameter uniq:  TRUE

[[2]]
S4 Object of class MEDIPSset
=======================================
Regions file:  2833-SMC-0004.sort.bam
File path:  raw_data_test_2
Genome:  BSgenome.Rnorvegicus.UCSC.rn5
Number of regions:  3529363
Chromosomes: chr1
Chromosome lengths: 290094216
Genome wide window size:  100
Reads extended to:  300
Reads shifted by:  0
Parameter uniq:  TRUE

[[3]]
S4 Object of class MEDIPSset
=======================================
Regions file:  2833-SMC-0005.sort.bam
File path:  raw_data_test_2
Genome:  BSgenome.Rnorvegicus.UCSC.rn5
Number of regions:  3567814
Chromosomes: chr1
Chromosome lengths: 290094216
Genome wide window size:  100
Reads extended to:  300
Reads shifted by:  0
Parameter uniq:  TRUE

[[4]]
S4 Object of class MEDIPSset
=======================================
Regions file:  2833-SMC-0006.sort.bam
File path:  raw_data_test_2
Genome:  BSgenome.Rnorvegicus.UCSC.rn5
Number of regions:  2481498
Chromosomes: chr1
Chromosome lengths: 290094216
Genome wide window size:  100
Reads extended to:  300
Reads shifted by:  0
Parameter uniq:  TRUE

[[5]]
S4 Object of class MEDIPSset
=======================================
Regions file:  2833-SMC-0007.sort.bam
File path:  raw_data_test_2
Genome:  BSgenome.Rnorvegicus.UCSC.rn5
Number of regions:  2433645
Chromosomes: chr1
Chromosome lengths: 290094216
Genome wide window size:  100
Reads extended to:  300
Reads shifted by:  0
Parameter uniq:  TRUE

[[6]]
S4 Object of class MEDIPSset
=======================================
Regions file:  2833-SMC-0008.sort.bam
File path:  raw_data_test_2
Genome:  BSgenome.Rnorvegicus.UCSC.rn5
Number of regions:  3051548
Chromosomes: chr1
Chromosome lengths: 290094216
Genome wide window size:  100
Reads extended to:  300
Reads shifted by:  0
Parameter uniq:  TRUE

[[7]]
S4 Object of class MEDIPSset
=======================================
Regions file:  2833-SMC-0009.sort.bam
File path:  raw_data_test_2
Genome:  BSgenome.Rnorvegicus.UCSC.rn5
Number of regions:  2341505
Chromosomes: chr1
Chromosome lengths: 290094216
Genome wide window size:  100
Reads extended to:  300
Reads shifted by:  0
Parameter uniq:  TRUE

[[8]]
S4 Object of class MEDIPSset
=======================================
Regions file:  2833-SMC-0010.sort.bam
File path:  raw_data_test_2
Genome:  BSgenome.Rnorvegicus.UCSC.rn5
Number of regions:  2359429
Chromosomes: chr1
Chromosome lengths: 290094216
Genome wide window size:  100
Reads extended to:  300
Reads shifted by:  0
Parameter uniq:  TRUE

[[9]]
S4 Object of class MEDIPSset
=======================================
Regions file:  2833-SMC-0011.sort.bam
File path:  raw_data_test_2
Genome:  BSgenome.Rnorvegicus.UCSC.rn5
Number of regions:  2039338
Chromosomes: chr1
Chromosome lengths: 290094216
Genome wide window size:  100
Reads extended to:  300
Reads shifted by:  0
Parameter uniq:  TRUE

[[10]]
S4 Object of class MEDIPSset
=======================================
Regions file:  2833-SMC-0012.sort.bam
File path:  raw_data_test_2
Genome:  BSgenome.Rnorvegicus.UCSC.rn5
Number of regions:  2869252
Chromosomes: chr1
Chromosome lengths: 290094216
Genome wide window size:  100
Reads extended to:  300
Reads shifted by:  0
Parameter uniq:  TRUE

[[11]]
S4 Object of class MEDIPSset
=======================================
Regions file:  2833-SMC-0013.sort.bam
File path:  raw_data_test_2
Genome:  BSgenome.Rnorvegicus.UCSC.rn5
Number of regions:  2044900
Chromosomes: chr1
Chromosome lengths: 290094216
Genome wide window size:  100
Reads extended to:  300
Reads shifted by:  0
Parameter uniq:  TRUE

ADD REPLY • link 10.1 years ago tptacek3050 • 0

0

Entering edit mode

Dear tptacek305, I agree that everything looks fine. Although I do not immediately see what the actual problem in the MEDIPS.meth() could be, I still assume that the problem could be caused by the file names. Could you do me a favour and rename one or two bam files by removing the 2668- prefix and by replacing the - by an _ (this means avoiding file names that start with a number and contain minus symbols). I am sorry for just guessing around at the moment, but could you please let me know, if the MEDIPS.meth function will still create a truncated table? If it is not the file names, I will need to further narrow down which part of the MEDIPS.meth function causes this problem. Could you then also run the dummy example with only one or two re-named samples without the MeDIP functionalities by setting MeDIP=F and let me know the output of header(mr.edgeR)? Thank you! Lukas > On 18 Mar 2015, at 18:05, tptacek3050 [bioc] <noreply@bioconductor.org> wrote: > > Activity on a post you are following on support.bioconductor.org <https: support.bioconductor.org=""/> > User tptacek3050 <https: support.bioconductor.org="" u="" 7477=""/> wrote Comment: MEDIPS: Output appears to be truncated <https: support.bioconductor.org="" p="" 65785="" #65792="">: > > > Everything appeared to finish with no errors. Here are the contents of the various data structures: > > files1 > > [1] "raw_data_test_1/2668-SMC-0001.sort.bam" > [2] "raw_data_test_1/2668-SMC-0002.sort.bam" > [3] "raw_data_test_1/2668-SMC-0003.sort.bam" > [4] "raw_data_test_1/2668-SMC-0004.sort.bam" > [5] "raw_data_test_1/2668-SMC-0005.sort.bam" > [6] "raw_data_test_1/2668-SMC-0006.sort.bam" > names1 > > [1] "2668-SMC-0001.sort.bam" "2668-SMC-0002.sort.bam" "2668-SMC-0003.sort.bam" > [4] "2668-SMC-0004.sort.bam" "2668-SMC-0005.sort.bam" "2668-SMC-0006.sort.bam" > files2 > > [1] "raw_data_test_2/2833-SMC-0003.sort.bam" > [2] "raw_data_test_2/2833-SMC-0004.sort.bam" > [3] "raw_data_test_2/2833-SMC-0005.sort.bam" > [4] "raw_data_test_2/2833-SMC-0006.sort.bam" > [5] "raw_data_test_2/2833-SMC-0007.sort.bam" > [6] "raw_data_test_2/2833-SMC-0008.sort.bam" > [7] "raw_data_test_2/2833-SMC-0009.sort.bam" > [8] "raw_data_test_2/2833-SMC-0010.sort.bam" > [9] "raw_data_test_2/2833-SMC-0011.sort.bam" > [10] "raw_data_test_2/2833-SMC-0012.sort.bam" > [11] "raw_data_test_2/2833-SMC-0013.sort.bam" > names2 > > [1] "2833-SMC-0003.sort.bam" "2833-SMC-0004.sort.bam" "2833-SMC-0005.sort.bam" > [4] "2833-SMC-0006.sort.bam" "2833-SMC-0007.sort.bam" "2833-SMC-0008.sort.bam" > [7] "2833-SMC-0009.sort.bam" "2833-SMC-0010.sort.bam" "2833-SMC-0011.sort.bam" > [10] "2833-SMC-0012.sort.bam" "2833-SMC-0013.sort.bam" > set1 > > [[1]] > S4 Object of class MEDIPSset > ======================================= > Regions file: 2668-SMC-0001.sort.bam > File path: raw_data_test_1 > Genome: BSgenome.Rnorvegicus.UCSC.rn5 > Number of regions: 3236283 > Chromosomes: chr1 > Chromosome lengths: 290094216 > Genome wide window size: 100 > Reads extended to: 300 > Reads shifted by: 0 > Parameter uniq: TRUE > > [[2]] > S4 Object of class MEDIPSset > ======================================= > Regions file: 2668-SMC-0002.sort.bam > File path: raw_data_test_1 > Genome: BSgenome.Rnorvegicus.UCSC.rn5 > Number of regions: 3510565 > Chromosomes: chr1 > Chromosome lengths: 290094216 > Genome wide window size: 100 > Reads extended to: 300 > Reads shifted by: 0 > Parameter uniq: TRUE > > [[3]] > S4 Object of class MEDIPSset > ======================================= > Regions file: 2668-SMC-0003.sort.bam > File path: raw_data_test_1 > Genome: BSgenome.Rnorvegicus.UCSC.rn5 > Number of regions: 4116556 > Chromosomes: chr1 > Chromosome lengths: 290094216 > Genome wide window size: 100 > Reads extended to: 300 > Reads shifted by: 0 > Parameter uniq: TRUE > > [[4]] > S4 Object of class MEDIPSset > ======================================= > Regions file: 2668-SMC-0004.sort.bam > File path: raw_data_test_1 > Genome: BSgenome.Rnorvegicus.UCSC.rn5 > Number of regions: 3067967 > Chromosomes: chr1 > Chromosome lengths: 290094216 > Genome wide window size: 100 > Reads extended to: 300 > Reads shifted by: 0 > Parameter uniq: TRUE > > [[5]] > S4 Object of class MEDIPSset > ======================================= > Regions file: 2668-SMC-0005.sort.bam > File path: raw_data_test_1 > Genome: BSgenome.Rnorvegicus.UCSC.rn5 > Number of regions: 2381139 > Chromosomes: chr1 > Chromosome lengths: 290094216 > Genome wide window size: 100 > Reads extended to: 300 > Reads shifted by: 0 > Parameter uniq: TRUE > > [[6]] > S4 Object of class MEDIPSset > ======================================= > Regions file: 2668-SMC-0006.sort.bam > File path: raw_data_test_1 > Genome: BSgenome.Rnorvegicus.UCSC.rn5 > Number of regions: 2774979 > Chromosomes: chr1 > Chromosome lengths: 290094216 > Genome wide window size: 100 > Reads extended to: 300 > Reads shifted by: 0 > Parameter uniq: TRUE > set2 > > [[1]] > S4 Object of class MEDIPSset > ======================================= > Regions file: 2833-SMC-0003.sort.bam > File path: raw_data_test_2 > Genome: BSgenome.Rnorvegicus.UCSC.rn5 > Number of regions: 2948651 > Chromosomes: chr1 > Chromosome lengths: 290094216 > Genome wide window size: 100 > Reads extended to: 300 > Reads shifted by: 0 > Parameter uniq: TRUE > > [[2]] > S4 Object of class MEDIPSset > ======================================= > Regions file: 2833-SMC-0004.sort.bam > File path: raw_data_test_2 > Genome: BSgenome.Rnorvegicus.UCSC.rn5 > Number of regions: 3529363 > Chromosomes: chr1 > Chromosome lengths: 290094216 > Genome wide window size: 100 > Reads extended to: 300 > Reads shifted by: 0 > Parameter uniq: TRUE > > [[3]] > S4 Object of class MEDIPSset > ======================================= > Regions file: 2833-SMC-0005.sort.bam > File path: raw_data_test_2 > Genome: BSgenome.Rnorvegicus.UCSC.rn5 > Number of regions: 3567814 > Chromosomes: chr1 > Chromosome lengths: 290094216 > Genome wide window size: 100 > Reads extended to: 300 > Reads shifted by: 0 > Parameter uniq: TRUE > > [[4]] > S4 Object of class MEDIPSset > ======================================= > Regions file: 2833-SMC-0006.sort.bam > File path: raw_data_test_2 > Genome: BSgenome.Rnorvegicus.UCSC.rn5 > Number of regions: 2481498 > Chromosomes: chr1 > Chromosome lengths: 290094216 > Genome wide window size: 100 > Reads extended to: 300 > Reads shifted by: 0 > Parameter uniq: TRUE > > [[5]] > S4 Object of class MEDIPSset > ======================================= > Regions file: 2833-SMC-0007.sort.bam > File path: raw_data_test_2 > Genome: BSgenome.Rnorvegicus.UCSC.rn5 > Number of regions: 2433645 > Chromosomes: chr1 > Chromosome lengths: 290094216 > Genome wide window size: 100 > Reads extended to: 300 > Reads shifted by: 0 > Parameter uniq: TRUE > > [[6]] > S4 Object of class MEDIPSset > ======================================= > Regions file: 2833-SMC-0008.sort.bam > File path: raw_data_test_2 > Genome: BSgenome.Rnorvegicus.UCSC.rn5 > Number of regions: 3051548 > Chromosomes: chr1 > Chromosome lengths: 290094216 > Genome wide window size: 100 > Reads extended to: 300 > Reads shifted by: 0 > Parameter uniq: TRUE > > [[7]] > S4 Object of class MEDIPSset > ======================================= > Regions file: 2833-SMC-0009.sort.bam > File path: raw_data_test_2 > Genome: BSgenome.Rnorvegicus.UCSC.rn5 > Number of regions: 2341505 > Chromosomes: chr1 > Chromosome lengths: 290094216 > Genome wide window size: 100 > Reads extended to: 300 > Reads shifted by: 0 > Parameter uniq: TRUE > > [[8]] > S4 Object of class MEDIPSset > ======================================= > Regions file: 2833-SMC-0010.sort.bam > File path: raw_data_test_2 > Genome: BSgenome.Rnorvegicus.UCSC.rn5 > Number of regions: 2359429 > Chromosomes: chr1 > Chromosome lengths: 290094216 > Genome wide window size: 100 > Reads extended to: 300 > Reads shifted by: 0 > Parameter uniq: TRUE > > [[9]] > S4 Object of class MEDIPSset > ======================================= > Regions file: 2833-SMC-0011.sort.bam > File path: raw_data_test_2 > Genome: BSgenome.Rnorvegicus.UCSC.rn5 > Number of regions: 2039338 > Chromosomes: chr1 > Chromosome lengths: 290094216 > Genome wide window size: 100 > Reads extended to: 300 > Reads shifted by: 0 > Parameter uniq: TRUE > > [[10]] > S4 Object of class MEDIPSset > ======================================= > Regions file: 2833-SMC-0012.sort.bam > File path: raw_data_test_2 > Genome: BSgenome.Rnorvegicus.UCSC.rn5 > Number of regions: 2869252 > Chromosomes: chr1 > Chromosome lengths: 290094216 > Genome wide window size: 100 > Reads extended to: 300 > Reads shifted by: 0 > Parameter uniq: TRUE > > [[11]] > S4 Object of class MEDIPSset > ======================================= > Regions file: 2833-SMC-0013.sort.bam > File path: raw_data_test_2 > Genome: BSgenome.Rnorvegicus.UCSC.rn5 > Number of regions: 2044900 > Chromosomes: chr1 > Chromosome lengths: 290094216 > Genome wide window size: 100 > Reads extended to: 300 > Reads shifted by: 0 > Parameter uniq: TRUE > > > > You may reply via email or visit C: MEDIPS: Output appears to be truncated >

ADD REPLY • link 10.1 years ago Lukas Chavez ▴ 570

0

Entering edit mode

I tried running again, changing two of the files in each group to match your specifications (- to _, no #s starting the file name), and got the same results.

Next I tried running against a smaller data set (3 files in each set) where all of the files have been renamed (- to _, no #s starting the file name). I'm still getting the same problem.

Here's the first 10 lines of the output file:

         chr     start      stop CF test_SMC_0001.sort.bam.counts
1       chr1         1       100  0                             4
2       chr1       101       200  0                             6
3       chr1       201       300  2                             7
4       chr1       301       400  2                             6
5       chr1       401       500  0                             6
6       chr1       501       600  2                             4
7       chr1       601       700  0                             2
8       chr1       701       800  0                             2
9       chr1       801       900  0                             0

And here are the data structures again:

files1

[1] "raw_data_test_3/test_SMC_0001.sort.bam"
[2] "raw_data_test_3/test_SMC_0002.sort.bam"
[3] "raw_data_test_3/test_SMC_0004.sort.bam"

names1

[1] "test_SMC_0001.sort.bam" "test_SMC_0002.sort.bam" "test_SMC_0004.sort.bam"

files2

[1] "raw_data_test_4/test2_SMC_0004.sort.bam"
[2] "raw_data_test_4/test2_SMC_0013.sort.bam"
[3] "raw_data_test_4/test_SMC_0003.sort.bam"

names2

[1] "test2_SMC_0004.sort.bam" "test2_SMC_0013.sort.bam"
[3] "test_SMC_0003.sort.bam"

set1

[[1]]
S4 Object of class MEDIPSset
=======================================
Regions file:  test_SMC_0001.sort.bam
File path:  raw_data_test_3
Genome:  BSgenome.Rnorvegicus.UCSC.rn5
Number of regions:  3236283
Chromosomes: chr1
Chromosome lengths: 290094216
Genome wide window size:  100
Reads extended to:  300
Reads shifted by:  0
Parameter uniq:  TRUE

[[2]]
S4 Object of class MEDIPSset
=======================================
Regions file:  test_SMC_0002.sort.bam
File path:  raw_data_test_3
Genome:  BSgenome.Rnorvegicus.UCSC.rn5
Number of regions:  3510565
Chromosomes: chr1
Chromosome lengths: 290094216
Genome wide window size:  100
Reads extended to:  300
Reads shifted by:  0
Parameter uniq:  TRUE

[[3]]
S4 Object of class MEDIPSset
=======================================
Regions file:  test_SMC_0004.sort.bam
File path:  raw_data_test_3
Genome:  BSgenome.Rnorvegicus.UCSC.rn5
Number of regions:  3067967
Chromosomes: chr1
Chromosome lengths: 290094216
Genome wide window size:  100
Reads extended to:  300
Reads shifted by:  0
Parameter uniq:  TRUE

set2

[[1]]
S4 Object of class MEDIPSset
=======================================
Regions file:  test2_SMC_0004.sort.bam
File path:  raw_data_test_4
Genome:  BSgenome.Rnorvegicus.UCSC.rn5
Number of regions:  3529363
Chromosomes: chr1
Chromosome lengths: 290094216
Genome wide window size:  100
Reads extended to:  300
Reads shifted by:  0
Parameter uniq:  TRUE

[[2]]
S4 Object of class MEDIPSset
=======================================
Regions file:  test2_SMC_0013.sort.bam
File path:  raw_data_test_4
Genome:  BSgenome.Rnorvegicus.UCSC.rn5
Number of regions:  2044900
Chromosomes: chr1
Chromosome lengths: 290094216
Genome wide window size:  100
Reads extended to:  300
Reads shifted by:  0
Parameter uniq:  TRUE

[[3]]
S4 Object of class MEDIPSset
=======================================
Regions file:  test_SMC_0003.sort.bam
File path:  raw_data_test_4
Genome:  BSgenome.Rnorvegicus.UCSC.rn5
Number of regions:  2948651
Chromosomes: chr1
Chromosome lengths: 290094216
Genome wide window size:  100
Reads extended to:  300
Reads shifted by:  0
Parameter uniq:  TRUE

ADD REPLY • link 10.1 years ago tptacek3050 • 0

1

Entering edit mode

Dear tptacek3050, thank you for testing changed file names. Unfortunately, that did not help except that we got rid of the X at the beginning of the column names. As I have no clue what might cause the truncated table, I have now run an example and I have written out the result table using sink() as you do at the end of your script. The resulting text file has also only few columns and only a subset of the rows. Therefore, I assume that your mr.edgeR object contains the whole data as usual and the sink() command cannot be applied in this case? You could write out the table using write.table(). I also recommend not to write out the whole table but selecting for significant regions and to write out the data table for those. Please let me know, if this solves your problem. All the best, Lukas > On 18 Mar 2015, at 22:22, tptacek3050 [bioc] <noreply@bioconductor.org> wrote: > > Activity on a post you are following on support.bioconductor.org <https: support.bioconductor.org=""/> > User tptacek3050 <https: support.bioconductor.org="" u="" 7477=""/> wrote Comment: MEDIPS: Output appears to be truncated <https: support.bioconductor.org="" p="" 65785="" #65803="">: > > > I tried running again, changing two of the files in each group to match your specifications (- to _, no #s starting the file name), and got the same results. > > Next I tried running against a smaller data set (3 files in each set) where all of the files have been renamed (- to _, no #s starting the file name). I'm still getting the same problem. > > Here's the first 10 lines of the output file: > > chr start stop CF test_SMC_0001.sort.bam.counts > 1 chr1 1 100 0 4 > 2 chr1 101 200 0 6 > 3 chr1 201 300 2 7 > 4 chr1 301 400 2 6 > 5 chr1 401 500 0 6 > 6 chr1 501 600 2 4 > 7 chr1 601 700 0 2 > 8 chr1 701 800 0 2 > 9 chr1 801 900 0 0 > And here are the data structures again: > > files1 > > [1] "raw_data_test_3/test_SMC_0001.sort.bam" > [2] "raw_data_test_3/test_SMC_0002.sort.bam" > [3] "raw_data_test_3/test_SMC_0004.sort.bam" > names1 > > [1] "test_SMC_0001.sort.bam" "test_SMC_0002.sort.bam" "test_SMC_0004.sort.bam" > files2 > > [1] "raw_data_test_4/test2_SMC_0004.sort.bam" > [2] "raw_data_test_4/test2_SMC_0013.sort.bam" > [3] "raw_data_test_4/test_SMC_0003.sort.bam" > names2 > > [1] "test2_SMC_0004.sort.bam" "test2_SMC_0013.sort.bam" > [3] "test_SMC_0003.sort.bam" > set1 > > [[1]] > S4 Object of class MEDIPSset > ======================================= > Regions file: test_SMC_0001.sort.bam > File path: raw_data_test_3 > Genome: BSgenome.Rnorvegicus.UCSC.rn5 > Number of regions: 3236283 > Chromosomes: chr1 > Chromosome lengths: 290094216 > Genome wide window size: 100 > Reads extended to: 300 > Reads shifted by: 0 > Parameter uniq: TRUE > > [[2]] > S4 Object of class MEDIPSset > ======================================= > Regions file: test_SMC_0002.sort.bam > File path: raw_data_test_3 > Genome: BSgenome.Rnorvegicus.UCSC.rn5 > Number of regions: 3510565 > Chromosomes: chr1 > Chromosome lengths: 290094216 > Genome wide window size: 100 > Reads extended to: 300 > Reads shifted by: 0 > Parameter uniq: TRUE > > [[3]] > S4 Object of class MEDIPSset > ======================================= > Regions file: test_SMC_0004.sort.bam > File path: raw_data_test_3 > Genome: BSgenome.Rnorvegicus.UCSC.rn5 > Number of regions: 3067967 > Chromosomes: chr1 > Chromosome lengths: 290094216 > Genome wide window size: 100 > Reads extended to: 300 > Reads shifted by: 0 > Parameter uniq: TRUE > set2 > > [[1]] > S4 Object of class MEDIPSset > ======================================= > Regions file: test2_SMC_0004.sort.bam > File path: raw_data_test_4 > Genome: BSgenome.Rnorvegicus.UCSC.rn5 > Number of regions: 3529363 > Chromosomes: chr1 > Chromosome lengths: 290094216 > Genome wide window size: 100 > Reads extended to: 300 > Reads shifted by: 0 > Parameter uniq: TRUE > > [[2]] > S4 Object of class MEDIPSset > ======================================= > Regions file: test2_SMC_0013.sort.bam > File path: raw_data_test_4 > Genome: BSgenome.Rnorvegicus.UCSC.rn5 > Number of regions: 2044900 > Chromosomes: chr1 > Chromosome lengths: 290094216 > Genome wide window size: 100 > Reads extended to: 300 > Reads shifted by: 0 > Parameter uniq: TRUE > > [[3]] > S4 Object of class MEDIPSset > ======================================= > Regions file: test_SMC_0003.sort.bam > File path: raw_data_test_4 > Genome: BSgenome.Rnorvegicus.UCSC.rn5 > Number of regions: 2948651 > Chromosomes: chr1 > Chromosome lengths: 290094216 > Genome wide window size: 100 > Reads extended to: 300 > Reads shifted by: 0 > Parameter uniq: TRUE > > You may reply via email or visit C: MEDIPS: Output appears to be truncated >

ADD REPLY • link 10.1 years ago Lukas Chavez ▴ 570

0

Entering edit mode

write.table worked!

The output looks right now. Thanks for your help. I'll post a new topic if something else looks off.

ADD REPLY • link 10.1 years ago tptacek3050 • 0