Question

Unable to read data using readBismark in BiSeq

1

Entering edit mode

Neel Aluru ▴ 460

@neel-aluru-3760

Last seen 8.4 years ago

United States

I apologize for asking a very naive question. I am using BiSeq for the first time and I am having a hard time reading the data. readBismark command seems to be stalled. Any advice would be highly appreciated.

Here is what I am trying to do.

> library(BiSeq)
Loading required package: GenomicRanges
Loading required package: GenomeInfoDb
Loading required package: Formula
> setwd("~/Documents/OneDrive/CpG_RRBS_Brain")
> file=system.file("CpG_zr1121_9.bismark.cov", package="BiSeq")

> readBismark(file, colData = DataFrame(row.names="sample1"))
Processing sample sample1 ...
1:

####It is taking a long time and I am not sure if it is even working. Same thing is happening when I give 8 samples at the same time. Here is the code for that.

> file=system.file("CpG_zr1121_9.bismark.cov", "CpG_zr1121_10.bismark.cov", "CpG_zr1121_11.bismark.cov", "CpG_zr1121_12.bismark.cov", "CpG_zr1121_13.bismark.cov", "CpG_zr1121_14.bismark.cov", "CpG_zr1121_15.bismark.cov", "CpG_zr1121_16.bismark.cov", package="BiSeq")

> readBismark(file, colData = DataFrame(row.names="DMSO1", "DMSO2", "DMSO3", "DMSO4", "PCB1", "PCB2", "PCB3", "PCB4"))

Processing sample DMSO1 ...
1:

####Same thing here...It has been in this state for the past 3-4 hours. Each sample is approximately 100 MB. Any help would be highly appreciated.

Session Info()

> sessionInfo()
R version 3.1.3 (2015-03-09)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.5 (Yosemite)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel stats4 stats graphics grDevices
[6] utils datasets methods base

other attached packages:
[1] BiSeq_1.6.0          Formula_1.2-1
[3] GenomicRanges_1.18.4 GenomeInfoDb_1.2.5
[5] AnnotationHub_1.6.0 IRanges_2.0.1
[7] S4Vectors_0.4.0      BiocGenerics_0.12.1

Thank you,

Neel

BiSeq readBismark • 2.2k views

ADD COMMENT • link updated 9.5 years ago by James W. MacDonald 68k • written 9.5 years ago by Neel Aluru ▴ 460

0

Entering edit mode

I noticed that the "system.file" is not working.

> file=system.file("CpG_zr1121_9.bismark.cov", package="BiSeq")

> file

""

# It is returning an empty vector! Is there something wrong in the command?

Thanks,

Neel

ADD REPLY • link 9.5 years ago Neel Aluru ▴ 460

score 2 · Answer 1 · 2015-11-04

2

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 18 hours ago

United States

You shouldn't be blindly following vignettes. In other words, for a vignette to work when it needs some real data to process, the usual method is to put the data either in with the package, or in a separate data package. In the former instance, using system.file() will allow the vignette to be processed, because system.file() is a way to find the data that was included as part of the package.

But when you are doing 'real' work with your own data, you should be starting R in the same directory as your data, in which case you don't need to tell R where the files are! The default place for R to look for files is its working directory, so you would just do

readBismark("CpG_zr1121_9.bismark.cov")

Also, when you want to add a comment to your question, you shouldn't use the 'Add Answer' box. Because you are obviously not adding an answer. Instead, use the 'ADD COMMENT" button at the bottom of the post you want to comment on.

ADD COMMENT • link 9.5 years ago James W. MacDonald 68k

0

Entering edit mode

Hi, I know this thread is almost 5 years old but I'm a bit stuck so I'm hoping I'll get a reply.

I am in the same directory as all my files, and I am trying to use readBismark. I wish to enter all my samples at the same time, but whenever I try this, I get an error. This I what I have entered:

setwd("~/Desktop/control_nucleated RBC_data")
rrbs <- readBismark("RRBS_C1_S19_bismark_bt2.bismark.cov", "RRBS_C2_S20_bismark_bt2.bismark.cov", colData = DataFrame(row.names = c("control_1", "control_2"), group = "control"))

The error I get is:

Error in readBismark("RRBS_C1_S19_bismark_bt2.bismark.cov", "RRBS_C2_S20_bismark_bt2.bismark.cov",  : 
  unused argument ("RRBS_C2_S20_bismark_bt2.bismark.cov")

Do you know what I'm doing wrong?

ADD REPLY • link 4.9 years ago akhaira • 0

1

Entering edit mode

In your code the readBismark function reads the files as separate arguments. For it to work you need to put the file names into a character vector, like this:

rrbs <- readBismark(files = c("RRBS_C1_S19_bismark_bt2.bismark.cov", "RRBS_C2_S20_bismark_bt2.bismark.cov"), colData = DataFrame(row.names = c("control_1", "control_2"), group = c("control", "control")))

I hope this solves the problem. And sorry for the late reply!

ADD REPLY • link 4.8 years ago Katja Hebestreit ▴ 130