This file: UCSD.H1.H2AK5ac.SAK201.bed.gz
looks like this:
chr1 9942 10141 SOLEXA2_1:1:101:4024:16163 -
chr1 9988 10187 SOLEXA2_1:1:10:12241:10803 -
chr1 9992 10191 SOLEXA2_1:1:93:18918:18953 -
chr1 9997 10196 SOLEXA2_1:1:30:11903:16499 -
It doesn't have a scores column. When I try to load it with
import.bed("UCSD.H1.H2AK5ac.SAK201.bed.gz")
I get:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : scan() expected 'a real', got '-'
Is there a way to instruct import.bed
to deal with missing scores? And would the same options work with files that have scores? The problem is that other files from the same source (e.g. UCSD.H1_BMP4_Derived_Mesendoderm_Cultured_Cells.H2AK5ac.AK126.bed.gz) do have scores, and I'd like to process them with the same instruction. I just need the Ranges info. I expected the format to be the same for every file on that site.
I'm quite new to R and to Bioconductor, so forgive my ignorance. (I did try reading the help documents and searching the web.)
João Rodrigues
Edited: Fixed link to first file.
Thanks Michael, that really works. It reads both files with no scores, as well as files with scores! I really don't understand this function, but this seems to solve my problem.
It does, however, produce a warning when reading either of files I mentioned:
It seems that this is caused by the files not having any "*" in the strand column, but the output seems fine to me. Probably a bug?
Yea, it could be smarter. But I think this is already fixed in devel, which will be released soon.