Entering edit mode
Eric,
You need to put the as.character() in space, what happened is that
without as.character, the chromosome is treated as factor. Please use
the exact code I sent to you. It should work. Regarding strand,
reverseComplement in Biostrings package should help. But I will try to
add the strand to the function before the new release.
Best regards,
Julie
On 3/25/10 10:54 AM, "Eric Bremer" <ericgbremer@gmail.com> wrote:
Hi Julie,
Thanks so much for the help!! Below is the result I get when using
chromosome names. I still get a numeric value that does not
correspond to the chromosome from the original table (2R). Another
question is how do I take into account "strand" when retreiving the
sequences?
Thanks again,
Eric
> peaks = RangedData(IRanges(start=c(SeqTest$start),
end=c(SeqTest$end), names=c(SeqTest$peakID)), space=c(SeqTest$Chrom))
> head(peaks)
RangedData with 6 rows and 0 value columns across 1 space
space ranges |
<character> <iranges> |
6010 1 [16831502, 16831539] |
7922 1 [20702347, 20702374] |
2598 1 [ 8146910, 8146941] |
7048 1 [18952292, 18952318] |
5159 1 [14120268, 14120294] |
7367 1 [19830055, 19830090] |
On Wed, Mar 24, 2010 at 11:56 AM, Zhu, Julie <julie.zhu@umassmed.edu>
wrote:
Hi Eric,
Please try the following code. You had the gene names in space which
needs to contain chromosome names.
peaks = RangedData(IRanges(start=SeqTest$start, end=SeqTest$end,
names=SeqTest$peakID), space=as.character(SeqTest$Chrom))
library(BSgenome.Dmelanogaster.UCSC.dm2)
peaksWithSequences = getAllPeakSequence(peaks, upstream=60,
downstream=40, genome=Dmelanogaster)
Best regards,
Julie
On 3/21/10 8:20 PM, "Julie Zhu" <julie.zhu@umassmed.edu <http:="" julie.zhu@umassmed.edu=""> > wrote:
Hi Eric,
Is CAGE the cap analysis of gene expression? Thanks!
I think the error has to do with the chromosome naming since there are
only chromosome X, 2L, 2R, 3L, 3R and 4 are available in drosophila
genome.
I am revising the ChIPpeakAnno paper which is due on Tuesday. Could I
get back to you after that? Thank so much!
Meanwhile, if others could help out, that would be very much
appreciated!
Best regards,
Julie
On 3/21/10 12:52 PM, "Eric Bremer" <ericgbremer@gmail.com <http:="" ericgbremer@gmail.com=""> > wrote:
HI Julie,
I am trying to get sequence data from the CAGE peaks identified to be
within 50 bases of a TSS. With your previous help I was able to
nicely idenitify these using ChIPeakAnno. I seem to be having trouble
getting the gene identifier into ranged data following the example in
the Vignette. I have pasted my approach below. Thanks in advance for
any suggestions.
> peaks = RangedData(IRanges(start = c(100, 500), end = c(300,
+ 600), names = c("peak1", "peak2")), space = c("NC_008253",
+ "NC_010468"))
> peaks
RangedData with 2 rows and 0 value columns across 2 spaces
space ranges |
<character> <iranges> |
peak1 NC_008253 [100, 300] |
peak2 NC_010468 [500, 600] |
> SeqTest = read.table(file="2Rp_50base.txt", header=TRUE)
> head(SeqTest)
Chrom start end length peakID strand FlyBaseGene
1 2R 16831502 16831539 38 6010 1 FBgn0000044
2 2R 20702347 20702374 28 7922 1 FBgn0000157
3 2R 8146910 8146941 32 2598 1 FBgn0000253
4 2R 18952292 18952318 27 7048 1 FBgn0000487
5 2R 14120268 14120294 27 5159 1 FBgn0000658
6 2R 19830055 19830090 36 7367 1 FBgn0001123
> peaks = RangedData(IRanges(start=c(SeqTest$start),
end=c(SeqTest$end), names=c(SeqTest$peakID)),
space=c(SeqTest$FlyBaseGene))
> head(peaks)
RangedData with 6 rows and 0 value columns across 247 spaces
space ranges |
<character> <iranges> |
6010 1 [16831502, 16831539] |
7922 2 [20702347, 20702374] |
2598 3 [ 8146910, 8146941] |
7048 4 [18952292, 18952318] |
5159 5 [14120268, 14120294] |
7367 6 [19830055, 19830090] |
> peaks$space[1:6]
[1] "1" "2" "3" "4" "5" "6"
> library(BSgenome.Dmelanogaster.UCSC.dm2)
> peaksWithSequences = getAllPeakSequence(peaks, upstream=60,
downstream=40, genome=Dmelanogaster)
Error in .getOneSeq(x, name) : sequence chr1 not found
Error in subseq(.getOneSeq(x, name), start = start, end = end, width =
width) :
error in evaluating the argument 'x' in selecting a method for
function 'subseq'
[[alternative HTML version deleted]]
_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing@r-project.org <http: bioc-sig-="" sequencing@r-project.org="">
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
[[alternative HTML version deleted]]