Hi,
I am working on a ribosome profiling data set and I wanted to give riboSeqR a shot. I have my alignment files as specified in the vignette and the readRibodata worked no problem.
I wanted to use a gff file for annotations (converted into a Granges object) like below
gff<-read.table("C:/Users/Alper Celik/Documents/analysis files/new-set/clean.gff.txt", header=F, sep="\t", as.is=T)
colnames(gff)<-c("chrom", "source", "type", "start", "end", "score", "strand", "phase", "name")
gff_gr<-makeGRangesFromDataFrame(gff, keep.extra.columns=T, ignore.strand=F, seqnames.field="chrom", start.field="start", end.field="end", strand.field="strand", starts.in.df.are.0based=F)
When I tried the frameCounting function I kept getting an error. I have used portions of this gff like just "gene" or just "CDS" but I always get same error (see below).
here is a snapshot of a subset gff as a GRanges object
GRanges object with 6600 ranges and 3 metadata columns:
seqnames ranges strand | source type name
<Rle> <IRanges> <Rle> | <character> <character> <character>
[1] chrVI [ 53, 535] + | SGD gene YFL068W
[2] chrV [264, 4097] - | SGD gene YEL077C
[3] chrII [280, 2658] - | SGD gene YBL113C
[4] chrXVI [280, 6007] - | SGD gene YPL283C
[5] chrI [335, 649] + | SGD gene YAL069W
... ... ... ... ... ... ... ...
[6596] chrIV [1523249, 1523611] + | SGD gene YDR542W
[6597] chrIV [1524634, 1524933] - | SGD gene YDR543C
[6598] chrIV [1525095, 1525523] - | SGD gene YDR544C
[6599] chrIV [1526321, 1531711] + | SGD gene YDR545W
[6600] chrIV [1530863, 1531342] - | SGD gene YDR545C-A
and this is the error I'm getting no matter how I try to sort the "gff" file (by chromosome, then start location, by start location alone doesnt matter)
Calling frames...Error in findInterval(spl27.f[[ii]], splfr0e[[ii]]) :
'vec' must be sorted non-decreasingly and not contain NAs
In addition: Warning messages:
1: In `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels) else paste0(labels, :
duplicated levels in factors are deprecated
2: In `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels) else paste0(labels, :
duplicated levels in factors are deprecated
3: In `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels) else paste0(labels, :
duplicated levels in factors are deprecated
4: In `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels) else paste0(labels, :
duplicated levels in factors are deprecated
5: In `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels) else paste0(labels, :
duplicated levels in factors are deprecated
BTW there are not duplicated values (at least in the $name section)
thanks in advance
Alper