Dear CAGEr team
I have been trying to run CAGEr on my test dataset and having problem in reading in the BAM files to make CAGEset object.
> input = list.files("test",pattern = "test-.*sorted.bam$",full.names = TRUE) > mycage <- new("CAGEset", genomeName = "BSgenome.Dmelanogaster.UCSC.dm6", + inputFiles = input, inputFilesType = "bam", + sampleLabels = c("test-C", "test-D", "test-E", "test-G")) > ctss <- getCTSS(mycage,removeFirstG = TRUE,correctSystematicG = TRUE)
Gives me :
Reading in file: test/test-C_sorted.bam... -> Filtering out low quality reads... -> Removing the first base of the reads if 'G' and not aligned to the genome... -> Estimating the frequency of adding a 'G' nucleotide and correcting the systematic bias... Error in setnames(CTSS, c("chr", "pos", "strand", sample.labels[i])) : Can't assign 4 names to a 1 column data.table
After looking into the code I thought I should try not using the removeFirstG option.. so I tried :
ctss <- getCTSS(mycage,removeFirstG = FALSE,correctSystematicG = FALSE)
and got :
Reading in file: test/test-C_sorted.bam... -> Filtering out low quality reads... Error in `$<-.data.frame`(`*tmp*`, "tag_count", value = 1) : replacement has 1 row, data has 0
So basically got stuck at the same place in the source code, but with a different error.
If I turn only correctSystematicG false, i get the same error :
ctss <- getCTSS(mycage, removeFirstG = TRUE, correctSystematicG = FALSE) Reading in file: ../rawdata_results/fetish-C_sorted.bam... -> Filtering out low quality reads... -> Removing the first base of the reads if 'G' and not aligned to the genome... Error in `$<-.data.frame`(`*tmp*`, "tag_count", value = 1) : replacement has 1 row, data has 0
I would appreciate any help with this problem..
Thanks
Vivek
hi, I am using CAGEr too. And I download the bam file from FANTOM5 website. I got the same problem like you. Can you tell me how to solve the problem?
the website I download the bam file is http://fantom.gsc.riken.jp/5/datafiles/latest/basic/mouse.primary_cell.hCAGE/
by the way the info of my R is
thanks
zhong
Hi
It's been a while since I had this, but I think the problem was solved by switching the gene annotation to ensembl syle.
Good point with
seqlevelsStyle
!I just pushed versions 1.33.1 and 1.32.1 to Bioconductor, which should address these problems.
Hi, I am also troubled with the problem. If I did not make it wrong, the solution is to redo the mapping of the UCSC GTF file?