Dear all,
I stumbled over a problem calling tallyVariants
on a BAM file from an ion torrent (which was most likely aligned using some proprietary tool) that causes R to crash after the message:
Two different genomic chars and G at position 197880852
I believe this message comes from the gsnap/gmap code in gmapR
, but am not sure. So, in order to better understand the cause of the problem I would like to pose some questions:
1) Does VariantTools
require that BAM files were generated by gmap?
2) Is it required to use exactly the same gmap index for alignment and tallyVariants
(this I obviously don't have, since I got the BAM file, but built my own gmap index using hg19 BSGenome package).
Thanks in advance for any insights.
cheers, jo
Getting the stuff from the BAM file might be tricky, in the meantime I will also try that on a different machine (eventually related to the OSX?).
my session info:
What do you mean by "related to the OSX"? VariantTools is not supported on OSX, only Linux.
Ah, didn't know that. I'm always compiling my R on OS X and thus also installing the source packages, not the pre-compiled ones. VariantTools compiled nicely on my system. I'll check if I get the same error on linux.
I guess it's unsupported in the sense that Bioc does not distribute a binary. But it builds and runs just fine on my Mac and on others, apparently.
Hi Michael,
Just to clarify, we don't support VariantTools on OS X because it depends on gmapR, which we don't support either because you asked us to mark it as unsupported on OS X (see commit 77246). Should we change that?
Thanks,
H.
I guess we could try again to get things building on your side. I looked back at the thread. It was the old autoconf via svn issue, which we solved with "--disable-maintainer-mode".
Done (in devel only, commit 111229). Note that you have full control on this via the
.BBSoptions
file in gmapR top-level directory.H.
Thanks. I know I could change it, but it might take some of Dan's time, so wanted to coordinate.
I must be missing something. Is there something that needs to be done on the Mac build nodes?
I said "might". Hopefully not ;)
Could you at least look at the reads at that region to see if there is anything suspicious, like in the cigar string? Also, I realize gmapR is temporarily broken in devel, but if you could try grabbing the latest version from svn, that would help.
Hi Michael,
that seems to be the problematic read from the BAM file:
there is no other read even close to that one, so it must be this. I'm now trying the svn version gmapR.
with the latest gmapR from svn I get the following message:
I'll try to recompile R on my Mac and run the code on a linux machine.
So, it did work on the linux machine, thus I guess it has something to do with my R/OSX. I'll check some more stuff.
Below is the session info from the linux machine:
I tried using your BAM with a freshly built genome index from the hg19 BSgenome package. It seemed to work, except for the mismatch in chrM length between the UCSC hg19 and your BAM's header. Note that this was on Linux, but with the latest devel of gmapR.
thanks for testing Michael. In Linux with the stable version it did also work for me. I'm still waiting to test the devel version on Mac, but there is some problem with VariantAnnotation that prevents me from installing it. I guest there must be something odd with my R installation...
Yes, the difference in chrM length is also strange, but that's the file I got from my collab partners.
The linux machine is using the stable release version of gmapR/bam_tally. We've made many changes to the devel gmapR. It would be great to get a hold of a BAM that causes this problem. Or at the very least the code constructing
talVarPar
.UPDATE:
I've now compiled and installed R-3.2.3 and installed Bioc 3.2 on that fresh install (that used XCode 7.2, more specifically, clang from Apple's LLVM version 7.0.2; gfortran comes from homebrew in gcc version 5.4.0). With that install tallyVariants did run smoothly.