Polyester and mutations
1
0
Entering edit mode
H.Hasani • 0
@hhasani-11134
Last seen 8.4 years ago

Hello all,

I'm looking for a package to simulate RNA-Seq data that its output includes besides sequences the true amount of mutations (SNP, INDELs) and their positions. So far I tried two tools who failed this task.

Polyester (http://bioconductor.org/packages/release/bioc/html/polyester.html) is my third attempt, but I take it from the documentation, this could be achieved only by giving the tool, in advance, the list of mutations. Is this the only way to do so?

Thanks

 

 

rnaseq pair-end reads mutations • 1.5k views
ADD COMMENT
1
Entering edit mode
Jeff Leek ▴ 650
@jeff-leek-5015
Last seen 3.7 years ago
United States

Yes, you are correct: polyester does not simulate SNPs / mutations / indels etc. You would need to know where the mutations were beforehand before giving the sequences to polyester.  You could then introduce them into the fasta file and have Polyester simulate from them. 

(edited version of a response from Alyssa Frazee via email)

Jeff

ADD COMMENT
0
Entering edit mode

Thank you for answering!

I'm trying to create a custom error model, however, GemErr is triggering an error [1]. I already tried to reach the author of this tool but got no answer.

The command line I'm using is:
$tools_path/GemErr.py -r 100 -f $anno_path/Homo_sapiens.GRCh37.75.dna.primary_assembly.fa -s $GemErr/ToGemErr.sam -n $GemErr/illm -m 12

Thanks


[1]

Traceback (most recent call last):
  File "...../Me/Tools/GemSIM_v1.6//GemErr.py", line 758, in <module>
    main(sys.argv[1:])
  File "...../Me/Tools/GemSIM_v1.6//GemErr.py", line 755, in main
    mkMxSingle(readLen,reference,samfile,name,skip,circular,maxIndel,excl,minK)
  File "...../Me/Tools/GemSIM_v1.6//GemErr.py", line 530, in mkMxSingle
    updateM(ref[chr],pos,seq,qual,cigList,circular,0,maxIndel,'f',readLen,excl)
KeyError: 'chr1'

 

ADD REPLY
0
Entering edit mode

GemErr is a totally separate project from polyester (it's just one way to go if you want a custom error model), and the authors of polyester aren't involved in GemErr development at all -- so this is just a debugging suggestion and not actually "official" GemErr help :) But it seems to me from this error that the chromosome names in your alignment file (ToGemErr.sam) are different from the chromosome names in your reference .fa file. I'd double check to make sure the fasta file you're providing as your reference is the same one that the reads in the sam file were aligned to.

ADD REPLY
0
Entering edit mode

Unfortunately, that did not work!

ADD REPLY

Login before adding your answer.

Traffic: 672 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6