Entering edit mode
Alejandro Reyes
★
1.9k
@alejandro-reyes-5124
Last seen 5 months ago
Novartis Institutes for BioMedical Reseā¦
Dear Julien, Dear Mar and people interested in DEXSeq ,
You recently reported some problems in DEXSeq that had to do with the
way the HTSeq python scripts deal with the exons that overlap with
more
than one gene ID.
The solution that we had taken so far was that the gene IDs sharing an
exon were merged into an "aggregate gene" ID. From the input of some
users and our own experience, we know that it was not the most
appropriate solution: when the merged genes were differentially
expressed, DEXSeq falsely calls differential usage in other exons of
the
aggregate genes. We have included a "-r" parameter in the script
"prepare_annotation_dexseq.py", for the user to decide what to do with
these exons: either to ignore the exons associated with more than one
gene IDs and treat each gene separately, or to merge the genes and
take
these exons into account.
Additionally, we have implemented the R/Bioconductor functions
equivalent to the python scripts. These functions were implemented
using
code contributed by Mike Love.
All these changes are available in the last svn version (1.5.9).
Best regards,
Alejandro Reyes
Hi Alejandro,
Just to let you know that adding the junctions to the test of
differential expression of DEXSeq worked fine! The "hack" was actually
straightforward, I just had to modify the counts files taken as input.
On a different note, I noticed that many false positives were
generated
because of "aggregate" gene models that were composed on different
overlapping genes. When these overlapping genes have different
behavior
in different conditions, this is interpreted as differential
expression
of some exons, while it is differential expression of genes... See the
attached picture, this might turn out to be easier to understand
Did you notice this behavior of DEXSeq, and do you have any comment on
this?
Thanks again for your work on DEXSeq
Julien