Motif search -- access to JASPAR, MotIV package, , more TF-PWM relationships?
1
0
Entering edit mode
Vince Schulz ▴ 160
@vince-schulz-3553
Last seen 5 weeks ago
United States
In regards to Questions, suggestions, use cases and data sources are all welcome for working with TF-PWM motifs, my 2c: For other methods for de novo chip-seq motif finding besides MotIV and meme, there is a new paper that describes a fast method for chip-seq sets and has recent references: http://www.ncbi.nlm.nih.gov/pubmed/22228832 I like the HOMER suite (perl, not R based), which is very fast, easy to use and gives reasonable results: http://biowhat.ucsd.edu/homer/chipseq/index.html If bioconductor matrix annotation packages are developed, it would be good IMO to have: -obviously all of jaspar -some sort of phylogenetic grouping, eg vertebrate, plant, in addition to species based, since species specific info is usually limited and not usually required. -It would also be important to include the latest uniprobe matrices: http://the_brain.bwh.harvard.edu/uniprobe/ Since it seems that Jaspar is slow to update. -Maybe include the free version of transfac, even if this is terribly old? -Unlikely, perhaps, but it would be great if someone would systematically go through all public chip-seq datasets and extract top motifs. Or you could grab the Homer motifs from website above where they have already done this to a limited extent. Another area that would be good is to have methods for identifying statistically significantly overrepresented known motifs in sets of DNA sequences, compared to some user chooseable control set (sequences from control set, all promoter sequences or same genome randomized in some way). This has been implemented in user friendly non-R ways many times, especially for promoter analysis of differentially expressed gene sets, see eg: -the homer package above http://dire.dcode.org/ http://159.149.109.9/pscan/ http://www.dbi.tju.edu/dbi/tools/paint/ clover at http://biowulf.bu.edu/MotifViz/ http://www.bioinfo.tsinghua.edu.cn/~zhengjsh/OTFBS/ http://www.telis.ucla.edu/index.php?cmd=transfac http://grenada.lumc.nl/HumaneGenetica/CORE_TF/ Finally, the ability to easily search for a given motif in a DNA sequence, but to attach a score to the match like the possum program listed at clover/motifviz above or like the transfac match software. This would use some kind test against control sequencesand currently could be done using existing bioconductor tools, but it is a common enough use that a package would be good. The idea is not to just give a score about how close the PWM is to the sequence, but also how likely is it to happen by chance, since many of the PWM's are very sloppy. Vince ......... On 4/24/12 11:02 PM, "Paul Shannon" <pshannon at="" fhcrc.org=""> wrote: > Hi Julie, > > FlyFactorSurvey looks great. Would that we had such a resource (curated, > current, and growing) for all organisms! > > A few questions, if I may: > > 1) What role with respect to FlyFactorSurvey do you picture us taking here > at BioC? How can we help? > > 2) Your website (http://pgfe.umassmed.edu/TFDBS) recommends meme and TOMTOM > for motif comparison. Do you use them yourself? If so, can you tell us about > their strengths and weaknesses? How do they compare to clover? > (http://zlab.bu.edu/clover/) > > In that same spirit -- trying to find out more about this topic -- here are > some more questions: > > 3) The JASPAR database seems to be mostly unchanged since 2009. > (http://jaspar.genereg.net/html/DOWNLOAD). Does anyone know their update > policy? > > 4) Is TRANSFAC only for license holders? > > 5) Are there any other organism-specific gems like FlyFactorSurvey to be > discovered out on the web? > > Thanks! > > - Paul .............
Annotation GO MotIV Annotation GO MotIV • 2.5k views
ADD COMMENT
0
Entering edit mode
Arno BioC ▴ 10
@arno-bioc-5258
Last seen 10.2 years ago
Hi, We are the developers of the de novo motif discovery package rGADEM. While rGADEM fits nicely in the ChIP-Seq analysis pipeline PICS/rGADEM/MotIV, it can also easily be used to analyze enriched regions obtained by any other peak callers. Among the many output of rGADEM, we find the position weight matrices (pwn) that can be directly used for analysis in MotIV, but also other programs that compare motif sequences with databases. Since it is implemented using OpenMP, the time for each analysis required for every analysis is greatly reduced. We would happily consider collaborating with you if you are interested to test our de novo motif discovery package with your dataset. Regards, Arnaud. On Fri, Apr 27, 2012 at 10:22 AM, Vincent Schulz <vincent.schulz@yale.edu>wrote: > In regards to Questions, suggestions, use cases and data sources are all > welcome for working with TF-PWM motifs, my 2c: > > For other methods for de novo chip-seq motif finding besides MotIV and > meme, there is a new paper that describes a fast method for chip-seq sets > and has recent references: > http://www.ncbi.nlm.nih.gov/**pubmed/22228832<http: www.ncbi.nlm.ni="" h.gov="" pubmed="" 22228832=""> > > I like the HOMER suite (perl, not R based), which is very fast, easy to > use and gives reasonable results: > http://biowhat.ucsd.edu/homer/**chipseq/index.html<http: biowhat.uc="" sd.edu="" homer="" chipseq="" index.html=""> > > If bioconductor matrix annotation packages are developed, it would be good > IMO to have: > -obviously all of jaspar > -some sort of phylogenetic grouping, eg vertebrate, plant, in addition to > species based, since species specific info is usually limited and not > usually required. > -It would also be important to include the latest uniprobe matrices: > http://the_brain.bwh.harvard.**edu/uniprobe/<http: the_brain.bwh.ha="" rvard.edu="" uniprobe=""/> > Since it seems that Jaspar is slow to update. > -Maybe include the free version of transfac, even if this is terribly old? > -Unlikely, perhaps, but it would be great if someone would systematically > go through all public chip-seq datasets and extract top motifs. Or you > could grab the Homer motifs from website above where they have already done > this to a limited extent. > > Another area that would be good is to have methods for identifying > statistically significantly overrepresented known motifs in sets of DNA > sequences, compared to some user chooseable control set (sequences from > control set, all promoter sequences or same genome randomized in some way). > This has been implemented in user friendly non-R ways many times, > especially for promoter analysis of differentially expressed gene sets, see > eg: > -the homer package above > http://dire.dcode.org/ > http://159.149.109.9/pscan/ > http://www.dbi.tju.edu/dbi/**tools/paint/<http: www.dbi.tju.edu="" dbi="" tools="" paint=""/> > clover at > http://biowulf.bu.edu/**MotifViz/ <http: biowulf.bu.edu="" motifviz=""/> > http://www.bioinfo.tsinghua.**edu.cn/~zhengjsh/OTFBS/<http: www.bio="" info.tsinghua.edu.cn="" ~zhengjsh="" otfbs=""/> > http://www.telis.ucla.edu/**index.php?cmd=transfac<http: www.telis.="" ucla.edu="" index.php?cmd="transfac"> > http://grenada.lumc.nl/**HumaneGenetica/CORE_TF/<http: grenada.lumc="" .nl="" humanegenetica="" core_tf=""/> > > Finally, the ability to easily search for a given motif in a DNA sequence, > but to attach a score to the match like the possum program listed at > clover/motifviz above or like the transfac match software. This would use > some kind test against control sequencesand currently could be done using > existing bioconductor tools, but it is a common enough use that a package > would be good. The idea is not to just give a score about how close the > PWM is to the sequence, but also how likely is it to happen by chance, > since many of the PWM's are very sloppy. > > Vince > > ......... > On 4/24/12 11:02 PM, "Paul Shannon" <pshannon@fhcrc.org> wrote: > > Hi Julie, > > > > FlyFactorSurvey looks great. Would that we had such a resource > (curated, > > current, and growing) for all organisms! > > > > A few questions, if I may: > > > > 1) What role with respect to FlyFactorSurvey do you picture us taking > here > > at BioC? How can we help? > > > > 2) Your website (http://pgfe.umassmed.edu/**TFDBS<http: pgfe.umassmed.edu="" tfdbs="">) > recommends meme and TOMTOM > > for motif comparison. Do you use them yourself? If so, can you tell us > about > > their strengths and weaknesses? How do they compare to clover? > > (http://zlab.bu.edu/clover/) > > > > In that same spirit -- trying to find out more about this topic -- here > are > > some more questions: > > > > 3) The JASPAR database seems to be mostly unchanged since 2009. > > (http://jaspar.genereg.net/**html/DOWNLOAD<http: jaspar.gene="" reg.net="" html="" download="">). > Does anyone know their update > > policy? > > > > 4) Is TRANSFAC only for license holders? > > > > 5) Are there any other organism-specific gems like FlyFactorSurvey to > be > > discovered out on the web? > > > > Thanks! > > > > - Paul > ............. > > ______________________________**_________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.et="" hz.ch="" mailman="" listinfo="" bioconductor=""> > Search the archives: http://news.gmane.org/gmane.** > science.biology.informatics.**conductor<http: news.gmane.org="" gmane.="" science.biology.informatics.conductor=""> > [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 641 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6