occurrence of rGADEM motifs

0

Entering edit mode

mattia pelizzola ▴ 200

@mattia-pelizzola-3304

Last seen 18 months ago

Italy

Hi, I am using rGADEM and MotIV to find out enriched motifs in my ChIPseq peaks and determine the similarity with Jaspar TFBS. These tools look very useful! rGADEM provides a list of enriched motifs. The total number of motifs is provided by the nOccurrences function, but I can't find a way to get to know which peak regions do contain these motifs. In particular, what are the startPos and endPos functions supposed to do? I would expect a set of genomic positions (or positions relative to the peak regions) with the same length as nOccurrences, but I only get one number for each motif, with no chromosome associated. Even in the rGADEM vignette you have nOccurrences equal to 60 but then you get only one number out of the startPos and endPos functions. Am I missing or misunderstanding anything? Additionally, I was also wondering if it is possible to control the max number of processors used in the analysis. I am working on a cluster shared between many people and apparently the software uses as many processors as possible, while I do not want to be that greedy with other users .. Thanks for any hint, mattia [[alternative HTML version deleted]]

rGADEM MotIV rGADEM MotIV • 1.9k views

ADD COMMENT • link updated 13.8 years ago by Charles Joly Beauparlant ▴ 170 • written 13.8 years ago by mattia pelizzola ▴ 200

0

Entering edit mode

SimonNoël ▴ 450

@simonnoel-3455

Last seen 10.6 years ago

Vous en pensez quoi? En plus, ?a utilise Jaspar, une base de donn?e tr?s performante et compl?te que nous n'utilisons ?s encore. Simon No?l CdeC ________________________________________ De : bioconductor-bounces at r-project.org [bioconductor-bounces at r-project.org] de la part de mattia pelizzola [mattia.pelizzola at gmail.com] Date d'envoi : 26 juillet 2011 06:51 ? : bioconductor Objet : [BioC] occurrence of rGADEM motifs Hi, I am using rGADEM and MotIV to find out enriched motifs in my ChIPseq peaks and determine the similarity with Jaspar TFBS. These tools look very useful! rGADEM provides a list of enriched motifs. The total number of motifs is provided by the nOccurrences function, but I can't find a way to get to know which peak regions do contain these motifs. In particular, what are the startPos and endPos functions supposed to do? I would expect a set of genomic positions (or positions relative to the peak regions) with the same length as nOccurrences, but I only get one number for each motif, with no chromosome associated. Even in the rGADEM vignette you have nOccurrences equal to 60 but then you get only one number out of the startPos and endPos functions. Am I missing or misunderstanding anything? Additionally, I was also wondering if it is possible to control the max number of processors used in the analysis. I am working on a cluster shared between many people and apparently the software uses as many processors as possible, while I do not want to be that greedy with other users .. Thanks for any hint, mattia [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 13.8 years ago SimonNoël ▴ 450

0

Entering edit mode

Sory for the last mail, I hit the wrong buton Simon No?l CdeC ________________________________________ De : bioconductor-bounces at r-project.org [bioconductor-bounces at r-project.org] de la part de Simon No?l [simon.noel.2 at ulaval.ca] Date d'envoi : 26 juillet 2011 13:12 ? : mattia pelizzola; bioconductor Objet : [BioC] RE : occurrence of rGADEM motifs Vous en pensez quoi? En plus, ?a utilise Jaspar, une base de donn?e tr?s performante et compl?te que nous n'utilisons ?s encore. Simon No?l CdeC ________________________________________ De : bioconductor-bounces at r-project.org [bioconductor-bounces at r-project.org] de la part de mattia pelizzola [mattia.pelizzola at gmail.com] Date d'envoi : 26 juillet 2011 06:51 ? : bioconductor Objet : [BioC] occurrence of rGADEM motifs Hi, I am using rGADEM and MotIV to find out enriched motifs in my ChIPseq peaks and determine the similarity with Jaspar TFBS. These tools look very useful! rGADEM provides a list of enriched motifs. The total number of motifs is provided by the nOccurrences function, but I can't find a way to get to know which peak regions do contain these motifs. In particular, what are the startPos and endPos functions supposed to do? I would expect a set of genomic positions (or positions relative to the peak regions) with the same length as nOccurrences, but I only get one number for each motif, with no chromosome associated. Even in the rGADEM vignette you have nOccurrences equal to 60 but then you get only one number out of the startPos and endPos functions. Am I missing or misunderstanding anything? Additionally, I was also wondering if it is possible to control the max number of processors used in the analysis. I am working on a cluster shared between many people and apparently the software uses as many processors as possible, while I do not want to be that greedy with other users .. Thanks for any hint, mattia [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD REPLY • link 13.8 years ago SimonNoël ▴ 450

0

Entering edit mode

Heidi Dvinge ★ 2.0k

@heidi-dvinge-2195

Last seen 10.6 years ago

Hi Mattia, > Hi, > I am using rGADEM and MotIV to find out enriched motifs in my ChIPseq > peaks > and determine the similarity with Jaspar TFBS. These tools look very > useful! > > rGADEM provides a list of enriched motifs. The total number of motifs is > provided by the nOccurrences function, but I can't find a way to get to > know > which peak regions do contain these motifs. In particular, what are the > startPos and endPos functions supposed to do? I would expect a set of > genomic positions (or positions relative to the peak regions) with the > same > length as nOccurrences, but I only get one number for each motif, with no > chromosome associated. I don't really know rGADEM, so I can't tell you if there's a direct way of doing that. You can however extract the PWM itself using getPWM (rGADEM), and then match it to your sequence(s) of interest with matchPWM (Biostrings). The latter will also let you control the score, i.e. how similar the found motif should be to your PWM of interest. I seem to remember that rGADEM sometimes gives you quite long motifs, where the information content around the edges is quite low. If you extract the actual PWMs you might therefore want to consider trimming them before matching to your sequences. HTH \Heidi > Even in the rGADEM vignette you have nOccurrences equal to 60 but then you > get only one number out of the startPos and endPos functions. Am I missing > or misunderstanding anything? > > Additionally, I was also wondering if it is possible to control the max > number of processors used in the analysis. I am working on a cluster > shared > between many people and apparently the software uses as many processors as > possible, while I do not want to be that greedy with other users .. > > Thanks for any hint, > > mattia > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD COMMENT • link 13.8 years ago Heidi Dvinge ★ 2.0k

0

Entering edit mode

Charles Joly Beauparlant ▴ 170

@charles-joly-beauparlant-4777

Last seen 6.1 years ago

Canada

Hi mattia, About the number of processors used: rGADEM is using openMP to speed up certain parts of the calculation. So in order to control the maximum number of processor used, one of your option would be to set the OMP_NUM_THREADS environment variable. For example, you can write the command "export OMP_NUM_THREADS=8" on a linux terminal before starting a rGADEM analysis and openMP will only use 8 threads. I do not have access to other operating system right now to test this, but I found informations on setting the OMP_NUM_THREADS environment variable in this page: http://software.intel.com/sites/products/documentation/hpc/composerxe /en-us/cpp/lin/optaps/common/optaps_par_var.htm Best regards, Charles Joly Beauparlant. > Hi, > I am using rGADEM and MotIV to find out enriched motifs in my ChIPseq peaks > and determine the similarity with Jaspar TFBS. These tools look very > useful! > > rGADEM provides a list of enriched motifs. The total number of motifs is > provided by the nOccurrences function, but I can't find a way to get to > know > which peak regions do contain these motifs. In particular, what are the > startPos and endPos functions supposed to do? I would expect a set of > genomic positions (or positions relative to the peak regions) with the same > length as nOccurrences, but I only get one number for each motif, with no > chromosome associated. > Even in the rGADEM vignette you have nOccurrences equal to 60 but then you > get only one number out of the startPos and endPos functions. Am I missing > or misunderstanding anything? > > Additionally, I was also wondering if it is possible to control the max > number of processors used in the analysis. I am working on a cluster shared > between many people and apparently the software uses as many processors as > possible, while I do not want to be that greedy with other users .. > > Thanks for any hint, > > mattia > [[alternative HTML version deleted]]

ADD COMMENT • link 13.8 years ago Charles Joly Beauparlant ▴ 170

Login before adding your answer.