Is that possible to find if the sequence is enriched by specific motifs
0
1
Entering edit mode
Liliian • 0
@67340e11
Last seen 20 months ago
United States

I'm wondering is that possible to find if the sequence is enriched by specific motifs?

For example, I have generated Granges object and DNAStringset :

> gr_up <-  GRanges(gr_up)
GRanges object with 376 ranges and 3 metadata columns:
      seqnames              ranges strand |            ENSG      pval      dPSI
         <Rle>           <IRanges>  <Rle> |     <character> <numeric> <numeric>
    1     chr1 167012370-167012510      + | ENSG00000143194 0.0000000  1.000000
    2     chr1 186304109-186304256      + | ENSG00000116690 0.0874126  0.666667
    3     chr1 154176116-154176178      - | ENSG00000143549 0.0000000  0.488313
    4     chr1 203347293-203348276      - | ENSG00000122176 0.0969031  0.482400
    5     chr1   54017377-54017426      - | ENSG00000203985 0.0289710  0.433965
  ...      ...                 ...    ... .             ...       ...       ...
  634     chrX   38804311-38804412      + | ENSG00000165175 0.0974026  0.134225
  635     chrX   68511850-68511976      + | ENSG00000181704 0.0309690  0.128942
  636     chrX   48575562-48575629      + | ENSG00000102317 0.0299700  0.119535
  637     chrX   47583409-47583535      + | ENSG00000102265 0.0109890  0.115762
  638     chrX 136548590-136549007      + | ENSG00000102243 0.0559441  0.100380
  -------
  seqinfo: 23 sequences from an unspecified genome; no seqlengths

> seqs_up <- getSeq(BSgenome.Hsapiens.UCSC.hg38, gr_up)
DNAStringSet object of length 376:
      width seq                                                                                                          names               
  [1]   141 TCCCCTAGTCTCCTGATGCTTCTTGTCATAATTCTTCTTGGACTAATTCACTG...TAATCAACAAATATTTATTGAGCACCTCCTCTGTGCCAGAAGATGATCCAAA 1
  [2]   148 GCATAATCCCACATCACCACCATCTTCAAAGAAAGCACCTCCACCTTCAGGAG...ACCAAACAAGAAGAAGACTAAGAAAGTTATAGAATCAGAGGAAATAACAGAA 2
  [3]    63 AGAGTGAGAGTAGTCGTCGAAAAAGTCGAAGAAGGTCGAAAACGTCCCGTCACCGGTCCGCGA                                              3
  [4]   984 TAACTAGGATAACGGAACCTCCATCTCCAAGAGGTCCAACCACAACTGACCCC...TCTCCCTCTTCTCGGGACGGTCGTCGTCCTCCCTCCAGGTGACGTAAAACAG 4
  [5]    50 AAGGGGAGGTCCCACCCGAAACCTAAGTCGTCGTCACATCGGTAAGAGAG                                                           5
  ...   ... ...
[372]   102 CACAGTCCAGCCACTGACCGCAGCAGCGCCCTTGCGTAGCAGCCGCTTGCAGCGAGAACACTGAATTGCCAACGAGCAGGAGAGTCTCAAGGCGCAAGAGGA       634
[373]   127 TTGCAGGCCTTTCAGATATATCCATCTCACAAGACATCCCCGTAGAAGGAGAA...GCATCCGGGAGTTTGACAGCTCCACATTAAATGAATCTGTTCGCAATACCAT 635
[374]    68 GGTCGTTGTCAAGGACCGGGAGACTCAGCGGTCCAGGGGTTTTGGTTTCATCACCTTCACCAACCCAG                                         636
[375]   127 ACCCACCATGGCCCCCTTTGAGCCCCTGGCTTCTGGCATCCTGTTGTTGCTGT...CTGCACCTGTGTCCCACCCCACCCACAGACGGCCTTCTGCAATTCCGACCTC 637
[376]   418 TGATAGCATGTCTCCAAATCAGTGGCGTTACTCGTCTCCATGGACAAAGCCAC...GATAGCTGGAAGCACAGGGTTGCTCTTCAACCTGCCTCCCGGCTCAGTTCAC 638

And then I have conducted motif enrichment analysis and found that these sequence set have some "C" enrich.

Motif Enrichment

I'm wondering is that possible to find which specific sequence(in Granges or DNAStringset ) has "C" enrich?

MotifDiscovery RNASeq • 721 views
ADD COMMENT

Login before adding your answer.

Traffic: 431 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6