FourCseq -problem with countFragmentOverlaps
4
0
Entering edit mode
dzis • 0
@dzis-8446
Last seen 9.0 years ago
Poland

Hi,

I have a problem using FourCseq, which I don't really understand. i am trying to use the tool to my data of Arabidopsis Thaliana . i have 2 replications for 2 conditions everything seems to work normal but when i am  using the  countFragmentOverlaps either i have 0 counts and the counts files are emplty but without any error (reference: Arabidopsis_thaliana.TAIR10.24.dna.genome.fa) or if i use another version of reference genome (Arabidopsis_thaliana.TAIR10.27.dna.toplevel.fa)i have a warning like :

reading bam files
calculating overlaps
Warning messages:
1: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)
2: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)
3: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)
4: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)
5: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)
6: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)
7: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)
8: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)

 the result in the first case is an assay like that :

 fcar
class: FourC
dim: 37838 4
exptData(7): projectPath fragmentDir ... primerFile bamFilePath
assays(3): counts countsLeftFragmentEnd countsRightFragmentEnd
rownames: NULL
rowRanges metadata column names(4): leftSize rightSize leftValid rightValid
colnames(4): FLC_condition_1_1 FLC_condition_1_2 FLC_condition_2_1 FLC_condition_2_2
colData names(21): viewpoint condition ... mappedReads mappingRatio
> assays(fcar)
List of length 3
names(3): counts countsLeftFragmentEnd countsRightFragmentEnd
> head(assay(fcar, "counts"))
     FLC_condition_1_1 FLC_condition_1_2 FLC_condition_2_1 FLC_condition_2_2
[1,]                 0                 0                 0                 0
[2,]                 0                 0                 0                 0
[3,]                 0                 0                 0                 0
[4,]                 0                 0                 0                 0
[5,]                 0                 0                 0                 0
[6,]                 0                 0                 0                 0

Why the tool is not able to find overlaprs ? Am i doing something wrong ???

 

Here is my code:

library(FourCSeq)

referenceGenomeFile = system.file("extdata/Arabidopsis_thaliana.TAIR10.24.dna.genome.fa",package ="FourCSeq")
referenceGenomeFile
bamFilePath = system.file ("extdata/bam", package="FourCSeq")
bamFilePath
primerFile = system.file ("extdata/primerAGATCT.fa",package="FourCSeq")
primerFile
writeLines(readLines(primerFile))
exptData <- SimpleList (projectPath="exampleData",
                        fragmentDir="re_fragments",
                        referenceGenomeFile = referenceGenomeFile,
                        reSequence1 = "AGATCT",
                        reSequence2 = "CATG",
                        primerFile = primerFile ,
                        bamFilePath = bamFilePath)
exptData

colData <- DataFrame (viewpoint = "FLC",
                    condition  = factor(rep(c("condition_1","condition_2"),
                                         each=2),
                                     levels = c("condition_1","condition_2")),
                    replicate = rep(c(1,2), 2),
                    bamFile = c("INDEX_1_3_2misTAIR10sorted.bam",
                                "INDEX_1_4_2misTAIR10sorted.bam",
                                "INDEX_1_5_2misTAIR10sorted.bam",
                                "INDEX_1_6_2misTAIR10sorted.bam"),
                    sequencingPrimer="first")

colData
fcar <- FourC(colData, exptData)
fcar
fcar<-addFragments(fcar, minSize = 20, filter = TRUE, save =TRUE )
fcar
rowRanges(fcar)
findViewpointFragments(fcar)

fcar <- addViewpointFrags(fcar)
fcar <- countFragmentOverlaps(fcar, trim=6, minMapq=-1, shift=0)
fcar
fcar <- combineFragEnds(fcar)
fcar
assays(fcar)
head(assay(fcar, "counts"))
FourCseq • 2.3k views
ADD COMMENT
0
Entering edit mode

The files in your example (Arabidopsis_thaliana.TAIR10.24.dna.genome.fa, primerAGATCT.fa) are not part of the FourCSeq package, and I'm a bit confused how they ended up in the data directories of the package. If you can specify where this data is coming from, it will be easier to reproduce your issue.

ADD REPLY
0
Entering edit mode
felix.klein ▴ 150
@felixklein-6900
Last seen 6.4 years ago
Germany
Hello, could you please specifiy against which genome you aligned the sequencing data. Because this should be the input to FourCSeq. Unfortunately I am not familiar with Arabidopsis, so I do not know what the reference genomes are. One reason for the 0 count overlaps could be that the settings in trimming the reads are not correct. Did you try to change those? In order to check that this is correct it would be great if you could show the head of one bam file that you are using as input. Best regards, Felix
ADD COMMENT
0
Entering edit mode
dzis • 0
@dzis-8446
Last seen 9.0 years ago
Poland

Hello,

Here is the head of one bam file

@HD     VN:1.0  SO:coordinate
@SQ     SN:Chr1 LN:30427671
@SQ     SN:Chr2 LN:19698289
@SQ     SN:Chr3 LN:23459830
@SQ     SN:Chr4 LN:18585056
@SQ     SN:Chr5 LN:26975502
@PG     ID:bowtie2      PN:bowtie2      VN:2.1.0
FCD2DFWACXX:5:1101:2545:2109#0/1        0       Chr5    3178597 255     49M     *       0       0       TCCCCATAGCAACTCTATAGATCTCCCGTAAGTGCATTGCATACAAATC       PPPS\`c`egcgfhfhffghdhfggghhhhYebebdgffedhfhccegh       AS:i:-6 XN:i:0  XM:i:1  XO:i:0  XG:i:0  NM:i:1  MD:Z:0A48       YT:Z:UU
FCD2DFWACXX:5:1101:3593:2236#0/1        0       Chr5    3178597 255     49M     *       0       0       ACCCCATAGCAACTCTATAGATCTCCCGTAAGTGCATTGCATACAAATC       PPPSSQQ`bca`^dggf`hffhdgd]ffeffh^fcgb^cgfhfROY^a]       AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:49 YT:Z:UU
FCD2DFWACXX:5:1101:5905:2186#0/1        0       Chr5    3178597 255     49M     *       0       0       ACCCCATAGCAACTCTATAGATCTCCCGTAAGTGCATTGCATACAAATC       bbbeeeeeeegggihhiiiiihihiR`_QYHPPPXcffcdf]]e^^fgh       AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:49 YT:Z:UU
FCD2DFWACXX:5:1101:11395:2216#0/1       0       Chr5    3178597 255     49M     *       0       0       ACCCCATAGCAACTCTATAGATCTCCCGTAAGTGCATTGCATACAAATC       ^_aS`cccgggggiiiiiiiiiiihiiigghifggiiiihiihihiiii       AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:49 YT:Z:UU
FCD2DFWACXX:5:1101:12558:2188#0/1       0       Chr5    3178597 255     49M     *       0       0       ACCCCATAGCAACTCTATAGATCTCCCGTAAGTGCATTGCATACAAATC       a_bcceccgggggiiiiiihhiihhhghffghegfhiiiiiiiidhhhh       AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:49 YT:Z:UU
FCD2DFWACXX:5:1101:17853:2237#0/1       0       Chr5    3178597 255     49M     *       0       0       GCCCCATAGCAACTCTATAGATCTCCCGTAAGTGCATTGCATACAAATC       JY\^ccccgeggghhhhhhhhhhhhhhhghhhgghhhhhhhhhhhhhhh       AS:i:-6 XN:i:0  XM:i:1  XO:i:0  XG:i:0  NM:i:1  MD:Z:0A48       YT:Z:UU
FCD2DFWACXX:5:1101:1754:2374#0/1        0       Chr5    3178597 255     49M     *       0       0       ACCCCATAGCAACTCTATAGATCTCCCGTAAGTGCATTGCATACAAATC       aa_S\acceggggiiiiiiiiiiiihiihghieghiiiihiiii]^egh       AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:49 YT:Z:UU
FCD2DFWACXX:5:1101:3029:2484#0/1        0       Chr5    3178597 255     49M     *       0       0       ACTCCATAGCAACTCTATAGATCTCCCGTAAGTGCATTGCATACAAATC       JYPSaRaa`cacchhdhcececcY`ecdbedbdhhhhhhc_cdc[ccch       AS:i:-6 XN:i:0  XM:i:1  XO:i:0  XG:i:0  NM:i:1  MD:Z:2C46       YT:Z:UU
FCD2DFWACXX:5:1101:3545:2484#0/1        0       Chr5    3178597 255     49M     *       0       0       ACCCCATAGCAACTCTATAGATCTCCCGTAAGTGCATTGCATACAAATC       ___cceeegfcgefhhiiihifdghhhhhffffgcdcfhbfhd]^aefd       AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:49 YT:Z:UU

My primer sequence is also here :

>FLC
ACCCCATAGCAACTCTATAGATCT

And the viewpoint is the FLC gene of arabidopsis thaliana with coordinates :

colData(fcar)$chr="Chr5"
colData(fcar)$start=3173504
colData(fcar)$end=3178620

I did the aligment of sequencing data with Arabidopsis_thaliana.TAIR10.24.dna.genome.fa and the head of this file is like that :

>Chr5 dna:chromosome chromosome:TAIR10:5:1:26975502:1
TATACCATGTACCCTCAACCTTAAAACCCTAAAACCTATACTATAAATCTTTAAAACCTA
TACTCTAAACCATAGGGTTTGTGAGTTTGCATAAAGTGTCACGTATAAGTGTTTCTAACA
TGTGAGTTTGCATAAGAGTCTCGACTATGTGTTTGTTCAAAAGTGACGTAAGTGTTTAGA
CTAGAGCCGGCCGTGAGCACAAGCGGGCCAAGCCCATGCTTGCGGCAGATTATCCTATAA
TTATGTTTTGCGGCTTTACAATTTTGAATTTGTTTTGTTGTTTAGTGGTCGAGTCAGGGA
TGAGTTTGTTCTCAAACATCTCAAATTTCTAATCTTCCAAGTGTTAGGTTCACTCTACCT
TATTTTTTTTTTTTTTTTTTATTTTAAATTTCTATTAGAAAGAACTTGAACCTTATTTCA
AGTGAACATTAAAAAGAATTGTAACCCTACCTTATTTCTATTATTATGAATATGTGTAAA
GTAAATTTTTTTACAATAAGCATGTATAAAGTAACATATTTTTGTTCTTTTCTTATAGCC
TTGTAAATCATAGGGACGGATCTAGTTTAGAATAAAGTGTTGCATAAGTTCTTCGACTAA
CATGTGAGTCCATTTTTGTGAGTTTTGCATAACTTTTGCGCCTTCGACAATTTGAATTTT
ATGCATAAGTGCTTTGACTAATTTATATGTTTTGTCTAAGTGATTCGACATTTCGACTAA
GTGATGTGAATTGTGTCATTCGTCGTCGTTGTCGCATCCGGAAGCGTGTTTGTGGTTCTT
GTGTCTTACTTGCCGGGTTATGAAGTCCTTTGTTTCGAGAGATCTGCTCTCATCATGATA
GAAGATGGTAGATGTTATCTTTGGATTGCCGCGGGTTGAATTGTCGCGGGCATCCTTCTT
AAATCTCATTCACATGTTCTTTTGAGGAAGAGTTTTTTTTGTTTTTTATTTATTTATTGT
TTCCCTTAAGATTTATACTTTATTCCTTCTATTTTTTGAGGACTTTTATTTTGTCTGTTG
GCTTTAGTTTTCCTTGTTTGGACATTTGTATTTTTGCGTCTTGTAACAACTATCCTTGTG
AAAATATCAATATAAGAACAACCCTCCTCATTTTAATTCCTTCTTGTCTACTTAGTTTAA
TATTTTCCAGCCGCAATGGGCCCATTAGCATCAACACCGGCCTATTTAGACGGCCCGTTA
TCTCCTCTTTGCCAATTTTCACCTTCTGCAATGATATTGAAATTGAAGTAAATGCAAACA
AAATAGTATGTTCCTGGTAGCAGATATTTAAGTTCCATTGATGTATTTCAAAGGGTTCAG
AGTTTTATCGTTTTACAAAGAAATGATGAGTGTTCATGACTGTAAAATCCACCTTCATCT
TCCACTTTCAGTTTAACGGCTCCGGCTCTGGATCCGGCTCCACTACTGGTGCTATGCAAT
CCACCTTATATCGAAGAGTCTTTCACAGAGGAAAAGAAACGGCCCAATGCAAAGATCCAA
ATTAATGAACGCAATGGAAAAATGGGTTACTGATTTACTCCGTCTACCCCAAAATGTAAA
TATCGCCCTACCTTTTTGAAGAAAGTTGAATATGAATTATCCCAAACAAGTTTGTAGTTT
CCAGCCATTAGCGTAGAGAAGTTGCCCTGCAGGTATAACCACAGTGGATCCCCAAAACGA
TTACCTTCATAACAACAACAAAAAATACCCAAAAGCTTCCCCACAAATGCACATGGGCAC
TGACCTGGTCAGCTTCATACCGACGGTAAGGTAATATCAGCTGAAATACGAAAAATTCAT
GATCAACCCTCAGAAATACACTATCTGGATGAATATTTGATAGAGTACGAGATTATATAT
ACTATGCGAAAGTAAAGTTTAACCAGAGTTTGGAAATTAATGAGCAGACTCACAGTCTTT
TCCCCTGAAGCATTTATGTACTCCACGCTAAATCCGATATCCTGGTAAAAGACAACAGAA
ATATAAACGGAGGACAAACAACAGAAGCTGAGAACCTAACTTAGATCCCGTATCATGTCA
AGAAATAACCATTCTTGTCCTTTTACATAAAAGATTCATCACAGACCAAGATAACGCTGT
ATACTATGTATGAAAACAAAGATAACCAGAGTCTGGCAAAAATAGCAAAAGATACAACAA
TAAAAATCTAATATACGGCAGCTAGAGAGAGTGTGATGTTCTCGACATGCAATCCAGAGA
CCAAAAACTAAGTTTCATGAATTCTTGAACTCTACTCATAAGAGAACAAACTAGACACCC
CTGTGAAGAGGGAAAGAACCCCAATACCATGCTTGGGAAACATATTGCAGGTTTAAAAAT
CAATACCATGATGGGGAAACATATCGCGAGTTCAATCATCCGAAAAAGCAATCAAATGAT
TCAAGGTTACGTATTGGGGTTTAAAAATTATTACCATGCTTATCTTGCCTTGCATCAAAG
AGAAATCCCAAGCAATATACGAATTTTCAGATTCCACCATCAATGAGATCTAGATAAACA
AAAAGAACATGCATCAATACTTACGCCAGACTCAAACGAAAGTGCTTATCATATGCTTTC
GTTTTTCTGCCAGCAGAAAATCTGAATGCTACATACTACAGTCATCAACATCACTTTTCA
CGTTTTTGAACATGACAATTTCAGATGGAACAAGTGTCAAGAGTCTTAGAATGGCAATAA
GTTTCAGTTTCATTAATAACCAAAAGCATCTACGGAGAACATATTACCTCATAGGTTCTC
CCAGCTTGAACGATTATCTCCTTGTACTCCTCATTGTCCCTAAGAAGTATAAACACTTAG
AGCTAGAAATCGTGAATTGAAGTGAGAATCTTGTGAATGATGGAAGCTAACCTTGACCAC
AATATGCCAGCTGTATCAGCTGAAATAAAGAAGGGGGAATTGGCGAATACATCGCTGAAA
AAGTAAAATGTTAATGACCCTCCATAGTGAGAAAGCAATACCTAAAGACGTCCAGCATAA
TTAAGGTTGGAACTGGGGATATAAATTTTAATCAAAACCGAATAAAGCTGAAAATTAACA
CAAAAATAATGAATTCGAGATAATGTCTAACACTTTATTAGATTTGTCTCTAATAACGCA
TGCCTTTGCAATTTGCATACTAGGAAAACTAATAAAAGAATACAATGGAGAACAAACCTT
AAAATAGACATAAACGTAATTGTCCTTAACTTTCCAAGGAATGTTTCAAAAGTTAAGTAC
ATCATATACTCAAAAGAAAATAAAGCAACCCACGACCAAGCCCATAGAAAGAGAGAAAGT
ACCTGATAAATGTGACGAATCTTTCAAACTCAGAAGTGTAGACTATAGCAGCTTCAGACA

Like all the fasta referench genome files ! I also tried with other versions but always i am using the same referench for mapping and for the FourCseq tool.

I tried to change the trimming settings from here :

fcar <- countFragmentOverlaps(fcar, trim=24, minMapq=0, shift=6) and only at that case i have one only peak but again all 0 counts. Is it possible that my data are problematic ???

Thank you for your quick reply.

Best Regards

Dimitris Zisis

 

ADD COMMENT
0
Entering edit mode
Hello, normally I would assume, that you remove the primers befor you align to the reference genome. Because the captured sequence just starts at the restriction fragment. Do your files only map to the one position listed here? Or are they mapped throughout the genome? Check this in a genome browser becaus from what I see I would assume that it is all piled up at one position as you have observed right now and there are a lot of unaligned reads in the library (check this eg. with samtools stat). Anyway, if you want to use FourCSeq countFragmentOverlaps, you have to remove the complete primer. Which means you have to set trimm to length( ACCCCATAGCAACTCTATAGATCT)=24 as you did and shift could be set to 0. Best regards, Felix On 07/23/2015 11:02 AM, dzis [bioc] wrote: > Activity on a post you are following on support.bioconductor.org > <https: support.bioconductor.org=""> > > User dzis <https: support.bioconductor.org="" u="" 8446=""/> wrote Answer: > FourCseq -problem with countFragmentOverlaps > <https: support.bioconductor.org="" p="" 70236="" #70278="">: > > Hello, > > Here is the head of one bam file > > @HD VN:1.0 SO:coordinate > @SQ SN:Chr1 LN:30427671 > @SQ SN:Chr2 LN:19698289 > @SQ SN:Chr3 LN:23459830 > @SQ SN:Chr4 LN:18585056 > @SQ SN:Chr5 LN:26975502 > @PG ID:bowtie2 PN:bowtie2 VN:2.1.0 > FCD2DFWACXX:5:1101:2545:2109#0/1 0 Chr5 3178597 255 49M * 0 0 TCCCCATAGCAACTCTATAGATCTCCCGTAAGTGCATTGCATACAAATC PPPS\`c`egcgfhfhffghdhfggghhhhYebebdgffedhfhccegh AS:i:-6 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:0A48 YT:Z:UU > FCD2DFWACXX:5:1101:3593:2236#0/1 0 Chr5 3178597 255 49M * 0 0 ACCCCATAGCAACTCTATAGATCTCCCGTAAGTGCATTGCATACAAATC PPPSSQQ`bca`^dggf`hffhdgd]ffeffh^fcgb^cgfhfROY^a] AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:49 YT:Z:UU > FCD2DFWACXX:5:1101:5905:2186#0/1 0 Chr5 3178597 255 49M * 0 0 ACCCCATAGCAACTCTATAGATCTCCCGTAAGTGCATTGCATACAAATC bbbeeeeeeegggihhiiiiihihiR`_QYHPPPXcffcdf]]e^^fgh AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:49 YT:Z:UU > FCD2DFWACXX:5:1101:11395:2216#0/1 0 Chr5 3178597 255 49M * 0 0 ACCCCATAGCAACTCTATAGATCTCCCGTAAGTGCATTGCATACAAATC ^_aS`cccgggggiiiiiiiiiiihiiigghifggiiiihiihihiiii AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:49 YT:Z:UU > FCD2DFWACXX:5:1101:12558:2188#0/1 0 Chr5 3178597 255 49M * 0 0 ACCCCATAGCAACTCTATAGATCTCCCGTAAGTGCATTGCATACAAATC a_bcceccgggggiiiiiihhiihhhghffghegfhiiiiiiiidhhhh AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:49 YT:Z:UU > FCD2DFWACXX:5:1101:17853:2237#0/1 0 Chr5 3178597 255 49M * 0 0 GCCCCATAGCAACTCTATAGATCTCCCGTAAGTGCATTGCATACAAATC JY\^ccccgeggghhhhhhhhhhhhhhhghhhgghhhhhhhhhhhhhhh AS:i:-6 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:0A48 YT:Z:UU > FCD2DFWACXX:5:1101:1754:2374#0/1 0 Chr5 3178597 255 49M * 0 0 ACCCCATAGCAACTCTATAGATCTCCCGTAAGTGCATTGCATACAAATC aa_S\acceggggiiiiiiiiiiiihiihghieghiiiihiiii]^egh AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:49 YT:Z:UU > FCD2DFWACXX:5:1101:3029:2484#0/1 0 Chr5 3178597 255 49M * 0 0 ACTCCATAGCAACTCTATAGATCTCCCGTAAGTGCATTGCATACAAATC JYPSaRaa`cacchhdhcececcY`ecdbedbdhhhhhhc_cdc[ccch AS:i:-6 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:2C46 YT:Z:UU > FCD2DFWACXX:5:1101:3545:2484#0/1 0 Chr5 3178597 255 49M * 0 0 ACCCCATAGCAACTCTATAGATCTCCCGTAAGTGCATTGCATACAAATC ___cceeegfcgefhhiiihifdghhhhhffffgcdcfhbfhd]^aefd AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:49 YT:Z:UU > > My primer sequence is also here : > > >FLC > ACCCCATAGCAACTCTATAGATCT > > And the viewpoint is the FLC gene of arabidopsis thaliana with > coordinates : > > colData(fcar)$chr="Chr5" > colData(fcar)$start=3173504 > colData(fcar)$end=3178620 > > I did the aligment of sequencing data with > Arabidopsis_thaliana.TAIR10.24.dna.genome.fa and the head of this file > is like that : > > >Chr5 dna:chromosome chromosome:TAIR10:5:1:26975502:1 > TATACCATGTACCCTCAACCTTAAAACCCTAAAACCTATACTATAAATCTTTAAAACCTA > TACTCTAAACCATAGGGTTTGTGAGTTTGCATAAAGTGTCACGTATAAGTGTTTCTAACA > TGTGAGTTTGCATAAGAGTCTCGACTATGTGTTTGTTCAAAAGTGACGTAAGTGTTTAGA > CTAGAGCCGGCCGTGAGCACAAGCGGGCCAAGCCCATGCTTGCGGCAGATTATCCTATAA > TTATGTTTTGCGGCTTTACAATTTTGAATTTGTTTTGTTGTTTAGTGGTCGAGTCAGGGA > TGAGTTTGTTCTCAAACATCTCAAATTTCTAATCTTCCAAGTGTTAGGTTCACTCTACCT > TATTTTTTTTTTTTTTTTTTATTTTAAATTTCTATTAGAAAGAACTTGAACCTTATTTCA > AGTGAACATTAAAAAGAATTGTAACCCTACCTTATTTCTATTATTATGAATATGTGTAAA > GTAAATTTTTTTACAATAAGCATGTATAAAGTAACATATTTTTGTTCTTTTCTTATAGCC > TTGTAAATCATAGGGACGGATCTAGTTTAGAATAAAGTGTTGCATAAGTTCTTCGACTAA > CATGTGAGTCCATTTTTGTGAGTTTTGCATAACTTTTGCGCCTTCGACAATTTGAATTTT > ATGCATAAGTGCTTTGACTAATTTATATGTTTTGTCTAAGTGATTCGACATTTCGACTAA > GTGATGTGAATTGTGTCATTCGTCGTCGTTGTCGCATCCGGAAGCGTGTTTGTGGTTCTT > GTGTCTTACTTGCCGGGTTATGAAGTCCTTTGTTTCGAGAGATCTGCTCTCATCATGATA > GAAGATGGTAGATGTTATCTTTGGATTGCCGCGGGTTGAATTGTCGCGGGCATCCTTCTT > AAATCTCATTCACATGTTCTTTTGAGGAAGAGTTTTTTTTGTTTTTTATTTATTTATTGT > TTCCCTTAAGATTTATACTTTATTCCTTCTATTTTTTGAGGACTTTTATTTTGTCTGTTG > GCTTTAGTTTTCCTTGTTTGGACATTTGTATTTTTGCGTCTTGTAACAACTATCCTTGTG > AAAATATCAATATAAGAACAACCCTCCTCATTTTAATTCCTTCTTGTCTACTTAGTTTAA > TATTTTCCAGCCGCAATGGGCCCATTAGCATCAACACCGGCCTATTTAGACGGCCCGTTA > TCTCCTCTTTGCCAATTTTCACCTTCTGCAATGATATTGAAATTGAAGTAAATGCAAACA > AAATAGTATGTTCCTGGTAGCAGATATTTAAGTTCCATTGATGTATTTCAAAGGGTTCAG > AGTTTTATCGTTTTACAAAGAAATGATGAGTGTTCATGACTGTAAAATCCACCTTCATCT > TCCACTTTCAGTTTAACGGCTCCGGCTCTGGATCCGGCTCCACTACTGGTGCTATGCAAT > CCACCTTATATCGAAGAGTCTTTCACAGAGGAAAAGAAACGGCCCAATGCAAAGATCCAA > ATTAATGAACGCAATGGAAAAATGGGTTACTGATTTACTCCGTCTACCCCAAAATGTAAA > TATCGCCCTACCTTTTTGAAGAAAGTTGAATATGAATTATCCCAAACAAGTTTGTAGTTT > CCAGCCATTAGCGTAGAGAAGTTGCCCTGCAGGTATAACCACAGTGGATCCCCAAAACGA > TTACCTTCATAACAACAACAAAAAATACCCAAAAGCTTCCCCACAAATGCACATGGGCAC > TGACCTGGTCAGCTTCATACCGACGGTAAGGTAATATCAGCTGAAATACGAAAAATTCAT > GATCAACCCTCAGAAATACACTATCTGGATGAATATTTGATAGAGTACGAGATTATATAT > ACTATGCGAAAGTAAAGTTTAACCAGAGTTTGGAAATTAATGAGCAGACTCACAGTCTTT > TCCCCTGAAGCATTTATGTACTCCACGCTAAATCCGATATCCTGGTAAAAGACAACAGAA > ATATAAACGGAGGACAAACAACAGAAGCTGAGAACCTAACTTAGATCCCGTATCATGTCA > AGAAATAACCATTCTTGTCCTTTTACATAAAAGATTCATCACAGACCAAGATAACGCTGT > ATACTATGTATGAAAACAAAGATAACCAGAGTCTGGCAAAAATAGCAAAAGATACAACAA > TAAAAATCTAATATACGGCAGCTAGAGAGAGTGTGATGTTCTCGACATGCAATCCAGAGA > CCAAAAACTAAGTTTCATGAATTCTTGAACTCTACTCATAAGAGAACAAACTAGACACCC > CTGTGAAGAGGGAAAGAACCCCAATACCATGCTTGGGAAACATATTGCAGGTTTAAAAAT > CAATACCATGATGGGGAAACATATCGCGAGTTCAATCATCCGAAAAAGCAATCAAATGAT > TCAAGGTTACGTATTGGGGTTTAAAAATTATTACCATGCTTATCTTGCCTTGCATCAAAG > AGAAATCCCAAGCAATATACGAATTTTCAGATTCCACCATCAATGAGATCTAGATAAACA > AAAAGAACATGCATCAATACTTACGCCAGACTCAAACGAAAGTGCTTATCATATGCTTTC > GTTTTTCTGCCAGCAGAAAATCTGAATGCTACATACTACAGTCATCAACATCACTTTTCA > CGTTTTTGAACATGACAATTTCAGATGGAACAAGTGTCAAGAGTCTTAGAATGGCAATAA > GTTTCAGTTTCATTAATAACCAAAAGCATCTACGGAGAACATATTACCTCATAGGTTCTC > CCAGCTTGAACGATTATCTCCTTGTACTCCTCATTGTCCCTAAGAAGTATAAACACTTAG > AGCTAGAAATCGTGAATTGAAGTGAGAATCTTGTGAATGATGGAAGCTAACCTTGACCAC > AATATGCCAGCTGTATCAGCTGAAATAAAGAAGGGGGAATTGGCGAATACATCGCTGAAA > AAGTAAAATGTTAATGACCCTCCATAGTGAGAAAGCAATACCTAAAGACGTCCAGCATAA > TTAAGGTTGGAACTGGGGATATAAATTTTAATCAAAACCGAATAAAGCTGAAAATTAACA > CAAAAATAATGAATTCGAGATAATGTCTAACACTTTATTAGATTTGTCTCTAATAACGCA > TGCCTTTGCAATTTGCATACTAGGAAAACTAATAAAAGAATACAATGGAGAACAAACCTT > AAAATAGACATAAACGTAATTGTCCTTAACTTTCCAAGGAATGTTTCAAAAGTTAAGTAC > ATCATATACTCAAAAGAAAATAAAGCAACCCACGACCAAGCCCATAGAAAGAGAGAAAGT > ACCTGATAAATGTGACGAATCTTTCAAACTCAGAAGTGTAGACTATAGCAGCTTCAGACA > > Like all the fasta referench genome files ! I also tried with other > versions but always i am using the same referench for mapping and for > the FourCseq tool. > > I tried to change the trimming settings from here : > > fcar <- countFragmentOverlaps(fcar, trim=24, minMapq=0, shift=6) and only at that case i have one only peak but again all 0 counts. Is it possible that my data are problematic ??? > > Thank you for your quick reply. > > Best Regards > > Dimitris Zisis > > ------------------------------------------------------------------------ > > Post tags: FourCseq > > You may reply via email or visit > A: FourCseq -problem with countFragmentOverlaps >
ADD REPLY
0
Entering edit mode
dzis • 0
@dzis-8446
Last seen 9.0 years ago
Poland

Hello,

Actually what i did here is a 2nd way of testing my data.I just did a different alligment of all read without treatment before in irder to check if it works.

At the beginning i did it as you said. the alignment is done by removing  the primers. But when i used also those bam files to the FourCseq tool the result was again the same.i have this warning mesage :

reading bam files
calculating overlaps
Warning messages:
1: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)
2: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)
3: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)
4: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)
5: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)
6: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)
7: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)
8: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)

 

 

my previous bam files were like that

0       TTGACGATGATGATGACTCAACTCGAGATCT fhfghhgbc_b^^^gd`bId`fffe]gcgdc AS:i:0  
XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:31 YT:Z:UU
301532  16      Chr1:13125-13175        20      255     31M     *       0       
0       TTGACGATGATGATGACTCAACTCGAGATCT ihiiihiiiiiihgffhgbgfiiiihiiiih AS:i:0  
XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:31 YT:Z:UU
1750998 16      Chr1:13125-13175        20      255     31M     *       0       
0       TTGACGATGATGATGACTCAACTCGAGATCT dhfiiihhigehhgfbd[hgegd_eXihfhi AS:i:0  
XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:31 YT:Z:UU
5968892 16      Chr1:13125-13175        20      255     31M     *       0       0       TTGACGATGATGATGACTCAACTCGAGATCT iihiiiiiiiiiihhgiihhgihiihihigi AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:31 YT:Z:UU
7768220 16      Chr1:13125-13175        20      255     31M     *       0       0       TTGACGATGATGATGACTCAACTCGAGATCT heiihiiiihffhgfdhhhgghhhhhhiiii AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:31 YT:Z:UU
66822   0       Chr1:13169-13219        1       255     31M     *       0       0       AGATCTTCAACATACATTAGAGAGATAATAT ihhighiiiihhiiihiiiigiiiiiiiiii AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:31 YT:Z:UU
432535  0       Chr1:13169-13219        1       255     31M     *       0       0       AGATCTTCAACATACATTAGAGAGATAATAT iiiiiihihihiiiiiiiiiiihhhhhfhih AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:31 YT:Z:UU
441231  0       Chr1:13169-13219        1       255     31M     *       0       0       AGATCTTCAACATACATTAGAGAGATAATAT iiiiiiiiiiiiiiiiiiiiiihihiiiiii AS:i:0  :

so at that case i suppose that i have to set trimm to 6 and shift=0 as i did but also the result is 0 counts all the time.

Best

Dimitris

 

ADD COMMENT
0
Entering edit mode
Hello Dimitris, in the fragment folder in the 4c project you can find 2 bed files. These are the Positions of the restriction fragments found in the provided reference genome. I would recommend that you have a look at the position of these fragments together with your bam files (primer trimmed) in a genome browser. There you can get a good feeling if your library worked. You should see a peak at the viepoint that sharply falls off to the sides. In the browser you can check the direction and location of the restriction sites and adjust the parameters for counting accordingly. One thing I have not mentioned yet. If you sequencing is starting from the second primer you have to use the countFragementOverlapsSecondCutter function. Best regards, Felix
ADD REPLY
0
Entering edit mode
@herve-pages-1542
Last seen 3 days ago
Seattle, WA, United States

Hi Dimitris,

countFragmentOverlaps() is calling countOverlaps() internally (twice per BAM file), and I think the 8 warnings you get are coming from these internal calls to countOverlaps(). These warnings are saying that the query and subject passed to countOverlaps() have no chromosome names (a.k.a. sequence levels) in common.

Can you please show us the output of seqlevels(rowRanges(fcar)) ? rowRanges(fcar) contains the fragment ranges and is passed to countOverlaps() via the query argument. Its seqlevels (or at least some of them) should be the same as those of your BAM files, which are Chr1, Chr2, etc... If not, then no overlap can be found between the fragments and the reads in your BAM files so countOverlaps() returns 0.

H.

ADD COMMENT

Login before adding your answer.

Traffic: 667 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6