DEXSeq for 1 gene
2
0
Entering edit mode
@mallon-eamonn-b-dr-5947
Last seen 10.4 years ago
Dear All, I have RNA-seq data for 20 bumblebee samples divided into treatment and control. I am interested in differential exon usage. Unfortunately the bumblebee genome is not at the stage where I can do a complete DEXSeq analysis genome wide. I decided to look at a single gene where I have a decent gene model. Using TopHat2 and the python scripts included in DEXSeq I was able to produce an ExonCountSet object. >head(counts(ecs)) K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82 XLOC_000001:E001 1 0 1 0 1 0 0 0 0 XLOC_000001:E002 1 0 0 0 2 1 0 1 1 XLOC_000001:E003 0 0 0 0 3 0 0 0 0 XLOC_000001:E004 1 0 1 0 1 0 0 1 0 XLOC_000001:E005 1 3 3 2 3 0 0 0 0 XLOC_000001:E006 0 0 0 0 0 0 0 0 0 As I expected, its all come crashing down round my ears at the analysis stage > sizeFactors(ecs) K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82 NA NA NA NA NA NA NA NA NA I tried using the shorth value > ecs<-estimateSizeFactors(ecs,(locfunc=genefilter::shorth((counts(ecs )),tie.action="min"))) Error in .local(object, ...) : unused argument (0.2) I understand that my main problem is that the DEXSeq analysis should be genome wide (my data has too few counts). Is there a way to use DEXSeq to ONLY look at a single gene? If not can anyone think of a statistical way to analyse my data (cross tabs? How should I normalise?) Thanks for your time Eamonn > sessionInfo() R version 3.0.0 (2013-04-03) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale: [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] parallel stats graphics grDevices utils datasets methods base other attached packages: [1] genefilter_1.42.0 DEXSeq_1.6.0 Biobase_2.20.0 BiocGenerics_0.6.0 loaded via a namespace (and not attached): [1] annotate_1.38.0 AnnotationDbi_1.22.5 biomaRt_2.16.0 Biostrings_2.28.0 bitops_1.0-5 DBI_0.2-7 [7] GenomicRanges_1.12.3 hwriter_1.3 IRanges_1.18.1 RCurl_1.95-4.1 Rsamtools_1.12.3 RSQLite_0.11.3 [13] splines_3.0.0 statmod_1.4.17 stats4_3.0.0 stringr_0.6.2 survival_2.37-4 tools_3.0.0 [19] XML_3.95-0.2 xtable_1.7-1 zlibbioc_1.6.0 Dr Eamonn Mallon Lecturer in Evolutionary Biology Adrian 220 Biology Department University of Leicester http://www2.le.ac.uk/departments/biology/people/mallon [[alternative HTML version deleted]]
DEXSeq DEXSeq • 1.8k views
ADD COMMENT
0
Entering edit mode
Alejandro Reyes ★ 1.9k
@alejandro-reyes-5124
Last seen 6 months ago
Novartis Institutes for BioMedical Rese…
Dear Eamonn, Thanks for your interest in DEXSeq. In your size factor estimation, where did you get your second parameter from? (check ?estimateSizeFactors) There is no need to specify anything else than the ExonCountSet object and that is the reason you are getting the error message. If you are only interested in a single gene maybe you could visualize directly the counts per exon using the function plotDEXSeq. Best regards, Alejandro Reyes > Dear All, > I have RNA-seq data for 20 bumblebee samples divided into treatment and control. I am interested in differential exon usage. Unfortunately the bumblebee genome is not at the stage where I can do a complete DEXSeq analysis genome wide. I decided to look at a single gene where I have a decent gene model. Using TopHat2 and the python scripts included in DEXSeq I was able to produce an ExonCountSet object. > >> head(counts(ecs)) > K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82 > XLOC_000001:E001 1 0 1 0 1 0 0 0 0 > XLOC_000001:E002 1 0 0 0 2 1 0 1 1 > XLOC_000001:E003 0 0 0 0 3 0 0 0 0 > XLOC_000001:E004 1 0 1 0 1 0 0 1 0 > XLOC_000001:E005 1 3 3 2 3 0 0 0 0 > XLOC_000001:E006 0 0 0 0 0 0 0 0 0 > > As I expected, its all come crashing down round my ears at the analysis stage > >> sizeFactors(ecs) > K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82 > NA NA NA NA NA NA NA NA NA > > I tried using the shorth value > >> ecs<-estimateSizeFactors(ecs,(locfunc=genefilter::shorth((counts(ec s)),tie.action="min"))) > Error in .local(object, ...) : unused argument (0.2) > > I understand that my main problem is that the DEXSeq analysis should be genome wide (my data has too few counts). > > Is there a way to use DEXSeq to ONLY look at a single gene? If not can anyone think of a statistical way to analyse my data (cross tabs? How should I normalise?) > > Thanks for your time > > Eamonn > > >> sessionInfo() > R version 3.0.0 (2013-04-03) > Platform: x86_64-apple-darwin10.8.0 (64-bit) > > locale: > [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods base > > other attached packages: > [1] genefilter_1.42.0 DEXSeq_1.6.0 Biobase_2.20.0 BiocGenerics_0.6.0 > > loaded via a namespace (and not attached): > [1] annotate_1.38.0 AnnotationDbi_1.22.5 biomaRt_2.16.0 Biostrings_2.28.0 bitops_1.0-5 DBI_0.2-7 > [7] GenomicRanges_1.12.3 hwriter_1.3 IRanges_1.18.1 RCurl_1.95-4.1 Rsamtools_1.12.3 RSQLite_0.11.3 > [13] splines_3.0.0 statmod_1.4.17 stats4_3.0.0 stringr_0.6.2 survival_2.37-4 tools_3.0.0 > [19] XML_3.95-0.2 xtable_1.7-1 zlibbioc_1.6.0 > Dr Eamonn Mallon > Lecturer in Evolutionary Biology > Adrian 220 > Biology Department > University of Leicester > > http://www2.le.ac.uk/departments/biology/people/mallon > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Hi Alejandro, Thanks very much for getting back to me. I think I got the second parameter idea from DESeq. I see I am wrong. so I ran > ecs<-estimateSizeFactors(ecs) > sizeFactors(ecs) K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82 NA NA NA NA NA NA NA NA NA Is there any way to get DEXSeq to estimate size factors with my small dataset? If you are only interested in a single gene maybe you could visualize directly the counts per exon using the function plotDEXSeq. I'm not sure how to do this I tried >plotDEXSeq(ecs,"XLOC_000001", fitExpToVar="condition", norCounts=FALSE, expression=FALSE, splicing=FALSE, displayTranscripts=FALSE, legend=TRUE) Error in plotDEXSeq(ecs, "XLOC_000001", fitExpToVar = "condition", norCounts = FALSE, : Please estimate sizeFactors first I guess this is because modelFrameForGene requires sizefactors. Any ideas? Eamonn On 22/05/13 18:07, Alejandro Reyes wrote: > Dear Eamonn, > > Thanks for your interest in DEXSeq. > > In your size factor estimation, where did you get your second parameter > from? (check ?estimateSizeFactors) There is no need to specify anything > else than the ExonCountSet object and that is the reason you are getting > the error message. > > If you are only interested in a single gene maybe you could visualize > directly the counts per exon using the function plotDEXSeq. > > Best regards, > Alejandro Reyes > > >> Dear All, >> I have RNA-seq data for 20 bumblebee samples divided into treatment and control. I am interested in differential exon usage. Unfortunately the bumblebee genome is not at the stage where I can do a complete DEXSeq analysis genome wide. I decided to look at a single gene where I have a decent gene model. Using TopHat2 and the python scripts included in DEXSeq I was able to produce an ExonCountSet object. >> >>> head(counts(ecs)) >> K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82 >> XLOC_000001:E001 1 0 1 0 1 0 0 0 0 >> XLOC_000001:E002 1 0 0 0 2 1 0 1 1 >> XLOC_000001:E003 0 0 0 0 3 0 0 0 0 >> XLOC_000001:E004 1 0 1 0 1 0 0 1 0 >> XLOC_000001:E005 1 3 3 2 3 0 0 0 0 >> XLOC_000001:E006 0 0 0 0 0 0 0 0 0 >> >> As I expected, its all come crashing down round my ears at the analysis stage >> >>> sizeFactors(ecs) >> K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82 >> NA NA NA NA NA NA NA NA NA >> >> I tried using the shorth value >> >>> ecs<-estimateSizeFactors(ecs,(locfunc=genefilter::shorth((counts(e cs)),tie.action="min"))) >> Error in .local(object, ...) : unused argument (0.2) >> >> I understand that my main problem is that the DEXSeq analysis should be genome wide (my data has too few counts). >> >> Is there a way to use DEXSeq to ONLY look at a single gene? If not can anyone think of a statistical way to analyse my data (cross tabs? How should I normalise?) >> >> Thanks for your time >> >> Eamonn >> >> >>> sessionInfo() >> R version 3.0.0 (2013-04-03) >> Platform: x86_64-apple-darwin10.8.0 (64-bit) >> >> locale: >> [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 >> >> attached base packages: >> [1] parallel stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] genefilter_1.42.0 DEXSeq_1.6.0 Biobase_2.20.0 BiocGenerics_0.6.0 >> >> loaded via a namespace (and not attached): >> [1] annotate_1.38.0 AnnotationDbi_1.22.5 biomaRt_2.16.0 Biostrings_2.28.0 bitops_1.0-5 DBI_0.2-7 >> [7] GenomicRanges_1.12.3 hwriter_1.3 IRanges_1.18.1 RCurl_1.95-4.1 Rsamtools_1.12.3 RSQLite_0.11.3 >> [13] splines_3.0.0 statmod_1.4.17 stats4_3.0.0 stringr_0.6.2 survival_2.37-4 tools_3.0.0 >> [19] XML_3.95-0.2 xtable_1.7-1 zlibbioc_1.6.0 >> Dr Eamonn Mallon >> Lecturer in Evolutionary Biology >> Adrian 220 >> Biology Department >> University of Leicester >> >> http://www2.le.ac.uk/departments/biology/people/mallon >> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
Dear Eamonn from your initial post -i.e. the output of head(counts(ecs))- it looks like your counts are very low? This would make estimateSizeFactors produce lots of NA values, but even if it did not, it would mean that you'd have essentially no statistical power to detect anything useful from such low counts. You mentioned "20 bumblebee samples" but the table whose head you show only has 9? Can you post the full output of 'counts(ecs)' and make sure its content is plausible based on your understanding of the data? Best wishes Wolfgang On May 23, 2013, at 2:03 pm, Eamonn Mallon <ebm3 at="" leicester.ac.uk=""> wrote: > Hi Alejandro, > Thanks very much for getting back to me. > > I think I got the second parameter idea from DESeq. I see I am wrong. so I ran > > > ecs<-estimateSizeFactors(ecs) > > sizeFactors(ecs) > K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82 > NA NA NA NA NA NA NA NA NA > > Is there any way to get DEXSeq to estimate size factors with my small dataset? > > If you are only interested in a single gene maybe you could visualize > directly the counts per exon using the function plotDEXSeq. > > I'm not sure how to do this I tried > > >plotDEXSeq(ecs,"XLOC_000001", fitExpToVar="condition", norCounts=FALSE, expression=FALSE, splicing=FALSE, displayTranscripts=FALSE, legend=TRUE) > > Error in plotDEXSeq(ecs, "XLOC_000001", fitExpToVar = "condition", norCounts = FALSE, : > Please estimate sizeFactors first > > I guess this is because modelFrameForGene requires sizefactors. > > Any ideas? > > Eamonn > > > On 22/05/13 18:07, Alejandro Reyes wrote: >> Dear Eamonn, >> >> Thanks for your interest in DEXSeq. >> >> In your size factor estimation, where did you get your second parameter >> from? (check ?estimateSizeFactors) There is no need to specify anything >> else than the ExonCountSet object and that is the reason you are getting >> the error message. >> >> If you are only interested in a single gene maybe you could visualize >> directly the counts per exon using the function plotDEXSeq. >> >> Best regards, >> Alejandro Reyes >> >> >>> Dear All, >>> I have RNA-seq data for 20 bumblebee samples divided into treatment and control. I am interested in differential exon usage. Unfortunately the bumblebee genome is not at the stage where I can do a complete DEXSeq analysis genome wide. I decided to look at a single gene where I have a decent gene model. Using TopHat2 and the python scripts included in DEXSeq I was able to produce an ExonCountSet object. >>> >>>> head(counts(ecs)) >>> K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82 >>> XLOC_000001:E001 1 0 1 0 1 0 0 0 0 >>> XLOC_000001:E002 1 0 0 0 2 1 0 1 1 >>> XLOC_000001:E003 0 0 0 0 3 0 0 0 0 >>> XLOC_000001:E004 1 0 1 0 1 0 0 1 0 >>> XLOC_000001:E005 1 3 3 2 3 0 0 0 0 >>> XLOC_000001:E006 0 0 0 0 0 0 0 0 0 >>> >>> As I expected, its all come crashing down round my ears at the analysis stage >>> >>>> sizeFactors(ecs) >>> K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82 >>> NA NA NA NA NA NA NA NA NA >>> >>> I tried using the shorth value >>> >>>> ecs<-estimateSizeFactors(ecs,(locfunc=genefilter::shorth((counts( ecs)),tie.action="min"))) >>> Error in .local(object, ...) : unused argument (0.2) >>> >>> I understand that my main problem is that the DEXSeq analysis should be genome wide (my data has too few counts). >>> >>> Is there a way to use DEXSeq to ONLY look at a single gene? If not can anyone think of a statistical way to analyse my data (cross tabs? How should I normalise?) >>> >>> Thanks for your time >>> >>> Eamonn >>> >>> >>>> sessionInfo() >>> R version 3.0.0 (2013-04-03) >>> Platform: x86_64-apple-darwin10.8.0 (64-bit) >>> >>> locale: >>> [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 >>> >>> attached base packages: >>> [1] parallel stats graphics grDevices utils datasets methods base >>> >>> other attached packages: >>> [1] genefilter_1.42.0 DEXSeq_1.6.0 Biobase_2.20.0 BiocGenerics_0.6.0 >>> >>> loaded via a namespace (and not attached): >>> [1] annotate_1.38.0 AnnotationDbi_1.22.5 biomaRt_2.16.0 Biostrings_2.28.0 bitops_1.0-5 DBI_0.2-7 >>> [7] GenomicRanges_1.12.3 hwriter_1.3 IRanges_1.18.1 RCurl_1.95-4.1 Rsamtools_1.12.3 RSQLite_0.11.3 >>> [13] splines_3.0.0 statmod_1.4.17 stats4_3.0.0 stringr_0.6.2 survival_2.37-4 tools_3.0.0 >>> [19] XML_3.95-0.2 xtable_1.7-1 zlibbioc_1.6.0 >>> Dr Eamonn Mallon >>> Lecturer in Evolutionary Biology >>> Adrian 220 >>> Biology Department >>> University of Leicester >>> >>> http://www2.le.ac.uk/departments/biology/people/mallon >>> >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
Hi Wolfgang Although I have 20 samples, only 11 are under the conditions I'm interested in this analysis. Not sure why I mentioned 20 in the first place sorry. Of those 11, 2 are giving odd results from top hat so I haven't included them yet until I figure out what is wrong. So I have 9 to work with > counts(ecs) K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82 XLOC_000001:E001 1 0 1 0 1 0 0 0 0 XLOC_000001:E002 1 0 0 0 2 1 0 1 1 XLOC_000001:E003 0 0 0 0 3 0 0 0 0 XLOC_000001:E004 1 0 1 0 1 0 0 1 0 XLOC_000001:E005 1 3 3 2 3 0 0 0 0 XLOC_000001:E006 0 0 0 0 0 0 0 0 0 XLOC_000001:E007 0 1 2 2 2 0 3 2 0 XLOC_000001:E008 2 1 2 0 0 2 0 1 0 XLOC_000001:E009 1 2 0 0 1 0 0 2 0 XLOC_000001:E010 4 0 1 1 0 0 0 1 0 XLOC_000001:E011 0 0 0 0 0 1 0 0 0 XLOC_000001:E012 0 0 1 1 0 2 0 2 0 XLOC_000001:E013 1 1 3 2 2 2 1 3 0 XLOC_000001:E014 3 3 1 2 2 1 3 0 1 XLOC_000001:E015 1 2 1 2 4 2 0 0 3 XLOC_000001:E016 3 5 2 2 7 1 0 5 0 XLOC_000001:E017 1 2 0 2 2 2 1 3 1 XLOC_000001:E018 0 4 3 1 6 3 2 2 1 XLOC_000001:E019 2 2 3 2 2 2 0 5 3 XLOC_000001:E020 3 3 2 0 3 1 2 4 2 XLOC_000001:E021 2 1 2 4 3 1 0 3 1 XLOC_000001:E022 2 0 1 4 3 1 0 2 1 XLOC_000001:E023 0 0 3 1 0 2 0 2 1 XLOC_000001:E024 3 1 0 1 0 0 0 1 1 XLOC_000001:E025 0 1 0 2 2 2 2 0 0 XLOC_000001:E026 0 0 2 2 1 2 0 1 0 XLOC_000001:E027 6 1 2 3 0 0 0 4 1 XLOC_000001:E028 7 2 3 3 0 4 1 5 0 XLOC_000001:E029 5 4 1 5 0 5 5 3 1 XLOC_000001:E030 0 1 1 0 3 0 0 0 0 XLOC_000001:E031 1 1 0 0 1 0 0 1 0 The low read count I put down to only looking at a single gene. Is my data a loss cost for the dexseq pipeline? Thanks for your help Eamonn Dr Eamonn Mallon Lecturer in Evolutionary Biology Adrian 220 Biology Department University of Leicester http://www2.le.ac.uk/departments/biology/people/mallon On 26/05/2013 13:12, "Wolfgang Huber" <whuber at="" embl.de=""> wrote: >Dear Eamonn > >from your initial post -i.e. the output of head(counts(ecs))- it looks >like your counts are very low? This would make estimateSizeFactors >produce lots of NA values, but even if it did not, it would mean that >you'd have essentially no statistical power to detect anything useful >from such low counts. > >You mentioned "20 bumblebee samples" but the table whose head you show >only has 9? > >Can you post the full output of 'counts(ecs)' and make sure its content >is plausible based on your understanding of the data? > > Best wishes > Wolfgang > > >On May 23, 2013, at 2:03 pm, Eamonn Mallon <ebm3 at="" leicester.ac.uk=""> wrote: > >> Hi Alejandro, >> Thanks very much for getting back to me. >> >> I think I got the second parameter idea from DESeq. I see I am wrong. >>so I ran >> >> > ecs<-estimateSizeFactors(ecs) >> > sizeFactors(ecs) >> K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82 >> NA NA NA NA NA NA NA NA NA >> >> Is there any way to get DEXSeq to estimate size factors with my small >>dataset? >> >> If you are only interested in a single gene maybe you could visualize >> directly the counts per exon using the function plotDEXSeq. >> >> I'm not sure how to do this I tried >> >> >plotDEXSeq(ecs,"XLOC_000001", fitExpToVar="condition", >>norCounts=FALSE, expression=FALSE, splicing=FALSE, >>displayTranscripts=FALSE, legend=TRUE) >> >> Error in plotDEXSeq(ecs, "XLOC_000001", fitExpToVar = "condition", >>norCounts = FALSE, : >> Please estimate sizeFactors first >> >> I guess this is because modelFrameForGene requires sizefactors. >> >> Any ideas? >> >> Eamonn >> >> >> On 22/05/13 18:07, Alejandro Reyes wrote: >>> Dear Eamonn, >>> >>> Thanks for your interest in DEXSeq. >>> >>> In your size factor estimation, where did you get your second parameter >>> from? (check ?estimateSizeFactors) There is no need to specify anything >>> else than the ExonCountSet object and that is the reason you are >>>getting >>> the error message. >>> >>> If you are only interested in a single gene maybe you could visualize >>> directly the counts per exon using the function plotDEXSeq. >>> >>> Best regards, >>> Alejandro Reyes >>> >>> >>>> Dear All, >>>> I have RNA-seq data for 20 bumblebee samples divided into treatment >>>>and control. I am interested in differential exon usage. Unfortunately >>>>the bumblebee genome is not at the stage where I can do a complete >>>>DEXSeq analysis genome wide. I decided to look at a single gene where >>>>I have a decent gene model. Using TopHat2 and the python scripts >>>>included in DEXSeq I was able to produce an ExonCountSet object. >>>> >>>>> head(counts(ecs)) >>>> K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82 >>>> XLOC_000001:E001 1 0 1 0 1 0 0 0 0 >>>> XLOC_000001:E002 1 0 0 0 2 1 0 1 1 >>>> XLOC_000001:E003 0 0 0 0 3 0 0 0 0 >>>> XLOC_000001:E004 1 0 1 0 1 0 0 1 0 >>>> XLOC_000001:E005 1 3 3 2 3 0 0 0 0 >>>> XLOC_000001:E006 0 0 0 0 0 0 0 0 0 >>>> >>>> As I expected, its all come crashing down round my ears at the >>>>analysis stage >>>> >>>>> sizeFactors(ecs) >>>> K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82 >>>> NA NA NA NA NA NA NA NA NA >>>> >>>> I tried using the shorth value >>>> >>>>> >>>>>ecs<-estimateSizeFactors(ecs,(locfunc=genefilter::shorth((counts( ecs)) >>>>>,tie.action="min"))) >>>> Error in .local(object, ...) : unused argument (0.2) >>>> >>>> I understand that my main problem is that the DEXSeq analysis should >>>>be genome wide (my data has too few counts). >>>> >>>> Is there a way to use DEXSeq to ONLY look at a single gene? If not >>>>can anyone think of a statistical way to analyse my data (cross tabs? >>>>How should I normalise?) >>>> >>>> Thanks for your time >>>> >>>> Eamonn >>>> >>>> >>>>> sessionInfo() >>>> R version 3.0.0 (2013-04-03) >>>> Platform: x86_64-apple-darwin10.8.0 (64-bit) >>>> >>>> locale: >>>> [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 >>>> >>>> attached base packages: >>>> [1] parallel stats graphics grDevices utils datasets >>>>methods base >>>> >>>> other attached packages: >>>> [1] genefilter_1.42.0 DEXSeq_1.6.0 Biobase_2.20.0 >>>>BiocGenerics_0.6.0 >>>> >>>> loaded via a namespace (and not attached): >>>> [1] annotate_1.38.0 AnnotationDbi_1.22.5 biomaRt_2.16.0 >>>>Biostrings_2.28.0 bitops_1.0-5 DBI_0.2-7 >>>> [7] GenomicRanges_1.12.3 hwriter_1.3 IRanges_1.18.1 >>>>RCurl_1.95-4.1 Rsamtools_1.12.3 RSQLite_0.11.3 >>>> [13] splines_3.0.0 statmod_1.4.17 stats4_3.0.0 >>>>stringr_0.6.2 survival_2.37-4 tools_3.0.0 >>>> [19] XML_3.95-0.2 xtable_1.7-1 zlibbioc_1.6.0 >>>> Dr Eamonn Mallon >>>> Lecturer in Evolutionary Biology >>>> Adrian 220 >>>> Biology Department >>>> University of Leicester >>>> >>>> http://www2.le.ac.uk/departments/biology/people/mallon >>>> >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>>http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >>http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY
0
Entering edit mode
Alejandro Reyes ★ 1.9k
@alejandro-reyes-5124
Last seen 6 months ago
Novartis Institutes for BioMedical Rese…
Dear Eamonn, With such low counts for a each exon, you can not infer any isoform regulation using either DEXSeq or any tool, since the signal to noise ratio is very low. Do all your genes have similar low numbers? If so, I would check the data quality or the quality of the transcript assemblies that you are using. If other genes have higher counts it could also mean that this gene is simply not so expressed in your samples. Best regards, Alejandro Reyes > Hi Wolfgang > Although I have 20 samples, only 11 are under the conditions I'm > interested in this analysis. Not sure why I mentioned 20 in the first > place sorry. Of those 11, 2 are giving odd results from top hat so I > haven't included them yet until I figure out what is wrong. So I have 9 to > work with > >> counts(ecs) > K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82 > XLOC_000001:E001 1 0 1 0 1 0 0 0 0 > XLOC_000001:E002 1 0 0 0 2 1 0 1 1 > XLOC_000001:E003 0 0 0 0 3 0 0 0 0 > XLOC_000001:E004 1 0 1 0 1 0 0 1 0 > XLOC_000001:E005 1 3 3 2 3 0 0 0 0 > XLOC_000001:E006 0 0 0 0 0 0 0 0 0 > XLOC_000001:E007 0 1 2 2 2 0 3 2 0 > XLOC_000001:E008 2 1 2 0 0 2 0 1 0 > XLOC_000001:E009 1 2 0 0 1 0 0 2 0 > XLOC_000001:E010 4 0 1 1 0 0 0 1 0 > XLOC_000001:E011 0 0 0 0 0 1 0 0 0 > XLOC_000001:E012 0 0 1 1 0 2 0 2 0 > XLOC_000001:E013 1 1 3 2 2 2 1 3 0 > XLOC_000001:E014 3 3 1 2 2 1 3 0 1 > XLOC_000001:E015 1 2 1 2 4 2 0 0 3 > XLOC_000001:E016 3 5 2 2 7 1 0 5 0 > XLOC_000001:E017 1 2 0 2 2 2 1 3 1 > XLOC_000001:E018 0 4 3 1 6 3 2 2 1 > XLOC_000001:E019 2 2 3 2 2 2 0 5 3 > XLOC_000001:E020 3 3 2 0 3 1 2 4 2 > XLOC_000001:E021 2 1 2 4 3 1 0 3 1 > XLOC_000001:E022 2 0 1 4 3 1 0 2 1 > XLOC_000001:E023 0 0 3 1 0 2 0 2 1 > XLOC_000001:E024 3 1 0 1 0 0 0 1 1 > XLOC_000001:E025 0 1 0 2 2 2 2 0 0 > XLOC_000001:E026 0 0 2 2 1 2 0 1 0 > XLOC_000001:E027 6 1 2 3 0 0 0 4 1 > XLOC_000001:E028 7 2 3 3 0 4 1 5 0 > XLOC_000001:E029 5 4 1 5 0 5 5 3 1 > XLOC_000001:E030 0 1 1 0 3 0 0 0 0 > XLOC_000001:E031 1 1 0 0 1 0 0 1 0 > > The low read count I put down to only looking at a single gene. Is my data > a loss cost for the dexseq pipeline? > > Thanks for your help > > Eamonn > > > > Dr Eamonn Mallon > Lecturer in Evolutionary Biology > Adrian 220 > Biology Department > University of Leicester > > http://www2.le.ac.uk/departments/biology/people/mallon > > > > > > On 26/05/2013 13:12, "Wolfgang Huber" <whuber at="" embl.de=""> wrote: > >> Dear Eamonn >> > >from your initial post -i.e. the output of head(counts(ecs))- it looks >> like your counts are very low? This would make estimateSizeFactors >> produce lots of NA values, but even if it did not, it would mean that >> you'd have essentially no statistical power to detect anything useful > >from such low counts. >> You mentioned "20 bumblebee samples" but the table whose head you show >> only has 9? >> >> Can you post the full output of 'counts(ecs)' and make sure its content >> is plausible based on your understanding of the data? >> >> Best wishes >> Wolfgang >> >> >> On May 23, 2013, at 2:03 pm, Eamonn Mallon <ebm3 at="" leicester.ac.uk=""> wrote: >> >>> Hi Alejandro, >>> Thanks very much for getting back to me. >>> >>> I think I got the second parameter idea from DESeq. I see I am wrong. >>> so I ran >>> >>>> ecs<-estimateSizeFactors(ecs) >>>> sizeFactors(ecs) >>> K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82 >>> NA NA NA NA NA NA NA NA NA >>> >>> Is there any way to get DEXSeq to estimate size factors with my small >>> dataset? >>> >>> If you are only interested in a single gene maybe you could visualize >>> directly the counts per exon using the function plotDEXSeq. >>> >>> I'm not sure how to do this I tried >>> >>>> plotDEXSeq(ecs,"XLOC_000001", fitExpToVar="condition", >>> norCounts=FALSE, expression=FALSE, splicing=FALSE, >>> displayTranscripts=FALSE, legend=TRUE) >>> >>> Error in plotDEXSeq(ecs, "XLOC_000001", fitExpToVar = "condition", >>> norCounts = FALSE, : >>> Please estimate sizeFactors first >>> >>> I guess this is because modelFrameForGene requires sizefactors. >>> >>> Any ideas? >>> >>> Eamonn >>> >>> >>> On 22/05/13 18:07, Alejandro Reyes wrote: >>>> Dear Eamonn, >>>> >>>> Thanks for your interest in DEXSeq. >>>> >>>> In your size factor estimation, where did you get your second parameter >>>> from? (check ?estimateSizeFactors) There is no need to specify anything >>>> else than the ExonCountSet object and that is the reason you are >>>> getting >>>> the error message. >>>> >>>> If you are only interested in a single gene maybe you could visualize >>>> directly the counts per exon using the function plotDEXSeq. >>>> >>>> Best regards, >>>> Alejandro Reyes >>>> >>>> >>>>> Dear All, >>>>> I have RNA-seq data for 20 bumblebee samples divided into treatment >>>>> and control. I am interested in differential exon usage. Unfortunately >>>>> the bumblebee genome is not at the stage where I can do a complete >>>>> DEXSeq analysis genome wide. I decided to look at a single gene where >>>>> I have a decent gene model. Using TopHat2 and the python scripts >>>>> included in DEXSeq I was able to produce an ExonCountSet object. >>>>> >>>>>> head(counts(ecs)) >>>>> K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82 >>>>> XLOC_000001:E001 1 0 1 0 1 0 0 0 0 >>>>> XLOC_000001:E002 1 0 0 0 2 1 0 1 1 >>>>> XLOC_000001:E003 0 0 0 0 3 0 0 0 0 >>>>> XLOC_000001:E004 1 0 1 0 1 0 0 1 0 >>>>> XLOC_000001:E005 1 3 3 2 3 0 0 0 0 >>>>> XLOC_000001:E006 0 0 0 0 0 0 0 0 0 >>>>> >>>>> As I expected, its all come crashing down round my ears at the >>>>> analysis stage >>>>> >>>>>> sizeFactors(ecs) >>>>> K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82 >>>>> NA NA NA NA NA NA NA NA NA >>>>> >>>>> I tried using the shorth value >>>>> >>>>>> ecs<-estimateSizeFactors(ecs,(locfunc=genefilter::shorth((counts(ecs)) >>>>>> ,tie.action="min"))) >>>>> Error in .local(object, ...) : unused argument (0.2) >>>>> >>>>> I understand that my main problem is that the DEXSeq analysis should >>>>> be genome wide (my data has too few counts). >>>>> >>>>> Is there a way to use DEXSeq to ONLY look at a single gene? If not >>>>> can anyone think of a statistical way to analyse my data (cross tabs? >>>>> How should I normalise?) >>>>> >>>>> Thanks for your time >>>>> >>>>> Eamonn >>>>> >>>>> >>>>>> sessionInfo() >>>>> R version 3.0.0 (2013-04-03) >>>>> Platform: x86_64-apple-darwin10.8.0 (64-bit) >>>>> >>>>> locale: >>>>> [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 >>>>> >>>>> attached base packages: >>>>> [1] parallel stats graphics grDevices utils datasets >>>>> methods base >>>>> >>>>> other attached packages: >>>>> [1] genefilter_1.42.0 DEXSeq_1.6.0 Biobase_2.20.0 >>>>> BiocGenerics_0.6.0 >>>>> >>>>> loaded via a namespace (and not attached): >>>>> [1] annotate_1.38.0 AnnotationDbi_1.22.5 biomaRt_2.16.0 >>>>> Biostrings_2.28.0 bitops_1.0-5 DBI_0.2-7 >>>>> [7] GenomicRanges_1.12.3 hwriter_1.3 IRanges_1.18.1 >>>>> RCurl_1.95-4.1 Rsamtools_1.12.3 RSQLite_0.11.3 >>>>> [13] splines_3.0.0 statmod_1.4.17 stats4_3.0.0 >>>>> stringr_0.6.2 survival_2.37-4 tools_3.0.0 >>>>> [19] XML_3.95-0.2 xtable_1.7-1 zlibbioc_1.6.0 >>>>> Dr Eamonn Mallon >>>>> Lecturer in Evolutionary Biology >>>>> Adrian 220 >>>>> Biology Department >>>>> University of Leicester >>>>> >>>>> http://www2.le.ac.uk/departments/biology/people/mallon >>>>> >>>>> >>>>> [[alternative HTML version deleted]] >>>>> >>>>> _______________________________________________ >>>>> Bioconductor mailing list >>>>> Bioconductor at r-project.org >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>> Search the archives: >>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Hi Alejandro Thanks for your reply.I haven't checked other genes. An edgeR analysis worked out fine. It definitely could be a low expression gene. Thanks for all your help Eamonn Dr Eamonn Mallon Lecturer in Evolutionary Biology Adrian 220 Biology Department University of Leicester http://www2.le.ac.uk/departments/biology/people/mallon On 28/05/2013 16:56, "Alejandro Reyes" <alejandro.reyes at="" embl.de=""> wrote: >Dear Eamonn, > >With such low counts for a each exon, you can not infer any isoform >regulation using either DEXSeq or any tool, since the signal to noise >ratio is very low. >Do all your genes have similar low numbers? If so, I would check the >data quality or the quality of the transcript assemblies that you are >using. If other genes have higher counts it could also mean that this >gene is simply not so expressed in your samples. > >Best regards, >Alejandro Reyes > > > >> Hi Wolfgang >> Although I have 20 samples, only 11 are under the conditions I'm >> interested in this analysis. Not sure why I mentioned 20 in the first >> place sorry. Of those 11, 2 are giving odd results from top hat so I >> haven't included them yet until I figure out what is wrong. So I have 9 >>to >> work with >> >>> counts(ecs) >> K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82 >> XLOC_000001:E001 1 0 1 0 1 0 0 0 0 >> XLOC_000001:E002 1 0 0 0 2 1 0 1 1 >> XLOC_000001:E003 0 0 0 0 3 0 0 0 0 >> XLOC_000001:E004 1 0 1 0 1 0 0 1 0 >> XLOC_000001:E005 1 3 3 2 3 0 0 0 0 >> XLOC_000001:E006 0 0 0 0 0 0 0 0 0 >> XLOC_000001:E007 0 1 2 2 2 0 3 2 0 >> XLOC_000001:E008 2 1 2 0 0 2 0 1 0 >> XLOC_000001:E009 1 2 0 0 1 0 0 2 0 >> XLOC_000001:E010 4 0 1 1 0 0 0 1 0 >> XLOC_000001:E011 0 0 0 0 0 1 0 0 0 >> XLOC_000001:E012 0 0 1 1 0 2 0 2 0 >> XLOC_000001:E013 1 1 3 2 2 2 1 3 0 >> XLOC_000001:E014 3 3 1 2 2 1 3 0 1 >> XLOC_000001:E015 1 2 1 2 4 2 0 0 3 >> XLOC_000001:E016 3 5 2 2 7 1 0 5 0 >> XLOC_000001:E017 1 2 0 2 2 2 1 3 1 >> XLOC_000001:E018 0 4 3 1 6 3 2 2 1 >> XLOC_000001:E019 2 2 3 2 2 2 0 5 3 >> XLOC_000001:E020 3 3 2 0 3 1 2 4 2 >> XLOC_000001:E021 2 1 2 4 3 1 0 3 1 >> XLOC_000001:E022 2 0 1 4 3 1 0 2 1 >> XLOC_000001:E023 0 0 3 1 0 2 0 2 1 >> XLOC_000001:E024 3 1 0 1 0 0 0 1 1 >> XLOC_000001:E025 0 1 0 2 2 2 2 0 0 >> XLOC_000001:E026 0 0 2 2 1 2 0 1 0 >> XLOC_000001:E027 6 1 2 3 0 0 0 4 1 >> XLOC_000001:E028 7 2 3 3 0 4 1 5 0 >> XLOC_000001:E029 5 4 1 5 0 5 5 3 1 >> XLOC_000001:E030 0 1 1 0 3 0 0 0 0 >> XLOC_000001:E031 1 1 0 0 1 0 0 1 0 >> >> The low read count I put down to only looking at a single gene. Is my >>data >> a loss cost for the dexseq pipeline? >> >> Thanks for your help >> >> Eamonn >> >> >> >> Dr Eamonn Mallon >> Lecturer in Evolutionary Biology >> Adrian 220 >> Biology Department >> University of Leicester >> >> http://www2.le.ac.uk/departments/biology/people/mallon >> >> >> >> >> >> On 26/05/2013 13:12, "Wolfgang Huber" <whuber at="" embl.de=""> wrote: >> >>> Dear Eamonn >>> >> >from your initial post -i.e. the output of head(counts(ecs))- it looks >>> like your counts are very low? This would make estimateSizeFactors >>> produce lots of NA values, but even if it did not, it would mean that >>> you'd have essentially no statistical power to detect anything useful >> >from such low counts. >>> You mentioned "20 bumblebee samples" but the table whose head you show >>> only has 9? >>> >>> Can you post the full output of 'counts(ecs)' and make sure its content >>> is plausible based on your understanding of the data? >>> >>> Best wishes >>> Wolfgang >>> >>> >>> On May 23, 2013, at 2:03 pm, Eamonn Mallon <ebm3 at="" leicester.ac.uk=""> >>>wrote: >>> >>>> Hi Alejandro, >>>> Thanks very much for getting back to me. >>>> >>>> I think I got the second parameter idea from DESeq. I see I am wrong. >>>> so I ran >>>> >>>>> ecs<-estimateSizeFactors(ecs) >>>>> sizeFactors(ecs) >>>> K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82 >>>> NA NA NA NA NA NA NA NA NA >>>> >>>> Is there any way to get DEXSeq to estimate size factors with my small >>>> dataset? >>>> >>>> If you are only interested in a single gene maybe you could visualize >>>> directly the counts per exon using the function plotDEXSeq. >>>> >>>> I'm not sure how to do this I tried >>>> >>>>> plotDEXSeq(ecs,"XLOC_000001", fitExpToVar="condition", >>>> norCounts=FALSE, expression=FALSE, splicing=FALSE, >>>> displayTranscripts=FALSE, legend=TRUE) >>>> >>>> Error in plotDEXSeq(ecs, "XLOC_000001", fitExpToVar = "condition", >>>> norCounts = FALSE, : >>>> Please estimate sizeFactors first >>>> >>>> I guess this is because modelFrameForGene requires sizefactors. >>>> >>>> Any ideas? >>>> >>>> Eamonn >>>> >>>> >>>> On 22/05/13 18:07, Alejandro Reyes wrote: >>>>> Dear Eamonn, >>>>> >>>>> Thanks for your interest in DEXSeq. >>>>> >>>>> In your size factor estimation, where did you get your second >>>>>parameter >>>>> from? (check ?estimateSizeFactors) There is no need to specify >>>>>anything >>>>> else than the ExonCountSet object and that is the reason you are >>>>> getting >>>>> the error message. >>>>> >>>>> If you are only interested in a single gene maybe you could visualize >>>>> directly the counts per exon using the function plotDEXSeq. >>>>> >>>>> Best regards, >>>>> Alejandro Reyes >>>>> >>>>> >>>>>> Dear All, >>>>>> I have RNA-seq data for 20 bumblebee samples divided into treatment >>>>>> and control. I am interested in differential exon usage. >>>>>>Unfortunately >>>>>> the bumblebee genome is not at the stage where I can do a complete >>>>>> DEXSeq analysis genome wide. I decided to look at a single gene >>>>>>where >>>>>> I have a decent gene model. Using TopHat2 and the python scripts >>>>>> included in DEXSeq I was able to produce an ExonCountSet object. >>>>>> >>>>>>> head(counts(ecs)) >>>>>> K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82 >>>>>> XLOC_000001:E001 1 0 1 0 1 0 0 0 0 >>>>>> XLOC_000001:E002 1 0 0 0 2 1 0 1 1 >>>>>> XLOC_000001:E003 0 0 0 0 3 0 0 0 0 >>>>>> XLOC_000001:E004 1 0 1 0 1 0 0 1 0 >>>>>> XLOC_000001:E005 1 3 3 2 3 0 0 0 0 >>>>>> XLOC_000001:E006 0 0 0 0 0 0 0 0 0 >>>>>> >>>>>> As I expected, its all come crashing down round my ears at the >>>>>> analysis stage >>>>>> >>>>>>> sizeFactors(ecs) >>>>>> K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82 >>>>>> NA NA NA NA NA NA NA NA NA >>>>>> >>>>>> I tried using the shorth value >>>>>> >>>>>>> >>>>>>>ecs<-estimateSizeFactors(ecs,(locfunc=genefilter::shorth((count s(ecs >>>>>>>)) >>>>>>> ,tie.action="min"))) >>>>>> Error in .local(object, ...) : unused argument (0.2) >>>>>> >>>>>> I understand that my main problem is that the DEXSeq analysis should >>>>>> be genome wide (my data has too few counts). >>>>>> >>>>>> Is there a way to use DEXSeq to ONLY look at a single gene? If not >>>>>> can anyone think of a statistical way to analyse my data (cross >>>>>>tabs? >>>>>> How should I normalise?) >>>>>> >>>>>> Thanks for your time >>>>>> >>>>>> Eamonn >>>>>> >>>>>> >>>>>>> sessionInfo() >>>>>> R version 3.0.0 (2013-04-03) >>>>>> Platform: x86_64-apple-darwin10.8.0 (64-bit) >>>>>> >>>>>> locale: >>>>>> [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 >>>>>> >>>>>> attached base packages: >>>>>> [1] parallel stats graphics grDevices utils datasets >>>>>> methods base >>>>>> >>>>>> other attached packages: >>>>>> [1] genefilter_1.42.0 DEXSeq_1.6.0 Biobase_2.20.0 >>>>>> BiocGenerics_0.6.0 >>>>>> >>>>>> loaded via a namespace (and not attached): >>>>>> [1] annotate_1.38.0 AnnotationDbi_1.22.5 biomaRt_2.16.0 >>>>>> Biostrings_2.28.0 bitops_1.0-5 DBI_0.2-7 >>>>>> [7] GenomicRanges_1.12.3 hwriter_1.3 IRanges_1.18.1 >>>>>> RCurl_1.95-4.1 Rsamtools_1.12.3 RSQLite_0.11.3 >>>>>> [13] splines_3.0.0 statmod_1.4.17 stats4_3.0.0 >>>>>> stringr_0.6.2 survival_2.37-4 tools_3.0.0 >>>>>> [19] XML_3.95-0.2 xtable_1.7-1 zlibbioc_1.6.0 >>>>>> Dr Eamonn Mallon >>>>>> Lecturer in Evolutionary Biology >>>>>> Adrian 220 >>>>>> Biology Department >>>>>> University of Leicester >>>>>> >>>>>> http://www2.le.ac.uk/departments/biology/people/mallon >>>>>> >>>>>> >>>>>> [[alternative HTML version deleted]] >>>>>> >>>>>> _______________________________________________ >>>>>> Bioconductor mailing list >>>>>> Bioconductor at r-project.org >>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>> Search the archives: >>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY

Login before adding your answer.

Traffic: 482 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6