RNAseq expression threshold using DESeq2 normalised counts
2
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 10.3 years ago
Hi Mike, This is a question similar to posted on biostars a few months ago (https://www.biostars.org/p/94680/) that you came across. I want to determine if a gene is expressed or not using RNAseq data. Though there is quite a discussion on it with papers defining range of FPKM values (generally generated using cufflinks ) as a cutoff to say that a gene is expressed. Can we rather use normalised counts from DESeq2- look at the distribution and determine a suitable cutoff. Better still if one has negative controls like spike ins in the RNA protocol use that a cutoff ? ( I unfortunately dont have spike in control data) Or do you think one should extract FPKM values and then use maybe a zFPKM transformation (http://www.ploscompbiol.org/article/info%3Adoi%2 F10.1371%2Fjournal.pcbi.1000598) like most people are suggesting I look forward to your opinion and suggestion, Thanks ! Aditi -- output of sessionInfo(): -- output of sessionInfo(): R version 3.1.0 (2014-04-10) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] DESeq2_1.4.5 RcppArmadillo_0.4.300.0 Rcpp_0.11.1 [4] EDASeq_1.10.0 aroma.light_2.0.0 matrixStats_0.8.14 [7] ShortRead_1.22.0 GenomicAlignments_1.0.1 BSgenome_1.32.0 [10] Rsamtools_1.16.0 GenomicRanges_1.16.3 GenomeInfoDb_1.0.2 [13] Biostrings_2.32.0 XVector_0.4.0 IRanges_1.22.7 [16] BiocParallel_0.6.1 Biobase_2.24.0 BiocGenerics_0.10.0 loaded via a namespace (and not attached): [1] annotate_1.42.0 AnnotationDbi_1.26.0 BatchJobs_1.2 [4] BBmisc_1.6 bitops_1.0-6 brew_1.0-6 [7] codetools_0.2-8 DBI_0.2-7 DESeq_1.16.0 [10] digest_0.6.4 fail_1.2 foreach_1.4.2 [13] genefilter_1.46.1 geneplotter_1.42.0 grid_3.1.0 [16] hwriter_1.3 iterators_1.0.7 lattice_0.20-29 [19] latticeExtra_0.6-26 locfit_1.5-9.1 plyr_1.8.1 [22] RColorBrewer_1.0-5 R.methodsS3_1.6.1 R.oo_1.18.0 [25] RSQLite_0.11.4 sendmailR_1.1-2 splines_3.1.0 [28] stats4_3.1.0 stringr_0.6.2 survival_2.37-7 [31] tools_3.1.0 XML_3.98-1.1 xtable_1.7-3 [34] zlibbioc_1.10.0 -- Sent via the guest posting facility at bioconductor.org.
RNASeq RNASeq • 2.2k views
ADD COMMENT
0
Entering edit mode
@qamra-aditi-gis-6128
Last seen 10.3 years ago
Sorry I posted the wrong link while referring to the paper using zFPKM transformation. here it is - http://www.biomedcentral.com/1471-2164/14/778 ________________________________________ From: Aditi [guest] [guest@bioconductor.org] Sent: Saturday, July 19, 2014 11:51 PM To: bioconductor at r-project.org; QAMRA Aditi (GIS) Cc: DESeq2 Maintainer Subject: RNAseq expression threshold using DESeq2 normalised counts Hi Mike, This is a question similar to posted on biostars a few months ago (https://www.biostars.org/p/94680/) that you came across. I want to determine if a gene is expressed or not using RNAseq data. Though there is quite a discussion on it with papers defining range of FPKM values (generally generated using cufflinks ) as a cutoff to say that a gene is expressed. Can we rather use normalised counts from DESeq2- look at the distribution and determine a suitable cutoff. Better still if one has negative controls like spike ins in the RNA protocol use that a cutoff ? ( I unfortunately dont have spike in control data) Or do you think one should extract FPKM values and then use maybe a zFPKM transformation (http://www.ploscompbiol.org/article/info%3Adoi%2 F10.1371%2Fjournal.pcbi.1000598) like most people are suggesting I look forward to your opinion and suggestion, Thanks ! Aditi -- output of sessionInfo(): -- output of sessionInfo(): R version 3.1.0 (2014-04-10) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] DESeq2_1.4.5 RcppArmadillo_0.4.300.0 Rcpp_0.11.1 [4] EDASeq_1.10.0 aroma.light_2.0.0 matrixStats_0.8.14 [7] ShortRead_1.22.0 GenomicAlignments_1.0.1 BSgenome_1.32.0 [10] Rsamtools_1.16.0 GenomicRanges_1.16.3 GenomeInfoDb_1.0.2 [13] Biostrings_2.32.0 XVector_0.4.0 IRanges_1.22.7 [16] BiocParallel_0.6.1 Biobase_2.24.0 BiocGenerics_0.10.0 loaded via a namespace (and not attached): [1] annotate_1.42.0 AnnotationDbi_1.26.0 BatchJobs_1.2 [4] BBmisc_1.6 bitops_1.0-6 brew_1.0-6 [7] codetools_0.2-8 DBI_0.2-7 DESeq_1.16.0 [10] digest_0.6.4 fail_1.2 foreach_1.4.2 [13] genefilter_1.46.1 geneplotter_1.42.0 grid_3.1.0 [16] hwriter_1.3 iterators_1.0.7 lattice_0.20-29 [19] latticeExtra_0.6-26 locfit_1.5-9.1 plyr_1.8.1 [22] RColorBrewer_1.0-5 R.methodsS3_1.6.1 R.oo_1.18.0 [25] RSQLite_0.11.4 sendmailR_1.1-2 splines_3.1.0 [28] stats4_3.1.0 stringr_0.6.2 survival_2.37-7 [31] tools_3.1.0 XML_3.98-1.1 xtable_1.7-3 [34] zlibbioc_1.10.0 -- Sent via the guest posting facility at bioconductor.org. ------------------------------- This e-mail and any attachments are only for the use of the intended recipient and may be confidential and/or privileged. If you are not the recipient, please delete it or notify the sender immediately. Please do not copy or use it for any purpose or disclose the contents to any other person as it may be an offence under the Official Secrets Act.
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 1 day ago
United States
hi Aditi, I don't have any concrete suggestions for this -- how to determine if a gene is expressed or not. Mike On Sat, Jul 19, 2014 at 11:51 AM, Aditi [guest] <guest at="" bioconductor.org=""> wrote: > Hi Mike, > > This is a question similar to posted on biostars a few months ago (https://www.biostars.org/p/94680/) that you came across. > > I want to determine if a gene is expressed or not using RNAseq data. Though there is quite a discussion on it with papers defining range of FPKM values (generally generated using cufflinks ) as a cutoff to say that a gene is expressed. > > Can we rather use normalised counts from DESeq2- look at the distribution and determine a suitable cutoff. Better still if one has negative controls like spike ins in the RNA protocol use that a cutoff ? ( I unfortunately dont have spike in control data) > > Or do you think one should extract FPKM values and then use maybe a zFPKM transformation (http://www.ploscompbiol.org/article/info%3Adoi%2 F10.1371%2Fjournal.pcbi.1000598) like most people are suggesting > > I look forward to your opinion and suggestion, > > Thanks ! > Aditi > > > > > > -- output of sessionInfo(): > > -- output of sessionInfo(): > > R version 3.1.0 (2014-04-10) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] DESeq2_1.4.5 RcppArmadillo_0.4.300.0 Rcpp_0.11.1 > [4] EDASeq_1.10.0 aroma.light_2.0.0 matrixStats_0.8.14 > [7] ShortRead_1.22.0 GenomicAlignments_1.0.1 BSgenome_1.32.0 > [10] Rsamtools_1.16.0 GenomicRanges_1.16.3 GenomeInfoDb_1.0.2 > [13] Biostrings_2.32.0 XVector_0.4.0 IRanges_1.22.7 > [16] BiocParallel_0.6.1 Biobase_2.24.0 BiocGenerics_0.10.0 > > loaded via a namespace (and not attached): > [1] annotate_1.42.0 AnnotationDbi_1.26.0 BatchJobs_1.2 > [4] BBmisc_1.6 bitops_1.0-6 brew_1.0-6 > [7] codetools_0.2-8 DBI_0.2-7 DESeq_1.16.0 > [10] digest_0.6.4 fail_1.2 foreach_1.4.2 > [13] genefilter_1.46.1 geneplotter_1.42.0 grid_3.1.0 > [16] hwriter_1.3 iterators_1.0.7 lattice_0.20-29 > [19] latticeExtra_0.6-26 locfit_1.5-9.1 plyr_1.8.1 > [22] RColorBrewer_1.0-5 R.methodsS3_1.6.1 R.oo_1.18.0 > [25] RSQLite_0.11.4 sendmailR_1.1-2 splines_3.1.0 > [28] stats4_3.1.0 stringr_0.6.2 survival_2.37-7 > [31] tools_3.1.0 XML_3.98-1.1 xtable_1.7-3 > [34] zlibbioc_1.10.0 > > -- > Sent via the guest posting facility at bioconductor.org.
ADD COMMENT

Login before adding your answer.

Traffic: 528 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6