using easyRNASeq to calculate RPKM values
1
0
Entering edit mode
@fatemehsadat-seyednasrollah-5367
Last seen 10.3 years ago
Dear list, I have used HTSeq to get the count table of an RNA seq dataset which has 8 biological replicates and two conditions ( so 4 biological replicates for each condition ) and the count table is like below: > head(a) V1 V2 V3 V4 V5 V6 V7 V8 V9 1 1/2-SBSRNA4 3 5 4 4 2 3 1 1 2 A1BG 200 93 246 102 86 46 58 85 3 A1BG-AS1 24 28 16 32 17 10 19 14 4 A1CF 1 1 1 2 1 0 0 1 5 A2LD1 100 71 98 97 59 128 88 114 6 A2M 5 5 23 1 5 6 10 5 Now for getting familiar with the expression level of each gene I want to calculate the RPKM values. Can I use the easyRNASeq package over the above count table to calculate the values or not? Thank you in advance
easyRNASeq easyRNASeq • 2.4k views
ADD COMMENT
0
Entering edit mode
@delhommeemblde-3232
Last seen 10.3 years ago
Dear Fatemehsadat, It is indeed possible. The function RPKM would do that for you. Have a look at the help page by doing ?RPKM after loading easyRNASeq. The last example takes as argument a matrix (your count table), the gene sizes (or whatever feature you used, e.g. transcripts) and the sizes of your RNA-Seq libraries. These two last arguments should be named vectors where the name are the rownames and colnames of your count table, respectively. The library size can be retrieved simply by summing your columns, i.e. colSums(count.table). Words of caution though, RPKM is a correction and not a normalization, so it's fine for visualizing the data, but I would not use it as input to any statistical tools such as DESeq, edgeR, etc. Moreover, depending on how you counted your reads per feature, you might have counted some reads multiple time in which case, it is better to retrieve your library size from your original BAM file using samtools. HTH, Nico --------------------------------------------------------------- Nicolas Delhomme Genome Biology Computational Support European Molecular Biology Laboratory Tel: +49 6221 387 8310 Email: nicolas.delhomme at embl.de Meyerhofstrasse 1 - Postfach 10.2209 69102 Heidelberg, Germany --------------------------------------------------------------- On Jan 22, 2013, at 2:03 PM, Fatemehsadat Seyednasrollah wrote: > Dear list, > > I have used HTSeq to get the count table of an RNA seq dataset which has 8 biological replicates and two conditions ( so 4 biological replicates for each condition ) and the count table is like below: > >> head(a) > > V1 V2 V3 V4 V5 V6 V7 V8 V9 > 1 1/2-SBSRNA4 3 5 4 4 2 3 1 1 > 2 A1BG 200 93 246 102 86 46 58 85 > 3 A1BG-AS1 24 28 16 32 17 10 19 14 > 4 A1CF 1 1 1 2 1 0 0 1 > 5 A2LD1 100 71 98 97 59 128 88 114 > 6 A2M 5 5 23 1 5 6 10 5 > > Now for getting familiar with the expression level of each gene I want to calculate the RPKM values. Can I use the easyRNASeq package over the above count table to calculate the values or not? > > Thank you in advance > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Many Thanks. ________________________________________ From: Nicolas Delhomme [delhomme@embl.de] Sent: Tuesday, January 22, 2013 4:46 PM To: Fatemehsadat Seyednasrollah Cc: bioconductor at r-project.org Subject: Re: [BioC] using easyRNASeq to calculate RPKM values Dear Fatemehsadat, It is indeed possible. The function RPKM would do that for you. Have a look at the help page by doing ?RPKM after loading easyRNASeq. The last example takes as argument a matrix (your count table), the gene sizes (or whatever feature you used, e.g. transcripts) and the sizes of your RNA-Seq libraries. These two last arguments should be named vectors where the name are the rownames and colnames of your count table, respectively. The library size can be retrieved simply by summing your columns, i.e. colSums(count.table). Words of caution though, RPKM is a correction and not a normalization, so it's fine for visualizing the data, but I would not use it as input to any statistical tools such as DESeq, edgeR, etc. Moreover, depending on how you counted your reads per feature, you might have counted some reads multiple time in which case, it is better to retrieve your library size from your original BAM file using samtools. HTH, Nico --------------------------------------------------------------- Nicolas Delhomme Genome Biology Computational Support European Molecular Biology Laboratory Tel: +49 6221 387 8310 Email: nicolas.delhomme at embl.de Meyerhofstrasse 1 - Postfach 10.2209 69102 Heidelberg, Germany --------------------------------------------------------------- On Jan 22, 2013, at 2:03 PM, Fatemehsadat Seyednasrollah wrote: > Dear list, > > I have used HTSeq to get the count table of an RNA seq dataset which has 8 biological replicates and two conditions ( so 4 biological replicates for each condition ) and the count table is like below: > >> head(a) > > V1 V2 V3 V4 V5 V6 V7 V8 V9 > 1 1/2-SBSRNA4 3 5 4 4 2 3 1 1 > 2 A1BG 200 93 246 102 86 46 58 85 > 3 A1BG-AS1 24 28 16 32 17 10 19 14 > 4 A1CF 1 1 1 2 1 0 0 1 > 5 A2LD1 100 71 98 97 59 128 88 114 > 6 A2M 5 5 23 1 5 6 10 5 > > Now for getting familiar with the expression level of each gene I want to calculate the RPKM values. Can I use the easyRNASeq package over the above count table to calculate the values or not? > > Thank you in advance > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY

Login before adding your answer.

Traffic: 288 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6