Error in ball gown command
1
0
Entering edit mode
@kritikamish99-9648
Last seen 6.3 years ago
India

Hello All

I have 4 samples M1 , M2 , W1, W2 . I have done alignment with Ecoli ref genome usng hisat2 on this sample i ran stringtie for abundance estimation  Now i am trying to run ball gown on this samples of now i tried this commands:

data_directory <- "/Hisat_stringtie_/string_out/"
bg = ballgown(dataDir=data_directory, samplePattern="string_out_", meas='all')

transcript_fpkm = texpr(bg, 'FPKM')
transcript_cov = texpr(bg, 'cov')
whole_tx_table = texpr(bg, 'all')
exon_mcov = eexpr(bg, 'mcov')
junction_rcount = iexpr(bg)
whole_intron_table = iexpr(bg, 'all')
gene_expression = gexpr(bg)


pData(bg) = data.frame(id=sampleNames(bg), group=rep(c(1,0), each=2))

pheno_data <- pData(bg)

head(pheno_data)

head(bg@dirs)

bg_2 = subset(bg,"rowVars(texpr(bg))>1", genomesubset = TRUE )

stat_results = stattest(bg_2, feature='transcript' ,meas='FPKM', covariate='group')

After running stattest i am getting error:

Error in solve.default(t(mod) %*% mod) :
Lapack routine dgesv: system is exactly singular: U[3,3] = 0

please help me i am not able to understand the error .

Where i am getting wrong

sessionInfo()

R version 3.2.2 (2015-08-14)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 15.10

locale:
 [1] LC_CTYPE=en_IN.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_IN.UTF-8        LC_COLLATE=en_IN.UTF-8    
 [5] LC_MONETARY=en_IN.UTF-8    LC_MESSAGES=en_IN.UTF-8   
 [7] LC_PAPER=en_IN.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_IN.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils    
[7] datasets  methods   base     

other attached packages:
 [1] dplyr_0.5.0                genefilter_1.52.1         
 [3] BiocStyle_1.8.0            ballgown_2.2.0            
 [5] Rqc_1.4.2                  ggplot2_2.2.0             
 [7] ShortRead_1.28.0           GenomicAlignments_1.6.3   
 [9] SummarizedExperiment_1.0.2 Biobase_2.30.0            
[11] Rsamtools_1.22.0           GenomicRanges_1.22.4      
[13] GenomeInfoDb_1.6.3         Biostrings_2.38.4         
[15] XVector_0.10.0             IRanges_2.4.8             
[17] S4Vectors_0.8.11           BiocGenerics_0.16.1       
[19] BiocParallel_1.4.3        

loaded via a namespace (and not attached):
 [1] splines_3.2.2            GenomicFiles_1.6.2      
 [3] Formula_1.2-1            shiny_0.14.2            
 [5] assertthat_0.1           highr_0.6               
 [7] latticeExtra_0.6-28      BSgenome_1.38.0         
 [9] RSQLite_1.1              lattice_0.20-34         
[11] biovizBase_1.18.0        limma_3.26.9            
[13] digest_0.6.10            RColorBrewer_1.1-2      
[15] colorspace_1.3-1         htmltools_0.3.5         
[17] httpuv_1.3.3             Matrix_1.2-7.1          
[19] plyr_1.8.4               XML_3.98-1.5            
[21] biomaRt_2.26.1           zlibbioc_1.16.0         
[23] xtable_1.8-2             scales_0.4.1            
[25] htmlTable_1.7            tibble_1.2              
[27] annotate_1.48.0          mgcv_1.8-16             
[29] GenomicFeatures_1.22.13  nnet_7.3-12             
[31] lazyeval_0.2.0           survival_2.40-1         
[33] magrittr_1.5             mime_0.5                
[35] evaluate_0.10            memoise_1.0.0           
[37] nlme_3.1-128             hwriter_1.3.2           
[39] foreign_0.8-67           tools_3.2.2             
[41] data.table_1.10.0        stringr_1.1.0           
[43] munsell_0.4.3            cluster_2.0.5           
[45] AnnotationDbi_1.32.3     lambda.r_1.1.9          
[47] futile.logger_1.4.3      grid_3.2.2              
[49] RCurl_1.95-4.8           dichromat_2.0-0         
[51] VariantAnnotation_1.16.4 labeling_0.3            
[53] bitops_1.0-6             gtable_0.2.0            
[55] DBI_0.5-1                markdown_0.7.7          
[57] reshape2_1.4.2           R6_2.2.0                
[59] gridExtra_2.2.1          knitr_1.15.1            
[61] rtracklayer_1.30.4       Hmisc_4.0-0             
[63] futile.options_1.0.0     stringi_1.1.2           
[65] sva_3.18.0               Rcpp_0.12.8             
[67] rpart_4.1-10             acepack_1.4.1  

Thanks

rnaseq ballgown differential gene expression • 2.2k views
ADD COMMENT
0
Entering edit mode
Jeff Leek ▴ 650
@jeff-leek-5015
Last seen 3.8 years ago
United States

Hello

 

The problem could be because you only have one sample you are testing for each group. Ballgown can not compute a test statistic for comparing single samples because it can not estimate the variance. You could simply calculate the fold change and then use that as your ranking system, but it is not usually possible to get a measure of statistical significance if you only have one sample in each group. 

 

Best

 

Jeff

ADD COMMENT
0
Entering edit mode
Okay but I want to find differential expression analysis on W1 vs M1 and W2 vs M2. I can't perform ball gown diff estimation with this? For bal gown minimum how many sample one need??
ADD REPLY
0
Entering edit mode

The output of `pData(bg)` isn't showing up, but I'm assuming you do have two replicates per group (two W's, and two M's). @Jeff if you only have one rep, ballgown throws this error message which isn't what's happening: https://github.com/alyssafrazee/ballgown/blob/master/R/stattest.R#L215

Without seeing the data, my *guess* as to what's going on is that you have 0 variance in one of your groups in one of your genes, i.e., both W's or both M's have equal FPKM values (which is pretty likely with many genes and only two samples per group). It's easy to get 0 variance there, and linear regression then requires dividing by the variance, and you're essentially getting a "divide by 0" error.

If you like, you can work around this by excluding genes with this property (having equal FPKM in either the W or the M group). But in general, it's good to have more than 2 samples per group (as many as possible). For smaller sample sizes, limma might be a good choice (as they use empirical Bayes to share info about variance across genes). 

ADD REPLY

Login before adding your answer.

Traffic: 884 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6