I am trying to conduct single time course analysis in maSigPro. While the manual says this is straight forward, I'm not convinced. Full disclosure- I am not particularly experienced with R and as a result, am not very good at interpreting its error messages. I have normalized reads in DESEQ2 (though I get the same error with reads normalized in edgeR). I obtained p-values and extracted significant gene expression profiles using the p.vector function in maSigPro. This all works fine. Then when I try to cluster genes based on similar expression profiles using the see.genes
function I get an error:
> see.genes(NBp$SELEC, show.fit = TRUE, dis=d$dis, edesign=d$edesign, cluster.method="hclust", groups.vector=d$groups.vector, cluster.data = 1, k=8)
Error in rep(0, (7 - length(a))) : invalid 'times' argument
I am not clear what the "rep(0, (7 - length(a)))" portion of this error refers to and traceback:
> traceback()
2: PlotGroups(data = dat[cut == j, ], show.fit = show.fit, dis = dis,
step.method = step.method, min.obs = min.obs, alfa = alfa,
nvar.correction = nvar.correction, show.lines = show.lines,
time = time, groups = groups, repvect = repvect, summary.mode = summary.mode,
xlab = "time", main = paste("Cluster", j, sep = " "), ylim = ylim,
cexlab = cexlab, legend = legend, groups.vector = groups.vector,
item = item, ...)
1: see.genes(NBp$SELEC, show.fit = TRUE, dis = d$dis, edesign = d$edesign,
cluster.method = "hclust", groups.vector = d$groups.vector,
cluster.data = 1, k = 8)
Is not especially helpful...
I have looked at the manual and as I said, it says very little about single time course data analysis. For the record, I have tried the very basic approach recommended like so:
exMSP <- maSigPro(normItai, design, vars="each")
and I seem to only get two significant genes?
> exMSP$summary
independ rep
1 Itaiw_v1_scaffold_63_g32259 Itaiw_v1_scaffold_63_g32259
2 Itaiw_v1_scaffold_163_g42033 Itaiw_v1_scaffold_163_g42033
rep2
1 Itaiw_v1_scaffold_63_g32259
2 Itaiw_v1_scaffold_163_g42033
Likewise I have searched for other people having the same problem and cannot seem to find an answer. Here's what I've done:
## Normalized read counts in DESEQ2:
countData <- as.matrix(read.csv("./gene_count_matrix_21.csv",row.names=1))
colData <- read.csv("./e_design-21-DESEQ.csv",row.names=1)
dds <- DESeqDataSetFromMatrix(countData = countData, colData = colData, design = ~date_time)
dds <- DESeq(dds)
normItai <- counts(dds, normalized = TRUE)
normItai<-normItai[rowSums(normItai)!=0, ] # remove all zero rows from data
## Identify genes with significant expression profiles:
design<-read.csv("./e_design-21-MSP.csv")
rownames(design) <- design$id
design$id <- NULL
all(rownames(design) == colnames(normItai)) # make sure row names in design object match columns in count data
d <- make.design.matrix(design, time.col=2, repl.col = 1, degree=8) # have to define non-default values for time and replicate
## calculate polynomial regressions for each gene using "negative binomial distribution"
NBp<-p.vector(normItai, d, counts=TRUE)
profiles <- see.genes(NBp$SELEC, show.fit = TRUE, dis=d$dis, edesign=d$edesign, cluster.method="hclust", cluster.data = 1, k=8)
Here is the error message (again) followed by traceback and session Info:
> traceback()
2: PlotGroups(data = dat[cut == j, ], show.fit = show.fit, dis = dis,
step.method = step.method, min.obs = min.obs, alfa = alfa,
nvar.correction = nvar.correction, show.lines = show.lines,
time = time, groups = groups, repvect = repvect, summary.mode = summary.mode,
xlab = "time", main = paste("Cluster", j, sep = " "), ylim = ylim,
cexlab = cexlab, legend = legend, groups.vector = groups.vector,
item = item, ...)
1: see.genes(NBp$SELEC, show.fit = TRUE, dis = d$dis, edesign = d$edesign,
cluster.method = "hclust", groups.vector = d$groups.vector,
cluster.data = 1, k = 8)
> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS 10.14.6
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
locale:
[1] C
attached base packages:
[1] tcltk parallel stats4 stats graphics grDevices utils
[8] datasets methods base
other attached packages:
[1] edgeR_3.24.3 limma_3.38.3
[3] Mfuzz_2.42.0 DynDoc_1.60.0
[5] widgetTools_1.60.0 e1071_1.7-3
[7] maSigPro_1.54.0 ggplot2_3.3.0
[9] RColorBrewer_1.1-2 DESeq2_1.22.2
[11] SummarizedExperiment_1.12.0 DelayedArray_0.8.0
[13] BiocParallel_1.16.6 matrixStats_0.56.0
[15] Biobase_2.42.0 GenomicRanges_1.34.0
[17] GenomeInfoDb_1.18.2 IRanges_2.16.0
[19] S4Vectors_0.20.1 BiocGenerics_0.28.0
loaded via a namespace (and not attached):
[1] bit64_0.9-7 splines_3.5.1 Formula_1.2-3
[4] assertthat_0.2.1 latticeExtra_0.6-28 blob_1.2.1
[7] GenomeInfoDbData_1.2.0 pillar_1.4.4 RSQLite_2.2.0
[10] backports_1.1.5 lattice_0.20-40 glue_1.3.2
[13] digest_0.6.25 XVector_0.22.0 checkmate_2.0.0
[16] tkWidgets_1.60.0 colorspace_1.4-1 htmltools_0.4.0
[19] Matrix_1.2-18 XML_3.99-0.3 pkgconfig_2.0.3
[22] venn_1.9 genefilter_1.64.0 zlibbioc_1.28.0
[25] purrr_0.3.3 xtable_1.8-4 scales_1.1.1
[28] htmlTable_1.13.3 tibble_2.1.3 annotate_1.60.1
[31] admisc_0.8 farver_2.0.3 withr_2.2.0
[34] nnet_7.3-13 mclust_5.4.5 survival_3.1-11
[37] magrittr_1.5 crayon_1.3.4 memoise_1.1.0
[40] MASS_7.3-51.5 class_7.3-15 foreign_0.8-71
[43] tools_3.5.1 data.table_1.12.8 lifecycle_0.2.0
[46] stringr_1.4.0 munsell_0.5.0 locfit_1.5-9.1
[49] cluster_2.1.0 AnnotationDbi_1.44.0 compiler_3.5.1
[52] rlang_0.4.5 grid_3.5.1 RCurl_1.98-1.1
[55] rstudioapi_0.11 htmlwidgets_1.5.1 labeling_0.3
[58] bitops_1.0-6 base64enc_0.1-3 gtable_0.3.0
[61] DBI_1.1.0 R6_2.4.1 gridExtra_2.3
[64] knitr_1.28 dplyr_0.8.5 bit_1.1-15.2
[67] Hmisc_4.3-1 stringi_1.4.6 Rcpp_1.0.4
[70] vctrs_0.2.4 geneplotter_1.60.0 rpart_4.1-15
[73] acepack_1.4.1 tidyselect_1.0.0 xfun_0.13
Sorry for the belated reply. This makes sense given the error message but after arbitrarily removing a time point and rerunning everything with degree = 7 I am still getting the same 'invalid times' error...