Question

glmTreat Errors for multi coefficient and analysis of deviance situations.

0

Entering edit mode

BharathAnanth ▴ 80

@bharathananth-10049

Last seen 6.8 years ago

Hi

I am fitting some time course RNA-seq data using glmQLFit and I then wanted to use glmTreat to test either of two coefficients with 0.5 lfc. Here is the simplified version (with fewer samples).

set.seed(0)
T <- 24
t <- rep(seq(3,42, by=3), 11)
c <- cos(2*pi*t/T)  # a time course basis function
s <- sin(2*pi*t/T)
batch_effect <- factor(ceiling(runif(14*11)*11))
subjects <- sprintf("P%02d", rep(1:11, each = 14))
design1 <- model.matrix(~ batch_effect + c + s)

fit <- glmQLFit(y, design, robust=TRUE)

These work

treat1 <- glmTreat(fit, coef="c", lfc=0.5)
treat2 <- glmTreat(fit, coef=c("c", "s"), lfc=0)

but this gives an error

treat3 <- glmTreat(fit, coef=c("c","s"), lfc=0.5)

error: Error in .addCompressedMatrices(offset.old, offset.adj) :
matrix dimensions are not consistent for non-repeating number of columns

Instead, if I use contrasts instead of specifying coefficients, I get a different error

treat4 <- glmTreat(fit, contrast = con, lfc=0.5)

error: Error in data.frame(logCPM = glmfit$AveLogCPM, PValue = p.value, row.names = rownames(glmfit)) :
row names supplied are of the wrong length

Is glmTreat not design to be used for multi-coefficient or analysis of deviance situtations? The help page seems to suggested multiple coefficients at least are permitted.

Thank you for your help.

My sessionInfo is

R version 3.4.2 (2017-09-28)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 8 (jessie)

Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.12.so

locale:
[1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base

other attached packages:
[1] bindrcpp_0.2               sva_3.24.4                 BiocParallel_1.10.1        genefilter_1.58.1          mgcv_1.8-22
[6] nlme_3.1-131               SummarizedExperiment_1.6.5 DelayedArray_0.2.7         matrixStats_0.52.2         Biobase_2.36.2
[11] GenomicRanges_1.28.4       GenomeInfoDb_1.12.3        IRanges_2.10.5             S4Vectors_0.14.7           BiocGenerics_0.22.1
[16] ggplot2_2.2.1              PoiClaClu_1.0.2            magrittr_1.5               edgeR_3.18.1               limma_3.32.5

loaded via a namespace (and not attached):
[1] tidyr_0.7.2             bit64_0.9-7             splines_3.4.2           Formula_1.2-2           assertthat_0.2.0        statmod_1.4.30
[7] latticeExtra_0.6-28     blob_1.1.0              cellranger_1.1.0        GenomeInfoDbData_0.99.0 yaml_2.1.15             RSQLite_2.0
[13] backports_1.1.1         lattice_0.20-35         glue_1.2.0              digest_0.6.12           RColorBrewer_1.1-2      XVector_0.16.0
[19] checkmate_1.8.5         colorspace_1.3-2        htmltools_0.3.6         Matrix_1.2-11           plyr_1.8.4              XML_3.98-1.9
[25] pkgconfig_2.0.1         zlibbioc_1.22.0         purrr_0.2.4             xtable_1.8-2            scales_0.5.0            pracma_2.1.1
[31] htmlTable_1.11.0        tibble_1.3.4            annotate_1.54.0         nnet_7.3-12             lazyeval_0.2.1          readxl_1.0.0
[37] survival_2.41-3         memoise_1.1.0           foreign_0.8-69          tools_3.4.2             data.table_1.10.4-3     stringr_1.2.0
[43] munsell_0.4.3           locfit_1.5-9.1          cluster_2.0.6           AnnotationDbi_1.38.2    compiler_3.4.2          rlang_0.1.2
[49] grid_3.4.2              RCurl_1.95-4.8          rstudioapi_0.7          htmlwidgets_0.9         labeling_0.3            bitops_1.0-6
[55] base64enc_0.1-3         gtable_0.2.0            DBI_0.7                 R6_2.2.2                gridExtra_2.3           knitr_1.17
[61] dplyr_0.7.4             bit_1.1-12              bindr_0.1               Hmisc_4.0-3             stringi_1.1.6           Rcpp_0.12.12
[67] geneplotter_1.54.0      rpart_4.1-11            acepack_1.4.1           tidyselect_0.2.3

edgeR glmTreat analysis of deviance • 1.9k views

ADD COMMENT • link updated 7.4 years ago by Ryan C. Thompson ★ 7.9k • written 7.4 years ago by BharathAnanth ▴ 80

score 0 · Answer 1 · 2017-12-07

0

Entering edit mode

Ryan C. Thompson ★ 7.9k

@ryan-c-thompson-5618

Last seen 6 months ago

Icahn School of Medicine at Mount Sinai…

As far as I know, the treat method is only defined for a single coefficient or contrast at a time. I'm not even sure what the null hypothesis would be for testing multiple coefficients at once.

ADD COMMENT • link 7.4 years ago Ryan C. Thompson ★ 7.9k

0

Entering edit mode

Normally the null hypothesis for the glm would be (in my case) H0: c=0 AND s =0.

What I want to test is the null hypothesis H0: |c| < lfc AND |s| <lfc using glmTreat. Is this inconceivable?

ADD REPLY • link 7.4 years ago BharathAnanth ▴ 80

0

Entering edit mode

The problem with that is that the region defined by your null hypothesis is now dependent on how you've parametrized your design matrix in ways that are probably not intuitive. That is, tests using different design matrices that would be equivalent in glmLRT or glmQLFTest would give different results in glmTreat.

ADD REPLY • link 7.4 years ago Ryan C. Thompson ★ 7.9k

0

Entering edit mode

Just to give a few other possible ways to define the null hypothesis for multi-coefficient treat that make equal sense, consider: |c| + |s| < threshold, or c^2 + s^2 < threshold^2. Of course, all of these are still dependent on the parametrization.

ADD REPLY • link 7.4 years ago Ryan C. Thompson ★ 7.9k

0

Entering edit mode

To elaborate on Ryan's example, say we have two contrasts, A - B and A - C. If we test A - B = 0 and A - C = 0 via an ANODEV, this is the same null hypothesis as A - B = 0 and B - C = 0. However, if we ask for |A - B| <= 1 and |A - C| <= 1, this is not equivalent to |A - B| <= 1 and |B - C| <= 1. (For example, the first version is true while the second is not when A = 0, B = 1 and C = -1.)

This isn't an insurmountable difficulty - it just requires some care when phrasing the contrasts - but it does highlight the issues with dealing with composite null hypotheses. Indeed, when you start stacking together contrasts to try to perform a log-fold-change-aware equivalent of an ANODEV, you will become increasingly conservative if you only perform testing at the lfc limit, like treat; or you will need to consider the correlation in the true effect sizes between contrasts to obtain an appropriate p-value, which does not seem possible on a gene-by-gene basis. Maybe something could be made to work by sharing information across genes with empirical Bayes, but this would require some thought.

ADD REPLY • link 7.4 years ago Aaron Lun ★ 28k