Entering edit mode
Guest User
★
13k
@guest-user-4897
Last seen 10.4 years ago
Dear Bioconductor,
I am trying to use the attract package to find the processes that are
differentially activated between cell types of the same lineage, using
RNA-Seq data. Since the attract package is designed to work with
microarray data, I decided to use the voom function to transform my
data and change the findAttractors() function accordingly, to
accommodate this type of data. Since this is not trivial, I want to
make sure that I am using the output from the voom function correctly.
The main part of the findAttractors() uses lm to model the expression
in relation to the cell type (group) and then an anova to get the F
statistic for each gene:
fstat <- apply(dat.detect.wkegg, 1, function(y, x) {
anova(lm(y ~ x))[[4]][1]}, x = group)
where dat.detect.wkegg is the matrix of the normalized expression
values with the genes per row and the samples per column.
(To give some more context, the function then uses the log2 values of
the fstat and does a t test between the gene values of a specific
pathway vs all the gene values to identify the significant pathways. )
What I want to do is change the above to:
counts_data <- DGEList(counts=rnaseq,group=celltype)
counts_data_norm <- calcNormFactors(counts_data)
design <- model.matrix( ~ celltype)
anal_voom <- voom(counts_data_norm, design)
dat.detect.wkegg <- as.list(as.data.frame(t(anal_voom$E)))
voom_weights <- as.list(as.data.frame(t(anal_voom$weights)))
fstat <- mapply(function(y, w, group) {anova(lm(y ~ group,
weights=w))[[4]][1]},
dat.detect.wkegg, voom_weights, MoreArgs =
list(group=celltype))
Is this the way to go with using the weights from voom, or am I
getting this very wrong?
Many thanks in advance for your reply!
Best wishes,
Emmanouela
-- output of sessionInfo():
sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_GB.ISO-8859-1 LC_NUMERIC=C
LC_TIME=en_GB.ISO-8859-1 LC_COLLATE=en_GB.ISO-8859-1
LC_MONETARY=en_GB.ISO-8859-1 LC_MESSAGES=en_GB.ISO-8859-1
[7] LC_PAPER=C LC_NAME=C
LC_ADDRESS=C LC_TELEPHONE=C
LC_MEASUREMENT=en_GB.ISO-8859-1 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats graphics grDevices utils datasets
methods base
other attached packages:
[1] attract_1.14.0 GOstats_2.28.0 graph_1.40.1
Category_2.28.0 GO.db_2.10.1 Matrix_1.1-3
cluster_1.15.2 annotate_1.40.1 org.Mm.eg.db_2.10.1
[10] KEGG.db_2.10.1 RSQLite_0.11.4 DBI_0.2-7
AnnotationDbi_1.24.0 Biobase_2.22.0 BiocGenerics_0.8.0
edgeR_3.4.2 limma_3.18.13
loaded via a namespace (and not attached):
[1] AnnotationForge_1.4.4 genefilter_1.44.0 grid_3.0.1
GSEABase_1.24.0 IRanges_1.20.7 lattice_0.20-29
RBGL_1.38.0 splines_3.0.1
[9] stats4_3.0.1 survival_2.37-7 tcltk_3.0.1
tools_3.0.1 XML_3.98-1.1 xtable_1.7-3
--
Sent via the guest posting facility at bioconductor.org.