Entering edit mode
Amos Kirilovsky
▴
50
@amos-kirilovsky-5407
Last seen 7.6 years ago
Dear Bioconductor list,
I'm try to get differentially expressed genes from two groups of
samples
using EdgeR.
In the first group I have only 1 sample and in the second there are 3
biological replicates. (I will get more samples per group in the
future but
for now I have to make an analysis pipeline with only that 4 samples)
The sample came from a metatranscriptomic experience.
For the comparison I selected only the genes that match to a species
of
interest.
I got error messages doing differential expression test and MDS :
1) *Differential expression*:
> et <- exactTest(d)
Error in `[.data.frame`(mu1, g) : undefined columns selected
or (only if the name of my first group is starting with the letter A
!)
> et <- exactTest(d)
Error in 0:s1[g] : argument NA / NaN
I found out that my error message is due to the presence of only one
sample
in my first group when using exactTest() function. I used glmLRT()
function
and I obtained many significantly differentially expressed genes.
Could you confirm that exactTest() don't work with group of only one
sample
and that I used correctly glmLRT() ? My code is at the end of the
message.
2) *MDS*
> plotMDS(d)
Error in is.finite(x$counts) :
default method unavailable for 'list' type (translated from French)
For the second problem I made some tests with the first Case study
("SAGE
profiles of normal and tumour tissue") in edgeR user Guide.
My code was very similar and this time it worked fine. I couldn't find
out
why. Do you have any idea?
I copy the output of my DGEList object after my message. I don't know
if
this is helpfull but the function plotBCV() seems ok with my data.
During my tests with the first Case study I get different results when
doing plotMDS(d) and plotMDS(d$counts).
The axis length and the sample position were different.
Is it expected? why such a difference? I guess it has to do with the
gene.selection option.
I hope I was clear in my explanations.
Many thanks in advance.
Amos Kirilovsky
--
Dr Amos Kirilovsky
CEA
Institut de génomique
2 rue Gaston Cremieux
CP 5706 91057 Evry cedex - France
Tel: (33) 01 60 87 25 35
####################################################################
> *d*
An object of class "DGEList"
$samples
files group lib.size norm.factors
s1 s1 E 271842 0.9414223
s2 s2 b 1310877 1.0216270
s3 s3 b 669731 1.0476370
s4 s4 b 1184754 0.9924584
$counts
s1 s2 s3 s4
804851 0 45 19 49
587117 0 27 12 30
427810 0 16 6 10
367020 0 8 16 20
980437 0 2 8 14
14236 more rows ...
$common.dispersion
[1] 0.4296784
$pseudo.counts
s1 s2 s3 s4
804851 0 24.4565653 19.756230 30.425129
587117 0 14.6478020 12.480244 18.638575
427810 0 8.7504091 6.242240 6.135612
367020 0 4.1452733 16.616435 12.407833
980437 0 0.8736513 8.306954 8.780709
14236 more rows ...
$logCPM
[1] 4.803775 4.158376 3.185720 3.723672 2.957607
14236 more elements ...
$pseudo.lib.size
[1] 729208.6
$prior.n
[1] 10
$tagwise.dispersion
[1] 0.3771734 0.3789361 0.3849697 0.4188742 0.4511627
14236 more elements
####################################################################
####################################################################
*#script*
library(edgeR)
#read file
filinName = "metaTranscriptome"
filin = read.table(filinName, sep ="\t", header = TRUE)
x <- list()
sets <- c("s1","s2","s3","s4")
x$samples <-
data.frame(files=as.character(sets),stringsAsFactors=FALSE)
x$samples$group <- factor(c("A","B","B","B"))
# read count matrix
x$counts = filin[,2:5] #Attention s'il y a des NA ça va bugger
rownames(x$counts) <- filin[,1]
colnames(x$counts) <- sets
# tot read number by sample
x$samples$lib.size <- colSums(x$counts)
# DGEList
x$samples$norm.factors <- 1
row.names(x$samples) <- colnames(x$counts)
d <- new("DGEList",x) #creation de DGElist à partir de x
# normalisation factor
d <- calcNormFactors(d) # TMM
summary(d$counts)
################ Not working ###################
plotMDS(d)
################################################
# estimate common dispersion
d <- estimateCommonDisp(d, verbose=TRUE)
# estimate Tagwise Disp
d <- estimateTagwiseDisp(d, trend="none")
# plots the tagwise dispersions against log2-CPM
# plotBCV(d, cex=0.4)
################ Not working ###################
# Differential expression
# et <- exactTest(d)
# topTags(et)
################################################
# matrix design
design <- model.matrix(~0+group, data=d$samples)
colnames(design) <- levels(d$samples$group)
design
# Differential expression
fit <- glmFit(d,design)
lrt <- glmLRT(fit, contrast=c(1,-1))
topTags(lrt)
####################################################################
####################################################################
sessionInfo()
R version 2.15.3 (2013-03-01)
Platform: x86_64-pc-linux-gnu (64-bit)
other attached packages:
[1] edgeR_3.0.8 limma_3.14.4
[[alternative HTML version deleted]]