Dear List
I have been using the glmFit method in edgeR to analyse some RNA-Seq
data. I will soon be presenting this data to a more statistically
naive audience (and I'm no expert myself) and I was hoping to be able
to prepapre a figure demonstrating how this particular edgeR analysis
approach works.
Basically what I'd like to do would be to plot count data for one (ore
perhaps a few) of my genes and then draw a couple of lines showing the
fit of the null and alternative models used in the glmLRT method of
edgeR to assess gene regulation between conditions.
I was hoping that this would allow me to illustrate the concept of
testing for the likelihood of model fit and hence gene regulation
between conditions.
If anyone could help I'd be grateful.
Best
Iain
Hi Iain,
You're asking a hard question, as drawing nice tutorial pictures for
any
statistical method can be lots of work, and the context here is harder
than most. I think I'd find it hard to think of a good picture like
you
describe, even if I was just doing a ordinary multiple regression
using
lm() with univariate normal data. What covariate or factor are you
testing for? Can you describe the picture you would draw if this was
just
an ordinary multiple regression problem?
Best wishes
Gordon
------------- original message ----------------
[BioC] visualise model fit in edgeR
Iain Gallagher iaingallagher at btopenworld.com
Tue Oct 25 16:06:11 CEST 2011
Dear List
I have been using the glmFit method in edgeR to analyse some RNA-Seq
data.
I will soon be presenting this data to a more statistically naive
audience
(and I'm no expert myself) and I was hoping to be able to prepapre a
figure demonstrating how this particular edgeR analysis approach
works.
Basically what I'd like to do would be to plot count data for one (ore
perhaps a few) of my genes and then draw a couple of lines showing the
fit
of the null and alternative models used in the glmLRT method of edgeR
to
assess gene regulation between conditions.
I was hoping that this would allow me to illustrate the concept of
testing
for the likelihood of model fit and hence gene regulation between
conditions.
If anyone could help I'd be grateful.
Best
Iain
______________________________________________________________________
The information in this email is confidential and
intend...{{dropped:4}}
Dear Gordon
Thanks for your reply. There's nothing like someone else's question to
make one focus on what exactly one wants. This was certainly the case
here!
I have given this some thought from my statisically naive
point of view and I have attached a mock-up picture of the kind of
thing
I envisaged (although I appreciate the real life situation is more
complicated).
The experimental design is as follows:
Cells
were collected from 6 animals and infected with one of 4 strains of
bacteria or left uninfected. RNA was sampled at 2, 6, 24 & 48 hours
post infection. There are thus 120 data points across the whole
experiment.
I have used edgeR to analyse the infected v
control data at each timepoint using the GLM approach? - effectively a
paired samples analysis for each timepoint? as per the edgeR manual
(section 11). Perhaps there's something more sophisticated I could do
here though. If you had any advice that would be great!
design <- model.matrix(~ cow + infection)
#dispersion estimate
d <- estimateGLMCommonDisp(d, design)
#fit the NB GLM for each gene
fitFiltered <- glmFit(d, design, dispersion = d$common.dispersion)
#carry out the likliehood ratio test
lrtFiltered <- glmLRT(d, fitFiltered, coef = 7)
For
my audience I simply wanted to illustrate the fitting of the two
models
and how likelihood ratio tests are used rather than a t-test
approach.
In the attached pdf each black line represents the H1 model (with
infection) and each red line represents the null model (cows only) for
one gene only. The points are the 'raw data' (but not real data); C =
control, I = infected. I realise this illustration is showing
essentially a linear fit but I'm trying to aim for simplicity for the
audience (a conceptual rather than entirely accurate approach). I
would
be happy to get my hands dirty coding something more lifelike as I
think
that would aid my understanding as well.
I was going to
describe this in terms of the 'fit' of each line to the data i.e. for
the regulated gene the black line is the more 'likely' model whereas
in
the non-regulated gene there is little to separate the models.
Hope this is somewhat useful.
Best
Iain
________________________________
From: Gordon K Smyth <smyth@wehi.edu.au>
To: Iain Gallagher <iaingallagher at="" btopenworld.com="">
Cc: Yunshun Chen <yuchen at="" wehi.edu.au="">; Bioconductor mailing list
<bioconductor at="" r-project.org="">
Sent: Friday, 28 October 2011, 6:36
Subject: visualise model fit in edgeR
Hi Iain,
You're asking a hard question, as drawing nice tutorial pictures for
any statistical method can be lots of work, and the context here is
harder than most.? I think I'd find it hard to think of a good picture
like you describe, even if I was just doing a ordinary multiple
regression using lm() with univariate normal data.? What covariate or
factor are you testing for?? Can you describe the picture you would
draw if this was just an ordinary multiple regression problem?
Best wishes
Gordon
------------- original message ----------------
[BioC] visualise model fit in edgeR
Iain Gallagher iaingallagher at btopenworld.com
Tue Oct 25 16:06:11 CEST 2011
Dear List
I have been using the glmFit method in edgeR to analyse some RNA-Seq
data. I will soon be presenting this data to a more statistically
naive audience (and I'm no expert myself) and I was hoping to be able
to prepapre a figure demonstrating how this particular edgeR analysis
approach works.
Basically what I'd like to do would be to plot count data for one (ore
perhaps a few) of my genes and then draw a couple of lines showing the
fit of the null and alternative models used in the glmLRT method of
edgeR to assess gene regulation between conditions.
I was hoping that this would allow me to illustrate the concept of
testing for the likelihood of model fit and hence gene regulation
between conditions.
If anyone could help I'd be grateful.
Best
Iain
______________________________________________________________________
The information in this email is confidential and intended solely for
the addressee.
You must not disclose, forward, print or use it without the permission
of the sender.
______________________________________________________________________
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mockModel.pdf
Type: application/pdf
Size: 64641 bytes
Desc: not available
URL: <https: stat.ethz.ch="" pipermail="" bioconductor="" attachments="" 20111031="" 5c2543b2="" attachment.pdf="">