Hi Andreia,
if your samples are indeed very different, then that's why a rank
invariant scaling fails. Quantile normalisation might be quite
conservative, but at least it seems to bring the C3 sample together
with
the other C samples, based on your plots.
Depending on how adventurous you feel, you can also try some other
scaling/normalisation methods yourself. For example, this article
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2718498/ recommends
scaling to
the mean of all Ct values when dealing with miRNA qPCR values.
Such a method might not work if you expect large overall differences
in
expression level between your samples. However, it's easy to implement
and
test this. Say for example that you want to use the geometric mean of
all
expressed genes (Ct>35), and use the first sample as a reference, you
could do something like this:
# Load some example data
data(qPCRraw)
# Define plotting function (or just use the individual commands
directly)
my.norm.method <- function(q, Ct.max=35, ref=1)
{
# Get the data
data <- exprs(q)
# For each column, calculate the geometric mean of Ct
values<ct.max geo.mean="" <-="" apply(data,="" 2,="" function(x)="" {="" xx="" <-="" log2(subset(x,="" x<ct.max))="" 2^mean(xx)})="" #="" calculate="" the="" scaling="" factor="" geo.scale="" <-="" geo.mean="" geo.mean[ref]="" #="" adjust="" the="" data="" accordingly="" data.norm="" <-="" t(t(data)="" *="" geo.scale)="" #="" return="" the="" normalised="" object="" exprs(q)="" <-="" data.norm="" q="" }="" #="" normalise="" q.norm="" <-="" my.norm.method(qpcrraw)="" #="" plot="" raw="" versus="" normalised="" data="" plot(exprs(qpcrraw),="" exprs(q.norm),="" col="rep(1:n.samples(q.norm)," each="n.wells(q.norm)))" #="" followed="" by="" the="" usual="" qc="" and="" sanity="" check="" of="" your="" data...="" if="" you="" decide="" to="" give="" that="" (or="" something="" similar)="" a="" go,="" i'd="" be="" interested="" in="" hearing="" whether="" it="" works="" for="" your="" data="" or="" not.="" cheers="" \heidi=""> Dear Heidi,
>
> thanks for your reply. Indeed I am comparing cell types which have
huge
> differences between miRNAs profiles and unfortunately the qPCR assay
only
> has one endogenous gene which is being affected by cell type and
therefore
> dCt method is not adequate. I have tried quantile. The reason why I
wanted
> to find another method is because has you can see in the
distribution of
> Ct
> values, the cells S1 have many miRs which are not expressed and that
I am
> analyzing as Ct=40. So these cells are very different and with
quantile
> some
> differences will not pop up in the analyses because I am forcing it
to
> have
> a distribution similar do the other cells. Still I think that is
approach
> is
> conservative, given that some differences do appear as you can see
in the
> files after quantile normalization. Implementing other methods that
could
> deal with this problems of working with cell types which have
different
> behavior like my case and lacking endogenous genes to normalize
could be a
> suggestion to your package.
> Kind regards,
> Andreia
>
> PS: in attach are two files with the correlations and data
distribution.
>
> On Mon, Jan 24, 2011 at 11:11 PM, Heidi Dvinge <heidi at="" ebi.ac.uk="">
wrote:
>
>> Hello Andreia,
>>
>> I can reproduce the error you get if I say:
>>
>> > data(qPCRraw)
>> > temp <- normalizeCtData(qPCRraw, norm="scale.rankinvariant")
>> Scaling Ct values
>> Using rank invariant genes: Gene1 Gene29
>> Scaling factors: 1.00 1.06 1.00 1.03 1.00 1.00
>> # Select just the first genes so that Gene29 is excluded
>> > normalizeCtData(qPCRraw[1:10,], norm="scale.rankinvariant")
>> Error in smooth.spline(ref[i.set], data[i.set]) :
>> need at least four unique 'x' values
>>
>> After looking into the code, the problem occur when there's only a
>> single
>> (or no) rank invariant genes between any individual sample and the
>> reference sample (the mean or median across all samples). At least
two
>> rank-invariant genes are required between the reference and each
sample.
>> I'll make a note of this in the help file.
>>
>> This means that a rank-invariant method is not going to be robust
enough
>> for your normalisation. Instead, you'll have to go with ddCt or
>> quantile.
>> In the future there might be other options available in HTqPCR
(e.g.
>> scale
>> by arithmetic or geometric mean) depending on demand.
>>
>> The likely cause of this is that your samples are quite different.
Have
>> you tried investigating them with e.g. plotCtCor or clusterCt to
see if
>> they group as expected, or if there's any marked difference in the
>> distribution of Ct values (plotCtDensity)? Even a relatively harsh
>> method
>> such as quantile normalisation might be suitable for you data.
>>
>> Cheers
>> \Heidi
>>
>>
>> > Dear Heidi,
>> >
>> > thanks for the quick reply,
>> >
>> > after traceback() I get
>> >
>> > traceback()
>> > 5: stop("need at least four unique 'x' values")
>> > 4: smooth.spline(ref[i.set], data[i.set])
>> > 3: FUN(newX[, i], ...)
>> > 2: apply(data, 2, normalize.invariantset, ref = ref.data)
>> > 1: normalizeCtDataraw.cat, norm = "scale.rank")
>> >
>> > information about the session
>> > sessionInfo()
>> > R version 2.11.1 (2010-05-31)
>> > i386-apple-darwin9.8.0
>> >
>> > locale:
>> > [1] C
>> >
>> > attached base packages:
>> > [1] stats graphics grDevices utils datasets methods
base
>> >
>> > other attached packages:
>> > [1] statmod_1.4.8 HTqPCR_1.2.0 limma_3.4.4
>> > RColorBrewer_1.0-2 Biobase_2.8.0
>> >
>> > loaded via a namespace (and not attached):
>> > [1] affy_1.26.1 affyio_1.16.0 gdata_2.7.2
>> > gplots_2.8.0 gtools_2.6.2 preprocessCore_1.10.0
>> >
>> > On Fri, Jan 21, 2011 at 5:41 PM, Heidi Dvinge <heidi at="" ebi.ac.uk=""> wrote:
>> >
>> >> Dear Andreia,
>> >>
>> >> > Dear all,
>> >> >
>> >> > I am analysing qPCR data from the Exiqon where I have one card
per
>> >> sample,
>> >> > in each card I have one observation for each miRNA. I have in
total
>> 8
>> >> > cards,
>> >> > 2 for treatment 1, 3 for treatment 2 and 3 for treatment 3.
Each
>> card
>> >> has
>> >> > one endogenous gene, which I wouldn't like to use to normalize
Ct
>> >> values
>> >> > because is being affected by the type of treatment. So I would
like
>> to
>> >> use
>> >> > scale.rank.
>> >> > I am getting the following error:
>> >> >
>> >> > sr.norm <- normalizeCtDataraw.cat, norm = "scale.rank")
>> >> > Error in smooth.spline(ref[i.set], data[i.set]) :
>> >> > need at least four unique 'x' values
>> >> >
>> >> It sounds like there aren't enough rank-invariant genes across
your 8
>> >> cards. If that's the case, then this is admittedly not the most
>> useful
>> >> error message, and it should be changed. What does it say when
you
>> run
>> >> traceback() following the error?
>> >>
>> >> The parameter "scale.rank.samples" in normalizeCtData() will let
you
>> set
>> >> how many of the samples each gene has to be rank-invariant
across in
>> >> order
>> >> to be excluded. Per default this is the number of samples-1. You
can
>> try
>> >> lowering that number, although keeping in mind that the lower it
is,
>> the
>> >> less robust your resulting rank-invariant genes are. If your
samples
>> are
>> >> all highly variable across all genes, it might not be possible
for
>> you
>> >> to
>> >> use this normalisation method.
>> >>
>> >> If this does not seem to be the problem, something else might be
>> going
>> >> on
>> >> with the function. In that case, please report back here and I
can
>> >> perhaps
>> >> have a look at your data.
>> >>
>> >> I have been considering adding an additional parameter to
>> >> normalizeCtData,
>> >> so that genes just have to be rank-invariant within a certain
>> interval,
>> >> e.g. be located within -/+5 of each other on the ranked list.
For
>> rather
>> >> low-throughput qPCR cards that could mess things up though.
>> >>
>> >> HTH
>> >> \Heidi
>> >>
>> >> > Does this mean I don't have enough replicates?
>> >> >
>> >> > thanks for the help
>> >> >
>> >> > Andreia
>> >> >
>> >> > --
>> >> > --------------------------------------------
>> >> > Andreia J. Amaral
>> >> > Unidade de Imunologia Cl?nica
>> >> > Instituto de Medicina Molecular
>> >> > Universidade de Lisboa
>> >> > email: andreiaamaral at fm.ul.pt
>> >> > andreia.fonseca at gmail.com
>> >> >
>> >> > [[alternative HTML version deleted]]
>> >> >
>> >> > _______________________________________________
>> >> > Bioconductor mailing list
>> >> > Bioconductor at r-project.org
>> >> >
https://stat.ethz.ch/mailman/listinfo/bioconductor
>> >> > Search the archives:
>> >> >
http://news.gmane.org/gmane.science.biology.informatics.conductor
>> >>
>> >>
>> >>
>> >
>> >
>> > --
>> > --------------------------------------------
>> > Andreia J. Amaral
>> > Unidade de Imunologia Cl?nica
>> > Instituto de Medicina Molecular
>> > Universidade de Lisboa
>> > email: andreiaamaral at fm.ul.pt
>> > andreia.fonseca at gmail.com
>> >
>>
>>
>>
>
>
> --
> --------------------------------------------
> Andreia J. Amaral
> Unidade de Imunologia Cl?nica
> Instituto de Medicina Molecular
> Universidade de Lisboa
> email: andreiaamaral at fm.ul.pt
> andreia.fonseca at gmail.com
>