Hello Simon,
your questions seems to have made it to the Bioconductor list with a
few
days delay, so I'm continuing our off-list discussion from last week
here
(everything copied in below).
Regarding the normalisation, it sounds like you might want to try
either
the "norm.rankinvariant" or "scale.rankinvariant" methods in
normalizeCtData. Both of them identifies a set of rank-invariant
genes,
i.e. don't rely on e.g. actin. "scale.rankinvariant" simply scales
each
qPCR card by a fixed factor identified based on the rank-invariant
genes.
"norm.rankinvariant" fits a smoothing line through the rank-invariant
genes, and then applies this smoothing to the rest of the genes.
The latter works best if there is more than a few rank-invariant
genes.
For a comparison of the data before/after normalisation, you can do a
simple scatterplot, similar to figure 7 in the vignette (also included
in
the examples in ?normalizeCtData). This gives you an idea of how each
method performs, and what the differences between them are.
Regarding how replicates are treated, then this depends on the
function.
Functions such as setCategory() for marking some genes as potentially
unreliable can account for technical replicates within each qPCR plate
by
setting replicates=TRUE. However, replicates from different qPCR
plates
are treated identically, whether they come from biological or
technical
sample replicates.
When testing for differential expression, technical vs biological
replicates can be accounted for using the function limmaCtData, which
incorporates funcitonality from the "limma" package. There are some
examples of this in the limma user manual, accessible via
limmaUsersGuide() in R, especially in section 8.2.
In your case that would involve calling duplicateCorrelation() first,
followed by limmaCtData(). However, if you give some more details
about
which samples you have + what type of replicates they are, we can
probably
help you construct you design matrix if you haven't used limma before.
HTH
\Heidi
======================
Off-list conversation copied in below, for the archives
======================
Hi Simon,
> Thanks Heidi,
> I posted my questions (plus a few others), on the forum as well.
hm, I seemed to have overlooked those in my Bioconductor email box. Do
you
know which date you sent your questions? I that case I can probably
chase
them up.
> So for
> the flags, HTqPCR needs them to be either "Passed" or "Flagged"?
Then you
> just substitute using rep as you suggest below?
Actually, the flags can be anything. For an example, I just used the
default qPCRset object that comes with HTqPCR. You can read in data
without specifying flags, and alter that later as you wish.
In the example I email you, the flag should be read in automatically:
test <- readCtData(files="1131135160.csv", # file name
+ n.features=2304, # if you use the 48.48 assay
+ flag=9, # the column containing flags
+ feature=5, # column containing gene names
+ type=6, # column containing probe type
+ Ct=7, # column containing the actual values
+ skip=24, # additional argument to read.table; the number of
lines to
skip before hte data starts. Probably different in your file!
+ sep=",") # additional argument to read.table; the data is ","
not
"\t" separated.
Warning message:
In readCtData(files = "1131135160.csv", n.features = 2304, flag = 9,
:
2304 gene names (rows) expected, got 2314
show(test)
An object of class "qPCRset"
Size: 2304 features, 1 samples
Feature types: NRC, Test
Feature names: T09F3_3_HK T09F3_3_HK F33H1_2_HK ...
Feature classes:
Feature categories: OK
Sample names: 1131135160 NA NA ...
head(flag(test))
1131135160
1 Pass
2 Pass
3 Pass
4 Pass
5 Pass
6 Pass
table(flag(test))
Fail Pass
835 1469
> Also, if you could respond to the post on the forum (with potential
> answers to the questions about technical versus biological, and
potential
> normalizers, that would be great).
I'm not exactly sure what your question here is. You say that you're
not
clear about technical versus biological reps. Do you mean how many you
need, whether to normalise one kind different from another or
something
else?
Cheers
\Heidi
>> Hi Simon,
>>
>> thanks for your interest in HTqPCR. In principle, there's no reason
why it
>> shouldn't be applicable to Fluidigm data. I used it briefly at some
point
>> for one of the 96.96 assays, and I've also had other questions from
people
>> on the Bioconductor list using HTqPCR for Fluidigm. I've been
meaning to
>> add a section about that to the HTqPCR vignette, but didn't have
time for
>> the before the previous bioconductor release.
>>
>> For specifying the flags, you can use the flag(<your_object>) <-
>> <your_flags> syntax. The flags have to be in a data.frame, but
apart from
>> that they can be anything. Although I must admit that I never
really
>> tested this... E.g. for changing just the first column, if flags
are
>> already present:
>>
>> data(qPCRraw)
>> head(flag(qPCRraw))
>> sample1 sample2 sample3 sample4 sample5 sample6
>> 1 Passed Passed Passed Flagged Passed Passed
>> 2 Passed Passed Passed Flagged Passed Passed
>> 3 Passed Passed Passed Passed Passed Passed
>> 4 Passed Passed Passed Passed Passed Passed
>> 5 Passed Passed Passed Passed Passed Passed
>> 6 Flagged Flagged Flagged Flagged Flagged Flagged
>> new.flags <- data.frame(rep("OK", 384))
>> flag(qPCRraw)[,1] <- new.flags
>> head(flag(qPCRraw))
>> sample1 sample2 sample3 sample4 sample5 sample6
>> 1 OK Passed Passed Flagged Passed Passed
>> 2 OK Passed Passed Flagged Passed Passed
>> 3 OK Passed Passed Passed Passed Passed
>> 4 OK Passed Passed Passed Passed Passed
>> 5 OK Passed Passed Passed Passed Passed
>> 6 OK Flagged Flagged Flagged Flagged Flagged
>>
>> Alternatively, it should be possible to read them into a qPCRset
object at
>> the same time as your data. My mac has done strange things to the
line
>> ending in your file, so this might not work directly for you, but
>> something along the lines of:
>>
>> test <- readCtData(files="1131135160.csv", # file name
>> n.features=2304, # if you use the 48.48 assay
>> flag=9, # the column containing flags
>> feature=5, # column containing gene names
>> type=6, # column containing probe type
>> Ct=7, # column containing the actual values
>> skip=24, # additional argument to read.table; the number of
lines to
>> skip before hte data starts. Probably different in your file!
>> sep=",") # additional argument to read.table; the data is ","
not "\t"
>> separated.
>>
>> HTH
>> \Heidi
>>> I'm trying to use the package developed by Heidi Dvinge (HTqPCR),
and
I've
>>> a couple of basic questions. The package seems very nice, however,
I'm
>>> using a high throughput real-time PCR system, the Biomark from
Fluidigm,
>>> which can carry out upto 40,000 PCR's in a single run
(approximately).
>>> Most of my runs use about 10,000 PCRs, with multiple technical
replicates,
>>> and occasionally I use several plates if I'm assying many samples,
and
>>> dozens of genes.
>>>
>>> Typically, I've been using the geNorm approach for identifying
potential
>>> normalizers in a suite of genes in any given biological paradigm,
as a
>>> priori its always best to identify a gene which doesnt change in
the
>>> question of interest (as opposed to using something like actin,
and
>>> assuming it doesnt change). Is there a way of identifying the
"best"
>>> normalizer in HTqPCR? Also, I'm having trouble specifying flags. I
>>> understand how to designate the appropriate column in the file as
the
flag
>>> column, however, I dont understand how to specify the pass/fail
criteria.
>>> In the Biomarks output, each PCR reaction is specified as "Pass"
or
>>> "Fail". I've been over the vignette for HTqPCR, and cant see how
to
define
>>> the "Faliure" criteria.
>>>
>>> In addition, I'm a bit fuzzy on how HtQPCR handles technical reps
as
>>> opposed to biological reps. There is some discussion of this in
the
>>> vignette, but I'm having trouble following it.
>>>
>>> Here are a few lines of our typical output from our Biomark
machine in
>>> case this helps.
>>>
>>>
>>> Chip Run Filename
>>>
>>> X:\Real-time studies\Biomark\HK_Mar2010\ChipRun.bml
>>>
>>>
>>>
>>>
>>>
>