Hello Vicencio,
> Dear list,
>
> I'm having some difficulties in using HTqPCR to analyze qPCR data
obtained
> using the Biomark Fluidigm 96.96 array.
>
interesting question. For a while I was toying with the idea of
incorporating functions specifically for Fluidigm data into HTqPCR. I
never went through with it though, since each individual Fluidigm
array
can have its own design, so it's not necessarily common across samples
the
way it is for e.g. ABI and Roche cards. Nevertheless, it should be
possible to use HTqPCR for Fluidigm data.
> With the Fluidigm chips, one can measure expression of 96 genes in
96
> samples on one plate, i.e. 9216 PCRs per plate (see
>
http://www.fluidigm.com/products/biomark-chips.html for details).
>
> In my experiment, I use 9 such plates. On each plate I have 88
different
> experimental samples, with different samples on each plate,
totalling 792
> unique experimental samples (associated with specific experimental
> conditions). On each plate, I also have 8 standard samples that are
the
> same across all plates (1 NTC, 1 cDNA mix +RT, 1 cDNA mix -RT, 5
samples
> of a dilution series).
>
>
> I use 32 different genes (features), each replicated 3 times, in the
same
> order on each plate.
>
> Each original data file (as exported by the FLuidigm software) has
data on
> one plate, i.e. 9216 rows with one PCR per row, with columns for
sample
> name, feature name, Ct, quality calls, etc.
> I managed to read in the 9 data files (from 9 plates) into one
qPCRset
> object:
> An object of class "qPCRset"
> Size: 96 features, 864 samples
> Feature types: Reference, Test
> Feature names: 1.BGRP 1.BGRP 1.BGRP ...
> Feature classes:
> Feature categories: OK, Undetermined
> Sample names: A1.1h A1.6a A1.11h ...
>
> Is this a good way to structure my data? Or would it be better to
create 9
> qPCRset objects (1 for each plate)? Before spending more effort
continuing
> this approach I'd appreciate your opinion on whether this is the way
> forward.
>
It depends a bit on how clean your data is, and how you want to
preprocess
it. If you suspect there are any array-specific effects at all, you'll
probably want to normalise your 9 plates separately, i.e. have them in
a
qPCRset object with 96x96 rows and 9 columns.
Do you have your data in a single or 9 files? Either way, you can
create
such a qPCRset. Or possibly, if you want to use the object you already
have loaded into R, you can split it up using something like this
(untested, and unelegant):
q <- your_qPCRset
# To get the columns originating from the same array
start <- seq(1, 9*96, 96)
# Make a list of 9 individual 96x96 qPCRset objects
q.list <- list()
for (i in seq_along(start)) {
q.list[[i]] <- q[,start[i]:(start[i]+95)]
}
# Convert each list entry from 96x96 to 9216x1 dimension qPCRset
for (i in seq_along(q.list)) {
temp <- list()
for (j in 1:96)
temp[[j]] <- q.list[[i]][,j]
q.list[[i]] <- do.call("rbind", temp)
}
# Join them all together into 9216x9
q.new <- do.call("cbind", q.list)
A bit of data exploration is probably required to check whether you
have
any particular biases that needs correcting in your data. Based on the
qPCRset object you have now, you can e.g. try clustering your data
using
clusterCt(), and see if the samples, especially the controls, cluster
together by sample type or based on what array they were run on. Also,
what's the correlation between samples like (plotCtCor)?
By the time you get to the actual statistical testing you'd want your
data
in a format like the one you have now, i.e. 1 row per gene (3 rows per
gene in your case due to your replicates) and 1 column per sample. If
your
start with 9216 rows x 9 columns for doing the normalisation, you can
reformat the data afterwards using the changeCtLayout function.
>
> Among other things, I would like to do the following:
> 1. Check for spatial effects. When I use plotCtCard, it only plots
one
> sample at a time, even though I have 96 samples on each plate. Is it
> possible to plot my 96 samples x 96 features? How can I specify this
kind
> of layout?
To plot each array separately, you'd need to have each array in a
single
column!
Note though, that the plotCtCard is optimised for the standard size
rectangular well plate. Aesthetically speaking it might not look so
nice
for a 96x96 square array. I started making a plotCtArray function for
Fluidigm data at some point; let me know if you're keen to be a guinea
pig.
> 2. Control for plate-specific effects. I have the same 8 standard
samples
> on each plate (for all genes), and would like to use these repeated
> measurements to 'normalize' all other data across plates. However,
I'm
> having a hard time even accessing and plotting the data.
For using these 8 control genes for normalisation you can use the
function
normalizeCtData(q, norm = "deltaCt", deltaCt.genes) where
deltaCt.genes is
a vector of the gene names you want to use as standard. Note that
these 8
gene names must appear exactly as they are in featureNames(q).
> 3. Speficiy technical replicates. Each sample has been run on 32
genes in
> triplicate. Each feature name is represented 3 times (once for each
> technical rep). How can I specify that my 96 features are grouped
per 3?
You don't have to specify technical replicates directly anywhere
within
your qPCRset objects. Several functions, such as ttestCtData has a
parameter "replicates" which can be set to TRUE if you want to
consider
replicates. If so the function(s) combine data across genes that have
identical featureNames.
A small note here: featureNames don't have to be unique, in fact it's
often easier for downstream analysis if identical genes are named the
same, and not e.g. gene1_rep1, gene1_rep2 etc. The way to tell them
apart
is then using the featurePos information. This corresponds to the
location
of each gene on the array, or pos1...pos9216 if not positional
information
is supplied to readCtData. The output from e.g. ttestCtData will
report
both the featureNames and featurePos, so even for replicates you can
always trace each result back to the original value.
> 4. Add information about my experimental design. My 792
experimental
> samples were obtained in a full factorial design with several
biological
> replicates per treatment. Is it possible to add extra data to my
qPCR set
> object? E.g. a matrix containing, per sample, information on sample
name,
> value for factor 1, value for factor 2, etc.?
>
I'm afraid there's no "optional" slot in qPCRset objects where users
can
add additional data, whether that's data frames, matrices or lists.
HTH
\Heidi
> I do understand that this package was not developed specifically for
> dealing with data from these Fluidigm chips, but I haven't found any
such
> package and as far as I know HTqPCR is the best package around for
> analysis of high-throughput qPCR data.
>
> I hope someone can help me out a little bit. I'm new to R, but I'm
not
> asking you to do my work for me, just some directions to help me do
it
> myself. Thanks!
>
> Cheers,
> Vicencio
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
>
https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
>
http://news.gmane.org/gmane.science.biology.informatics.conductor
>