Hello BioConductor community,
I downloaded .cel files from an Affymetrix U133A array study from NCBI GEO: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE55235
Afterwards, to get more up-to-date annotation, I used the version 25 customCDF files for gencode from the umich.edu brainarray website and read in the data with customCDF
and affy
package from there:
http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/CDF_download.asp
Afterwards, I used this tutorial/guide to perform gcrma()
normalization:
https://www.biostars.org/p/61987/#62003
Finally, the 10 rheumatoid arthritis (RA) samples vs 10 ND (healthy) samples were compared using a t-test for single genes of interest (without any p-value adjustment) or modeling, just mean of 10 RA samples vs mean of 10 healthy samples, for gene x, and repeat for gene y, generate t value and it's corresponding p value using an alpha of .05.
Is this correct or flawed? Would or could a housekeeping gene such as GAPDH be used to normalize or just see what expression was of GAPDH or another housekeeping gene compared to gene x of interest and gene y of interest?
As a final point, I did do the analysis as described in the limma
userguide section 8.2. But was wondering, would it be correct to do it like it has been done already, with single genes and a t-test? and again, could a housekeeping gene be used to compare/reference "baseline" gene expression of a common gene with gene x of interest and gene y of interest?
Thank you very much in advance.
Respectfully, Pratik
Thank you very much for your answer.
If I am understanding correctly (and maybe extrapolating a little bit here), would plotting the mean normalized GCRMA values for gene X of
healthy
anddisease
be acceptable (through a histogram) ?and then as a replacement for the t-test (for determining statistical significance) that has already been done (using single genes), would I instead use
limma
's outputconfint=TRUE
when runningtopTable
inlimma
as suggested here: Display error (error bars) for fold-change estimate from replicates in edgeR and here: Confidence intervals on edgeR logFC (to use for adding error bars on the above histogram?)the objective here, is to kind-of make sure an analysis is conventionally correct before submitting the manuscript... I provided the gcrma normalized counts to a colleague about 2 years ago, a histogram was made with mean gcrma normalized values for gene of interest X but used single-gene t-test (as described in the original question post) to make error bars... just want to make sure what we are submitting now, is solid evidence for, hopefully, the truth or closer-to.
Thank you again :)
Respectfully, Pratik
Your questions are off-topic. I am unable to say what you should plot for a manuscript. As the author that is entirely up to you.
Using
limma
is about as conventionally correct as you can get for microarray analyses. Over 22K citations is probably sufficient for any journal, I would imagind.