I need to ask two questions. I am new to this area. I have my epic array data from 4 different cells with 2 replicates. I have to use Minfi package. I have loaded my idat data into the environment. I checked tens of workflows and all of them contained sample sheet. There is no sample sheet for my data. Should I create it somehow or should I start analyzing without the sample sheet? If I should create, how can I do it? And the second question is how will I handle with the replicates? I will continue with bump hunter and Limma for differential methylation. At what point the replicates should be merged, if they should be merged? Thanks in advance
> RGset
class: RGChannelSet
dim: 1105209 8
metadata(0):
assays(2): Green Red
rownames(1105209): 1600157 1600179 ... 99810982 99810990
rowData names(0):
colnames(8): 207716530101_R01C01 207716530101_R02C01 ... 207716530101_R07C01
207716530101_R08C01
colData names(14): X Date ... Basename filenames
Annotation
array: IlluminaHumanMethylationEPIC
annotation: ilm10b4.hg19
You don't need a sample sheet. It is just a simple way to identify the Idat files and possibly include sample information. But you appear to have read the data in already, so you are good to go.
It is not clear what you mean by replicates. Are these technical replicates or something more like a biological replicate? Ideally you would have biological replicates so your results would be more likely to generalize to the underlying population, but perhaps you have replicates that are simply different plates of cells? In that case your replicates are more like technical replicates and you could either collapse to a single value by taking the mean of the M-values for each replicate (using e.g., avereps from limma), or perhaps by fitting a GLS model by specifying the within-replicate correlation (using duplicateCorrelation).
Thank you so much for the clarification. As you said the replicates are the cells from different plates. For instance, both R01C01 and R05C01 are SUP-B15 cell line grown in different plates, and R02C01 and R06C01 are from resistant cell line derived from SUP-B15. I want to use preprocessFunnorm for normalization, and then I want to find differentially methylated regions between the resistant and the sensitive cell lines using bumphunter. And then, I want to use Limma. How am I supposed to design the matrix?
Thank you so much for the clarification. As you said the replicates are the cells from different plates. For instance, both R01C01 and R05C01 are SUP-B15 cell line grown in different plates, and R02C01 and R06C01 are from resistant cell line derived from SUP-B15. I want to use preprocessFunnorm for normalization, and then I want to find differentially methylated regions between the resistant and the sensitive cell lines using bumphunter. And then, I want to use Limma. How am I supposed to design the matrix?
You can use
model.matrix
to generate the design matrix. See examples in the 'limma User's Guide' or the vignette for theDMRcate
package.Thanks a lot for the help. You saved my day