I have a total of 8 samples, 4 controls and 4 Foxcut gene over expressed samples. I have a dataframe data
with genes as rows and samples as columns with counts.
The column data for all the 8 samples look like below with replicate and cell-line information:
Samples TYPE Replicate Cell-lines
Cell1_HA1 Control 1 1
Cell1_HA2 Control 2 1
Cell1_foxcut11 FOXCUT_OverExpression 1 1
Cell1_foxcut12 FOXCUT_OverExpression 2 1
Cell2_HA1 Control 3 2
Cell2_HA2 Control 4 2
Cell2_foxcut11 FOXCUT_OverExpression 3 2
Cell2_foxcut12 FOXCUT_OverExpression 4 2
I have counts data for all the 8 samples after star
alignment. I'm using edgeR
package for differential analysis. This is the first time I'm doing differential analysis with cell-line data with replicate information. I'm not aware about how to create design matrix
and contrast.matrix
for differential analysis within same cell-line samples.
I wanted to compare the below samples and do differential analysis:
Cell1_foxcut samples vs Cell1_HA samples
Cell2_foxcut samples vs Cell2_HA samples
I tried like below, but not sure whether this is right or not.
colnames(data) %in% coldata$Samples
coldata <- coldata[match(colnames(data), coldata$Samples),]
table(coldata$Type)
library(edgeR)
group <- factor(paste0(coldata$TYPE))
y <- DGEList(data,group = group)
y$samples
## Filtering
keep <- rowSums(cpm(y) > 0.5) >= 1
y <- y[keep, , keep.lib.sizes=FALSE]
y <- calcNormFactors(y,method = "TMM") ##Normalization
## Create design matrix
design2 <- model.matrix(~ 0 + group + coldata$Replicate + coldata$Cell-lines)
And how to give coef
in contrast.matrix for differential analysis between different samples?
If the above design.matrix
is not right could you please help me how to do this. I have seen tutorials and many other questions, but couldn't come to a conclusion, because I'm confused in this type of analysis.
thanks a lot
@Gordon Could you please help me in something about my post. thanq