Hi
I am trying to do PCA analysis of my samples. I generated the matrix using the tximport package. I have transcript ids as my rows and the sample names are the columns.
txi <- tximport(files, type="salmon", tx2gene=NULL, ignoreTxVersion=TRUE,dropInfReps=TRUE,txOut = TRUE) tpm <- (txi$abundance[apply(txi$abundance, MARGIN = 1, FUN = function(x) sd(x) != 0),]) tpm = log2(tpm + 1) tpm_centered <- t(tpm-rowMeans(tpm)) pca = prcomp(tpm_centered , scale=TRUE, center=TRUE) cols <- as.factor(as.numeric(colnames(tpm_centered))) plot(pca$x[,1],pca$x[,2], xlab = "PC1", ylab = "PC2",main ="PCA replicate1", col =cols) text(pca$x[,1],pca$x[,2], row.names(pca$x), cex=0.5, pos=3)
I have couple of question.
1. Is generating PCA plot from txi$abundance a good idea to plot the PCA
2. I am unable to get the output colored based on samples.
Can someone please help me
Thanks
Tanya
Hi Michael
I am now using the follwoing code:
txi <- tximport(files, type="salmon", tx2gene=tx2gene, ignoreTxVersion=TRUE,dropInfReps=TRUE)
sampleTable <- data.frame(condition =samples$condition,time=factor(samples$time))
rownames(sampleTable) <- colnames(txi$counts)
dds <- DESeqDataSetFromTximport(txi, sampleTable, ~condition+time)
dds <- dds[ rowSums(counts(dds)) > 1, ]
rld <- rlog(dds, blind = FALSE)
plotPCA(rld, intgroup = c("condition", "time"))
I have wild type and mutant as condition and the time point as 0hr 6hr and 24 hr. I have 4 replicate for each one of them. In the PCA plot the replicates are not grouping together. Do you think this is normal or I am making some mistake.
Regards
Tanya
Make sure that files and sample table are the same order. This is very important.