You can do it with base R. One way of doing it is first transposing the TPM count matrix (assuming you want to run PCA on the samples rather than the genes), centering it, then doing an SVD and subsequently plotting the first and second columns of the u matrix (assuming you are interested in the first and second principal components. Alternatively, use the prcomp() function instead of SVD and plot the first and second column of the x matrix. Both should yield the same pattern (note that the values will not be the same, but the pattern will).
Pertaining the question at hand, you could use base R plot() function, or ggplot2::ggplot(). My preference goes to the latter as it does make a lot of nice plots once you get the hang of it.
#base R
cols <- as.factor(as.numeric(sample_names))
plot(tpm$u[,1], tpm$u[,2], col = cols)
#ggplot2
library(ggplot2)
plot_df <- data.frame(PC1 = tpm$u[,1], PC2 = tpm$u[,2], Samples = sample_names)
ggplot(plot_df, aes(x = PC1, y = PC2, col = Samples)) +
geom_point()
Assuming you run this in R/RStudio that should work. If not, you would need to setup a plotting device.
Thanks Andy. Can you please help me how can I add colors to this PCA plot based on sample names.
Thanks
Tanya
I would recommend you do some reading on plotting in R.
Example tutorials:
Pertaining the question at hand, you could use base R
plot()
function, orggplot2::ggplot()
. My preference goes to the latter as it does make a lot of nice plots once you get the hang of it.Assuming you run this in R/RStudio that should work. If not, you would need to setup a plotting device.
Hi Andy, what if I want to coloured by genotype or sex from my metadata, how to do it?