Entering edit mode
ecg1g15
▴
30
@ecg1g15-19970
Last seen 4.1 years ago
I am working with a set of samples and around 4000 genes being expressed across these. (no control).
- I have a counts df with samples as columns and genes as rows
- I have a coldata df = samples as rows and observations as columns
I have plotted a PCA using the DESEq
plotPCA(vsd, intgroup=c("condition", "type"))
pcaData <- plotPCA(vsd, intgroup=c("condition", "type"), returnData=TRUE)
percentVar <- round(100 * attr(pcaData, "percentVar"))
ggplot(pcaData, aes(PC1, PC2, color=condition, shape=type)) +
geom_point(size=3) +
xlab(paste0("PC1: ",percentVar[1],"% variance")) +
ylab(paste0("PC2: ",percentVar[2],"% variance")) +
coord_fixed()
I would like to plot a biplot on the PCA using my coldata environmental variables so they point explaining the variability.
- Is there a way I could include the eigenvectors or the factors from my "coldata" which are potentially explaining the PCA variability of my samples?
- In what format do I need to have my data to (ie. only numeric, transpose, merge coldata+count data?
library(PCAtools)
biplot(pcaData)
# Error in nrow(y) : argument "y" is missing, with no default
sessionInfo( )
R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.4