Dear all,
I would like to plot the correlation between my samples. I have a matrix with genes in rows and samples in columns, and for each gene a logCPM normalized values. If I want to look at the correlation between 2 samples I get an Error:
x <- as.data.frame(t(log.cpm.norm[,1]))
y <- as.data.frame(t(log.cpm.norm[,2]))
correlation <- as.data.frame(cor(x, y,use="pairwise.complete.obs"))
plot(correlation)
Error in plot.new() : figure margins too large
In addition: Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf
3: In min(x) : no non-missing arguments to min; returning Inf
4: In max(x) : no non-missing arguments to max; returning -Inf
It is correct to transpose the columns?
Furthermore, are log cpms usually used to compute correlations between samples?
To solve the margin problem I have tried to adjust the margins with:
par('mar')
Thank you for any suggestion.
Regards
What exactly are you trying to do? You are producing a matrix of the pairwise correlations between two single observations, which will be NA for all of those pairs (you can't compute a correlation between just two observations). You could hypothetically compute the correlation between the two samples, but that would be a single number. Neither of these things is interesting, or illuminating.
Are you perhaps trying to show which samples are more similar to each other? If so, the main tool is either MDS or PCA, both of which are useful for showing that sort of thing.
You might want to spend some time reading some tutorials, which could help you get on track. Here is one, if you are using DESeq2. You could also use the DESeq2 vignette, or if you are using edgeR, there is a User's Guide for that package.