Hi! I would like to run a paired RankProd analysis in R. My dataset consists of matched cancer samples and healthy controls. For example, the Colorectal Cancer data set consists of 40 cancer samples and 40 matched healthy controls. I defined the class vectors and origin vectors accordingly:
class vector
n1 <- 40
n2 <- 40
cl <- rep(c(0,1), c(n1,n2))
origin vector
origin <- rep(1, n1+n2)
rankprod code
RP.out <- RankProducts(cancer_data, cl, origin, logged=TRUE, plot=FALSE, gene.names = genenames, rand=123)
I have two questions: (1)This calculation will always run as "Rank Product analysis for unpaired case". Is there a specific argument where I can specify the pairing of my cancer samples and healthy controls?
(2) If I run a dataset with more than 60 samples (30 cancer + 30 healthy controls), I receive an error message that reads: "Error: vector memory exhausted (limit reached?)" (NB: I use a MAC). I've followed some advice from other forums to change .Renviron file to "RMAXVSIZE=100GB". But with those large matrices, the error does not go away. Is there any solution?
Thank you very much for your help in advance, Cheers
Hi!
Thank you very much for your answer! I apologize, I chose the wrong term above. The data is actually paired. The healthy and cancer samples are taken from the same patient. Is there a way to specify a paired analysis? The RankProd package documentation/tutorial only describes unpaired cases.
And regarding the memory issue: I have a MAC with 16GB. I've followed the instructions from a post on StackOverflow where someone had the same issue. They solved the problem apparently by setting "RMAXVSIZE=100GB". For me, this didn't work.
I hope that in my case the calculation is too resource-hungry as it assumes an unpaired case. I hope once I can correctly specify the function so that only the correct pairs are considered, the memory issue might be solved as well.
Conventionally for a paired analysis in a non-parametric setting you would first compute the difference (after taking logs) between the cancer and healthy tissues, then do a one-sample analysis. This is essentially a sign test. That may help with the memory issues, which seems to be true for me.