The ruv
package isn't a Bioconductor package, but maybe it's Bioconductor-adjacent? Technically since it's a CRAN package you should be asking on R-help or biostars or whatever. Anyway, this is an example of the need to perform a close reading of the help pages for any package you might want to use - the information is there, but it's usually pretty terse and every word counts.
You are asking how to get the adjusted data matrix for downstream analysis, while apparently not understanding that RUV4
is carrying out the analysis for you. From ?RUV4
Arguments:
Y: The data. A m by n matrix, where m is the number of samples
and n is the number of features.
X: The factor(s) of interest. A m by p matrix, where m is the
number of samples and p is the number of factors of interest.
Very often p = 1. Factors and dataframes are also
permissible, and converted to a matrix by 'design.matrix'.
## and further down
Details:
Implements the RUV-4 algorithm as described in Gagnon-Bartsch,
Jacob, and Speed (2013), using the SVD as the factor analysis
routine. Unwanted factors W are estimated using control genes. Y
is then regressed on the variables X, Z, and W.
Which pretty clearly states that this function does the regression for you? But ruv
could use a vignette because RUV4
returns a not completely useful object. If you look at ?ruv_summary
it becomes somewhat clearer:
RUV Summary
Description:
Post-process and summarize the results of call to RUV2, RUV4,
RUVinv, or RUVrinv.
Usage:
ruv_summary(Y, fit, rowinfo=NULL, colinfo=NULL, colsubset=NULL, sort.by="F.p",
var.type=c("ebayes", "standard", "pooled"),
p.type=c("standard", "rsvar", "evar"), min.p.cutoff=10e-25)
## and further down
Details:
This function post-processes the results of a call to
RUV2/4/inv/rinv and then nicely summarizes the output. The
post-processing step primarily consists of a call to
variance_adjust, which computes various adjustments to variances,
t-statistics, and and p-values. See variance_adjust for details.
The 'var.type' and 'p.type' options determine which of these
adjustments are used. An additional post-processing step is that
the column means of the 'Y' matrix are computed, both before and
after the call to 'RUV1' (if 'eta' was specified).
After post-processing, the results are summarized into a list
containing 4 objects: 1) the data matrix 'Y'; 2) a dataframe 'R'
containing information about the rows (samples); 3) a dataframe
'C' containing information about the columns (features, e.g.
genes), and 4) a list 'misc' of other information returned by
RUV2/4/inv/rinv.
There are other functions in ruv
that are presumably useful for doing things, but I leave it to you to do your own further exploration.