limma + voom combine voomWithQualityWeights and duplicateCorrelation
1
0
Entering edit mode
@df6c68e9
Last seen 10 weeks ago
United States

Suppose I want to combine voomWithQualityWeights and duplicateCorrelation. Would the modification of voomWithQualityWeights below be the recommended approach? I only added lines for duplicateCorrelation (based on using duplicateCorrelation with limma+voom for RNA-seq data) and included the correlation as part of the output list to be used with lmFit later, but I wasn't sure about which set of weights to pass to each instance of duplicateCorrelation:

  1. The array weights (aw, what I am currently using)
  2. The voom observation-level weights (v$weights, the default when weights = NULL)
  3. The product of the two t(aw * t(v$weights))

Note: voomWithCorrelationAndWeights(..., ndups = 0) (no blocking) is equivalent to voomWithQualityWeights() with the addition of the correlation (value of 0) in the output.

voomWithCorrelationAndWeights <-
  function(counts, design = NULL, lib.size = NULL, normalize.method = "none",
           plot = FALSE, span = 0.5, var.design = NULL, var.group = NULL,
           method = "genebygene", maxiter = 50, tol = 1e-05,
           trace = FALSE, col = NULL,
           # from duplicateCorrelation
           ndups = 2, spacing = 1, trim = 0.15, block = NULL,
           ...)
  {
    if (plot) {
      oldpar <- par(mfrow = c(1, 2))
      on.exit(par(oldpar))
    }

    # Iter 1
    v <- voom(counts, design = design, lib.size = lib.size,
              normalize.method = normalize.method, plot = FALSE, span = span,
              ...)
    aw <- arrayWeights(v, design = design, method = method,
                       maxiter = maxiter, tol = tol,
                       var.design = var.design, var.group = var.group)
    dupecor <- duplicateCorrelation(v, design = design, block = block,
                                    ndups = ndups, spacing = spacing, trim = trim,
                                    weights = aw) # update v$weights or use aw here?

    # Iter 2
    v <- voom(counts, design = design, lib.size = lib.size,
              normalize.method = normalize.method, plot = plot, span = span,
              # Changes from first iteration
              block = block, correlation = dupecor$consensus.correlation,
              weights = aw,
              ...)
    aw <- arrayWeights(v, design = design, method = method,
                       maxiter = maxiter, tol = tol, trace = trace,
                       var.design = var.design, var.group = var.group)
    dupecor <- duplicateCorrelation(v, design = design, block = block,
                                    ndups = ndups, spacing = spacing, trim = trim,
                                    weights = aw) # same question as above

    # Update weights and add correlation to output (needed for lmFit)
    v$weights <- t(aw * t(v$weights))
    v$targets$sample.weights <- aw
    v$correlation <- dupecor$consensus.correlation

    if (plot) {
      barplot(aw, names = 1:length(aw), main = "Sample-specific weights",
              ylab = "Weight", xlab = "Sample", col = col)
      abline(h = 1, col = 2, lty = 2)
    }

    return(v)
  }

Highly simplified version of how this function may be used:

# dge = DGEList object, design = design matrix, block = blocking factor, 
# var.group = each level of this factor is assigned a different sample weight
v <- voomWithCorrelationAndWeights(dge, design = design, 
                                   block = block, var.group = var.group)
fit <- lmFit(v, design = design, block = block, correlation = v$correlation)
fit.smooth <- eBayes(fit)

Edit: added voom tag and changed counts in last code chunk to dge.

voom RNASeq limma • 973 views
ADD COMMENT
2
Entering edit mode
@gordon-smyth
Last seen 2 hours ago
WEHI, Melbourne, Australia

Just use edgeR::voomLmFit() with sample.weights=TRUE and block=yourblockfactor and it will do it all for you. It does what you're trying to implement, and in addition takes account of any groups with all-zero counts. The function will also accept var.group or var.design instead of sample.weights.

The reason why voomLmFit is part of edgeR rather than part of limma is because it needs to use edgeR functionality to identify the counts with exact zero fitted values.

ADD COMMENT
0
Entering edit mode

Thanks, Gordon! Exactly what I was looking for. I'm putting together a series of wrapper functions for some limma proteomics data analysis pipelines that my workplace commonly uses or is likely to use in the future, and this will simplify my (probably over-engineered) code quite a bit.

ADD REPLY
0
Entering edit mode

What sort of proteomics data?

Please be mindful that voom is designed for sequencing data and for RNA-seq in particular. We have never tested voom on proteomics data and I would not expect it to be ideal for that sort of data.

ADD REPLY
0
Entering edit mode

I apologize for the confusion. I meant that the wrapper functions will mainly handle TMT proteomics data, since that is what myself and colleagues encounter most often, but it will also include the ability to analyze RNA-Seq data. It will be similar to what I've done in this limma_full function, but only the MArrayLM object from eBayes will be output, and I will be including more robust checks for user input, among other improvements. The portion of the code that analyzes RNA-Seq data follows what is outlined in Law et al. (2018), but I will switch to using voomLmFit.

ADD REPLY

Login before adding your answer.

Traffic: 892 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6