Hello everyone ! I wanted to ask a particular question regarding the implementation of limma R package, for DE regarding proteomics multiplex immunoassay technologies. In particular, my experiment is related with blood samples and performed using targeted Olink assay panels. My major goal is to investigate if there are any protein expression differences based on two different profiled groups. The complete experiment was run in 3 separate plates (not completely randomized, but including individuals from both groups), yielding 3 distinct outputs, into 3 different excel files:
colnames(prot_dat)
[1] "SampleID" "Index" "OlinkID"
[4] "UniProt" "Assay" "MissingFreq"
[7] "Panel" "Panel_Version" "PlateID"
[10] "QC_Warning" "PlateLQL" "PlateLOD"
[13] "LLOQ" "ULOQ" "Quantified_value"
[16] "Unit" "Assay_Warning" "Normalization"
[19] "QC Deviation Inc Ctrl" "QC Deviation Det Ctrl" "Olink NPX Signature Version"
In addition, the column "Normalization", in all 3 separate excel files, has the value "Calibrator Normalized". After thorough investigation, these numeric values represent the absolute quantification of the panel proteins, which depend on the included calibrator samples. This process brings all samples at the same scale (pg/ml), and hence (from the general guidelines), no additional "bridging" is required to make these runs directly comparable to each other. In parallel, from a direct comparison of total protein expression distribution across the 3 plates, there was not any significant difference between the 3 plates; Hence, I proceeded with direct merging of all plates into a concatenated dataset. On this premise, my main questions is the following:
range(prot_dat$Quantified_value, na.rm = TRUE)
[1] 0.00122 15801.02184
In the column "Quantified_value", the values are ranging from 0 to some thousands, like 2000; I also utilized a simple log2 (+0.5) transformation, as well as an additional median-center transformation. But if you check the below density plots, there is not much difference between log2 alone, vs additional median centering, thus:
For differential expression with limma, and based on my above description, even the log2 transformation could suffice? In addition, although there are limited features in total (~89), still the limma-trend pipeline might be beneficial for a basic two-group comparison, potentially including some categorical confounders?
Thank's a gazillion,
Efstathios