I am analyzing an RNA seq experiments using limma in which I have 4 different RNA species A,B,C,D and each has three replicates. I am using the follwing (assuming data_counts is my count matrix)
As you see in the above code I want to test whether the sum A+B-C-D is equal to zero or not. However, when I play with the lfc argument in the treat function I get different results. I do not know what the log fold change corresponds in my case?
"when I play with the lfc argument in the treat function I get different results."
Actually you don't. You will get exactly the same logFC for every gene regardless of the lfc argument to treat().
Of course, the order of the genes in the top table may change depending on the lfc argument, but that's the purpose of the argument, to rank large fold changes more highly.
BTW, 'logFC' in the topTreat table and 'lfc' are both abbreviations for the same thing, log2-fold-change. logFC is the estimated value, whereas lfc gives the threshold against which logFC is tested. Normally you want to use much smaller values for lfc than 2, something like lfc=log2(1.5) is common.
I'm guessing that you want a contrast more like (A+B)/2 - (C+D)/2, i.e. the mean of A & B minus the mean of C & D. Without the division by two, you are testing the log fold change for the difference between the sum of A & B and the sum of C & D, which are probably not quantities you are interested in,
I think you are misunderstanding the meaning of coefficients and contrasts. A contrast is just an arithmetic expression involving coefficients, which are columns of the design matrix. Assuming you have used a group means parametrization in constructing your design matrix, each column represents the mean (log) expression for that group of replicates. You do not need to divide by the group size to get the mean expression for that group. In my contrast, I'm using (A+B)/2 to represent the average expression for all samples in groups A & B. I'm dividing by two because that's the number of groups I'm taking the average of, not the number of replicates there are in each group.
Are you asking what the lfc argument in treat means? It stands for "log-fold change", and represents the absolute log-fold change threshold that you consider meaningful. Read the documentation for the treat function and the associated paper for more information.
Also, note that since coefficients and contrasts are on a logarithmic scale, you almost certainly want to add and subtract them, not multiply or divide.
Hi Ryan,
Yes since I have three replicates I am testing (A+B)/3-(C+D)/3=0, what does the fold change represent in this case?
I think you are misunderstanding the meaning of coefficients and contrasts. A contrast is just an arithmetic expression involving coefficients, which are columns of the design matrix. Assuming you have used a group means parametrization in constructing your design matrix, each column represents the mean (log) expression for that group of replicates. You do not need to divide by the group size to get the mean expression for that group. In my contrast, I'm using (A+B)/2 to represent the average expression for all samples in groups A & B. I'm dividing by two because that's the number of groups I'm taking the average of, not the number of replicates there are in each group.
that makes sense and I am doing what you describe. In your case what does the lfc correspond in the treat () function,
(A+B)/(C+D)?
Thanks
Are you asking what the lfc argument in treat means? It stands for "log-fold change", and represents the absolute log-fold change threshold that you consider meaningful. Read the documentation for the treat function and the associated paper for more information.
Also, note that since coefficients and contrasts are on a logarithmic scale, you almost certainly want to add and subtract them, not multiply or divide.