Question

Agilent One Channel Data: Will I be able to use limma with flagged data removed.

0

Entering edit mode

CantExitVIM ▴ 10

@cantexitvim-15274

Last seen 3 months ago

United States

First, I apologize if these are dumb questions... I am new to R and Microarray data. That being said, I would appreciate any helpful feedback and/or links.

From what I've read here, it is unusual to remove flagged data-points in limma and I couldn't find any post showing how to do it within the limma package (read.maimages does have a weight function, but it's not for Agilent data). That being said, I'm doing a rotation with someone who prefers such methodology. Thus, I wrote a script that re-formatted the Agilent data with poorly performing probes removed. This means my .txt files now have different lengths, which I believe is causing an error with read.maimages:

Error in RG[[a]][, i] <- obj[, columns[[a]]] : 
number of items to replace is not a multiple of replacement length

But I would like to take advantage of limma's normalization processes. Is this possible I could reformat the data in another way so those probes are removed? Did I miss something, and eliminating such probe values is actually possible in limma? Is there another package, which may be able to do what I am seeking?

Again, sorry if this seems like a stupid question or if removing flagged values is considered "bad practice." But would really appreciate any feedback on overcoming this issue.

Thank you

limma agilent • 2.3k views

ADD COMMENT • link 6.8 years ago CantExitVIM ▴ 10

0

Entering edit mode

"You simply read the microarray data into limma in the normal way, then set all the spots that are flagged to have zero weights Then all flagged spots will automatically be removed from any limma downstream analysis."

I was playing around with some code that I found in another thread ( Filtering genes based on Agilent flags ). But after reading what I could find on wt.fun() and the limma documentation... I am a bit loss on how to create my own function.

Attempted Code:

wtfun.Agilent <- function(x) {
  okAboveBG <- x[,"gIsWellAboveBG"]==1
  as.numeric(okAboveBG)
}
x = read.maimages(targets, source="agilent", green.only=TRUE, wt.fun=wtfun.Agilent)

Again, I am appreciative of your help.

ADD REPLY • link updated 6.8 years ago by Gordon Smyth 52k • written 6.8 years ago by CantExitVIM ▴ 10

0

Entering edit mode

And the problem is what? Did this code not work for you? In what way did it not work?

PS. Please be careful with the buttons on this website, which confuse many users. I've moved your new question to be a comment, because you posted it as an independent answer instead of as a comment on my answer. In other words, don't answer your own questions.

ADD REPLY • link 6.8 years ago Gordon Smyth 52k

0

Entering edit mode

I broke the code down in order to understand it. I am sure much of the confusion stems from me being new to R. But I don't understand (1) what values are being passed to that function, (2) how weights are being assigned, and (3) how that information is being fed-back into read.maimages.

For instance, lets say I "gIsWellAboveBG" values, with a 1, to have a weight of 1. And values of 0 to have a weight of 0. Is limma intelligent enough where it is able to parse-out that information and automatically deduce the weights?

Again, I apologize if these questions seem simple. Furthermore, I realized you posted an alternative way to obtain specific readings and that certainly seems like a promising route. But if there is some documentation you can point me to, I would appreciate that.

Thank you for your time.

ADD REPLY • link updated 6.8 years ago by Gordon Smyth 52k • written 6.8 years ago by CantExitVIM ▴ 10

0

Entering edit mode

In other words, the code already worked perfectly for you, but you just don't believe it for some reason.

You code is unnecessarily complicated, as simply

wtfun.Agilent <- function(x) x$gIsWellAboveBG

would give the same result. There is no need for limma to deduce anything. What is returned by the above function is the weight.

More importantly though, as I've said elsewhere, there is no need for you to be using wt.fun at all. In particular, you should not be using wt.fun with gIsWellAboveBG.

I don't find your questions simple, but you do have some misconceptions, and I think you are making microarray analysis more complicated by trying to micro-manage it. limma is a sophisticated program. It will do a good job of analysing your Agilent data, quickly and easily if you run a standard analysis. You are not expected to read the code but instead just follow the documentation.

If you are new to R, new to microarrays and (perhaps) new to statistics as well, trying to invent your own complicated analysis right from the start isn't necessarily productive. What about trying a simple vanilla analysis first? If highly unusual finesses really are needed for your data, then you can easily add them afterwards.

ADD REPLY • link 6.8 years ago Gordon Smyth 52k

0

Entering edit mode

As you probably inferred from my first post, I understand that I am not following standard practices. And I hope this isn't interpreted as some brash disregard of thoughtful and tested practices. Long story short, I am bioinformatics grad student that is trying my best to analyze Agilent one-channel data while following the wishes of my PI. As so many guides point to limma when analyzing such data... limma has been my focused (and also the source of some frustration, as I am not quite understanding what is going underneath the hood).

What about trying a simple vanilla analysis first?

That would likely be prudent and I would love to find examples of basic Agilent analysis without depending on limma so much (or at all). I am curious about what you are thinking about when you say "vanilla analysis."

Again, I appreciate the feedback.

ADD REPLY • link updated 6.8 years ago by Gordon Smyth 52k • written 6.8 years ago by CantExitVIM ▴ 10

0

Entering edit mode

limma is the "vanilla analysis" == "standard practices".

It's hard to see what can be frustrating you about limma because you don't seem to have actually tried it yet. You haven't mentioned anything that you tried and didn't work, apart from reading the incorrect data files after you had hacked them, and those couldn't possibly have been read by any software.

But if you want try something else that's up to you.

ADD REPLY • link 6.8 years ago Gordon Smyth 52k

0

Entering edit mode

Can I ask you what you specifically mean by a "flagged" spot? Which Agilent columns specifically were you planning to treat as quality flags?

ADD REPLY • link 6.8 years ago Gordon Smyth 52k

0

Entering edit mode

I would like to conduct normalization with probe values that have a "gIsWellAboveBG" value of 1 and ignore/extract probes with a value of 0.

ADD REPLY • link 6.8 years ago CantExitVIM ▴ 10

0

Entering edit mode

What you're saying doesn't make sense, at least not as you've said it. I suspect that you are conflating two concepts that are really separate: spot quality flags and probe-filtering.

Quality flags apply to individual spots. You can use limma to down-weight low quality spots but, so far, nothing you've said suggests that you have any low quality spots. You haven't mentioned any Agilent flag that corresponds to spot quality.

Probe-filtering on the other hand applies to a probe across a whole series of arrays. If you filter out a probe, then will you will filter out all n spots for that probe, where n is the number of arrays.

gIsWellAboveBG takes on values for each spot, not for each probe. For any probe, you will get a whole string of values for gIsWellAboveBG, one for each microarray. If you want to use gIsWellAboveBG for probe filtering, then you have to compute some summary measure of the whole set of n values for that probe. I've already told you how to do that, in my answer below.

ADD REPLY • link 6.8 years ago Gordon Smyth 52k

0

Entering edit mode

I think you are correct and I probably was misinterpreting what was being asked. If I remove/keep individual probes based on gIsWellAboveBG... I am creating a bunch of data-sets that can't be compared.

Based on your comment, I suspect he wanted me to remove probes that are universally "poor" in all the arrays. I am not sure if that is a more acceptable practice but will confirm my suspicious with him.

I appreciate the feedback. Thank you.

ADD REPLY • link 6.8 years ago CantExitVIM ▴ 10

score 2 · Answer 1 · 2018-03-16

Flags that identify poor quality spots can be set as zero weights

It is easy to remove "flagged" spots in limma, but not the way that you are doing it. It is all much easier than what you are attempting.

First, you must not change the original Agilent image analysis text files. They must be left as they are.

You simply read the microarray data into limma in the normal way, then set all the spots that are flagged to have zero weights Then all flagged spots will automatically be removed from any limma downstream analysis.

Note that this does not physically remove the flagged data, it simply causes the flagged data to be ignored during any statistical analysis.

You can set the weights automatically as the Agilent files are read in, by setting the wt.fun argument of read.maimages(). Or you can set the weights after the data has been read in. I'd do it afterwards myself, but it's easy either way.

It's not usually recommended however

I don't recommend the practice of removing flagged spots, especially for Agilent microarray. There is no need for it, and it will mostly likely make your analysis worse. In my experience, Agilent microarrays are generally good quality and flagging is not required. But I've explained how to do it if you want to do it anyway.

Detection flags should be using for filtering, not as weights

One thing that you should definitely not do, is to treat a detection column like gIsWellAboveBG, is if it was a quality flag. Such columns can be used, but they should be used to filter non-expressed probes, not to flag individual spots. The probe filtering should be done at a later step, after reading the intensity data into R.

To illustrate this, consider the Agilent microarray data analysed in Section 17.4 of the limma User's Guide. We could use gIsWellAboveBG like this. First, read the gIsWellAboveBG column in during the read step:

x <- read.maimages(SDRF[,"Array Data File"],source="agilent",
              green.only=TRUE,other.columns="gIsWellAboveBG")

Then we can filter probes using this variable:

is.expressed <- rowSums(x$other$gIsWellAboveBG > 0) >= 4
x <- x[is.expressed,]

This code keeps probes that have intensities well above background on at least 4 arrays.