processCGH in snapCGH package

0

Entering edit mode

jhs1jjm@leeds.ac.uk ▴ 230

@jhs1jjmleedsacuk-2338

Last seen 10.6 years ago

R 2.5.0 on openSUSE 10.2 x86_64. Hi, I'm using the snapCGH package to analyse 2* 244k agilent CGH arrays with the aim of identifying regions of gain/loss. So far i've done the following: >targets <- readTargets ("targets.txt") >RG1 <-read.maimages (targets$File_names, source="agilent") >RG2 <- readPositionalInfo (RG1,source="agilent") >RG2$design <- c(-1-1) >RG3 <- backgroundCorrect (RG2,method="minimum") >MA1 <- normalizeWithinArrays (RG2,method="median") then > MA2 <- processCGH(MA1,method.of.averaging=mean,ID="MA1$genes$ProbeName") Error in order(na.last, decreasing, ...) : argument 2 is not a vector I've looked at ?processCGH and am following the vignette for the snapCGH package fairly closely. Can anyone help with the error. Also i'm unsure of what background correction to use and normalization function (I've been informed that non-linear methods are unsuitable). Also if anyone has any experience of Agilent CGH arrays could they also tell me whether the default estimates used for the foreground and background intensities in read.maimages are suitable. I'd like to determine the most suitable methods before as I think the segmentation may take some time on my machine. If its a case of trial and error then then thats fine. Thanks for any input. Regards John

CGH snapCGH CGH snapCGH • 1.9k views

ADD COMMENT • link updated 17.6 years ago by Sean Davis 21k • written 17.6 years ago by jhs1jjm@leeds.ac.uk ▴ 230

0

Entering edit mode

Sean Davis 21k

@sean-davis-490

Last seen 9 weeks ago

United States

jhs1jjm at leeds.ac.uk wrote: > R 2.5.0 on openSUSE 10.2 x86_64. > Hi, > > I'm using the snapCGH package to analyse 2* 244k agilent CGH arrays with the aim > of identifying regions of gain/loss. > So far i've done the following: > >> targets <- readTargets ("targets.txt") >> RG1 <-read.maimages (targets$File_names, source="agilent") >> RG2 <- readPositionalInfo (RG1,source="agilent") >> RG2$design <- c(-1-1) >> RG3 <- backgroundCorrect (RG2,method="minimum") >> MA1 <- normalizeWithinArrays (RG2,method="median") > > then >> MA2 <- processCGH(MA1,method.of.averaging=mean,ID="MA1$genes$ProbeName") > Error in order(na.last, decreasing, ...) : > argument 2 is not a vector > > I've looked at ?processCGH and am following the vignette for the snapCGH package > fairly closely. Can anyone help with the error. You can't quote variable names like above. I'm not sure that is going to fix the problem, but until the syntax is correct, it will be hard to diagnose the issue. > Also i'm unsure of what background correction to use and normalization function > (I've been informed that non-linear methods are unsuitable). Also if anyone has > any experience of Agilent CGH arrays could they also tell me whether the > default estimates used for the foreground and background intensities in > read.maimages are suitable. I'd like to determine the most suitable methods > before as I think the segmentation may take some time on my machine. If its a > case of trial and error then then thats fine. Thanks for any input. I would use the LogRatio column of the Agilent file without any further normalization. The LogRatio is already background corrected. The CGH algorithms in snapCGH do not depend on the center of the data, so there isn't really a need to do any further median centering, etc. In fact, there are probably better methods to center the data, but these use the segmented data. Hope that helps. Sean

ADD COMMENT • link 17.6 years ago Sean Davis 21k

0

Entering edit mode

Quoting Sean Davis <sdavis2 at="" mail.nih.gov=""> on Wed 26 Sep 2007 17:30:18 BST: > jhs1jjm at leeds.ac.uk wrote: > > R 2.5.0 on openSUSE 10.2 x86_64. > > Hi, > > > > I'm using the snapCGH package to analyse 2* 244k agilent CGH arrays with > the aim > > of identifying regions of gain/loss. > > So far i've done the following: > > > >> targets <- readTargets ("targets.txt") > >> RG1 <-read.maimages (targets$File_names, source="agilent") > >> RG2 <- readPositionalInfo (RG1,source="agilent") > >> RG2$design <- c(-1-1) > >> RG3 <- backgroundCorrect (RG2,method="minimum") > >> MA1 <- normalizeWithinArrays (RG2,method="median") > > > > then > >> MA2 <- processCGH(MA1,method.of.averaging=mean,ID="MA1$genes$ProbeName") > > Error in order(na.last, decreasing, ...) : > > argument 2 is not a vector > > > > I've looked at ?processCGH and am following the vignette for the snapCGH > package > > fairly closely. Can anyone help with the error. > > You can't quote variable names like above. I'm not sure that is going > to fix the problem, but until the syntax is correct, it will be hard to > diagnose the issue. > > > Also i'm unsure of what background correction to use and normalization > function > > (I've been informed that non-linear methods are unsuitable). Also if anyone > has > > any experience of Agilent CGH arrays could they also tell me whether the > > default estimates used for the foreground and background intensities in > > read.maimages are suitable. I'd like to determine the most suitable methods > > before as I think the segmentation may take some time on my machine. If its > a > > case of trial and error then then thats fine. Thanks for any input. > > I would use the LogRatio column of the Agilent file without any further > normalization. The LogRatio is already background corrected. The CGH > algorithms in snapCGH do not depend on the center of the data, so there > isn't really a need to do any further median centering, etc. In fact, > there are probably better methods to center the data, but these use the > segmented data. > > Hope that helps. > > Sean > Hi Sean, I'm struggling to import the LogRatio column from the Agilent text files. I'm using read.delim2 but this is bringing my machine to a standstill and after 45 mins hadn't finished. Is the following the same: > RG1 <- read.maimages(targets$File_names,source="agilent") > RG2 <- readPositionalInfo(RG1,"agilent") > RG2$design <- c(1,-1) > RG3 <- backgroundCorrect(RG2,method="none") > MA1 <- normalizeWithinArrays (RG3,method="none") > LogRatio <- MA1$M Having just looked at the text file it doesn't appear to be. I've looked through the data import R guide but haven't found anything yet. Thanks again John

ADD REPLY • link 17.6 years ago jhs1jjm@leeds.ac.uk ▴ 230

0

Entering edit mode

Quoting jhs1jjm at leeds.ac.uk on Wed 26 Sep 2007 22:54:01 BST: > Quoting Sean Davis <sdavis2 at="" mail.nih.gov=""> on Wed 26 Sep 2007 17:30:18 BST: > > > jhs1jjm at leeds.ac.uk wrote: > > > R 2.5.0 on openSUSE 10.2 x86_64. > > > Hi, > > > > > > I'm using the snapCGH package to analyse 2* 244k agilent CGH arrays with > > the aim > > > of identifying regions of gain/loss. > > > So far i've done the following: > > > > > >> targets <- readTargets ("targets.txt") > > >> RG1 <-read.maimages (targets$File_names, source="agilent") > > >> RG2 <- readPositionalInfo (RG1,source="agilent") > > >> RG2$design <- c(-1-1) > > >> RG3 <- backgroundCorrect (RG2,method="minimum") > > >> MA1 <- normalizeWithinArrays (RG2,method="median") > > > > > > then > > >> MA2 <- processCGH(MA1,method.of.averaging=mean,ID="MA1$genes$ProbeName") > > > Error in order(na.last, decreasing, ...) : > > > argument 2 is not a vector > > > > > > I've looked at ?processCGH and am following the vignette for the snapCGH > > package > > > fairly closely. Can anyone help with the error. > > > > You can't quote variable names like above. I'm not sure that is going > > to fix the problem, but until the syntax is correct, it will be hard to > > diagnose the issue. > > > > > Also i'm unsure of what background correction to use and normalization > > function > > > (I've been informed that non-linear methods are unsuitable). Also if > anyone > > has > > > any experience of Agilent CGH arrays could they also tell me whether the > > > default estimates used for the foreground and background intensities in > > > read.maimages are suitable. I'd like to determine the most suitable > methods > > > before as I think the segmentation may take some time on my machine. If > its > > a > > > case of trial and error then then thats fine. Thanks for any input. > > > > I would use the LogRatio column of the Agilent file without any further > > normalization. The LogRatio is already background corrected. The CGH > > algorithms in snapCGH do not depend on the center of the data, so there > > isn't really a need to do any further median centering, etc. In fact, > > there are probably better methods to center the data, but these use the > > segmented data. > > > > Hope that helps. > > > > Sean > > > Hi Sean, > > I'm struggling to import the LogRatio column from the Agilent text files. I'm > using read.delim2 but this is bringing my machine to a standstill and after > 45 > mins hadn't finished. Is the following the same: > > > RG1 <- read.maimages(targets$File_names,source="agilent") > > RG2 <- readPositionalInfo(RG1,"agilent") > > RG2$design <- c(1,-1) > > RG3 <- backgroundCorrect(RG2,method="none") > > MA1 <- normalizeWithinArrays (RG3,method="none") > > LogRatio <- MA1$M > > Having just looked at the text file it doesn't appear to be. I've looked > through > the data import R guide but haven't found anything yet. > > Thanks again > John > Additionally Sean I tried: >LogRatio <-log2(RG1$R)-log2(RG1$G) This gives me different results to the text file? > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD REPLY • link 17.6 years ago jhs1jjm@leeds.ac.uk ▴ 230

0

Entering edit mode

jhs1jjm at leeds.ac.uk wrote: > Quoting jhs1jjm at leeds.ac.uk on Wed 26 Sep 2007 22:54:01 BST: > > >> Quoting Sean Davis <sdavis2 at="" mail.nih.gov=""> on Wed 26 Sep 2007 17:30:18 BST: >> >> >>> jhs1jjm at leeds.ac.uk wrote: >>> >>>> R 2.5.0 on openSUSE 10.2 x86_64. >>>> Hi, >>>> >>>> I'm using the snapCGH package to analyse 2* 244k agilent CGH arrays with >>>> >>> the aim >>> >>>> of identifying regions of gain/loss. >>>> So far i've done the following: >>>> >>>> >>>>> targets <- readTargets ("targets.txt") >>>>> RG1 <-read.maimages (targets$File_names, source="agilent") >>>>> RG2 <- readPositionalInfo (RG1,source="agilent") >>>>> RG2$design <- c(-1-1) >>>>> RG3 <- backgroundCorrect (RG2,method="minimum") >>>>> MA1 <- normalizeWithinArrays (RG2,method="median") >>>>> >>>> then >>>> >>>>> MA2 <- processCGH(MA1,method.of.averaging=mean,ID="MA1$genes$ProbeName") >>>>> >>>> Error in order(na.last, decreasing, ...) : >>>> argument 2 is not a vector >>>> >>>> I've looked at ?processCGH and am following the vignette for the snapCGH >>>> >>> package >>> >>>> fairly closely. Can anyone help with the error. >>>> >>> You can't quote variable names like above. I'm not sure that is going >>> to fix the problem, but until the syntax is correct, it will be hard to >>> diagnose the issue. >>> >>> >>>> Also i'm unsure of what background correction to use and normalization >>>> >>> function >>> >>>> (I've been informed that non-linear methods are unsuitable). Also if >>>> >> anyone >> >>> has >>> >>>> any experience of Agilent CGH arrays could they also tell me whether the >>>> default estimates used for the foreground and background intensities in >>>> read.maimages are suitable. I'd like to determine the most suitable >>>> >> methods >> >>>> before as I think the segmentation may take some time on my machine. If >>>> >> its >> >>> a >>> >>>> case of trial and error then then thats fine. Thanks for any input. >>>> >>> I would use the LogRatio column of the Agilent file without any further >>> normalization. The LogRatio is already background corrected. The CGH >>> algorithms in snapCGH do not depend on the center of the data, so there >>> isn't really a need to do any further median centering, etc. In fact, >>> there are probably better methods to center the data, but these use the >>> segmented data. >>> >>> Hope that helps. >>> >>> Sean >>> >>> >> Hi Sean, >> >> I'm struggling to import the LogRatio column from the Agilent text files. I'm >> using read.delim2 but this is bringing my machine to a standstill and after >> 45 >> mins hadn't finished. Is the following the same: >> >> >>> RG1 <- read.maimages(targets$File_names,source="agilent") >>> RG2 <- readPositionalInfo(RG1,"agilent") >>> RG2$design <- c(1,-1) >>> RG3 <- backgroundCorrect(RG2,method="none") >>> MA1 <- normalizeWithinArrays (RG3,method="none") >>> LogRatio <- MA1$M >>> >> Having just looked at the text file it doesn't appear to be. I've looked >> through >> the data import R guide but haven't found anything yet. >> >> You will probably need to read the read.maimages help pretty carefully. You will need to specify other columns to read in if you want to read in the LogRatio column. Alternatively, change the red and green foreground columns to be rProcessedSignal and gProcessedSignal and then do not do background correction, as LogRatio is calculated from these. You will also potentially benefit from looking at the Agilent Feature Extraction Reference Manual, which explains the columns in the Agilent files. http://www.chem.agilent.com/scripts/LiteraturePDF.asp?iWHID=50416 > Additionally Sean I tried: > > >> LogRatio <-log2(RG1$R)-log2(RG1$G) >> > > This gives me different results to the text file? > The LogRatio column is calculated from rProcessedSignal and gProcessedSignal in the Agilent file. These columns are not loaded by limma by default. Hope that helps some. Sean

ADD REPLY • link 17.6 years ago Sean Davis 21k

0

Entering edit mode

Quoting Sean Davis <sdavis2 at="" mail.nih.gov=""> on Thu 27 Sep 2007 00:13:04 BST: > jhs1jjm at leeds.ac.uk wrote: > > Quoting jhs1jjm at leeds.ac.uk on Wed 26 Sep 2007 22:54:01 BST: > > > > > >> Quoting Sean Davis <sdavis2 at="" mail.nih.gov=""> on Wed 26 Sep 2007 17:30:18 BST: > >> > >> > >>> jhs1jjm at leeds.ac.uk wrote: > >>> > >>>> R 2.5.0 on openSUSE 10.2 x86_64. > >>>> Hi, > >>>> > >>>> I'm using the snapCGH package to analyse 2* 244k agilent CGH arrays with > >>>> > >>> the aim > >>> > >>>> of identifying regions of gain/loss. > >>>> So far i've done the following: > >>>> > >>>> > >>>>> targets <- readTargets ("targets.txt") > >>>>> RG1 <-read.maimages (targets$File_names, source="agilent") > >>>>> RG2 <- readPositionalInfo (RG1,source="agilent") > >>>>> RG2$design <- c(-1-1) > >>>>> RG3 <- backgroundCorrect (RG2,method="minimum") > >>>>> MA1 <- normalizeWithinArrays (RG2,method="median") > >>>>> > >>>> then > >>>> > >>>>> MA2 <- > processCGH(MA1,method.of.averaging=mean,ID="MA1$genes$ProbeName") > >>>>> > >>>> Error in order(na.last, decreasing, ...) : > >>>> argument 2 is not a vector > >>>> > >>>> I've looked at ?processCGH and am following the vignette for the snapCGH > >>>> > >>> package > >>> > >>>> fairly closely. Can anyone help with the error. > >>>> > >>> You can't quote variable names like above. I'm not sure that is going > >>> to fix the problem, but until the syntax is correct, it will be hard to > >>> diagnose the issue. > >>> > >>> > >>>> Also i'm unsure of what background correction to use and normalization > >>>> > >>> function > >>> > >>>> (I've been informed that non-linear methods are unsuitable). Also if > >>>> > >> anyone > >> > >>> has > >>> > >>>> any experience of Agilent CGH arrays could they also tell me whether the > >>>> default estimates used for the foreground and background intensities in > >>>> read.maimages are suitable. I'd like to determine the most suitable > >>>> > >> methods > >> > >>>> before as I think the segmentation may take some time on my machine. If > >>>> > >> its > >> > >>> a > >>> > >>>> case of trial and error then then thats fine. Thanks for any input. > >>>> > >>> I would use the LogRatio column of the Agilent file without any further > >>> normalization. The LogRatio is already background corrected. The CGH > >>> algorithms in snapCGH do not depend on the center of the data, so there > >>> isn't really a need to do any further median centering, etc. In fact, > >>> there are probably better methods to center the data, but these use the > >>> segmented data. > >>> > >>> Hope that helps. > >>> > >>> Sean > >>> > >>> > >> Hi Sean, > >> > >> I'm struggling to import the LogRatio column from the Agilent text files. > I'm > >> using read.delim2 but this is bringing my machine to a standstill and > after > >> 45 > >> mins hadn't finished. Is the following the same: > >> > >> > >>> RG1 <- read.maimages(targets$File_names,source="agilent") > >>> RG2 <- readPositionalInfo(RG1,"agilent") > >>> RG2$design <- c(1,-1) > >>> RG3 <- backgroundCorrect(RG2,method="none") > >>> MA1 <- normalizeWithinArrays (RG3,method="none") > >>> LogRatio <- MA1$M > >>> > >> Having just looked at the text file it doesn't appear to be. I've looked > >> through > >> the data import R guide but haven't found anything yet. > >> > >> > > You will probably need to read the read.maimages help pretty carefully. > You will need to specify other columns to read in if you want to read in > the LogRatio column. Alternatively, change the red and green foreground > columns to be rProcessedSignal and gProcessedSignal and then do not do > background correction, as LogRatio is calculated from these. You will > also potentially benefit from looking at the Agilent Feature Extraction > Reference Manual, which explains the columns in the Agilent files. > > http://www.chem.agilent.com/scripts/LiteraturePDF.asp?iWHID=50416 > > > Additionally Sean I tried: > > > > > >> LogRatio <-log2(RG1$R)-log2(RG1$G) > >> > > > > This gives me different results to the text file? > > > > The LogRatio column is calculated from rProcessedSignal and > gProcessedSignal in the Agilent file. These columns are not loaded by > limma by default. > > Hope that helps some. > > Sean Hi Sean, I did the following: #read in the intensity data > RG1 <-read.maimages(targets$File_names,source="agilent", columns=list(R="rProcessedSignal",G="gProcessedSignal")) It sounded like there was an alternative in your email but having looked at the reference manual's column explanation I couldn't see one. #insert info on ch pos of clone into the $genes matrix > RG2 <- readPositionalInfo(RG1,source="agilent") Warning message: NAs introduced by coercion #normalize > MA1 <- normalizeWithinArrays (RG2,method="none") This gives the log2 ratio whereas the agilent text is log10, is this important? Following this i'm getting the same error with processCGH as follows: > MA2 <- processCGH(MA1,ID="ProbeName") Error in order(na.last, decreasing, ...) : argument 2 is not a vector Some of the probes do not have location information, could this be the problem? Thanks again John

ADD REPLY • link 17.6 years ago jhs1jjm@leeds.ac.uk ▴ 230

0

Entering edit mode

J-C. Marioni ▴ 40

@j-c-marioni-2033

Last seen 10.6 years ago

Hi John, Having a quick look through your code, I think that the line RG2$design <- c(-1-1) is incorrect. It should be RG2$design <- c(-1,-1) I think. Hope this helps! Cheers, John On Wed, 26 Sep 2007, jhs1jjm at leeds.ac.uk wrote: > R 2.5.0 on openSUSE 10.2 x86_64. > Hi, > > I'm using the snapCGH package to analyse 2* 244k agilent CGH arrays with the aim > of identifying regions of gain/loss. > So far i've done the following: > >> targets <- readTargets ("targets.txt") >> RG1 <-read.maimages (targets$File_names, source="agilent") >> RG2 <- readPositionalInfo (RG1,source="agilent") >> RG2$design <- c(-1-1) >> RG3 <- backgroundCorrect (RG2,method="minimum") >> MA1 <- normalizeWithinArrays (RG2,method="median") > > then >> MA2 <- processCGH(MA1,method.of.averaging=mean,ID="MA1$genes$ProbeName") > Error in order(na.last, decreasing, ...) : > argument 2 is not a vector > > I've looked at ?processCGH and am following the vignette for the snapCGH package > fairly closely. Can anyone help with the error. > > Also i'm unsure of what background correction to use and normalization function > (I've been informed that non-linear methods are unsuitable). Also if anyone has > any experience of Agilent CGH arrays could they also tell me whether the > default estimates used for the foreground and background intensities in > read.maimages are suitable. I'd like to determine the most suitable methods > before as I think the segmentation may take some time on my machine. If its a > case of trial and error then then thats fine. Thanks for any input. > > Regards > > John > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD COMMENT • link 17.6 years ago J-C. Marioni ▴ 40

0

Entering edit mode

Hi John, Unfortunately that was a typo and hasn't fixed the problem but thanks for looking. Regards John Quoting "J-C. Marioni" <jcm68 at="" hermes.cam.ac.uk=""> on Wed 26 Sep 2007 16:59:34 BST: > Hi John, > > Having a quick look through your code, I think that the line > RG2$design <- c(-1-1) > is incorrect. > > It should be > RG2$design <- c(-1,-1) > I think. > > Hope this helps! > > Cheers, > John > > On Wed, 26 Sep 2007, jhs1jjm at leeds.ac.uk wrote: > > > R 2.5.0 on openSUSE 10.2 x86_64. > > Hi, > > > > I'm using the snapCGH package to analyse 2* 244k agilent CGH arrays with > the aim > > of identifying regions of gain/loss. > > So far i've done the following: > > > >> targets <- readTargets ("targets.txt") > >> RG1 <-read.maimages (targets$File_names, source="agilent") > >> RG2 <- readPositionalInfo (RG1,source="agilent") > >> RG2$design <- c(-1-1) > >> RG3 <- backgroundCorrect (RG2,method="minimum") > >> MA1 <- normalizeWithinArrays (RG2,method="median") > > > > then > >> MA2 <- processCGH(MA1,method.of.averaging=mean,ID="MA1$genes$ProbeName") > > Error in order(na.last, decreasing, ...) : > > argument 2 is not a vector > > > > I've looked at ?processCGH and am following the vignette for the snapCGH > package > > fairly closely. Can anyone help with the error. > > > > Also i'm unsure of what background correction to use and normalization > function > > (I've been informed that non-linear methods are unsuitable). Also if anyone > has > > any experience of Agilent CGH arrays could they also tell me whether the > > default estimates used for the foreground and background intensities in > > read.maimages are suitable. I'd like to determine the most suitable methods > > before as I think the segmentation may take some time on my machine. If its > a > > case of trial and error then then thats fine. Thanks for any input. > > > > Regards > > > > John > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > >

ADD REPLY • link 17.6 years ago jhs1jjm@leeds.ac.uk ▴ 230

Login before adding your answer.