CGH analysis without genome positions
1
0
Entering edit mode
adam_pgsql ▴ 70
@adam_pgsql-3901
Last seen 10.2 years ago
Hi, I am trying to do some CGH analysis with Agilent arrays, but all the analyses methods seem to require genome position information. Does anyone know of any packages that will call genes as present/absent without the genome position? thanks for any help adam
CGH CGH • 1.6k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 3 months ago
United States
On Tue, Apr 6, 2010 at 1:47 PM, adam_pgsql <adam_pgsql at="" witneyweb.org=""> wrote: > > Hi, > > I am trying to do some CGH analysis with Agilent arrays, but all the analyses methods seem to require genome position information. Does anyone know of any packages that will call genes as present/absent without the genome position? > I don't think of CGH analysis as "present/absent", but perhaps I am not clear on what you mean by CGH analysis. For Agilent arrays, presumably you have two colors, one representing the sample and the other the reference. Simply make a ratio and then rank the probes based on that. Sean
ADD COMMENT
0
Entering edit mode
On Tue, Apr 6, 2010 at 1:58 PM, Sean Davis <seandavi at="" gmail.com=""> wrote: > On Tue, Apr 6, 2010 at 1:47 PM, adam_pgsql <adam_pgsql at="" witneyweb.org=""> wrote: >> >> Hi, >> >> I am trying to do some CGH analysis with Agilent arrays, but all the analyses methods seem to require genome position information. Does anyone know of any packages that will call genes as present/absent without the genome position? >> > > I don't think of CGH analysis as "present/absent", but perhaps I am > not clear on what you mean by CGH analysis. ?For Agilent arrays, > presumably you have two colors, one representing the sample and the > other the reference. ?Simply make a ratio and then rank the probes > based on that. I'm making an assumption here that you are using some custom array based on an organism with no assembled genome. If there is an assembled genome, then you should map your probes to the genome using an alignment tool (blast, blat, etc.) and use those alignments for more standard CGH analysis. Sean
ADD REPLY
0
Entering edit mode
On 6 Apr 2010, at 19:15, Sean Davis wrote: > On Tue, Apr 6, 2010 at 1:58 PM, Sean Davis <seandavi at="" gmail.com=""> wrote: >> On Tue, Apr 6, 2010 at 1:47 PM, adam_pgsql <adam_pgsql at="" witneyweb.org=""> wrote: >>> >>> Hi, >>> >>> I am trying to do some CGH analysis with Agilent arrays, but all the analyses methods seem to require genome position information. Does anyone know of any packages that will call genes as present/absent without the genome position? >>> >> >> I don't think of CGH analysis as "present/absent", but perhaps I am >> not clear on what you mean by CGH analysis. For Agilent arrays, >> presumably you have two colors, one representing the sample and the >> other the reference. Simply make a ratio and then rank the probes >> based on that. > > I'm making an assumption here that you are using some custom array > based on an organism with no assembled genome. If there is an > assembled genome, then you should map your probes to the genome using > an alignment tool (blast, blat, etc.) and use those alignments for > more standard CGH analysis. Thanks Sean for your reply. This is a custom bacterial pan-genome array. The problem is that many of the oligos target genes found in unfinished genome sequences (not the reference strain) and as such I don't really have a genome position. Also due to the nature of bacterial genomes when i hybridise DNA from unsequenced strains there is no guarantee that the gene arrangement would be exactly the same as the sequenced reference strain. in terms of "present/absent" i would like to score each gene sequence represented on the array as present or absent in the test strain. I guess this could be done by ranking the ratios and determining some cutoff for presence or absence, but the question is are there any tools that provide a more statistically sound approach to suggesting a good cutioff value to use? thanks again for your help adam
ADD REPLY
0
Entering edit mode
On Tue, Apr 6, 2010 at 6:53 PM, adam_pgsql <adam_pgsql at="" witneyweb.org=""> wrote: > > On 6 Apr 2010, at 19:15, Sean Davis wrote: > >> On Tue, Apr 6, 2010 at 1:58 PM, Sean Davis <seandavi at="" gmail.com=""> wrote: >>> On Tue, Apr 6, 2010 at 1:47 PM, adam_pgsql <adam_pgsql at="" witneyweb.org=""> wrote: >>>> >>>> Hi, >>>> >>>> I am trying to do some CGH analysis with Agilent arrays, but all the analyses methods seem to require genome position information. Does anyone know of any packages that will call genes as present/absent without the genome position? >>>> >>> >>> I don't think of CGH analysis as "present/absent", but perhaps I am >>> not clear on what you mean by CGH analysis. ?For Agilent arrays, >>> presumably you have two colors, one representing the sample and the >>> other the reference. ?Simply make a ratio and then rank the probes >>> based on that. >> >> I'm making an assumption here that you are using some custom array >> based on an organism with no assembled genome. ?If there is an >> assembled genome, then you should map your probes to the genome using >> an alignment tool (blast, blat, etc.) and use those alignments for >> more standard CGH analysis. > > Thanks Sean for your reply. > > This is a custom bacterial pan-genome array. The problem is that many of the oligos target genes found in unfinished genome sequences (not the reference strain) and as such I don't really have a genome position. Also due to the nature of bacterial genomes when i hybridise DNA from unsequenced strains there is no guarantee that the gene arrangement would be exactly the same as the sequenced reference strain. > > in terms of "present/absent" i would like to score each gene sequence represented on the array as present or absent in the test strain. I guess this could be done by ranking the ratios and determining some cutoff for presence or absence, but the question is are there any tools that provide a more statistically sound approach to suggesting a good cutioff value to use? > Hi, Adam. There are many ways to go here, but one would really need to know the experimental design in more detail. If you have replicates, then there are MANY statistical methodologies that could be applied to find differences between the reference and the test. Any gene expression hypothesis testing packages could probably be applied. Sean
ADD REPLY
0
Entering edit mode
On 7 Apr 2010, at 00:01, Sean Davis wrote: > On Tue, Apr 6, 2010 at 6:53 PM, adam_pgsql <adam_pgsql at="" witneyweb.org=""> wrote: >> >> On 6 Apr 2010, at 19:15, Sean Davis wrote: >> >>> On Tue, Apr 6, 2010 at 1:58 PM, Sean Davis <seandavi at="" gmail.com=""> wrote: >>>> On Tue, Apr 6, 2010 at 1:47 PM, adam_pgsql <adam_pgsql at="" witneyweb.org=""> wrote: >>>>> >>>>> Hi, >>>>> >>>>> I am trying to do some CGH analysis with Agilent arrays, but all the analyses methods seem to require genome position information. Does anyone know of any packages that will call genes as present/absent without the genome position? >>>>> >>>> >>>> I don't think of CGH analysis as "present/absent", but perhaps I am >>>> not clear on what you mean by CGH analysis. For Agilent arrays, >>>> presumably you have two colors, one representing the sample and the >>>> other the reference. Simply make a ratio and then rank the probes >>>> based on that. >>> >>> I'm making an assumption here that you are using some custom array >>> based on an organism with no assembled genome. If there is an >>> assembled genome, then you should map your probes to the genome using >>> an alignment tool (blast, blat, etc.) and use those alignments for >>> more standard CGH analysis. >> >> Thanks Sean for your reply. >> >> This is a custom bacterial pan-genome array. The problem is that many of the oligos target genes found in unfinished genome sequences (not the reference strain) and as such I don't really have a genome position. Also due to the nature of bacterial genomes when i hybridise DNA from unsequenced strains there is no guarantee that the gene arrangement would be exactly the same as the sequenced reference strain. >> >> in terms of "present/absent" i would like to score each gene sequence represented on the array as present or absent in the test strain. I guess this could be done by ranking the ratios and determining some cutoff for presence or absence, but the question is are there any tools that provide a more statistically sound approach to suggesting a good cutioff value to use? >> > > Hi, Adam. > > There are many ways to go here, but one would really need to know the > experimental design in more detail. If you have replicates, then > there are MANY statistical methodologies that could be applied to find > differences between the reference and the test. Any gene expression > hypothesis testing packages could probably be applied. > > Sean thanks again for your reply Sean. for the arrays that have been performed so far, the design is simply test against reference strain (no biological replicates), 3 or more different oligos per gene, printed in duplicate. The problem with the reference design is that as I mentioned before many of the oligos map to genes that are not present in the reference strain, so there will be lots of features with little or no signal in the reference channel. We would in fact like to be able to do this with single colour data if possible. Are there any packages that could help with this? thanks again adam
ADD REPLY
0
Entering edit mode
On Wed, Apr 7, 2010 at 4:25 PM, adam_pgsql <adam_pgsql at="" witneyweb.org=""> wrote: > > On 7 Apr 2010, at 00:01, Sean Davis wrote: > >> On Tue, Apr 6, 2010 at 6:53 PM, adam_pgsql <adam_pgsql at="" witneyweb.org=""> wrote: >>> >>> On 6 Apr 2010, at 19:15, Sean Davis wrote: >>> >>>> On Tue, Apr 6, 2010 at 1:58 PM, Sean Davis <seandavi at="" gmail.com=""> wrote: >>>>> On Tue, Apr 6, 2010 at 1:47 PM, adam_pgsql <adam_pgsql at="" witneyweb.org=""> wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I am trying to do some CGH analysis with Agilent arrays, but all the analyses methods seem to require genome position information. Does anyone know of any packages that will call genes as present/absent without the genome position? >>>>>> >>>>> >>>>> I don't think of CGH analysis as "present/absent", but perhaps I am >>>>> not clear on what you mean by CGH analysis. ?For Agilent arrays, >>>>> presumably you have two colors, one representing the sample and the >>>>> other the reference. ?Simply make a ratio and then rank the probes >>>>> based on that. >>>> >>>> I'm making an assumption here that you are using some custom array >>>> based on an organism with no assembled genome. ?If there is an >>>> assembled genome, then you should map your probes to the genome using >>>> an alignment tool (blast, blat, etc.) and use those alignments for >>>> more standard CGH analysis. >>> >>> Thanks Sean for your reply. >>> >>> This is a custom bacterial pan-genome array. The problem is that many of the oligos target genes found in unfinished genome sequences (not the reference strain) and as such I don't really have a genome position. Also due to the nature of bacterial genomes when i hybridise DNA from unsequenced strains there is no guarantee that the gene arrangement would be exactly the same as the sequenced reference strain. >>> >>> in terms of "present/absent" i would like to score each gene sequence represented on the array as present or absent in the test strain. I guess this could be done by ranking the ratios and determining some cutoff for presence or absence, but the question is are there any tools that provide a more statistically sound approach to suggesting a good cutioff value to use? >>> >> >> Hi, Adam. >> >> There are many ways to go here, but one would really need to know the >> experimental design in more detail. ?If you have replicates, then >> there are MANY statistical methodologies that could be applied to find >> differences between the reference and the test. ?Any gene expression >> hypothesis testing packages could probably be applied. >> >> Sean > > thanks again for your reply Sean. > > for the arrays that have been performed so far, the design is simply test against reference strain (no biological replicates), 3 or more different oligos per gene, printed in duplicate. The problem with the reference design is that as I mentioned before many of the oligos map to genes that are not present in the reference strain, so there will be lots of features with little or no signal in the reference channel. We would in fact like to be able to do this with single colour data if possible. Are there any packages that could help with this? > Agilent generates several statistics that might be relevant. You might look at the Feature Extraction manual to determine which columns of output will help you determine if a single channel is thought to be above background. In any case, I don't think there are any bioconductor packages that will do exactly what you want without some creativity. Sean
ADD REPLY
0
Entering edit mode
Hi, you really need the chromosome and position where the probes map in the genome. You can user readPositionalInfo function in snapCGH package to get that information from previously parsed Agilent txt files. Check the package vignette. Best, Daniel On Apr 6, 2010, at 7:58 PM, Sean Davis wrote: > On Tue, Apr 6, 2010 at 1:47 PM, adam_pgsql > <adam_pgsql at="" witneyweb.org=""> wrote: >> >> Hi, >> >> I am trying to do some CGH analysis with Agilent arrays, but all >> the analyses methods seem to require genome position information. >> Does anyone know of any packages that will call genes as present/ >> absent without the genome position? >> > > I don't think of CGH analysis as "present/absent", but perhaps I am > not clear on what you mean by CGH analysis. For Agilent arrays, > presumably you have two colors, one representing the sample and the > other the reference. Simply make a ratio and then rank the probes > based on that. > > Sean > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor ******************************************** Daniel Rico Rodriguez, PhD. Structural Computational Biology Group Spanish National Cancer Research Center, CNIO Melchor Fernandez Almagro, 3. 28029 Madrid, Spain. Phone: +34 91 224 69 00 #3015 drico at cnio.es http://www.cnio.es ******************************************** **NOTA DE CONFIDENCIALIDAD** Este correo electr?nico, y ...{{dropped:3}}
ADD REPLY

Login before adding your answer.

Traffic: 517 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6