lumiHumanAll.db, GOstats

0

Entering edit mode

Al Ivens ▴ 270

@al-ivens-1646

Last seen 10.6 years ago

Hi, Working with the Illumina Human WG6 chips, and lumiHumanAll.db. After linear model fitting and clustering, I have identified a cluster of ~130 loci that show an interesting profile. However, as 125 of them have no GeneName or EntrezID, I am struggling to figure out what they might be biologically. >From a closer inspection of lumiHumanALl.db, I find that approx. half of the features have no EntrezID, so I wasn't just unlucky with the constituents of my cluster! Does anyone have any recommendations as to the best approach to get around this problem (which came to light when I tried to run GOstats, which requires EntrezID for mapping of terms)? Many thanks in advance. Cheers, a -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

Clustering GOstats Clustering GOstats • 1.4k views

ADD COMMENT • link updated 16.9 years ago by Wei Shi ★ 3.6k • written 16.9 years ago by Al Ivens ▴ 270

0

Entering edit mode

Sean Davis 21k

@sean-davis-490

Last seen 10 weeks ago

United States

On Tue, Jun 10, 2008 at 1:40 AM, Al Ivens <alicat at="" sanger.ac.uk=""> wrote: > Hi, > > Working with the Illumina Human WG6 chips, and lumiHumanAll.db. After > linear model fitting and clustering, I have identified a cluster of ~130 > loci that show an interesting profile. However, as 125 of them have no > GeneName or EntrezID, I am struggling to figure out what they might be > biologically. > > >From a closer inspection of lumiHumanALl.db, I find that approx. half of > the features have no EntrezID, so I wasn't just unlucky with the > constituents of my cluster! > > Does anyone have any recommendations as to the best approach to get > around this problem (which came to light when I tried to run GOstats, > which requires EntrezID for mapping of terms)? You could use one of the illuminaHuman... (rather than lumiHumanAll.db) annotation packages which look to have a larger number of mapped probes. Sean

ADD COMMENT • link 16.9 years ago Sean Davis 21k

0

Entering edit mode

?The illuminaHuman... annotation packages are based on accession IDs taken from BLAST results from Nuno Barbosa-Marais. Check out www.compbio.group.cam.ac.uk for more details. Lynn Sean Davis wrote: > On Tue, Jun 10, 2008 at 1:40 AM, Al Ivens <alicat at="" sanger.ac.uk=""> wrote: > >> Hi, >> >> Working with the Illumina Human WG6 chips, and lumiHumanAll.db. After >> linear model fitting and clustering, I have identified a cluster of ~130 >> loci that show an interesting profile. However, as 125 of them have no >> GeneName or EntrezID, I am struggling to figure out what they might be >> biologically. >> >> >From a closer inspection of lumiHumanALl.db, I find that approx. half of >> the features have no EntrezID, so I wasn't just unlucky with the >> constituents of my cluster! >> >> Does anyone have any recommendations as to the best approach to get >> around this problem (which came to light when I tried to run GOstats, >> which requires EntrezID for mapping of terms)? >> > > You could use one of the illuminaHuman... (rather than > lumiHumanAll.db) annotation packages which look to have a larger > number of mapped probes. > > Sean > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD REPLY • link 16.9 years ago Lynn Amon ▴ 280

0

Entering edit mode

Wei Shi ★ 3.6k

@wei-shi-2183

Last seen 2 days ago

Australia/Melbourne

Hi Al: I would suggest mapping the sequences of your ~130 probes to the genome directly using "blat" in the UCSC genome browser. You should be able to get genes which these probes can be sequence matched to (perfectly or not perfectly). And also you need to check if probes are matched with the exon regions of the genes. The probe sequences can be retrieved from Illumina's manifest file. Hope this helps. Cheers, Wei Al Ivens wrote: > Hi, > > Working with the Illumina Human WG6 chips, and lumiHumanAll.db. After > linear model fitting and clustering, I have identified a cluster of ~130 > loci that show an interesting profile. However, as 125 of them have no > GeneName or EntrezID, I am struggling to figure out what they might be > biologically. > > >From a closer inspection of lumiHumanALl.db, I find that approx. half of > the features have no EntrezID, so I wasn't just unlucky with the > constituents of my cluster! > > Does anyone have any recommendations as to the best approach to get > around this problem (which came to light when I tried to run GOstats, > which requires EntrezID for mapping of terms)? > > Many thanks in advance. > > Cheers, > > a > > > > >

ADD COMMENT • link 16.9 years ago Wei Shi ★ 3.6k

0

Entering edit mode

On Thu, Jun 12, 2008 at 10:57 PM, Wei Shi <shi at="" wehi.edu.au=""> wrote: > Hi Al: > > I would suggest mapping the sequences of your ~130 probes to the genome > directly using "blat" in the UCSC genome browser. You should be able to get > genes which these probes can be sequence matched to (perfectly or not > perfectly). And also you need to check if probes are matched with the exon > regions of the genes. The probe sequences can be retrieved from Illumina's > manifest file. For affy probes, this works reasonably well. However, for longer probes, it might be easier to map to a transcript database like RefSeq or Ensembl genes. Sean > Al Ivens wrote: >> >> Hi, >> >> Working with the Illumina Human WG6 chips, and lumiHumanAll.db. After >> linear model fitting and clustering, I have identified a cluster of ~130 >> loci that show an interesting profile. However, as 125 of them have no >> GeneName or EntrezID, I am struggling to figure out what they might be >> biologically. >> >From a closer inspection of lumiHumanALl.db, I find that approx. half of >> the features have no EntrezID, so I wasn't just unlucky with the >> constituents of my cluster! >> >> Does anyone have any recommendations as to the best approach to get >> around this problem (which came to light when I tried to run GOstats, >> which requires EntrezID for mapping of terms)? >> >> Many thanks in advance. >> >> Cheers, >> >> a >> >> >> >> >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD REPLY • link 16.9 years ago Sean Davis 21k

Login before adding your answer.