217216_x_at is NOT dihydrolipoamide S-succinytransferase
2
0
Entering edit mode
Kevin Dawson ▴ 80
@kevin-dawson-538
Last seen 10.2 years ago
Is any one of you the curator of the Affy annotation packages. The probeset 217216_x_at is mislabeled as dihydrolipoamide S-succinytransferase. The correct dihydrolipoamide S-succinytransferase is 215210_s_at On the other hand, 217216_x_at is "mismatch repair gene MLH3, mutL (E.coli) homolog 3; mutL homolog 3". hgu133plus2GENENAME and hgu133plus2GENENAME are both incorrect. I didn't test others. Please let me know if you are able to fix this error in the next release. Thanks, Kevin Dawson
Annotation affy Annotation affy • 1.5k views
ADD COMMENT
0
Entering edit mode
rgentleman ★ 5.5k
@rgentleman-7725
Last seen 9.6 years ago
United States
Hi, On Mar 28, 2005, at 4:51 PM, Kevin Dawson wrote: > Is any one of you the curator of the Affy annotation packages. > > The probeset 217216_x_at is mislabeled as dihydrolipoamide > S-succinytransferase. > The correct dihydrolipoamide S-succinytransferase is 215210_s_at > > On the other hand, 217216_x_at is "mismatch repair gene MLH3, mutL > (E.coli) > homolog 3; mutL homolog 3". > Hi, Can you please provide complete details on why you believe that to be true (where did you get your information from, and did you check it against the latest Entrez Gene to be sure). We have documented our sources (see for example hgu133plus2 documentation). > hgu133plus2GENENAME and hgu133plus2GENENAME are both incorrect. I > didn't > test others. They are the same? Sorry what did you check? > > Please let me know if you are able to fix this error in the next > release. > If it is, it will be fixed. And for sure lots of annotation will change in the next release (it is constantly being updated). Robert > > Thanks, > > Kevin Dawson > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > > +--------------------------------------------------------------------- -- ----------------+ | Robert Gentleman phone: (206) 667-7700 | | Head, Program in Computational Biology fax: (206) 667-1319 | | Division of Public Health Sciences office: M2-B865 | | Fred Hutchinson Cancer Research Center | | email: rgentlem@fhcrc.org | +--------------------------------------------------------------------- -- ----------------+
ADD COMMENT
0
Entering edit mode
Dear Robert, I would make the following corrections. Thank you, Kevin 215210_s_at: ACCNUM S72422 CHR 1 -> 14 GENENAME dihydrolipoamide S-succinyltransferase pseudogene (E2 component of 2-oxo-glutarate complex) -> dihydrolipoamide S-succinyltransferase (E2 component of 2-oxo-glutarate complex) GO 6085,8415,5947,41498152,45252,16740,6099 -> 8415,4149,16740,6091,8152,6099,5739,45252 LOCUSID 1744 -> 1743 MAP 1p31 -> 14q24.3 PMID 8076640,8009371 -> 8076640,8009371,15038610,12805207,12477932,8889548,8268217,8240324,800 9371 REFSEQ NG_002326 -> NM_001933 SYMBOL DLSTP -> DLST UNIGENE Hs.480230 -> Hs.525459 217216_x_at: ACCNUM AC006530 CHR 14 GENENAME dihydrolipoamide S-succinyltransferase (E2 component of 2-oxo-glutarate complex) -> mutL homolog 3 (E. coli) GO 8415,4149,6091,8152,5739,45252,16740,6099 -> 5524,5515,3696,7131,6298,5634 LOCUSID 1743 -> 27030 MAP 14q24.3 PMID 15038610,12805207,12477932,8889548,8268217,8240324,8009371 -> 7596406,11292842,12800209,12095912,10615123 REFSEQ NM_001933,NP_001924 -> AB039667 SYMBOL DLST -> MLH3 UNIGENE Hs.525459 -> Hs.279843 [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
On Tue, Mar 29, 2005 at 03:10:46PM -0800, Kevin Dawson wrote: > Dear Robert, > > I would make the following corrections. What Robert was interested in was primarily _why_ you would make these changes, ie. what evidence do you have that your annotation is correct? As he said, at least the BioC annotation has a clearly identified source. Kasper > > 215210_s_at: > ACCNUM S72422 > CHR 1 -> 14 > GENENAME dihydrolipoamide S-succinyltransferase pseudogene (E2 > component of 2-oxo-glutarate complex) -> dihydrolipoamide > S-succinyltransferase (E2 component of 2-oxo-glutarate complex) > GO 6085,8415,5947,41498152,45252,16740,6099 -> > 8415,4149,16740,6091,8152,6099,5739,45252 > LOCUSID 1744 -> 1743 > MAP 1p31 -> 14q24.3 > PMID 8076640,8009371 -> > 8076640,8009371,15038610,12805207,12477932,8889548,8268217,8240324,8 009371 > REFSEQ NG_002326 -> NM_001933 > SYMBOL DLSTP -> DLST > UNIGENE Hs.480230 -> Hs.525459 > > 217216_x_at: > ACCNUM AC006530 > CHR 14 > GENENAME dihydrolipoamide S-succinyltransferase (E2 component of > 2-oxo-glutarate complex) -> mutL homolog 3 (E. coli) > GO 8415,4149,6091,8152,5739,45252,16740,6099 -> > 5524,5515,3696,7131,6298,5634 > LOCUSID 1743 -> 27030 > MAP 14q24.3 > PMID > 15038610,12805207,12477932,8889548,8268217,8240324,8009371 -> > 7596406,11292842,12800209,12095912,10615123 > REFSEQ NM_001933,NP_001924 -> AB039667 > SYMBOL DLST -> MLH3 > UNIGENE Hs.525459 -> Hs.279843 > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor -- Kasper Daniel Hansen, Research Assistant Department of Biostatistics, University of Copenhagen
ADD REPLY
0
Entering edit mode
Dear Kasper, Apparently my previous message did not appear on the Bioconductor list serve due to its attachment. I think, Robert should have received it though. The annotation error was caused by the fact that DLAT and MLH3 are close neighbors on chromosome 14 and at one point, somebody used the same locus info for both. I cannot attach an image that appears on the list serve; therefore, probably, the easiest way to be demonstrate what I am talking about is if you do the followings: (1) Go to affymetrix.com and go to NetAffy (2) Look up 217216_x_at. It will tell you, it refers to MLH3. (3) Click on <details> (4) At the alignments, click on the chr14 entry: This will take you to the UCSC genome browser with the Affymetrix probe set information aligned with the gene information (5) Extend the chromosomal location to chr14:74,422,000-74,600,000 to see both genes at a time. On the left side of the map, you'll see DLST, and on the right side you'll see MLH3. The 217216_x_at probeset matches MLH3 and NOT DLST. A second problem caused by the misannotation is that in the annotation packages, the 215210_s_at is annotating the DLST pseudogene. It is true that the probeset also matches the pseudogene on chromosome 1; however it primarily matches DLST itself on chromosome 14. That is why I am suggesting to clarify the issues and change the annotation of 215210_s_at to DLST and the annotation of 217216_x_at to MLH3 as described in my previous post Thank you, Kevin
ADD REPLY
0
Entering edit mode
Hi Kevin, Thank-you for the report but I think you are under some misapprehensions about both what we do, and what we can do. First, we "merely" (in quotes because it is a big job) map from what the manufacturer tells us to a set of best matches from published and reliable sources, such as the NCBI. The routines used are well documented (see the AnnBuilder package and relevant publications named therein) and applied as documented. While you view this as an error, we do not. An error, is when we have either misidentified the match or misaligned it with the public data source. Not, if the knowledge about the biology has changed, or if the public source itself has made a mistake. We are repackagers of data not arbiters; we do not have the resources nor the capability to do that. We have tried to be clear about what we do and why. If you find it useful that is great. If you do not, that is your choice - we do provide AnnBuilder as a tool so that you could build your own set of packages if that suits you. In this case, the data we are using from Affymetrix maps as follows: 217216_x_at AC006530 So the question then becomes what is known about AC006530, and we tend to rely on Entrez Gene for that information: see http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=46807 64 (or however you want to get there). That is the information we primarily use and our data packages reflect that. Note that Affymetrix has: 215210_s_at S72422, so a different mapping. We do not have the resources (nor does anyone that I know of) to hand curate annotations as you are suggesting. If you would like our matches to be different, I can suggest making your improvements known to the NCBI or who ever their primary sources are and as they propagate we will be able to pick them up. You may want to change your local copy to reflect your local conditions and views. If you do so, please make sure to change the version number to something that is different from our set, perhaps using -'s or similar, so that there can be no possible chance of confusion. Best wishes, Robert ] On Mar 29, 2005, at 3:58 PM, Kevin Dawson wrote: > Dear Kasper, > > Apparently my previous message did not appear on the Bioconductor list > serve > due to its attachment. I think, Robert should have received it though. > > The annotation error was caused by the fact that DLAT and MLH3 are > close > neighbors on chromosome 14 and at one point, somebody used the same > locus > info for both. I cannot attach an image that appears on the list serve; > therefore, probably, the easiest way to be demonstrate what I am > talking > about is if you do the followings: > > (1) Go to affymetrix.com and go to NetAffy > (2) Look up 217216_x_at. It will tell you, it refers to MLH3. > (3) Click on
> (4) At the alignments, click on the chr14 entry: This will take you to > the > UCSC genome browser with the Affymetrix probe set information aligned > with > the gene information > (5) Extend the chromosomal location to chr14:74,422,000-74,600,000 to > see > both genes at a time. > > On the left side of the map, you'll see DLST, and on the right side > you'll > see MLH3. The 217216_x_at probeset matches MLH3 and NOT DLST. > > A second problem caused by the misannotation is that in the annotation > packages, the 215210_s_at is annotating the DLST pseudogene. It is > true that > the probeset also matches the pseudogene on chromosome 1; however it > primarily matches DLST itself on chromosome 14. That is why I am > suggesting > to clarify the issues and change the annotation of 215210_s_at to DLST > and > the annotation of 217216_x_at to MLH3 as described in my previous post > > Thank you, > > Kevin > > +--------------------------------------------------------------------- -- ----------------+ | Robert Gentleman phone: (206) 667-7700 | | Head, Program in Computational Biology fax: (206) 667-1319 | | Division of Public Health Sciences office: M2-B865 | | Fred Hutchinson Cancer Research Center | | email: rgentlem@fhcrc.org | +--------------------------------------------------------------------- -- ----------------+
ADD REPLY
0
Entering edit mode
Kevin Dawson ▴ 80
@kevin-dawson-538
Last seen 10.2 years ago
Dear Robert, I understand, you don't have the resorces to hand-curate the data. I did not know how the info is propagated from Affy (who is selecting the probesets) to the BioConductor packages. However, you could have picked up this annotation error automatically. AC006530 is a long piece of DNA that includes both DLST and MLH3 among others (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi? db=gene&cmd=search&term=AC006530). If you used the info from Affymetrix, where the oligos are, it would have become clear that the oligoset is in MLH3 and not in DLST. Thanks anyway, Kevin
ADD COMMENT
0
Entering edit mode
Kevin Dawson wrote: > Dear Robert, > > I understand, you don't have the resorces to hand-curate the data. I did > not know how the info is propagated from Affy (who is selecting the > probesets) to the BioConductor packages. > > However, you could have picked up this annotation error automatically. > AC006530 is a long piece of DNA that includes both DLST and MLH3 among > others (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi? > db=gene&cmd=search&term=AC006530). If you used the info from Affymetrix, > where the oligos are, it would have become clear that the oligoset is in > MLH3 and not in DLST. You might note that arguing about the annotation of an _x_at probeset is probably not worth the trouble. From the Affy website: Occasionally, it is not possible to select either a unique probe set or a probe set with all probes common among multiple transcripts ("_s_at" ). In such cases, similarity criteria are suspended, and the resulting probe set name is appended with the "_x_at" extension. These probe sets contain some probes that are identical, or highly similar, to unrelated sequences. These probes may cross-hybridize in an unpredictable manner with sequences other than the main target. Data generated from these probe sets should be interpreted with caution, due to the likelihood that some of the signal is from transcripts other than the one being intentionally measured. In other words, you probably aren't measuring what you think anyway. Best, Jim > > Thanks anyway, > > Kevin > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor -- James W. MacDonald Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623
ADD REPLY

Login before adding your answer.

Traffic: 871 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6