problems with pd.genomewidesnp.6

0

Entering edit mode

Sebastian Thieme ▴ 60

@sebastian-thieme-5020

Last seen 10.2 years ago

Hi at all, I have some problems with the pd.genomewidesnp.6 package and I hope some one can help me. The info with get(objects("package:pd.genomewidesnp.6")) is #Class........: AffySNPCNVPDInfo #Manufacturer.: Affymetrix #Genome Build.: HG19 #Chip Geometry: 2572 rows x 2680 columns I want match the man_festid of each prob to one gene, therefore I look in the gene_assoc part and call the gene with minimum distance to the respective prob as corresponding gene. My commands for get the raw informations are: snp.f <- dbGetQuery(con6, "select * from featureSet") snp.f <- snpfeature[,c("fsetid","man_fsetid","chrom","physical_pos","s trand","cytoband","gene_assoc")] cn.f <- dbGetQuery(con6, "select * from featureSetCNV") cn.f <- cn.f[,c("fsetid","man_fsetid","chrom","chrom_start","strand"," cytoband","gene_assoc")] snp6.f <- rbind(snp.f,cn.f) and process the gene_assoc part. Now the problem within the gene_assoc part is that there are genes which are not on the same chromosome as the respective probs e.g. fsetid man_fsetid chrom physical_pos strand cytoband 650443 CN_618877 12 93793083 - q22 gene_assoc ENST00000358888 // upstream // 315610 // Hs.112553 // RPL41 // 6171 //ribosomal protein L41 /// ENST00000318066 // downstream // 8981 // Hs.524630 // UBE2N // 7334 // ubiquitin-conjugating enzyme E2N (UBC13 homolog, yeast) /// NR_002212 // exon // 0 // --- // NUDT4P1 // 440672 // nudix (nucleoside diphosphate linked moiety X)-type motif 4 pseudogene 1 /// NM_199040 // CDS // 0 // Hs.506325 // NUDT4 // 11163 // nudix (nucleoside diphosphate linked moiety X)-type motif 4 ///NM_019094 // CDS // 0 // Hs.506325 // NUDT4 // 11163 // nudix (nucleoside diphosphate linked moiety X)-type motif 4 gene "NUDT4P1" is annotated on Chromosome 1 not 12 and this is only one. An other example is fsetid man_fsetid chrom physical_pos strand cytoband 186938 SNP_A-4227519 12 31784081 - p11.21 gene_assoc ENST00000294419 // upstream // 14576 // Hs.10862 // AK3L1 // 205 // adenylate kinase 3-like 1 /// ENST00000412352 // upstream // 16012 // Hs.585084 // C12orf72 // 254013 // chromosome 12 open reading frame 72 /// NM_013410 // upstream // 14564 // Hs.10862 // AK3L1 // 205 // adenylate kinase 3-like 1 /// NM_001135864 // upstream // 16012 // Hs.585084 // C12orf72 // 254013 // chromosome 12 open reading frame 72 AK3L1 is annotated at chromosome 9 not 12. The corresponding ensembl ID (ENST00000294419 ) is mapped to AK4-201 which is annotated on chromosome 1 . This are only two examples there are a lot more. Can some one help? best regards Basti

PROcess PROcess • 989 views

ADD COMMENT • link updated 13.0 years ago by James W. MacDonald 67k • written 13.0 years ago by Sebastian Thieme ▴ 60

0

Entering edit mode

James W. MacDonald 67k

@james-w-macdonald-5106

Last seen 4 days ago

United States

Hi Sebastian, On 12/20/11 6:01 PM, Sebastian Thieme wrote: > Hi at all, > > I have some problems with the pd.genomewidesnp.6 package and I hope > some one can help me. The info with > get(objects("package:pd.genomewidesnp.6")) is > > #Class........: AffySNPCNVPDInfo > #Manufacturer.: Affymetrix > #Genome Build.: HG19 > #Chip Geometry: 2572 rows x 2680 columns > > I want match the man_festid of each prob to one gene, therefore I look > in the gene_assoc part and call the gene with minimum distance to the > respective prob as corresponding gene. My commands for get the raw > informations are: > > snp.f<- dbGetQuery(con6, "select * from featureSet") > snp.f<- snpfeature[,c("fsetid","man_fsetid","chrom","physical_pos"," strand","cytoband","gene_assoc")] > > cn.f<- dbGetQuery(con6, "select * from featureSetCNV") > cn.f<- cn.f[,c("fsetid","man_fsetid","chrom","chrom_start","strand", "cytoband","gene_assoc")] > > snp6.f<- rbind(snp.f,cn.f) > > and process the gene_assoc part. Now the problem within the gene_assoc > part is that there are genes which are not on the same chromosome as > the respective probs e.g. > > fsetid man_fsetid chrom physical_pos strand cytoband > 650443 CN_618877 12 93793083 - q22 > gene_assoc > ENST00000358888 // upstream // 315610 // Hs.112553 // RPL41 // 6171 > //ribosomal protein L41 /// ENST00000318066 // downstream // 8981 // > Hs.524630 // UBE2N // 7334 // ubiquitin-conjugating enzyme E2N (UBC13 > homolog, yeast) /// NR_002212 // exon // 0 // --- // NUDT4P1 // 440672 > // nudix (nucleoside diphosphate linked moiety X)-type motif 4 > pseudogene 1 /// NM_199040 // CDS // 0 // Hs.506325 // NUDT4 // 11163 > // nudix (nucleoside diphosphate linked moiety X)-type motif 4 > ///NM_019094 // CDS // 0 // Hs.506325 // NUDT4 // 11163 // nudix > (nucleoside diphosphate linked moiety X)-type motif 4 > > gene "NUDT4P1" is annotated on Chromosome 1 not 12 and this is only > one. An other example is In what build is that true? UCSC claims that NUDT4 and NUDT4P1 are overlapping, on chr12 (hg19). Anyway, the larger point here is a discussion of what a SNP is, and how they are localized. Essentially, a SNP is a single base that has been found to vary with a certain frequency in a population. They are localized by the flanking sequence, which means that in the case of a pseudogene (which may or may not be on the same chromosome), you will see the same flanking sequence and cannot reliably say where the SNP is really located. Since DNA chips work by binding to the SNP and its flanking sequence, you cannot say whether you have measured the gene, the pseudogene, or some combination thereof. Listing all possibilities for the SNP location is therefore not a 'problem', it just reflects our lack of precision. Best, Jim > fsetid man_fsetid chrom physical_pos strand cytoband > 186938 SNP_A-4227519 12 31784081 - p11.21 > > gene_assoc > ENST00000294419 // upstream // 14576 // Hs.10862 // AK3L1 // 205 // > adenylate kinase 3-like 1 /// ENST00000412352 // upstream // 16012 // > Hs.585084 // C12orf72 // 254013 // chromosome 12 open reading frame 72 > /// NM_013410 // upstream // 14564 // Hs.10862 // AK3L1 // 205 // > adenylate kinase 3-like 1 /// NM_001135864 // upstream // 16012 // > Hs.585084 // C12orf72 // 254013 // chromosome 12 open reading frame 72 > > AK3L1 is annotated at chromosome 9 not 12. The corresponding ensembl > ID (ENST00000294419 ) is mapped to AK4-201 which is annotated on > chromosome 1 . This are only two examples there are a lot more. Can > some one help? > > > best regards > > Basti > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

ADD COMMENT • link 13.0 years ago James W. MacDonald 67k

Login before adding your answer.