pd.mapping250k.sty package: featureSet:fragment_length
1
0
Entering edit mode
Julie Zhu ★ 4.3k
@julie-zhu-3596
Last seen 11 months ago
United States
Hi, Could someone please tell me whether the fragment_length in the featureSet of pd.mapping250k.sty is the fragment_length of the sample? Are there documentations available for looking up the meanings of each field? Some rows have NAs for most the fields even though the allele information is known, is this expected? Thanks so much for your help! library("pd.mapping250k.sty") con = db(pd.mapping250k.sty) dbListFields(con, "featureSet") [1] "fsetid" "man_fsetid" "dbsnp_rs_id" "chrom" [5] "physical_pos" "strand" "cytoband" "allele_a" [9] "allele_b" "gene_assoc" "fragment_length" "dbsnp" [13] "cnv" dbGetQuery(con, "select * from featureSet order by fsetid desc limit 2") fsetid man_fsetid dbsnp_rs_id chrom physical_pos strand cytoband allele_a allele_b 1 238378 SNP_A-4301986 rs6989223 8 5214036 - p23.2 A G 2 238377 SNP_A-2291495 rs11644392 <na> NA <na> <na> A G fragment_length dbsnp 1 1667 0 2 NA NA Best regards, Julie sessionInfo() R version 2.11.1 (2010-05-31) x86_64-apple-darwin9.8.0 locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] pd.mapping250k.sty_1.0.0 RSQLite_0.9-2 DBI_0.2-5 [4] oligo_1.12.2 oligoClasses_1.10.0 Biobase_2.8.0 [7] affxparser_1.20.0 loaded via a namespace (and not attached): [1] affyio_1.16.0 Biostrings_2.16.9 IRanges_1.6.11 preprocessCore_1.10.0 [5] splines_2.11.1 tools_2.11.1
• 811 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States
Hi Julie, On 9/17/2010 9:53 AM, Zhu, Julie wrote: > Hi, > > Could someone please tell me whether the fragment_length in the featureSet > of pd.mapping250k.sty is the fragment_length of the sample? Are there > documentations available for looking up the meanings of each field? The fragment_length is the length of the restriction fragment. You could hypothetically have figured this out yourself by comparing the fragment length to the data on the netaffx site. Unfortunately, it looks like the current version of the pd.mapping250k.sty package is out of date when compared to what netaffx has, as the fragment length data for these two probesets don't agree. This is not true of the pd.genomewidesnp.6 package, which is what I have installed. So for instance, > dbGetQuery(con, "select fragment_length, fragment_length2, man_fsetid from featureSet limit 10;") fragment_length fragment_length2 man_fsetid 1 395 217 SNP_A-2131660 2 NA 702 SNP_A-1967418 3 633 883 SNP_A-1969580 4 831 399 SNP_A-4263484 5 970 611 SNP_A-1978185 6 1508 711 SNP_A-4264431 7 NA 921 SNP_A-1980898 8 NA 243 SNP_A-1983139 9 NA 194 SNP_A-4265735 10 420 858 SNP_A-1995832 the fragment_length and fragment_length2 data here do agree (well, at least the two I checked agree ;-P) with netaffx. As for the other field names, most seem clear to me. Is there one in particular that is not clear? > > Some rows have NAs for most the fields even though the allele information is > known, is this expected? It is expected, depending on when the package was built. We are simply taking data from Affymetrix and re-packaging into an object that is easier to use, so we are dependent on the data we get from Affy. Since annotation of genetic data is a moving target, things are always changing. We only build these packages on a semi-annual basis, so we end up out of date quite quickly. This is a tradeoff between having the most up-to-date data, and having stable data packages that people can rely on. We do provide the functionality to build your own, so if you desire the most up-to-date package, you can build a personal package using the pdInfoBuilder package. Best, Jim > > Thanks so much for your help! > > library("pd.mapping250k.sty") > con = db(pd.mapping250k.sty) > dbListFields(con, "featureSet") > [1] "fsetid" "man_fsetid" "dbsnp_rs_id" "chrom" > [5] "physical_pos" "strand" "cytoband" "allele_a" > [9] "allele_b" "gene_assoc" "fragment_length" "dbsnp" > [13] "cnv" > > dbGetQuery(con, "select * from featureSet order by fsetid desc limit 2") > fsetid man_fsetid dbsnp_rs_id chrom physical_pos strand cytoband > allele_a allele_b > 1 238378 SNP_A-4301986 rs6989223 8 5214036 - p23.2 > A G > 2 238377 SNP_A-2291495 rs11644392<na> NA<na> <na> > A G > fragment_length dbsnp > 1 1667 0 > 2 NA NA > > > Best regards, > > Julie > > sessionInfo() > R version 2.11.1 (2010-05-31) > x86_64-apple-darwin9.8.0 > > locale: > [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] pd.mapping250k.sty_1.0.0 RSQLite_0.9-2 DBI_0.2-5 > [4] oligo_1.12.2 oligoClasses_1.10.0 Biobase_2.8.0 > [7] affxparser_1.20.0 > > loaded via a namespace (and not attached): > [1] affyio_1.16.0 Biostrings_2.16.9 IRanges_1.6.11 > preprocessCore_1.10.0 > [5] splines_2.11.1 tools_2.11.1 > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
ADD COMMENT
0
Entering edit mode
Jim, Thank you very much for the detailed information! It all makes sense. Best regards, Julie On 9/17/10 12:58 PM, "James W. MacDonald" <jmacdon at="" med.umich.edu=""> wrote: > Hi Julie, > > On 9/17/2010 9:53 AM, Zhu, Julie wrote: >> Hi, >> >> Could someone please tell me whether the fragment_length in the featureSet >> of pd.mapping250k.sty is the fragment_length of the sample? Are there >> documentations available for looking up the meanings of each field? > > The fragment_length is the length of the restriction fragment. You could > hypothetically have figured this out yourself by comparing the fragment > length to the data on the netaffx site. Unfortunately, it looks like the > current version of the pd.mapping250k.sty package is out of date when > compared to what netaffx has, as the fragment length data for these two > probesets don't agree. > > This is not true of the pd.genomewidesnp.6 package, which is what I have > installed. So for instance, > >> dbGetQuery(con, "select fragment_length, fragment_length2, man_fsetid > from featureSet limit 10;") > fragment_length fragment_length2 man_fsetid > 1 395 217 SNP_A-2131660 > 2 NA 702 SNP_A-1967418 > 3 633 883 SNP_A-1969580 > 4 831 399 SNP_A-4263484 > 5 970 611 SNP_A-1978185 > 6 1508 711 SNP_A-4264431 > 7 NA 921 SNP_A-1980898 > 8 NA 243 SNP_A-1983139 > 9 NA 194 SNP_A-4265735 > 10 420 858 SNP_A-1995832 > > the fragment_length and fragment_length2 data here do agree (well, at > least the two I checked agree ;-P) with netaffx. > > As for the other field names, most seem clear to me. Is there one in > particular that is not clear? > >> >> Some rows have NAs for most the fields even though the allele information is >> known, is this expected? > > It is expected, depending on when the package was built. We are simply > taking data from Affymetrix and re-packaging into an object that is > easier to use, so we are dependent on the data we get from Affy. Since > annotation of genetic data is a moving target, things are always changing. > > We only build these packages on a semi-annual basis, so we end up out of > date quite quickly. This is a tradeoff between having the most > up-to-date data, and having stable data packages that people can rely on. > > We do provide the functionality to build your own, so if you desire the > most up-to-date package, you can build a personal package using the > pdInfoBuilder package. > > Best, > > Jim > > >> >> Thanks so much for your help! >> >> library("pd.mapping250k.sty") >> con = db(pd.mapping250k.sty) >> dbListFields(con, "featureSet") >> [1] "fsetid" "man_fsetid" "dbsnp_rs_id" "chrom" >> [5] "physical_pos" "strand" "cytoband" "allele_a" >> [9] "allele_b" "gene_assoc" "fragment_length" "dbsnp" >> [13] "cnv" >> >> dbGetQuery(con, "select * from featureSet order by fsetid desc limit 2") >> fsetid man_fsetid dbsnp_rs_id chrom physical_pos strand cytoband >> allele_a allele_b >> 1 238378 SNP_A-4301986 rs6989223 8 5214036 - p23.2 >> A G >> 2 238377 SNP_A-2291495 rs11644392<na> NA<na> <na> >> A G >> fragment_length dbsnp >> 1 1667 0 >> 2 NA NA >> >> >> Best regards, >> >> Julie >> >> sessionInfo() >> R version 2.11.1 (2010-05-31) >> x86_64-apple-darwin9.8.0 >> >> locale: >> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] pd.mapping250k.sty_1.0.0 RSQLite_0.9-2 DBI_0.2-5 >> [4] oligo_1.12.2 oligoClasses_1.10.0 Biobase_2.8.0 >> [7] affxparser_1.20.0 >> >> loaded via a namespace (and not attached): >> [1] affyio_1.16.0 Biostrings_2.16.9 IRanges_1.6.11 >> preprocessCore_1.10.0 >> [5] splines_2.11.1 tools_2.11.1 >> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY

Login before adding your answer.

Traffic: 205 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6