Question

BeadarraySNP/read.SnpSetIllumina

0

Entering edit mode

Boris Umylny ▴ 120

@boris-umylny-3254

Last seen 10.6 years ago

Hi, We are trying to use beadarraySNP on Illumina sample files. We are using R 2.9 with bioconductor 3.4. Please, if any one can help us it would be most appreciated. At this time we have access only to final report files, so we are constructing samplesheet manually based on the documentation: [Header],,,,,, Investigator Name,Test,,,,, Project Name,Test,,,,, Experiment Name,Test,,,,, Date,27072009,,,,, [Data],,,,,, Sample_Name,Sample_Well,Sample_Plate,Sample_Group,Pool_ID,Sentrix_ID,S entrix_Position NA12155,well1,plate1,group,GS001-OPA,1280260,R001_C00 NA10861,well2,plate1,group,GS001-OPA,1280260,R002_C00 NA12814,well3,plate1,group,GS001-OPA,1280260,R003_C00 NA11829,well4,plate1,group,GS001-OPA,1280260,R004_C00 The report file we have contains the following columns: [Header] Processing Date,29/7/2009 14:28 Content,Human610_Quad_v1 Num SNPs,620901 Total SNPs,620901 Num Samples,4 Total Samples,4 [Data] SNP Name,Sample ID,Allele1 - Forward,Allele2 - Forward,GC Score,X,Y,X Raw,Y Raw,Log R Ratio,B Allele Freq 200003,NA12155,A,G,0.9299,0.633,0.530,7898,7050,-0.1552,0.5069 200006,NA12155,T,T,0.7877,1.563,0.143,18645,2500,-0.1099,0.0102 200047,NA12155,A,A,0.8612,0.472,0.048,5916,1201,0.0005,0.0242 200050,NA12155,C,C,0.8331,0.009,1.209,761,15072,0.0075,1.0000 This is different than the documentation. In particular, it does not have GT Score, Chr and Position columns. For our purposes, those columns are not important - we are taking annotation information from dbSNP and Illumina platform annotations. After modifying the file to add these columns and to rename Allele1/2 - Forward to Allele1/2 - AB, we have: [Header] Processing Date,29/7/2009 14:30 Content,Human610_Quad_v1 Num SNPs,620901 Total SNPs,620901 Num Samples,4 Total Samples,4 [Data] SNP Name,Sample ID,Allele1 - AB,Allele2 - AB,GC Score,X,Y,X Raw,Y Raw,Log R Ratio,B Allele Freq,GT Score,Chr,Position 200003,NA12155,A,G,0.9299,0.633,0.530,7898,7050,-0.1552,0.5069,0.0,1,1 200006,NA12155,T,T,0.7877,1.563,0.143,18645,2500,-0.1099,0.0102,0.0,1, 1 200047,NA12155,A,A,0.8612,0.472,0.048,5916,1201,0.0005,0.0242,0.0,1,1 200050,NA12155,C,C,0.8331,0.009,1.209,761,15072,0.0075,1.0000,0.0,1,1 200052,NA12155,T,T,0.9466,0.012,0.901,961,12084,-0.0236,1.0000,0.0,1,1 In both cases we got the same error: xx <- read.SnpSetIllumina(samplesheet="sample_sheet.csv", reportfile = "report_file.csv") Error in read.SnpSetIllumina(samplesheet = "sample_sheet.csv", reportfile = "report_file.csv") : Columns:SNP Name, Sample ID, GC Score, GT Score, X Raw, Y Raw, Chr, Position are missing in the report file We also tried to remove those columns that are not mentioned in the document (for example X, Y and Log R Ratio) - the results were the same. Many, many thanks in advance! Sincerely, Boris Umylny

beadarraySNP beadarraySNP • 1.3k views

ADD COMMENT • link 15.8 years ago Boris Umylny ▴ 120