error in reading idat files and creating rgset
1
0
Entering edit mode
sharangiiv • 0
@d5fb3054
Last seen 8 months ago
Canada

Hi,

I am reading EPIC samples using minfi's read.metharray.exp() function as follows:

library(minfi)
library(knitr)
library(limma)
library(minfi)
library(IlluminaHumanMethylation450kanno.ilmn12.hg19)
library(IlluminaHumanMethylation450kmanifest)
library(RColorBrewer)
library(Gviz)
library(stringr)

setwd("C:\\Users\\Sarah Vasavan\\Documents\\DNA methylation project")
baseDir <- getwd()
targets <- read.metharray.sheet(baseDir)
rgSet <- read.metharray.exp(targets=targets, force=TRUE)

My R output is as follows:

> setwd("C:\\Users\\Sarah Vasavan\\Documents\\DNA methylation project")
> baseDir <- getwd()
> targets <- read.metharray.sheet(baseDir)
[read.metharray.sheet] Found the following CSV files:
[1] "C:/Users/Sarah Vasavan/Documents/DNA methylation project/DeLuca BAN11196 Methylation450 Dec 2013.csv"
> rgSet <- read.metharray.exp(targets=targets, force=TRUE)
Error in read.metharray(basenames = files, extended = extended, verbose = verbose,  : 
  !anyDuplicated(basenames) is not TRUE

Can you please tell me what I am doing wrong? I've checked and made sure that all my idat files match the samples in the samplesheet.

My idat files are as follows:

"C:\Users\Sarah Vasavan\Documents\DNA methylation project\R01C01_Grn.idat" "C:\Users\Sarah Vasavan\Documents\DNA methylation project\R01C01_Red.idat" "C:\Users\Sarah Vasavan\Documents\DNA methylation project\R01C02_Grn.idat" "C:\Users\Sarah Vasavan\Documents\DNA methylation project\R01C02_Red.idat" ... "C:\Users\Sarah Vasavan\Documents\DNA methylation project\R06C02_Grn.idat" "C:\Users\Sarah Vasavan\Documents\DNA methylation project\R06C02_Red.idat"

minfi MethylationArray • 876 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 14 hours ago
United States

Yes. It's saying that you have multiple files with the same name. You should check the targets data.frame to identify and figure out why that is so.

0
Entering edit mode
> print(targets)

this is my output on R when I do that:

   Sample_Name Sample_Group Sample_Plate  Project Pool_ID Sample_Well  Array
1           10            0           10 BAN11196       0         A10 R01C01
2           13            0           13 BAN11196       0         B10 R02C01
3           17            0           17 BAN11196       0         C10 R03C01
4           18            0           18 BAN11196       0         D10 R04C01
5           19            0           19 BAN11196       0         E10 R05C01
6           21            0           21 BAN11196       0         F10 R06C01
7           25            0           25 BAN11196       0         G10 R01C02
8           29            0           29 BAN11196       0         H10 R02C02
9           31            0           31 BAN11196       0         A11 R03C02
10          41            0           41 BAN11196       0         B11 R04C02
11          47            0           47 BAN11196       0         C11 R05C02
12          51            0           51 BAN11196       0         D11 R06C02
        Slide     Basename
1  9344730088 character(0)
2  9344730088 character(0)
3  9344730088 character(0)
4  9344730088 character(0)
5  9344730088 character(0)
6  9344730088 character(0)
7  9344730088 character(0)
8  9344730088 character(0)
9  9344730088 character(0)
10 9344730088 character(0)
11 9344730088 character(0)
12 9344730088 character(0)
>
ADD REPLY
0
Entering edit mode

Right. And what is in the Basename column?

ADD REPLY
0
Entering edit mode

character(0) for all of them

ADD REPLY
0
Entering edit mode

And since that's supposed to point to the directory that contains the files, it probably won't work. This is where your skills as a detective have to come into play.

ADD REPLY
0
Entering edit mode

Thank you! so my updated sample sheet contains this column:

Basename
/Users/Sarah Vasavan/Documents/DNA methylation project/R01C01_Grn.idat;/Users/Sarah Vasavan/Documents/DNA methylation project/R01C01_Red.idat
/Users/Sarah Vasavan/Documents/DNA methylation project/R02C01_Grn.idat;/Users/Sarah Vasavan/Documents/DNA methylation project/R02C01_Red.idat
/Users/Sarah Vasavan/Documents/DNA methylation project/R03C01_Grn.idat;/Users/Sarah Vasavan/Documents/DNA methylation project/R03C01_Red.idat
/Users/Sarah Vasavan/Documents/DNA methylation project/R04C01_Grn.idat;/Users/Sarah Vasavan/Documents/DNA methylation project/R04C01_Red.idat
/Users/Sarah Vasavan/Documents/DNA methylation project/R05C01_Grn.idat;/Users/Sarah Vasavan/Documents/DNA methylation project/R05C01_Red.idat
/Users/Sarah Vasavan/Documents/DNA methylation project/R06C01_Grn.idat;/Users/Sarah Vasavan/Documents/DNA methylation project/R06C01_Red.idat
/Users/Sarah Vasavan/Documents/DNA methylation project/R01C02_Grn.idat;/Users/Sarah Vasavan/Documents/DNA methylation project/R01C02_Red.idat
/Users/Sarah Vasavan/Documents/DNA methylation project/R02C02_Grn.idat;/Users/Sarah Vasavan/Documents/DNA methylation project/R02C02_Red.idat
/Users/Sarah Vasavan/Documents/DNA methylation project/R03C02_Grn.idat;/Users/Sarah Vasavan/Documents/DNA methylation project/R03C02_Red.idat
/Users/Sarah Vasavan/Documents/DNA methylation project/R04C02_Grn.idat;/Users/Sarah Vasavan/Documents/DNA methylation project/R04C02_Red.idat
/Users/Sarah Vasavan/Documents/DNA methylation project/R05C02_Grn.idat;/Users/Sarah Vasavan/Documents/DNA methylation project/R05C02_Red.idat
/Users/Sarah Vasavan/Documents/DNA methylation project/R06C02_Grn.idat;/Users/Sarah Vasavan/Documents/DNA methylation project/R06C02_Red.idat

however, I still get an error:

> #create rgSet
> rgSet <- read.metharray.exp(targets=targets, force=TRUE)
Error in read.metharray(basenames = files, extended = extended, verbose = verbose,  : 
  !anyDuplicated(basenames) is not TRUE
ADD REPLY
0
Entering edit mode

That's not what it should look like. The Basename doesn't include the idat file names. The help page for read.metharray.sheet has an example you can run to see what it should look like.

> example(read.metharray.sheet)

rd.mt.> if(require(minfiData)) {
rd.mt.+ 
rd.mt.+ baseDir <- system.file("extdata", package = "minfiData")
rd.mt.+ sheet <- read.metharray.sheet(baseDir)
rd.mt.+ 
rd.mt.+ }
[read.metharray.sheet] Found the following CSV files:
[1] "C:/Users/jmacdon/AppData/Local/R/win-library/4.3/minfiData/extdata/SampleSheet.csv"
> sheet
  Sample_Name Sample_Well
1    GroupA_3          H5
2    GroupA_2          D5
3    GroupB_3          C6
4    GroupB_1          F7
5    GroupA_1          G7
6    GroupB_2          H7
  Sample_Plate Sample_Group Pool_ID
1         <NA>       GroupA    <NA>
2         <NA>       GroupA    <NA>
3         <NA>       GroupB    <NA>
4         <NA>       GroupB    <NA>
5         <NA>       GroupA    <NA>
6         <NA>       GroupB    <NA>
  person age sex status  Array
1    id3  83   M normal R02C02
2    id2  58   F normal R04C01
3    id3  83   M cancer R05C02
4    id1  75   F cancer R04C02
5    id1  75   F normal R05C02
6    id2  58   F cancer R06C02
       Slide
1 5723646052
2 5723646052
3 5723646052
4 5723646053
5 5723646053
6 5723646053
                                                                                         Basename
1 C:/Users/jmacdon/AppData/Local/R/win-library/4.3/minfiData/extdata/5723646052/5723646052_R02C02
2 C:/Users/jmacdon/AppData/Local/R/win-library/4.3/minfiData/extdata/5723646052/5723646052_R04C01
3 C:/Users/jmacdon/AppData/Local/R/win-library/4.3/minfiData/extdata/5723646052/5723646052_R05C02
4 C:/Users/jmacdon/AppData/Local/R/win-library/4.3/minfiData/extdata/5723646053/5723646053_R04C02
5 C:/Users/jmacdon/AppData/Local/R/win-library/4.3/minfiData/extdata/5723646053/5723646053_R05C02
6 C:/Users/jmacdon/AppData/Local/R/win-library/4.3/minfiData/extdata/5723646053/5723646053_R06C02
ADD REPLY

Login before adding your answer.

Traffic: 801 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6