Working flow_own CEL file reading problem
1
0
Entering edit mode
@junshi-yazaki-1259
Last seen 10.3 years ago
Hi Jim, Seth, Reddy, Paul, Thank you very much for your suggestion. I may be make cdf environment. Could you please help me how to confirm the env is OK or No? Next I tried cel file reading and normalize from our custom affy array. If my working flow are useful for affy beginner like me, could you please help me? At first, I typed below... >source("http://www.bioconductor.org/getBioC.R") >getBioC("all") >library(makecdfenv) >Library(affy) >make.cdf.package ("arabidopsistlgF.cdf") And move to Terminal on my Mac, >R CMD INSTALL arabidopsistlgFcdf Return to R, >arabidopsistlgF = make.cdf.env("arabidopsistlgF.cdf") And I shut down my Mac. Is these step correct for making cdf environment? And then I started again. > source("http://www.bioconductor.org/getBioC.R") > getBioC() > library(affy) >Data <- readAffy() >eset <- rma(data) I got Error below, *********** Note: You did not specify a download type. Using a default value of: Source This will be fine for almost all users Error in getCdfInfo(object) : Could not obtain CDF environment, problems encountered: Specified environment specified did not contain arabidopsis_tlgF_4x Library - package arabidopsistlgf4xcdf not installed Data for package affy did not contain arabidopsis_tlgF_4x Bioconductor - arabidopsistlgf4xcdf not available ********* Q1. I have question. Do I need typing below every time after restart? If I need the typing every time for making cdf env, I need lot of time for this step (cdf file is big). ********** >source("http://www.bioconductor.org/getBioC.R") >getBioC("all") >library(makecdfenv) >Library(affy) >make.cdf.package ("arabidopsistlgF.cdf") ********** And next, I tried makecdfenv again like below, > env = make.cdf.env("arabidopsistlgF.cdf") > library(makecdfenv) > env = make.cdf.env("arabidopsistlgF.cdf") > cel.files=list.files(pattern=".CEL$") > data=ReadAffy(filenames=cel.files) > pname<- cleancdfname(whatcdf("J_HpaII_Wt_10uM.CEL")) > temp=rma(data) I got Error below, ****** Note: You did not specify a download type. Using a default value of: Source This will be fine for almost all users Error in getCdfInfo(object) : Could not obtain CDF environment, problems encountered: Specified environment specified did not contain arabidopsis_tlgF_4x Library - package arabidopsistlgf4xcdf not installed Data for package affy did not contain arabidopsis_tlgF_4x Bioconductor - arabidopsistlgf4xcdf not available ********* So I made copy of "arabidopsistlgF.cdf", and change name "arabidopsistlgF4x". And continue, > env = make.cdf.env("arabidopsistlgF4x.cdf") > cel.files=list.files(pattern=".CEL$") > data=ReadAffy(filenames=cel.files) > > pname<- cleancdfname(whatcdf("J_HpaII_Wt_10uM.CEL")) I got Error again, ******** Error in whatcdf("J_HpaII_Wt_10uM.CEL") : Could not open file J_HpaII_Wt_10uM.CEL ******** I thought I may be need for cel file normalization, below, > library(gcrma) Loading required package: matchprobes > Data <- ReadAffy() > eset <- gcrma(Data) I got Error again, ******** Computing affinities[1] "Checking to see if your internet connection works..." Warning message: unable to connect to 'www.bioconductor.org' on port 80. Note: http://www.bioconductor.org/repository/devel/package/Source does not seem to have a valid repository, skipping Warning messages: 1: Failed to read replisting at http://www.bioconductor.org/repository/devel/package/Source in: getReplisting(repURL, repFile, method = method) 2: unable to connect to 'www.bioconductor.org' on port 80. Note: http://www.bioconductor.org/repository/devel/package/Win32 does not seem to have a valid repository, skipping Note: You did not specify a download type. Using a default value of: Source This will be fine for almost all users Error in getCDF(cdfpackagename) : Environment arabidopsistlgf4xcdf was not found in the Bioconductor repository. In addition: Warning message: Failed to read replisting at http://www.bioconductor.org/repository/devel/package/Win32 in: getReplisting(repURL, repFile, method = method) ******** Q2. I can not read my cel file now. Our cdf file name is "arabidopsistlgF.cdf" . But cif file name is "arabidopsistlgF_4x.cif". Do I need to use same name for cif and cdf? Because cel file include cif file name. And how can I start to read cel file? Q3. And also I would like to read cel file and normalization using a lot of cel files. Could you please suggest me what package is better for reading and normalization of affy custom array? and which is better rma (Robust Multi-Array Average expression measure) or gcrma (Background adjustment using sequence information)? Q4. If our array has over 3 million data, how long do I need for reading and normalization for 1 data (depend on machine power?)? Do you have some speculation for calculation efficiency? I need to read cdf file for about 20min. Thank you very much, Junshi -- *********************************************************** *********************************************************** Junshi Yazaki The Salk Institute for Biological Studies
Normalization cdf affy makecdfenv Normalization cdf affy makecdfenv • 1.5k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 13 hours ago
United States
Junshi Yazaki wrote: > Hi Jim, Seth, Reddy, Paul, > > Thank you very much for your suggestion. I may be make cdf > environment. Could you please help me how to confirm the env is OK or > No? Next I tried cel file reading and normalize from our custom affy > array. If my working flow are useful for affy beginner like me, could > you please help me? > > At first, I typed below... > >> source("http://www.bioconductor.org/getBioC.R") >> getBioC("all") >> library(makecdfenv) >> Library(affy) >> make.cdf.package ("arabidopsistlgF.cdf") > > > And move to Terminal on my Mac, > >> R CMD INSTALL arabidopsistlgFcdf It should actually be arabidopsistlgfcdf. Note that the F is lower case. > > Return to R, > >> arabidopsistlgF = make.cdf.env("arabidopsistlgF.cdf") > This is an unnecessary step - you already made and installed the package. > > And I shut down my Mac. Is these step correct for making cdf > environment? And then I started again. > >> source("http://www.bioconductor.org/getBioC.R") >> getBioC() >> library(affy) >> Data <- readAffy() At this point, try cleancdfname(cdfName(Data)) if the result is not arabidopsistlgfcdf, then you need to make your cdfenv again, using the cleancdfname(). I am betting the cleancdfname will be arabidopsistlgf4xcdf, so you will need to do make.cdf.package("arabidopsistlgF.cdf", packagename="arabidopsistlgf4xcdf") And then install using R CMD INSTALL >> eset <- rma(data) > > > I got Error below, > *********** > Note: You did not specify a download type. Using a default value of: > Source > This will be fine for almost all users > > Error in getCdfInfo(object) : Could not obtain CDF environment, problems > encountered: > Specified environment specified did not contain arabidopsis_tlgF_4x > Library - package arabidopsistlgf4xcdf not installed > Data for package affy did not contain arabidopsis_tlgF_4x > Bioconductor - arabidopsistlgf4xcdf not available > ********* > Q1. I have question. Do I need typing below every time after restart? If > I need the typing every time for making cdf env, I need lot of time for > this step (cdf file is big). No. If you install correctly, it should be there for you every time you run R. > ********** > >> source("http://www.bioconductor.org/getBioC.R") >> getBioC("all") >> library(makecdfenv) >> Library(affy) >> make.cdf.package ("arabidopsistlgF.cdf") > > ********** > And next, I tried makecdfenv again like below, > >> env = make.cdf.env("arabidopsistlgF.cdf") >> library(makecdfenv) >> env = make.cdf.env("arabidopsistlgF.cdf") >> cel.files=list.files(pattern=".CEL$") >> data=ReadAffy(filenames=cel.files) >> pname<- cleancdfname(whatcdf("J_HpaII_Wt_10uM.CEL")) >> temp=rma(data) > > > I got Error below, > ****** > Note: You did not specify a download type. Using a default value of: > Source > This will be fine for almost all users > > Error in getCdfInfo(object) : Could not obtain CDF environment, problems > encountered: > Specified environment specified did not contain arabidopsis_tlgF_4x > Library - package arabidopsistlgf4xcdf not installed > Data for package affy did not contain arabidopsis_tlgF_4x > Bioconductor - arabidopsistlgf4xcdf not available > ********* > So I made copy of "arabidopsistlgF.cdf", and change name > "arabidopsistlgF4x". And continue, > >> env = make.cdf.env("arabidopsistlgF4x.cdf") >> cel.files=list.files(pattern=".CEL$") >> data=ReadAffy(filenames=cel.files) >> >> pname<- cleancdfname(whatcdf("J_HpaII_Wt_10uM.CEL")) > > > I got Error again, > ******** > Error in whatcdf("J_HpaII_Wt_10uM.CEL") : Could not open file > J_HpaII_Wt_10uM.CEL > ******** > > I thought I may be need for cel file normalization, below, > >> library(gcrma) > > Loading required package: matchprobes > >> Data <- ReadAffy() >> eset <- gcrma(Data) > > > I got Error again, > ******** > Computing affinities[1] "Checking to see if your internet connection > works..." > Warning message: > unable to connect to 'www.bioconductor.org' on port 80. > Note: http://www.bioconductor.org/repository/devel/package/Source does > not seem to have a valid repository, skipping > Warning messages: > 1: Failed to read replisting at > http://www.bioconductor.org/repository/devel/package/Source in: > getReplisting(repURL, repFile, method = method) > 2: unable to connect to 'www.bioconductor.org' on port 80. > Note: http://www.bioconductor.org/repository/devel/package/Win32 does > not seem to have a valid repository, skipping > Note: You did not specify a download type. Using a default value of: > Source > This will be fine for almost all users > > Error in getCDF(cdfpackagename) : Environment arabidopsistlgf4xcdf was > not found in the Bioconductor repository. > In addition: Warning message: > Failed to read replisting at > http://www.bioconductor.org/repository/devel/package/Win32 in: > getReplisting(repURL, repFile, method = method) > ******** > Q2. I can not read my cel file now. Our cdf file name is > "arabidopsistlgF.cdf" . But cif file name is "arabidopsistlgF_4x.cif". > Do I need to use same name for cif and cdf? Because cel file include cif > file name. And how can I start to read cel file? Once you have the cdfenv installed correctly, you can read celfiles using ReadAffy(). > > Q3. And also I would like to read cel file and normalization using a lot > of cel files. Could you please suggest me what package is better for > reading and normalization of affy custom array? and which is better rma > (Robust Multi-Array Average expression measure) or gcrma (Background > adjustment using sequence information)? Which are better, apples or oranges? I guess it all depends on who you ask. > > Q4. If our array has over 3 million data, how long do I need for reading > and normalization for 1 data (depend on machine power?)? Do you have > some speculation for calculation efficiency? I need to read cdf file for > about 20min. If it takes that long to read in the cdf I am betting you are using virtual memory. In that case, you really need to get more RAM or things will be crushingly slow. HTH, Jim > > Thank you very much, > Junshi -- James W. MacDonald Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623
ADD COMMENT
0
Entering edit mode
Hi Jim, Thank you very much for your suggestion. I tried make environment again like attached txt as you said. >>>source("http://www.bioconductor.org/getBioC.R") >>>getBioC("all") >>>library(makecdfenv) >>>Library(affy) >>>make.cdf.package ("arabidopsistlgF.cdf") >> >>And move to Terminal on my Mac, >> >>>R CMD INSTALL arabidopsistlgfcdf back to R >>> source("http://www.bioconductor.org/getBioC.R") >>> getBioC() >>> library(affy) >>>Data <- readAffy() > > >cleancdfname(cdfName(Data)) >[1] "arabidopsistlgf4xcdf" > > cleancdfname("arabidopsistlgf4xcdf") >[1] "arabidopsistlgf4xcdfcdf" > > library(makecdfenv) > >make.cdf.package("arabidopsistlgF.cdf", packagename="arabidopsistlgf4xcdf") >Reading CDF file. >Creating CDF environment >Wait for about 0 dots >> cel.files=list.files(pattern=".CEL$") >> data=ReadAffy(filenames=cel.files) >> eset <- rma(Data) >Note: You did not specify a download type. Using a default value of: Source >This will be fine for almost all users > >Error in getCdfInfo(object) : Could not obtain CDF environment, >problems encountered: >Specified environment specified did not contain arabidopsis_tlgF_4x >Library - package arabidopsistlgf4xcdf not installed >Data for package affy did not contain arabidopsis_tlgF_4x Bioconductor - arabidopsistlgf4xcdf not available I think this command "make.cdf.package("arabidopsistlgF.cdf", packagename="arabidopsistlgf4xcdf")" does work. But according to above message, I can not make cdf environment (Bioconductor - arabidopsistlgf4xcdf not available). Could you please help me what should I do more? Thank you, Junshi -- *********************************************************** *********************************************************** Junshi Yazaki, Ph D The Salk Institute for Biological Studies 10010 North Torrey Pines Road La Jolla, CA 92037 Phone (858) 453-4100 x1533 FAX (858) 558-6379 Email: jyazaki@salk.edu *********************************************************** ***********************************************************
ADD REPLY
0
Entering edit mode
Junshi Yazaki wrote: > Hi Jim, > > Thank you very much for your suggestion. > I tried make environment again like attached txt as you said. I think you have a serious misunderstanding about what you are doing here. I will try to point some things out, but you really need to be doing some close reading of the available documentation. > >>>> source("http://www.bioconductor.org/getBioC.R") >>>> getBioC("all") This installs *all* of the Bioconductor packages. This is probably a bit excessive, but the main problem here is it appears you think you have to re-install the Bioconductor packages every time you do anything with your computer. Once you have them installed, they are there for your use. This is about the same as installing Microsoft Office every time you want to open a Word document. >>>> library(makecdfenv) >>>> Library(affy) >>>> make.cdf.package ("arabidopsistlgF.cdf") >>> >>> >>> And move to Terminal on my Mac, >>> >>>> R CMD INSTALL arabidopsistlgfcdf At this point you have installed the cdfenv, but it is *not* the correct name. > > > back to R > >>>> source("http://www.bioconductor.org/getBioC.R") >>>> getBioC() Re-installing just the default packages. Please stop doing this... >>>> library(affy) >>>> Data <- readAffy() >> >> >> >cleancdfname(cdfName(Data)) >> [1] "arabidopsistlgf4xcdf" Here you have established that the name the affy package is looking for is arabidopsistlgf4xcdf. Note that this is *different* from the name of the cdfenv you installed above. >> > cleancdfname("arabidopsistlgf4xcdf") >> [1] "arabidopsistlgf4xcdfcdf" >> > library(makecdfenv) >> >make.cdf.package("arabidopsistlgF.cdf", >> packagename="arabidopsistlgf4xcdf") Here you are making a package with the correct name, but unfortunately not ever installing it. >> Reading CDF file. >> Creating CDF environment >> Wait for about 0 dots >> >>> cel.files=list.files(pattern=".CEL$") >>> data=ReadAffy(filenames=cel.files) >>> eset <- rma(Data) Since you have not installed the package with the corrected name, the affy package goes to the Bioconductor website and tries to find the correct cdfenv. Since this is a custom chip, it can't find anything and errors out. >> >> Note: You did not specify a download type. Using a default value of: >> Source >> This will be fine for almost all users >> >> Error in getCdfInfo(object) : Could not obtain CDF environment, >> problems encountered: >> Specified environment specified did not contain arabidopsis_tlgF_4x >> Library - package arabidopsistlgf4xcdf not installed >> Data for package affy did not contain arabidopsis_tlgF_4x > > Bioconductor - arabidopsistlgf4xcdf not available > > I think this command "make.cdf.package("arabidopsistlgF.cdf", > packagename="arabidopsistlgf4xcdf")" does work. > But according to above message, I can not make cdf environment > (Bioconductor - arabidopsistlgf4xcdf not available). Could you please > help me what should I do more? No. You *are* making the cdf environment, but not installing the one with the correct name. You need to read the available documentation more closely and try to understand what it is you are trying to do. It appears you are simply trying different things in the hope that something will work. Creating packages and installing them can be fairly confusing the first time you do it. I remember being a bit overwhelmed the first time I tried it, and thinking that the documentation was a bit sparse. However, all the information you need to do this is available to you in the help files for makecdfenv, and in 'R Installation and Administration'. It is well worth the time spent to read (maybe several times) all this documentation before going any further, and making sure you understand exactly what you are doing. Good luck, Jim > > Thank you, > Junshi -- James W. MacDonald University of Michigan Affymetrix and cDNA Microarray Core 1500 E Medical Center Drive Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.
ADD REPLY

Login before adding your answer.

Traffic: 578 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6