Question

question about the read.columns

0

Entering edit mode

Changbin Du ▴ 30

@changbin-du-4719

Last seen 10.5 years ago

Dear Bioconductor community, I am using your read.columns to read some columns into R, I found the > following problems, can you help me? > > I have a large data set names dd.txt, the *columns* are: there are 2402 > variables. > > a1, b1, ..z1, a11, b11, ...z11, a111, b111, ..z111.. > > IF I dont know the *relative* position of the *columns*, but I know I need > the > following variables: > var<-c(a1, c1,a11,b11,f111) > > Can I *use* *read*.*columns* *to* *read* the data *into* *R*? > > I have tried the following codes, but it does not work > > hh<-*read*.*columns*("/house/homedirs/c/cdu/operon/gh/dd.dimer", > *required*.col=NULL, text.*to*.search=var, sep="\t", skip=0, quote="", > fill=T) > > dim(hh) > 468, 2402 > > > hh<-*read*.*columns*("/house/homedirs/c/cdu/operon/gh/dd.dimer", > *required*.col=var, text.*to*.search="", sep="\t", skip=0, quote="", > fill=T) > > dim(hh) > 0, 0 > > Thanks so much! > > > > -- Sincerely, Changbin -- Changbin Du DOE Joint Genome Institute Bldg 400 Rm 457 2800 Mitchell Dr Walnut Creet, CA 94598 Phone: 925-927-2856 [[alternative HTML version deleted]]

• 1.1k views

ADD COMMENT • link updated 13.7 years ago by Valerie Obenchain ★ 6.8k • written 13.7 years ago by Changbin Du ▴ 30

score 0 · Answer 1 · 2011-06-27

0

Entering edit mode

Valerie Obenchain ★ 6.8k

@valerie-obenchain-4275

Last seen 3.1 years ago

United States

Hi Changbin, Looking at the documentation for read.columns in the limma package, the required.col argument should select the columns you want. It looks like you were successful in reading in all of the data on your first try. Check the column names in that file to be sure they match the names you are inputting with the variable "var". For example, myfile <- "/house/homedirs/c/cdu/operon/gh/dd.dimer" hh_full <- read.columns(myfile, quote="") colnames(hh) You should see the column names you expect, then try selecting the columns with var<-c("a1", "c1", "a11", "b11", "f111") hh_reduced <- read.columns(myfile, required.col=var, quote="") Valerie On 06/24/11 15:21, Changbin Du wrote: > Dear Bioconductor community, > > I am using your read.columns to read some columns into R, I found the > >> following problems, can you help me? >> >> I have a large data set names dd.txt, the *columns* are: there are 2402 >> variables. >> >> a1, b1, ..z1, a11, b11, ...z11, a111, b111, ..z111.. >> >> IF I dont know the *relative* position of the *columns*, but I know I need >> the >> following variables: >> var<-c(a1, c1,a11,b11,f111) >> >> Can I *use* *read*.*columns* *to* *read* the data *into* *R*? >> >> I have tried the following codes, but it does not work >> >> hh<-*read*.*columns*("/house/homedirs/c/cdu/operon/gh/dd.dimer", >> *required*.col=NULL, text.*to*.search=var, sep="\t", skip=0, quote="", >> fill=T) >> >> dim(hh) >> 468, 2402 >> >> >> hh<-*read*.*columns*("/house/homedirs/c/cdu/operon/gh/dd.dimer", >> *required*.col=var, text.*to*.search="", sep="\t", skip=0, quote="", >> fill=T) >> >> dim(hh) >> 0, 0 >> >> Thanks so much! >> >> >> >> >> > > >

ADD COMMENT • link 13.7 years ago Valerie Obenchain ★ 6.8k

0

Entering edit mode

HI, Valerie, Thanks for the information! > hh.full<-read.columns("/house/homedirs/c/cdu/operon/gh/dd.dimer",requi red.col=NULL, sep="\t", skip=0, quote="", fill=T) > colnames(hh.full) [1] "gene_id" "A{5}A" "A{5}C" "A{5}D" "A{5}E" "A{5}F" "A{5}G" * My variables names are right and same as my original file(dd.dimer)* But when I use the following codes: > dimer.hh<-read.table("/house/homedirs/c/cdu/operon/gh/dd.dimer", sep="\t", skip=0, quote="", header=T, fill=T) > colnames(dimer.hh) [1] "gene_id" "A.5.A" "A.5.C" "A.5.D" "A.5.E" "A.5.F" "A.5.G" The variables names changed. the { } are changed to .. , I dont know how? Thanks! On Mon, Jun 27, 2011 at 7:02 AM, Valerie Obenchain <vobencha@fhcrc.org>wrote: > Hi Changbin, > > Looking at the documentation for read.columns in the limma package, the > required.col argument should select the columns you want. > > It looks like you were successful in reading in all of the data on your > first try. Check the column names in that file to be sure they match the > names you are inputting with the variable "var". For example, > > myfile <- "/house/homedirs/c/cdu/operon/**gh/dd.dimer" > hh_full <- read.columns(myfile, quote="") > colnames(hh) > > You should see the column names you expect, then try selecting the columns > with > > var<-c("a1", "c1", "a11", "b11", "f111") > hh_reduced <- read.columns(myfile, required.col=var, quote="") > > Valerie > > > > > > > > > On 06/24/11 15:21, Changbin Du wrote: > >> Dear Bioconductor community, >> >> I am using your read.columns to read some columns into R, I found the >> >> >>> following problems, can you help me? >>> >>> I have a large data set names dd.txt, the *columns* are: there are 2402 >>> variables. >>> >>> a1, b1, ..z1, a11, b11, ...z11, a111, b111, ..z111.. >>> >>> IF I dont know the *relative* position of the *columns*, but I know I >>> need >>> the >>> following variables: >>> var<-c(a1, c1,a11,b11,f111) >>> >>> Can I *use* *read*.*columns* *to* *read* the data *into* *R*? >>> >>> I have tried the following codes, but it does not work >>> >>> hh<-*read*.*columns*("/house/**homedirs/c/cdu/operon/gh/dd.**dimer", >>> *required*.col=NULL, text.*to*.search=var, sep="\t", skip=0, quote="", >>> fill=T) >>> >>> dim(hh) >>> 468, 2402 >>> >>> >>> hh<-*read*.*columns*("/house/**homedirs/c/cdu/operon/gh/dd.**dimer", >>> *required*.col=var, text.*to*.search="", sep="\t", skip=0, quote="", >>> fill=T) >>> >>> dim(hh) >>> 0, 0 >>> >>> Thanks so much! >>> >>> >>> >>> >>> >>> >> >> >> >> > > -- Sincerely, Changbin -- Changbin Du DOE Joint Genome Institute Bldg 400 Rm 457 2800 Mitchell Dr Walnut Creet, CA 94598 Phone: 925-927-2856 [[alternative HTML version deleted]]

ADD REPLY • link 13.7 years ago Changbin Du ▴ 30

0

Entering edit mode

On 06/27/2011 08:51 AM, Changbin Du wrote: > HI, Valerie, > > Thanks for the information! > > > > hh.full<-read.columns("/house/homedirs/c/cdu/operon/gh/dd.dimer",req uired.col=NULL, > sep="\t", skip=0, quote="", fill=T) > > colnames(hh.full) > [1] "gene_id" "A{5}A" "A{5}C" "A{5}D" "A{5}E" "A{5}F" > "A{5}G" > * > My variables names are right and same as my original file(dd.dimer)* > > But when I use the following codes: > > dimer.hh<-read.table("/house/homedirs/c/cdu/operon/gh/dd.dimer", > sep="\t", skip=0, quote="", header=T, fill=T) > > colnames(dimer.hh) > [1] "gene_id" "A.5.A" "A.5.C" "A.5.D" "A.5.E" "A.5.F" > "A.5.G" > > The variables names changed. the { } are changed to .. , I dont know how? I'm not sure why you are seeing the parenthesis replaced by periods. I have tested your example on a sample file of my own where I named the columns "A{5}A", "A{5}C" etc. and I can not replicate this behavior. The columns names in the result are the same regardless if I use the required.col=NULL or leave it out. More importantly, have you been successful in extracting the columns you want with the required.col argument now that you know how the column names appear? For example, required.col <- c("A{5}A", "A{5}F") Valerie > > Thanks! > > > > > > On Mon, Jun 27, 2011 at 7:02 AM, Valerie Obenchain <vobencha@fhcrc.org> <mailto:vobencha@fhcrc.org>> wrote: > > Hi Changbin, > > Looking at the documentation for read.columns in the limma > package, the required.col argument should select the columns you want. > > It looks like you were successful in reading in all of the data on > your first try. Check the column names in that file to be sure > they match the names you are inputting with the variable "var". > For example, > > myfile <- "/house/homedirs/c/cdu/operon/gh/dd.dimer" > hh_full <- read.columns(myfile, quote="") > colnames(hh) > > You should see the column names you expect, then try selecting the > columns with > > var<-c("a1", "c1", "a11", "b11", "f111") > hh_reduced <- read.columns(myfile, required.col=var, quote="") > > Valerie > > > > > > > > > On 06/24/11 15:21, Changbin Du wrote: > > Dear Bioconductor community, > > I am using your read.columns to read some columns into R, I > found the > > following problems, can you help me? > > I have a large data set names dd.txt, the *columns* are: > there are 2402 > variables. > > a1, b1, ..z1, a11, b11, ...z11, a111, b111, ..z111.. > > IF I dont know the *relative* position of the *columns*, > but I know I need > the > following variables: > var<-c(a1, c1,a11,b11,f111) > > Can I *use* *read*.*columns* *to* *read* the data *into* *R*? > > I have tried the following codes, but it does not work > > hh<-*read*.*columns*("/house/homedirs/c/cdu/operon/gh/dd.dimer", > *required*.col=NULL, text.*to*.search=var, sep="\t", > skip=0, quote="", > fill=T) > > dim(hh) > 468, 2402 > > > hh<-*read*.*columns*("/house/homedirs/c/cdu/operon/gh/dd.dimer", > *required*.col=var, text.*to*.search="", sep="\t", skip=0, > quote="", > fill=T) > > dim(hh) > 0, 0 > > Thanks so much! > > > > > > > > > > > > > -- > Sincerely, > Changbin > -- > > Changbin Du > DOE Joint Genome Institute > Bldg 400 Rm 457 > 2800 Mitchell Dr > Walnut Creet, CA 94598 > Phone: 925-927-2856 > > [[alternative HTML version deleted]]

ADD REPLY • link 13.7 years ago Valerie Obenchain ★ 6.8k

0

Entering edit mode

Hi, Valerie, Thanks for the reply! Yes, now I can successfully extract the columns I want with the required.col argument as long as the required.col <- c("A{5}A", "A{5}F"). The read.columns works fine. IF I use* read.table*, however, even the original column names are ("A{5}A", "A{5}F" etc), I still get A.5.D" "A.5.F" "A.5.G" etc. I dont know how it happens. Anyway, the read.columns work for me now. Thanks for all the helps! On Mon, Jun 27, 2011 at 3:58 PM, Valerie Obenchain <vobencha@fhcrc.org>wrote: > ** > On 06/27/2011 08:51 AM, Changbin Du wrote: > > HI, Valerie, > > Thanks for the information! > > > > hh.full<-read.columns("/house/homedirs/c/cdu/operon/gh/dd.dimer",req uired.col=NULL, > sep="\t", skip=0, quote="", fill=T) > > colnames(hh.full) > [1] "gene_id" "A{5}A" "A{5}C" "A{5}D" "A{5}E" "A{5}F" "A{5}G" > > * > My variables names are right and same as my original file(dd.dimer)* > > But when I use the following codes: > > dimer.hh<-read.table("/house/homedirs/c/cdu/operon/gh/dd.dimer", > sep="\t", skip=0, quote="", header=T, fill=T) > > colnames(dimer.hh) > [1] "gene_id" "A.5.A" "A.5.C" "A.5.D" "A.5.E" "A.5.F" "A.5.G" > > > The variables names changed. the { } are changed to .. , I dont know how? > > > I'm not sure why you are seeing the parenthesis replaced by periods. I have > tested your example on a sample file of my own where I named the columns > "A{5}A", "A{5}C" etc. and I can not replicate this behavior. The columns > names in the result are the same regardless if I use the required.col=NULL > or leave it out. > > More importantly, have you been successful in extracting the columns you > want with the required.col argument now that you know how the column names > appear? For example, > > required.col <- c("A{5}A", "A{5}F") > > > Valerie > > > Thanks! > > > > > > On Mon, Jun 27, 2011 at 7:02 AM, Valerie Obenchain <vobencha@fhcrc.org>wrote: > >> Hi Changbin, >> >> Looking at the documentation for read.columns in the limma package, the >> required.col argument should select the columns you want. >> >> It looks like you were successful in reading in all of the data on your >> first try. Check the column names in that file to be sure they match the >> names you are inputting with the variable "var". For example, >> >> myfile <- "/house/homedirs/c/cdu/operon/gh/dd.dimer" >> hh_full <- read.columns(myfile, quote="") >> colnames(hh) >> >> You should see the column names you expect, then try selecting the columns >> with >> >> var<-c("a1", "c1", "a11", "b11", "f111") >> hh_reduced <- read.columns(myfile, required.col=var, quote="") >> >> Valerie >> >> >> >> >> >> >> >> >> On 06/24/11 15:21, Changbin Du wrote: >> >>> Dear Bioconductor community, >>> >>> I am using your read.columns to read some columns into R, I found the >>> >>> >>>> following problems, can you help me? >>>> >>>> I have a large data set names dd.txt, the *columns* are: there are 2402 >>>> variables. >>>> >>>> a1, b1, ..z1, a11, b11, ...z11, a111, b111, ..z111.. >>>> >>>> IF I dont know the *relative* position of the *columns*, but I know I >>>> need >>>> the >>>> following variables: >>>> var<-c(a1, c1,a11,b11,f111) >>>> >>>> Can I *use* *read*.*columns* *to* *read* the data *into* *R*? >>>> >>>> I have tried the following codes, but it does not work >>>> >>>> hh<-*read*.*columns*("/house/homedirs/c/cdu/operon/gh/dd.dimer", >>>> *required*.col=NULL, text.*to*.search=var, sep="\t", skip=0, quote="", >>>> fill=T) >>>> >>>> dim(hh) >>>> 468, 2402 >>>> >>>> >>>> hh<-*read*.*columns*("/house/homedirs/c/cdu/operon/gh/dd.dimer", >>>> *required*.col=var, text.*to*.search="", sep="\t", skip=0, quote="", >>>> fill=T) >>>> >>>> dim(hh) >>>> 0, 0 >>>> >>>> Thanks so much! >>>> >>>> >>>> >>>> >>>> >>>> >>> >>> >>> >>> >> >> > > > -- > Sincerely, > Changbin > -- > > Changbin Du > DOE Joint Genome Institute > Bldg 400 Rm 457 > 2800 Mitchell Dr > Walnut Creet, CA 94598 > Phone: 925-927-2856 > > > > -- Sincerely, Changbin -- Changbin Du DOE Joint Genome Institute Bldg 400 Rm 457 2800 Mitchell Dr Walnut Creet, CA 94598 Phone: 925-927-2856 [[alternative HTML version deleted]]

ADD REPLY • link 13.7 years ago Changbin Du ▴ 30

0

Entering edit mode

Check the read.table help (hint: check.names), which points you to the 'make.names' function... that's why. Cheers, Cei On 6/27/11 6:05 PM, Changbin Du wrote: > Hi, Valerie, > > Thanks for the reply! > > Yes, now I can successfully extract the columns I want with the required.col > argument as long as the > required.col<- c("A{5}A", "A{5}F"). The read.columns works fine. > > > IF I use* read.table*, however, even the original column names are ("A{5}A", > "A{5}F" etc), I still get A.5.D" "A.5.F" "A.5.G" etc. I dont know how > it happens. > > Anyway, the read.columns work for me now. > > Thanks for all the helps! > > > > > On Mon, Jun 27, 2011 at 3:58 PM, Valerie Obenchain<vobencha at="" fhcrc.org="">wrote: > >> ** >> On 06/27/2011 08:51 AM, Changbin Du wrote: >> >> HI, Valerie, >> >> Thanks for the information! >> >>> >> hh.full<-read.columns("/house/homedirs/c/cdu/operon/gh/dd.dimer",re quired.col=NULL, >> sep="\t", skip=0, quote="", fill=T) >>> colnames(hh.full) >> [1] "gene_id" "A{5}A" "A{5}C" "A{5}D" "A{5}E" "A{5}F" "A{5}G" >> >> * >> My variables names are right and same as my original file(dd.dimer)* >> >> But when I use the following codes: >>> dimer.hh<-read.table("/house/homedirs/c/cdu/operon/gh/dd.dimer", >> sep="\t", skip=0, quote="", header=T, fill=T) >>> colnames(dimer.hh) >> [1] "gene_id" "A.5.A" "A.5.C" "A.5.D" "A.5.E" "A.5.F" "A.5.G" >> >> >> The variables names changed. the { } are changed to .. , I dont know how? >> >> >> I'm not sure why you are seeing the parenthesis replaced by periods. I have >> tested your example on a sample file of my own where I named the columns >> "A{5}A", "A{5}C" etc. and I can not replicate this behavior. The columns >> names in the result are the same regardless if I use the required.col=NULL >> or leave it out. >> >> More importantly, have you been successful in extracting the columns you >> want with the required.col argument now that you know how the column names >> appear? For example, >> >> required.col<- c("A{5}A", "A{5}F") >> >> >> Valerie >> >> >> Thanks! >> >> >> >> >> >> On Mon, Jun 27, 2011 at 7:02 AM, Valerie Obenchain<vobencha at="" fhcrc.org="">wrote: >> >>> Hi Changbin, >>> >>> Looking at the documentation for read.columns in the limma package, the >>> required.col argument should select the columns you want. >>> >>> It looks like you were successful in reading in all of the data on your >>> first try. Check the column names in that file to be sure they match the >>> names you are inputting with the variable "var". For example, >>> >>> myfile<- "/house/homedirs/c/cdu/operon/gh/dd.dimer" >>> hh_full<- read.columns(myfile, quote="") >>> colnames(hh) >>> >>> You should see the column names you expect, then try selecting the columns >>> with >>> >>> var<-c("a1", "c1", "a11", "b11", "f111") >>> hh_reduced<- read.columns(myfile, required.col=var, quote="") >>> >>> Valerie >>> >>> >>> >>> >>> >>> >>> >>> >>> On 06/24/11 15:21, Changbin Du wrote: >>> >>>> Dear Bioconductor community, >>>> >>>> I am using your read.columns to read some columns into R, I found the >>>> >>>> >>>>> following problems, can you help me? >>>>> >>>>> I have a large data set names dd.txt, the *columns* are: there are 2402 >>>>> variables. >>>>> >>>>> a1, b1, ..z1, a11, b11, ...z11, a111, b111, ..z111.. >>>>> >>>>> IF I dont know the *relative* position of the *columns*, but I know I >>>>> need >>>>> the >>>>> following variables: >>>>> var<-c(a1, c1,a11,b11,f111) >>>>> >>>>> Can I *use* *read*.*columns* *to* *read* the data *into* *R*? >>>>> >>>>> I have tried the following codes, but it does not work >>>>> >>>>> hh<-*read*.*columns*("/house/homedirs/c/cdu/operon/gh/dd.dimer", >>>>> *required*.col=NULL, text.*to*.search=var, sep="\t", skip=0, quote="", >>>>> fill=T) >>>>> >>>>> dim(hh) >>>>> 468, 2402 >>>>> >>>>> >>>>> hh<-*read*.*columns*("/house/homedirs/c/cdu/operon/gh/dd.dimer", >>>>> *required*.col=var, text.*to*.search="", sep="\t", skip=0, quote="", >>>>> fill=T) >>>>> >>>>> dim(hh) >>>>> 0, 0 >>>>> >>>>> Thanks so much! >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> >>>> >>> >>> >> >> >> -- >> Sincerely, >> Changbin >> -- >> >> Changbin Du >> DOE Joint Genome Institute >> Bldg 400 Rm 457 >> 2800 Mitchell Dr >> Walnut Creet, CA 94598 >> Phone: 925-927-2856 >> >> >> >> > >

ADD REPLY • link 13.7 years ago Cei Abreu-Goodger ▴ 830