Basic question reading special chars using read.table
3
0
Entering edit mode
David ▴ 860
@david-3335
Last seen 6.6 years ago
The Header file (no quotes in the file): ######## Gene Coli-r1 1ug-r1 1ug-r2 Blank-r1 .... .... ... ... ... ####reading the file t = read.table(file="file.txt",header=TRUE) colnames(t) [1] "Gene" "Coli.r1" "X1ug.r1" "X1ug.r2" "X1ug.r3" "Blank.r1" As you can see the colname starting with a number are not treated as i expected. An "X" is added ??? Also the "?" is not handled properly, i have replaced it with "u" in my example ??? How come that the names of the columns are changed ??? how to avoid that ??? thanks for any help! david
• 1.2k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 12 weeks ago
United States
On Tue, Mar 31, 2009 at 11:47 AM, David martin <vilanew@gmail.com> wrote: > The Header file (no quotes in the file): > ######## > Gene Coli-r1 1ug-r1 1ug-r2 Blank-r1 > .... .... ... ... ... > > ####reading the file > t = read.table(file="file.txt",header=TRUE) > colnames(t) > > [1] "Gene" "Coli.r1" "X1ug.r1" "X1ug.r2" "X1ug.r3" "Blank.r1" > > As you can see the colname starting with a number are not treated as i > expected. An "X" is added ??? Also the "µ" is not handled properly, i have > replaced it with "u" in my example ??? > > How come that the names of the columns are changed ??? how to avoid that > ??? Hi, David. The best place to start is to read the help for read.table(). See the argument "check.names" and its default. Sean [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
@vincent-j-carey-jr-4
Last seen 7 weeks ago
United States
please read the posting guide and the documentation on the function you are using. you need the check.names parameter to be set in a way that works for you. i don't know what will happen for your greek character but presumably if you set your environment variables suitably it will be propagated according to the rules of string handling in R. 'X's are prepended by default to nonsyntactic column names according to the behavior of make.names; if you set check.names=FALSE this is avoided. > x = read.table("~/ta.txt", h=TRUE) > x X1abc X2 1 1 2 > x = read.table("~/ta.txt", h=TRUE, check.names=FALSE) > x 1abc 2 1 1 2 > names(x) [1] "1abc" "2" On Tue, Mar 31, 2009 at 11:47 AM, David martin <vilanew@gmail.com> wrote: > The Header file (no quotes in the file): > ######## > Gene Coli-r1 1ug-r1 1ug-r2 Blank-r1 > .... .... ... ... ... > > ####reading the file > t = read.table(file="file.txt",header=TRUE) > colnames(t) > > [1] "Gene" "Coli.r1" "X1ug.r1" "X1ug.r2" "X1ug.r3" "Blank.r1" > > As you can see the colname starting with a number are not treated as i > expected. An "X" is added ??? Also the "µ" is not handled properly, i have > replaced it with "u" in my example ??? > > How come that the names of the columns are changed ??? how to avoid that > ??? > > > thanks for any help! > > david > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
HI, indeed the check.names=False did work properly. david Vincent Carey wrote: > please read the posting guide and the documentation on the function you are > using. > you need the check.names parameter to be set in a way that works for you. i > don't > know what will happen for your greek character but presumably if you set > your > environment variables suitably it will be propagated according to the rules > of string handling in R. > 'X's are prepended by default to nonsyntactic column names according to the > behavior > of make.names; if you set check.names=FALSE this is avoided. > >> x = read.table("~/ta.txt", h=TRUE) >> x > X1abc X2 > 1 1 2 >> x = read.table("~/ta.txt", h=TRUE, check.names=FALSE) >> x > 1abc 2 > 1 1 2 >> names(x) > [1] "1abc" "2" > > > On Tue, Mar 31, 2009 at 11:47 AM, David martin <vilanew at="" gmail.com=""> wrote: > >> The Header file (no quotes in the file): >> ######## >> Gene Coli-r1 1ug-r1 1ug-r2 Blank-r1 >> .... .... ... ... ... >> >> ####reading the file >> t = read.table(file="file.txt",header=TRUE) >> colnames(t) >> >> [1] "Gene" "Coli.r1" "X1ug.r1" "X1ug.r2" "X1ug.r3" "Blank.r1" >> >> As you can see the colname starting with a number are not treated as i >> expected. An "X" is added ??? Also the "?" is not handled properly, i have >> replaced it with "u" in my example ??? >> >> How come that the names of the columns are changed ??? how to avoid that >> ??? >> >> >> thanks for any help! >> >> david >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > [[alternative HTML version deleted]] > > > > -------------------------------------------------------------------- ---- > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
HI, indeed the check.names=False did work properly. david Vincent Carey wrote: > please read the posting guide and the documentation on the function you are > using. > you need the check.names parameter to be set in a way that works for you. i > don't > know what will happen for your greek character but presumably if you set > your > environment variables suitably it will be propagated according to the rules > of string handling in R. > 'X's are prepended by default to nonsyntactic column names according to the > behavior > of make.names; if you set check.names=FALSE this is avoided. > >> x = read.table("~/ta.txt", h=TRUE) >> x > X1abc X2 > 1 1 2 >> x = read.table("~/ta.txt", h=TRUE, check.names=FALSE) >> x > 1abc 2 > 1 1 2 >> names(x) > [1] "1abc" "2" > > > On Tue, Mar 31, 2009 at 11:47 AM, David martin <vilanew at="" gmail.com=""> wrote: > >> The Header file (no quotes in the file): >> ######## >> Gene Coli-r1 1ug-r1 1ug-r2 Blank-r1 >> .... .... ... ... ... >> >> ####reading the file >> t = read.table(file="file.txt",header=TRUE) >> colnames(t) >> >> [1] "Gene" "Coli.r1" "X1ug.r1" "X1ug.r2" "X1ug.r3" "Blank.r1" >> >> As you can see the colname starting with a number are not treated as i >> expected. An "X" is added ??? Also the "?" is not handled properly, i have >> replaced it with "u" in my example ??? >> >> How come that the names of the columns are changed ??? how to avoid that >> ??? >> >> >> thanks for any help! >> >> david >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > [[alternative HTML version deleted]] > > > > -------------------------------------------------------------------- ---- > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
Yannick Wurm ▴ 220
@yannick-wurm-2314
Last seen 10.2 years ago
Hello David, R doesnt like having names begin with numbers... hence the added X It also thinks '-' means 'minus'... So if you choose column names that have no "strange or possibly mathematical" characters inside them, you're safe Also, if you have weird characters (eg, # or quotes or whitespace) read.table can fail to do what you expect from it... so whenever importing, check that your data is complete. best, yannick -------------------------------------------- yannick . wurm @ unil . ch Ant Genomics, Ecology & Evolution @ Lausanne http://www.unil.ch/dee/page28685_fr.html On Mar 31, 2009, at 17:47 , David martin wrote: > The Header file (no quotes in the file): > ######## > Gene Coli-r1 1ug-r1 1ug-r2 Blank-r1 > .... .... ... ... ... > > ####reading the file > t = read.table(file="file.txt",header=TRUE) > colnames(t) > > [1] "Gene" "Coli.r1" "X1ug.r1" "X1ug.r2" "X1ug.r3" "Blank.r1" > > As you can see the colname starting with a number are not treated > as i expected. An "X" is added ??? Also the "?" is not handled > properly, i have replaced it with "u" in my example ??? > > How come that the names of the columns are changed ??? how to avoid > that ??? > > > thanks for any help! > > david > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/ > gmane.science.biology.informatics.conductor
ADD COMMENT

Login before adding your answer.

Traffic: 620 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6