Convert DataFrame to data.frame While Keeping Column Name Syntax
1
0
Entering edit mode
Dario Strbenac ★ 1.5k
@dario-strbenac-5916
Last seen 15 hours ago
Australia

If I have a DataFrame object, but I want to use a classification algorithm like randomForest from CRAN, which complains about S4 objects being input, what is the best way to coerce to a data.frame? I would like to keep column names in the current gene symbol format. They might have unusual symbols like HLA-A, for example. as.data.frame automatically converts column names to by syntactically valid, by doing things such as replancing hyphens by periods. data.frame(myDataFrame, check.names = FALSE) effectively does what I want. But, it's a constructor rather than a function to convert between types. Anything better to use?

S4Vectors • 3.7k views
ADD COMMENT
2
Entering edit mode
@james-w-macdonald-5106
Last seen 4 minutes ago
United States
> df <- DataFrame("HLA-A" = 1:5, "HLA-B" = 2:6, check.names = FALSE)
> df
DataFrame with 5 rows and 2 columns
      HLA-A     HLA-B
  <integer> <integer>
1         1         2
2         2         3
3         3         4
4         4         5
5         5         6
> as(df, "data.frame")
  HLA-A HLA-B
1     1     2
2     2     3
3     3     4
4     4     5
5     5     6

Doesn't seem like the names are run through make.names?

ADD COMMENT
1
Entering edit mode

Oh. You were using as.data.frame. That just ends up re-creating the data.frame, and on top of it all there's an ... argument that is ignored! LOL

> as.data.frame(df, check.names = FALSE)
  HLA.A HLA.B
1     1     2
2     2     3
3     3     4
4     4     5
5     5     6
Warning message:
In .local(x, row.names, optional, ...) : Arguments in '...' ignored

Howeva, there is the 'optional' argument that you could use, and which is documented, so hypothetically you could have just figured this out yourself.

From ?as.data.frame, and then following to ?base::as.data.frame you will see

Arguments:

       x: any R object.

row.names: 'NULL' or a character vector giving the row names for the
          data frame.  Missing values are not allowed.

optional: logical. If 'TRUE', setting row names and converting column
          names (to syntactic names: see 'make.names') is optional.
          Note that all of R's 'base' package 'as.data.frame()' methods
          use 'optional' only for column names treatment, basically
          with the meaning of 'data.frame(*, check.names = !optional)'.
          See also the 'make.names' argument of the 'matrix' method.
> as.data.frame(df, optional = TRUE)
  HLA-A HLA-B
1     1     2
2     2     3
3     3     4
4     4     5
5     5     6
ADD REPLY

Login before adding your answer.

Traffic: 866 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6